Hi Kona -
On Fri, 9 Mar 2007, Kona Andrews wrote:
>> Speaking of standards to consider in this area, we also have VOTable,
>> which is the primary IVOA standard for conveying tabular data of any
>> kind (especially useful for cases where such data may be consumed
>> by scientific applications, where many standard tools are already
>> available), and SQL92, which defines a general and widely implemented
>> standard for describing tables. It would be good to look also at the
>> latter, to see what kind of metadata has been defined.
>
> Could you clarify a little what you were saying yesterday in the telecon
> about not using SIAP-style VOTable to return tabular metadata?
>
> As I understand it, you were suggesting that in your prospective
> use of VOTable for returning tabular metadata descriptions, you
> did *not* propose to use the ordinary VOTable header section to describe
> the metadata (i.e. return an "empty" votable just containing headers
> and no actual data rows).
>
> Instead, you seemed to be proposing that the metadata description would
> be returned inside the actual table rows of the VOTable (with the VOTable
> header presumably conveying some other kind of information, or no
> information?). Is that correct, or have I misunderstood what you were
> saying?
The idea with this approach is to provide a uniform mechanism to access both table data and metadata. A table query returns a table; a metadata query returns a table. The same mechanism can be used in both cases. (This is of course the essence of the relational database approach, and is how the SQL92 information schema mechanism, used to query database metadata, works).
The key things are 1) that in all cases we get back a table, and 2) the information content of the table. The same information could be returned in various ways, e.g., VOTable, XML, CSV, etc. From the application programmer's perspective it is convenient to use the same mechanism for both table data and table metadata.
If we describe the list of tables which a given service can access, the response is a table wherein each row describes one such Table object. The columns or fields of the table are the attributes used to describe a Table object. These attributes can be anything we want, regardless of how the information is conveyed or stored (this has nothing to do with VOTable, which is just a container).
If we describe the list of columns or fields in a Table object, the response is a table wherein each row describes one such TableColumn object. The columns of the returned table are the attributes used to describe a data table TableColumn. Again, these can be anything we want (probably some combination of SQL92 and VOTable PARAM/FIELD-like attributes, but it is arbitrary).
If the data is returned using VOTable, VOTable provides a standard table container. The fields of the table are the attributes of the Table or TableColumn, with each row describing a single Table or TableColumn. The advantage of VOTable is that, since it is a standard container for table data, many existing table-oriented tools already exist to deal with this data (no need to write a custom parser). Plus, if VOTable is already used to access table data, the same mechanism can be used in both cases.
If native XML is used to convey table data, we have to invent a new, custom schema just for this class of tabular data. Applications may have to use different software to access table metadata and table data, even though both are tables.
The case of getCapabilities/VOResource is different, as if we exclude things like large sequences of table metadata, this case generally involves a small quantity of heterogeneous data which does not map easily into the relational model. It is more easily expressed as a data structure, hence structured XML works well, and a custom schema is required in any case. To access the data in an application, a table mechanism is not very useful, rather we probably just ingest the data into an object instance of some sort, and do GETs on it.
> If the metadata is to be conveyed in the table rows, is there a pre-existing
> IVOA standard for the provision of such metadata within VOTable table data
> rows that we can consider? Or is the standards-compliance here based on
> the use of a IVOA-standard VOTable wrapper to contain a custom-purpose
> metadata description format?
VOTable, the VOResource schema for tabular data, SQL92 information schema, the work already done on the Catalog data model, and so forth would be good things to look at to define the metadata required to describe Table, TableColumn, or any other such objects we may need to deal with. Again, the key thing here is the data model, not how it is serialized.
Regarding standards-compliance, clearly VOTable is the primary IVOA standard for conveying tabular data, with advantages of existing tools, uniform interface, etc. as I already mentioned above.