Re: Metadata - ADQL WD v1.05

From: Noel Winstanley <Noel.Winstanley-at-manchester.ac.uk>
Date: Tue, 4 Jul 2006 16:19:58 +0200

On 4 Jul 2006, at 10:55, Yuji SHIRASAKI wrote:

>
> From: Noel Winstanley <Noel.Winstanley-at-manchester.ac.uk>
> Subject: Metadata - ADQL WD v1.05
> Date: Tue, 4 Jul 2006 01:12:14 +0200
>
>> page 18
>> Metadata Query
>> The table structures, types and contents are quite informally defined
>> - it's a hard task to be more precise when there's no suitable tools.
>>
>> Wouldn't metadata be more suitably described in an xml document ,
>> specified with a schema. It would certainly lead to a more formal
>> specification of the metadata, and something that database
>> descriptions could be validated against. I believe there are Registry
>> Schemas suitable for modeling this information already - seems a pity
>> not to use or improve these - even if we don't require these
>> descriptions to be lodged in a registry.
>
> The metadata query returns a result in a table structure, but it is
> not
> defined in what format it should be returned. So it does not prohibit
> to return the result in a form of regsitry schema.

registry schema is _not_ a table structure. As a table is mentioned, I assumed VOTable.

> It depends which
> interface is used. If skynode performQuery is used, it is returned in
> VOTable.
>
> The document does not say that "metadata MUST be store in a database
> table", just saying that "metadata MUST be queryied by ADQL". The ADQL
> is an langeuage for relational table, so it was neccessary to define
> the table strucutre, but it does not necessarily mean that the
> metadata
> MUST be stored in an tabular form, may be in an XML document.
>

No matter how it is stored, if it appears to be a table in a database, it may as well be a table in a database.

> The reason to use the ADQL for metadata query is just to reuse the
> same
> spec to access to the metadata. It is not convenient to introduce
> another
> query language.
>
>> Furthermore, I'm not sure that a query language spec should be
>> mandating the contents of data services (i.e. that they should
>> contain metadata tables) in the query language. In addition to
>> accessing metadata tables using adql, I can think of at least 3 other
>> ways an (xml document describing) metatdata could be sensibly
>> accessed for different kinds of service
>> a) from the registry, as part of the service's registry entry
>> b) from the service, using the IVOA standard support interface
>> c) using a service-specific form - which may be a metadata query
>> (..&FORMAT=METADATA) such as is used in SIAP and SSAP
>
> I think it is convient, usefull and even necessary to introduce
> specification of a metadata query to the query langae itself.
>
> It is not possible to write an ADQL if there is no way to access to
> the metadata.

agreed. A client needs a convenient way of accessing metadata before an adql query can be usefully written.

> So this means that, if the metadata query is not
> defined in the ADQL spec, ADQL alone does not provide a way to
> make a query.

Sure. The adql spec does not define other things required to make a query - i.e. the service interface a query is submitted to, or the format of results. Although skynode and votable are being implicitly assumed within the adql spec, this won't necessarily always be the case.

The adql spec should just concentrate on the query language - heaven knows this is complex enough to get right. Leave how metadata is supplied to the service definition.

>
> Skynode spec does define the metadata query interface, however the
> ADQL is not necessary used only by SkyNode. It may be used on another
> protocol, such as on HTTP GET/POST, smtp (e-mail), and so on. If the
> metadata query is defined as a part of a language itself, it can
> be used on any protocol.

it's not usefully defined at present - as the return format is not defined. And as we don't want to define the return format within the adql spec, we shouldn't be specifying how to access metadata in this spec.

>
>> Retreiving a single document, by any of these methods, is obvously
>> faster than performing multiple adql queries, and then interpreting
>> the results - notice, for example, the implicit hierarchy between the
>> INFO_TABLES and INFO_COLUMNS tables. A single document is also more
>> convenient for caching client-side.
>
> If there are many tables on the service, the single metadata document
> becomes very big.

For many tables, your proposal would mean that I'd need to do many different queries. In my experience of writing interactive skynode clients, the round-trip time for distant skynodes (e.g. uk to japan) means that it's much more usable to pre-fetch metadata in it's entirity - then the user can browse it without round-tripping to the server.
If metadata is prefetched, a single document is much simpler to fetch.

> In addition you must retrieve whole the big docuemnt
> even when only a very small change is made on the metadata.
>
> In the case of ADQL metadata query, it is possible to get only
> metadata
> that has been changed since the last query by checking the
> "last_modified"
> attribute.

how often does metadata change for the typical astronomical archive?

>
> So, it is not obvious which query method is better. I think it is
> better
> to have both method.
>

not convinced. I think neither should be suggested in the ADQL specification.

>> Also note that the metadata as currently described wouldn't be very
>> useful for describing hierarchical data. As there's xpath-ish support
>> in adql, and Registries are expected to response to adql queries,
>> this is something that needs considering. And I don't think we should
>> expect a registry implementation to expose metadata tables.
>
> Registry is not actually an ADQL service. It just accepts a part of
> ADQL
> spec (WHERE clause), so not necessary to implement the metadata query.
>
>> Finally, from a client implementors point of view, the metadata query
>> approach means I need to build votable parsing into my application to
>> be able to interpret the votable of metadata returned. A client will
>> probably already contain xml / registry parsing libraries - to be
>> able to search for and resolve an ADQL-consuming service. But it's
>> not always the case that the client wishes to do votable parsing - it
>> may just wish to save the query results; or pass them to a PLASTIC
>> application.
>
> If all the metadata are registered with the registry, you may use the
> registry. You don't need to use the ADQL metadata query. Some client
> may not have functionality to access the registry.

how would they locate new skynodes then?

> In that case they
> will use the ADQL metadata query or SkyNode metadata query interface.
> Which method to be used depends on the client developper.
>
> Yuji Shirasaki.
>
Received on 2006-07-04Z16:20:39