Re: Response to TAP presentations

From: Guy Rixon <gtr-at-ast.cam.ac.uk>
Date: Thu, 17 May 2007 02:05:07 +0100 (BST)


On Wed, 16 May 2007, Doug Tody wrote:

> I think it could help advance the TAP discussions enormously if we could
> reach agreement on the following key issues:
>
> [...]
>
> Since both table data and table metadata are tabular, it is possible
> to use the the same mechanism to query both. This simplifies things
> for the client and will promote code-sharing at all levels, plus it
> is more flexible if we wish to describe more than just tables/columns
> in the future (e.g., views, indexes, etc.).
>
> ** Do we provide a uniform interface for table data and table metadata
> queries?

It doesn't simplify anything for clients that read VOResource.

If you want to simplify matters by using _existing_, commodity code for parsing metadata, then you must use only the VOTable header and not the rows of the table to carry those metadata. Specifically, you have to emit the same VOTable as you would for a data query, omitting the rows. Anythign else would need to be specially written for TAP. If you keep the commonality, which is fine, then you can't have the extensibility too. If you think that you can describe any view or index as a kind of table then you could do that with VODataService too.

If you want eleborate extensibility to new use cases, then the best thing might be to define a new XML schema for ADQL services. or you could simply define it as a kind of Capability.

> [...]
>
> 4) Grid capabilities
>
> Roy has a good point that asynchronous data staging is complex,
> and supporting this initially expands the scope of TAP and may delay
> the standard. In particular, we will probably need a HTTP/REST-based
> version of VOSpace, and we probably cannot manage the general issue
> of data staging without also solving the problem of authentication.
> We will need to specify and prototype all of these related Grid
> technologies before we can integrate them into the data services
> (not just TAP). Nonetheless, as many have pointed out, we need
> these capabilities if we are ultimately to deliver robust, capable
> data services.
>
> ** Do we wish to have a phased development for TAP which does
> not fully specify the grid capabilities in the initial version?
> Will this be sufficiently useful? If we do go this route, I suggest
> the service interface design should encompass the grid capabilities
> (via a stageData concept or whatever) so that we can be confident
> that yet another interface redesign is not necessary when these
> capabilities are eventually added.

Please note that you can do asynchronous queries using UWS without VOSpace - you don't need the TAP to be, own or be told about any VOSpace service, and you don't the client to know anything of VOSpace. This is so for any UWS application and it's written that way to allow simpler clients.

A TAP implementation is a VOSpace client. It doesn't follow that a RESTful TAP needs a RESTful VOSpace.

There is a public prototype of asynchronous, ADQL querying with staging to VOSpace. AstroGrid's DSA/Catalogue product does this (using AstroGrid MySpace as a VOSpace) and it's been in service for years. We in IVOA have much more experience with this than many other current topics.

I think asynch queries for TAP are _much_ more important than data staging is for SIA. Many use cases need it; almost all implementations and deployments may meet one of those use cases. I think we need it in the first version of TAP, not as an afterthought.

If we accept UWS, and if we apply it directly to TAP (rather than to SSAP and then bend TAP to fit), then it's conceptually simple. We should not be scared of doing this.

> ---
> On Tue, 15 May 2007, Roy Williams wrote:
>
> > This is my response to a double presentation this morning on the TAP protocol
> > evolution, one from Tody, the other from Stebe/Osuna. I believe that this
> > project has gained unacceptable mission-creep from its original conception as
> > SIMPLE table access. I was VERY disturbed when Osuna joked today that this be
> > renamed "General Access Protocol".
> >
> > Roy Williams
> > -----------------------------------------
> >
> > (1) It was my impression at the beginning of the discussion a year ago that
> > TAP would be a protocol to allow ADQL and/or SQL queries to be sent to a
> > database, and a response to be obtained in table form.
> >
> > (2) The parameterized queries presented this morning should not be part of the
> > TAP protocol. The language is not well defined, and would be difficult to
> > implement. If a cone search is wanted, then the IVOA already has that
> > specification, and TAP is not a replacement for cone search. It is no easier
> > to write x="2.5/" than "select * where x>2.5", but adds a considerable burden
> > in implementation, and the requirment to implement an open-ended
> > parameter-based "language" that is not well-defined.
> >
> > (3) The presentation this morning suggested that error responses from TAP be
> > encoded into the HTTP transport, for example HTTP 204 means "No Content". This
> > is problematic on several grounds. First, no other IVOA protocol is bound to
> > HTTP in this wat, so we have a new concept where, I believe, other ways
> > already exist. Second, why should we bind ourselves to a particular transport
> > layer like this? Third, the HTTP messages leave no room for elaboration -- for
> > example *why* is there no content? I propose that errors can be reported as
> > with other IVOA protocols: a VOTable with an INFO element.
> >
> > (4) I very much like the suggestion of three classes of query: the ADQL, the
> > Utype query, the NativeSQL. The Utype method is a natural expression of the
> > Source Catalog Data Model, so that the same query can be sent to many
> > databases, and the NativeSQL allows an extremely easy implementation of TAP
> > for some providers. I believe that none of these three should be mandatory.
> > Thus each column of the table can have two names: the arbitrary one in the
> > database table or view itself, and the IVOA standard Utype name.
> >
> > (5) Tables and table metadata are both tables, and the IVOA has adopted a
> > standard representation for tables, it is called VOTable. I believe VOTable
> > should be the principle way that relational schema and the table data should
> > be returned from TAP, although other formats may be offered by implementers.
> > Because tables and table metadata are unified, it means that querying the
> > table metadata is no different from querying table data itself.
> >
> > (6) Please note that the IVOA Recommendation VOResource does NOT include a
> > mechanism for expressing table metadata; rather it is VODataService that
> > suggests this, and that is not yet defined even as Working Draft. It would be
> > unwise to make TAP dependent on a controversial suggestion that is not yet
> > even defined or documented. Obviously this could in the future be an optional
> > expression of table metadata, but I suggest the IVOA should stick with the
> > standards it has already ratified.
> >
> > (7) The asynchronous query mechanisms presented were rather different, and
> > neither was satisfactory to me. Tody suggested a warping of what has been
> > drafted (but not implemented) for other DAL services. Stebe/Osuna suggested a
> > mechanism with no monitoring or notification that did not seem well thought
> > out. I would like to suggest leaving asynchronous TAP for a future version,
> > and concentrate on getting the plain, simple, synchronous version to
> > Recommendation.
> >
> > (8) Neither presentation considered TAP queries on private data; how the query
> > protocol can include an authentication token so that only a select group of
> > people can launch queries. This is just as important as batch jobs on public
> > data. I believe that this too should be handled in the next version of TAP.
> >
> >
>

Guy Rixon 				        gtr-at-ast.cam.ac.uk
Institute of Astronomy   	                Tel: +44-1223-337542
Madingley Road, Cambridge, UK, CB3 0HA		Fax: +44-1223-337523
Received on 2007-05-17Z03:05:44