Hi Kona, All -
Frankly I'm not really sure how to respond to this posting. The fact is that nearly every detail of what is described here is incorrect, from the get-go, and mistates fundamentally the (fairly simple I thought) concept of an object-oriented queryData to estimate the work to be performed (which could then be performed either synchronously or asynchronously), followed by a stageData or whatever to initiate an asynchronous operation, presumably followed by some version of the UWS pattern to interact with the job once it is in progress.
I could respond to this posting in detail and walk through the mistatements, but I don't think this would be worthwhile or productive given that we are already at this point. Lets just say that I do not agree that this is in any way an accurate description of what has been proposed. I have already made an number of postings on this concept in several forums recently, apparently to little effect. It is not clear it would serve any purpose for me to go through it again here now. Very briefly, the only real distinction between an asynchronous, object-oriented data service, and a generic job management system, is in how the elements or tasks of the job to be executed are defined. This is different for a DAL interface than for the generic case, because we already have a way to define quite precisely the data product(s) we want to generate, and once we have done this, the job to be performed is fully defined.
Instead lets back up and look at the greater issue of integrating TAP with the other DAL interfaces. In the next several years we should have a number of data access interfaces, but I think the "cornerstones" will be catalog, image, and spectral access. We are simply not doing our jobs well if we don't have some uniformity, and sharing of technology, approach, and interface, between these second generation data access protocols. Plus, there is also still interest in eventually integrating ADQL capabilities into the other DAL services, in which case the distinction becomes even less. I think this issue falls in the category of a _requirement_. Unless there is a good reason to diverge, we should adopt a uniform approach; if there is a good reason to do something different, by all means lets do so, while keeping most of the rest of the interface the same.
Also - I think many will argue that simple, synchronous, non-authenticated queries remain the priority. It is good to hear actually, that people are taking large queries seriously, as I also think this is quite important to have, to move beyond the "toy" stage. A good interface will be simple for synchronous queries against a single table, but easy to extend to asynchronous operations, reusing much of the core interface. What is the difference, really? Once the query is posed, the service can determine whether it is simple enough to proceed synchronously, or expensive enough to require staging (or perhaps rethinking on the part of the client). The result, if a large query is attempted synchronously, is truncation or an error response; alternatively, we for serious large queries we have a two-stage operation involving estimation and job submission. This is basically what queryData/stageData concept already provides.
On Tue, 1 May 2007, Kona Andrews wrote:
> Dear all,
>
> Copied below is a useful discussion from a colleague of why access
> protocols like SIAP and SSAP don't extend so gracefully to large
> tabular data queries, and why therefore we shouldn't try to make
> TAP exactly conform to the model assumed by these protocols.
>
> Cheers,
> Kona
Received on 2007-05-02Z06:03:16