On Wed, 1 Nov 2006, Kona Andrews wrote:
>> Exactly how this should look for TAP is not yet clear since for
>> TAP a data query can be a long running operation. Most TAP queries
>> however (>90% ?) are probably simple synchronous queries as well,
>> especially if we have a way to page through a large query response with
>> a succession of calls, something which all queries probably require.
>
> I would be loath to start putting functionality such as paging of results
> into TAP; I really think we should keep it as simple as possible
> (think conesearch-simple). If there are a lot of results, let the user
> choose to have them delivered to their VOStore, and then view/manage
> them using the panoply of regular data analysis tools available.
> If TAP starts to get bloated with functionality, it becomes far less
> attractive to imlement it alongside the richer asynchronous SOAP interfaces
> we already implement.
For very large query responses I agree, it should just go to the VOStore.
However this business of paging through large queries is pretty fundamental for any query - we already have to deal with this for all the DAL queries, e.g., a mechanism for it has been introduced in the SSA data query and it will be in SIA V2 as well. Any simple query can easily overflow and there has to be some mechanism to handle this, even if all the service does is indicate truncation or an overflow error. For large queries where we are moving lots of data around we will need VOStore, but requiring VOStore makes things significantly more complicated, so we shouldn't rule out a simpler mechanism for queries which are not terribly large but may require paging. It would be nice if simple services which don't support VOStore can still function.
Paging may or may not require state to be maintained by the service depending upon how it is implemented. The simplest solution is to truncate the response or indicate an error. Next one can do what is done with most Web queries (google etc.), and re-execute the entire query but just return page N+1 of the response, using a token of some sort to indicate the next page. The most sophisticated solution (which really isn't very hard) is merely to cache result sets up to a certain size on the server and return a pointer in each query response to get the next chunk. Since this can all be driven by a token returned by the service in each call, all the information required is available in each call hence it requires very little state to be maintained in the service; you don't need a full stateful service mechanism of the usual sort.