Re: TAP and large resultsets

From: Patrick Dowler <patrick.dowler-at-nrc-cnrc.gc.ca>
Date: Mon, 29 Jan 2007 11:44:20 -0800


On Monday 29 January 2007 11:26, Patrick Dowler wrote:
> Note: by using TOP and ORDER BY judiciously, the client can arrange to do
> the chunking without any support in the service, especially easy if there
> is a known unique key:
>
> select top N * from some_table
> [ where uid > biggest_id_I_have_seen ]
> order by uid

I should have added something else here: doing chunking with top and order by is fairly friendly to the DB since the good columns for the order by and the condition are almost always going to be indexed, so finding the start of the chunk is well optimised. And when you tell the db how many rows you want via top, it can perform other optimisations. Most DBs will try to use an index on uid for the above example, especially when N is small, but can use it in later query exec stages as dictated by table stats, etc.

This kind of client-specified chunking in the query is easy to do and well behaved, and it requires no server-side state. I think it would be tricky to have the TAP service implement the same thing (inject some extra SQL, for example), certainly would require some server-side state, and could conflict with a user-specified ORDER BY clause.

So in an effort to make the TAP spec as simple as possible, I think we should just punt since the client will be able to do it within core ADQL features.

-- 

Patrick Dowler
Tel/Tél: (250) 363-6914                  | fax/télécopieur: (250) 363-0045
Canadian Astronomy Data Centre   | Centre canadien de donnees astronomiques
National Research Council Canada | Conseil national de recherches Canada
Government of Canada                  | Gouvernement du Canada
5071 West Saanich Road               | 5071, chemin West Saanich
Victoria, BC                                  | Victoria (C.-B.)
Received on 2007-01-29Z20:41:18