Dear colleagues,
I am appending below some detailed feedback from my colleague Jeff Lusted, whose advice I sought on the subject of the Region construct.
His feedback reflects a lot of practical experience with the issues of intertranslating ADQL/xml, ADQL/sql and SQL, and the practical issues involved in constructing a graphical Query Builder for ADQL, so I hope it will be helpful.
Message from Jeff follows.
My initial reaction, after having read the mails you sent me, is to stick with the Region construct as it stands, ie: not as a separate clause. I'll go along with most of what Benjamin says, especially on not deciding the standard on implementations issues/specials.
But these things need to be qualified!
The problem with Region is that everything I've seen about Region so far gives no details of what table(s) is/are involved or what columns will be involved in the search! Tables have to be inferred from the context, and columns by the implementation. I suspect the implementation will decide whether this is healpix, htm or plain ra & dec. But if the latter, what if there is more than one column that can be construed as ra (or dec)? With tables, I assume with the current region spec, that region will be applied to each table in the query.
There is also the complication of whether such a separate Region clause could be included in sub-queries, which after all are simply other forms of select.
If the Region is not to be some super-filter clause, then a table (or tables!) need to be specified, perhaps with an optional indication of columns. Something along the lines of...
t.Region( 'Circle etc ', t.ra, t,dec )
or
t.Region( 'Circle etc ', t.htm )
in BNF...
<region_predicate> ::=
<table_reference>
<dot>
REGION
<left_paren>
<region_specification>
[ <comma>
<column_reference>[ { <comma> <column_reference> }... ] ]
<right_paren>
Perhaps the column specification might be too much for some. I'm easy, but it might be really handy to specify columns to be used in certain situations.
A better shorthand might be to include the list of tables within the predicate (see example regarding derived table later).
Having a Region clause as some super-filter presupposes that the query is only interested in one region of the sky. I think this may well be the case but not for certain. It may be, for instance, that a query is interested in comparing stars of a certain character between different regions.
Another complication in this area is that the spec (STC schema) allows for semantics against one or more regions, but presumably against the same "set" of tables. Thus a Region can quote another region inside it. There is for instance a Region of unionType that is the union of two or more regions. This is contained wholely within the Region construct in the schema, which takes some imagination).
Sorry about the above points. They do need serious thought. A possible compromise (worth exploring to see whether it resolves some of the conundrums) is to allow for derived tables. Currently we do not seem to support this idea within adql/s or adql/x as far as I can see. But it is there in SQL92! So, here is the bnf for a table reference...
<table_reference> ::=
<table_name> [ [ AS ] <correlation_name>
[ <left_paren> <derived_column_list> <right_paren> ] ]
| <derived_table> [ AS ] <correlation_name>
[ <left_paren> <derived_column_list> <right_paren> ]
| <joined_table
where derived table is defined as ....
<derived_table> ::= <table_subquery
It seems to me as if region is supposed to be some form of special derived table, which obviously acts as a form of filter. So, the table subquery might look like...
Select *
>From Region( 'Circle etc', catalogue_A ) as Circle_A,
Region( 'Ellipse etc', catalogue_B ) as Circle_B
Where
Circle_A.Q_Z_ABS >= 0.975 ;
I think this could be thought through and made to work. It may even be possible to quote one region inside another if the spec is carefully worked out. I suppose if you take this seriously, Region must be thought of as a search condition and not as a completely separate clause. The latter might be easier for the easy cases, but I think lacks flexibility.
One final point. I don't think in the long run the string based semantics will work (eg: 'Circle J2000 181.3 -0.76 6.5' ). Given the complexity, we need to break things down into the constituent parts as the STC schema attempts to do.
-- Kona Andrews kea-at-roe.ac.uk AstroGrid Project http://www.astrogrid.org IfA, Royal Observatory, Blackford Hill, Edinburgh EH9 3HJReceived on 2006-11-03Z11:21:43