More on REGION implementation

From: Kona Andrews <kea-at-roe.ac.uk>
Date: Thu, 1 Mar 2007 14:09:55 +0000


Dear colleagues,

A few more comments on the implementation of REGION.

At the moment, our DataSet Access component does not support queries using the REGION construct. However, I would hope to support it in the DSA back-end using the same method by which we implement Cone Search (i.e. a CIRCLE REGION), where possible.

When a Cone Search request comes in, it is translated into ADQL/sql by populating a template string. The template contains the basic ADQL/sql circle-region query, together with some placeholder terms that are replaced by the appropriate values/expressions. The replacement is carried out using simple string substitution from within Java, using values calculated by the Java code.

For small-radius queries we use a haversine formula, and for larger-radius queries we use a greatcircle formula. For each of these two cases, we have four separate templates covering all possible combinations of column units and database trig units (radians/degrees). This gives 8 templates in total. While having 8 different templates is a bit of a pain maintenance-wise, hopefully the templates don't actually *need* maintenance once correctly defined, and the java code is a bit more streamlined as a result. However, the units issue could in principle be dealt with in Java, instead of by multiple templates, if that was preferred.

For example, this is the template for a haversine query where the column units are in radians and the DBMS trig functions are in degrees:


SELECT * FROM INSERT_NAME_TABLE AS a WHERE

   (

(a."INSERT_NAME_DEC" <= INSERT_VALUE_MAX_DEC_RAD) AND
(a."INSERT_NAME_DEC" >= INSERT_VALUE_MIN_DEC_RAD)
   )
   INSERT_RA_CLIP_CONDITION
   AND
   (

(2.0 * ASIN(SQRT(

         POWER(SIN(((DEGREES(a."INSERT_NAME_DEC") -
               INSERT_VALUE_DEC_DEG) / 2.0)), 2)  +
           (
              (COS(DEGREES(a."INSERT_NAME_DEC") )  *
                 (
                    COS(INSERT_VALUE_DEC_DEG)  *
                    POWER(SIN(((DEGREES(a."INSERT_NAME_RA") -
                        INSERT_VALUE_RA_DEG) / 2.0)), 2)
                 )
              )
           )
      ) ) )
      < INSERT_VALUE_CIRCRADIUS_DEG

  );

All the terms that begin with INSERT_ are substituted by appropriate values using string substitution in the Java code; the term INSERT_RA_CLIP_CONDITION is replaced by an entire expression representing an appropriate boxcut constraint on RA to make the search more efficient.

Where possible, we would hope to implement other conditions in a similar way (obviously, using templates expressing just the necessary ADQL/s WHERE clauses rather than entire queries). This approach wouldn't work for arbitrary user-defined polygons, but for simple shapes with fixed geometry it is reasonably effective.

I would concur with previous comments that it is important to be able to specify explicitly which columns should be used in the region calculation, to avoid ambiguity.

Note that as the above example shows, we are not explicitly dealing with the question of the coordinate system(s) of the input query and the database columns; we assume that the coordinate system is the same for both. This assumption may not be correct, which is one of the reasons why we are seeking clarification on the issue of default coordinate systems in the VO at large.

All the best,
Kona

-- 
Kona Andrews        kea-at-roe.ac.uk
AstroGrid Project   http://www.astrogrid.org
IfA, Royal Observatory, Blackford Hill, Edinburgh EH9 3HJ
Received on 2007-03-01Z15:10:23