Here is a brief description of how we implement geometry querying (REGION). We are using a clustered DB2 database for this (8 nodes, 16 partitions at the moment).
Spatial querying is implemented using a feature of DB2 called Multi Dimensional Clustering (MDC), which is essentially a multi-dimensional clustered index. A simple example:
table positions (
-- the real columns:
ra double precision not null, dec double precision not null,
-- in cartesian coordinates:
x double precision not null, y double precision not null, z double precision not null,
Then we have to transform something like REGION(a, b, r, ICRS) via:
WHERE x between tx-r and tx+r
AND y between ty-r and ty+r
AND z between tz-r and tz+r
DB2 knows that the binning columns are generated from x,y,z so it can use the binning columns to optimise the search. For points we can make refine the SQL to be a sphere intersecting the unit sphere (equivalent to a circle on the sky) by sticking some extra conditons into the SQL, such as:
AND ((tx-x)*(tx-x) + (ty-y)*(ty-y)+ (tz-z)*(tz-z)) < r*r
or we can do some post-filtering in software to throw away some edge cases that get past the first cube conditions. In practice this is what we do for observations because instead of ra,dec we have a polygon, instead of x,y,z we have a bounding cube, and instead of the simple distance computation we have to do full geometric intersection, which I didn't want to put in the DB.
So, our approach is a combination of replacing REGION(...) with some other simple SQL conditions and some post-filtering of the results to remove (literally) edge cases.
As a point of interest, this approach is taking advantage of the parallel query processing in our clustered DB2 system and the fact that using MDC (scan bins instead of whole table) makes the query performance dependent more on bandwidth (disk/memory/cpu throughout) rather than latency (disk seek time), which just doesn't scale all that well.... a high performance but totally brute force approach with minimal software complexity :-)
We also use MDC to "index" spectral and temporal intervals and the related intersection kind of conditions.
--
Patrick Dowler
Tel/Tél: (250) 363-6914 | fax/télécopieur: (250) 363-0045
Canadian Astronomy Data Centre | Centre canadien de donnees astronomiques
National Research Council Canada | Conseil national de recherches Canada
Government of Canada | Gouvernement du Canada 5071 West Saanich Road | 5071, chemin West Saanich Victoria, BC | Victoria (C.-B.)Received on 2006-11-23Z21:39:20