RE: Collaboration on Source Catalogue DM, ADQL and SkyNodes -- "Spatial is special"

From: Jim Gray <Jim.Gray-at-microsoft.com>
Date: Wed, 21 Dec 2005 09:55:33 -0800


The paradox for me, being a database guy, is that Rtrees seem to lose out to the zone algorithm for crossmatch and seem to lose out to HTM for point-near-point or point-in-polygon. And the footprint queries (polygon-overlaps-polygon) seems to want the algebraic approach (halfspace intersections and then simplification). One might think that Rtrees would beat HTM, but they seem to have problems with the poles and the Cartesian representation of points avoids transcendentals.
And Healpix and Igloo have their place in this "space". As they say: "Spatial is special."
It seems to have a special access method for each application, And every researcher seems to have a favorite access method or organization.

-----Original Message-----
From: owner-voql-at-eso.org [mailto:owner-voql-at-eso.org] On Behalf Of Clive Page
Sent: Wednesday, December 21, 2005 8:04 AM To: voql-at-ivoa.net
Subject: RE: Collaboration on Source Catalogue DM, ADQL and SkyNodes

On Wed, 21 Dec 2005, Jim Gray wrote:

> There is also the cross match that Tanu & Maria implemented in SQL.
> That may be the easiest way to do the implementation if the nodes all
> have a SQL backend.

I'd like to point out that if the data are already in a relational DBMS then by far the simplest way to do the cross-match, and in many cases also the fastest, is to use R-tree indexing and a spatial join. I think the first astronomical use of this was by Andrea Baruffolo (see http://monet.ncsa.uiuc.edu/adass98/Proceedings/baruffoloa1/ ) but it has also been extensively tested here and documented on the AstroGrid wiki, see: http://wiki.astrogrid.org/bin/view/Astrogrid/DataDocs

Support for spatial indexing is now included in or readily available for DB2, Oracle, Informix, Sybase, MySQL, and Postgres, i.e. just about all the DBMS widely used in astronomy (with perhaps just one exception, which Jim can tell you about :-).

> But, getting objects into a node dominates all other costs (moving
> stuff thru xml is expensive).

Indeed that is a very serious problem. I wonder if we can't solve this by using, instead of XML, some more efficient data format, e.g. one which holds tabular data in binary form with just the metadata in plain text.
There's something called the "FITS table" with exactly these properties which perhaps astronomers should investigate :-)

Merry Christmas (translation for those in countries where Christianity is not the established religion: Happy Holidays).

--
Clive Page
Dept of Physics & Astronomy,
University of Leicester,
Leicester, LE1 7RH,  U.K.
Received on 2005-12-21Z18:56:50