Epistemology of the VO

From: Roy Williams <roy-at-cacr.caltech.edu>
Date: Sun, 30 May 2004 10:05:34 -0700


Please find below some Sunday morning musing on the different mindsets in the VO (I count four). Are there more or less? Does this analysis get us any closer to our goals? Does this text have a place in the architecture document? Roy



California Institute of Technology
roy-at-caltech.edu
626 395 3670

Epistemology of the Virtual Observatory

Here we consider ways of thinking of ourVO clients. They might come to the VO with one or more of the following mindsets, as described below: Hyper-spatial, Relational Table, Object Data Model, or Semantic.

In the VO, the Cone Search *request* is in this class -- although the Cone Search *response* is a table (next section). The Simple Image Access request is also in this "spatial" class, as is the Spectral Access.

Much of the visualization work uses the sky coordinates (RA, Dec) as its first guiding principle -- Aladin, Oasis, Image Cutout and Virtual Sky. There is work in the VO to build a schema that accurately defines spatial regions, leading to the VORegion standard that allows precise definitions of subsets of the (hyper)sky.

Queries are built as boolean combinations of predicates on the attribute names of the columns. The VO has built ADQL, a standard variant of SQL, and added extra features.

In the table model, the Row object is thought to be a sequence of primitive types (integer, float etc), and the RowSchema defines this sequence. The RowSchema is limited in complexity: by keeping the table cells very simple in structure (i.e. just fixed-format primitives), we maximize the thrust and utility of general tools.

VOTable is built on the Relational model. In VOTable, cells can be variable-length, multidimensional arrays of primitives. The OpenSkyQuery protocols are also Relational in their mindset -- what is the set of tables, what are their attributes, here is an SQL query.

XML schema has been a rich language for the VO in communicating the object mindset. Software is available to take an object collection that is expressed in XML Schema, and convert it to client-server code so that the logic of the object can be implemented (stub code), and so the object can be remoted (SOAP).

The VO Registry is built with an object model, so that every resource in the registry has a basic view (eg Dublin Core), and then can be specialized to more complex metadata descriptions (eg where in the sky).

The VO has been working to promote the "Unified Content Descriptor" (UCD) project, which has built a formal language of semantic types for astronomical data. For example "phys.mass" means that the quantity is a physical mass. There are services to explain or compare UCDs. UCDs can fulfill the role of names in a query, or act as an intermediate lingua franca when diverse relational databases are federated. In this sense, UCD acts as a generic data model for all of astronomy.

  1. http://iraf.noao.edu/projects/vo/dal/datamodel.html
  2. sorry, I can't find this. It was a proposal from CfA to extend cone search into wavelength, time, etc.
  3. http://jvo.nao.ac.jp/Documents/VOQL-yshirasa-2004-05-v0-5.pdf
Received on 2004-05-30Z17:06:01