6/5/02.
Francois, Roy, et al,
This message is just a few thoughts prompted by the recent comments on the VOTable document from Ian Evans and the CfA VO Group. I thought that most of the comments were sound and well-made. In particular the remarks about representing null values made sense. A few further specific thoughts on the `General Comments' section follow.
> (2) Is the name VOTable too narrowly focused?
My understanding was that the VOTable was deliberately focussed on the limited problem of tabular data because (a) this was needed most immediately and (b) by limiting the scope of the problem we could achieve rapid progress. I see the VOTable format as a mechanism for representing subsets extracted from astronomical catalogues, and I think it makes sense to restrict it to this purpose, rather than trying to `creep' its specification to allow the representation of other sorts of data.
Conversely, the VO will indeed return other sorts of data, as well as catalogue extracts, and we'll need some format to represent these. We've not discussed such a format at all yet, but I'd implicitly assumed we'd invent a `VOBulkData' format for handling `bulk data' (spectra, images, data cubes etc), which are essentially n-dimensional arrays. It would seem to make sense for it to be along similar lines to the VOTable: an XML based scheme with alternatives to represent the basic arrays either in-line using XML tags or externally as FITS images or other binary files.
Some astronomical datasets are more complicated than simple tables or images of course, and there are cases where it is useful to be able to define hierarchical structures optionally containing a mixture of bulk data, tables and auxiliary items. XML is well-suited to this sort of thing and there could be a `VOHDF' (VO hierarchical data format) element which optionally has VOTable and VOBulkData elements as its children.
At the end of this note I include a few informal examples of what the VOBulkData and VOHDF elements might look like.
Starlink has used hierarchical data formats for many years (albeit not in the context of XML), and they can be very powerful and flexible ways of representing astronomical data. However, they need to be used in a controlled fashion, with standard structures agreed, if they are to be effective and useful. These are things that we should think carefully about, rather than rushing into.
> In the FITS world the wish has often been expressed that column
> and keyword be treated as the same kind of beast ...
I think that the underlying requirement here is that column and keyword names should occupy the same name-space. If this is the case then it is possible to support expressions (for both selection and projection) involving both columns and keywords. For example (and using the VOTable notation), if a VOTable includes FIELDs (ie. columns) a, b and c and PARAMs (ie. keywords) p and q then:
a + (2.0*b) + (p/4.0) > c - sin(q)
is a valid expression. The CURSA table-handling package provides this feature (and doubtless other packages do too), and it seems to work fine.
regards,
Clive.
The following are outline examples of VOBulkData and VOHDF elements. They are presented informally and just intended to show how the elements might work. They are certainly not complete or fully thought-out specifications. The VOBulkData element is quite closely modelled on the VOTable.
<VOBULKDATA>
<RESOURCE>
<DESCRIPTION> An optional free-text description.
...
<PARAM ID="param1" ... Auxiliary parameters (keywords),
</PARAM> identical to VOTable PARAMs.
<PARAM ID="param2" ...
</PARAM>
The RESOURCE contains one or more
data ARRAYs, allowing related
arrays to be grouped. Optionally
each array may be assigned a role
(primary data, statistical
variance etc.)
<ARRAY ID= name=
role= primary | variance | quality flags ...
ndim="3" Dimensionality.
dims="512,512,3" Size in each dimension.
datatype="float"
>
<AXIS dim=1 Details of each axis.
label="X axis"
unit="pixels"
...
>
</AXIS>
<AXIS dim=2
label="Y axis"
unit="pixels"
...
>
</AXIS>
<ARRAYDATA>
Actual n-dimensional array of data, with options for pure XML,
wrapping a FITS image or a simple binary representation.
</ARRAYDATA>
</RESOURCE>
</VOBULKDATA
It will probably be necessary to have some WCS mechanism to allow pixel indices to be related to physical coordinates. But, on the basis of previous experience, perhaps we should tiptoe quietly away from this problem for the time being. :-)
A VOHDF element might look something like:
<VOHDF>
<DESCRIPTION> An optional free-text description.
...
</DESCRIPTION>
<VOTABLE> Zero or more VOTable elements....
<VOBULKDATA> Zero or more VOBulkData elements.
...
</VOBULKDATA>
<VOHDF> Zero or more VOHDF elements (thus allowing ... arbitrary structures to be constructed).</VOHDF>
Clive Davenhall Institute for Astronomy, e-mail (internet, JANET): acd @ roe.ac.uk Royal Observatory Edinburgh, fax from within the UK: 0131-668-8416 Blackford Hill, Edinburgh, fax from overseas: +44-131-668-8416 EH9 3HJ, Scotland.Received on 2002-05-13Z07:01:23