Re: CfA VO Group comments on the VOTable document

From: shaya <Edward.J.Shaya.1-at-gsfc.nasa.gov>
Date: Mon, 06 May 2002 13:44:17 -0400


Clive,

    What you describe here is close to the thinking that led us to develop XDF soon after the ASTRORES specification was developed. ASTRORES and VOTable are quick and easy experimental methods of exchanging simple tables, but to contain all of the complex types of data that scientists deal with requires a more sophisticated design. It is also best to design a layer that is not astronomy specific and then to inherit that into a layer that is astronomy specific. We have spent 3 years developing UML data models, a rich API, Java and Perl packages based on the API, and XML Schema work. We have also been using it extensively for the last two years at the ADC and so it is well tested. It is presently the XML specification that CCSDS (Consultive Committee for Space Data Science) is putting forward for all space agencies' mission data. Recently we presented to the "CCSDS Working Group on XML" a Telemetry Markup Language so that telemetry and science data can have the same underlying mechanisms.

I don't know if you looked at XDF before you wrote this e-mail, but you're strawman array and parameter elements are right on to the XDF way of doing things. We can discuss some details that we would do slightly differently (like put each dimension size on the relevant axis element), but I think we are in fairly good agreement on the basic approach and the need to start working out a VO standard. Perhaps we need a working group specifically for this.

If you have not seen XDF, here are some links:

XDF Home:                http://xml.gsfc.nasa.gov/XDF/XDF_home.html
Version 17 diagram:     http://xml.gsfc.nasa.gov/XDF/XDF_017.jpg
API:                            http://xml.gsfc.nasa.gov/XDF/Java/API/
UML:                         

http://xml.gsfc.nasa.gov/XDF/UML-XDF-Core.gif (recommend IE to view UML gif)

Two languages that inherit from XDF:

FITSML:                    http://xml.gsfc.nasa.gov/DTD/FITSML_dtd.txt
CDFML:                    

http://nssdc.gsfc.nasa.gov/nssdc_news/dec01/cdf.html

Cheers,
Ed  

Clive Davenhall wrote:

>6/5/02.
>
>Francois, Roy, et al,
>
>This message is just a few thoughts prompted by the recent comments
>on the VOTable document from Ian Evans and the CfA VO Group. I thought
>that most of the comments were sound and well-made. In particular the
>remarks about representing null values made sense. A few further
>specific thoughts on the `General Comments' section follow.
>
>
>
>>(2) Is the name VOTable too narrowly focused?
>>
>>
>
>My understanding was that the VOTable was deliberately focussed on the
>limited problem of tabular data because (a) this was needed most
>immediately and (b) by limiting the scope of the problem we could achieve
>rapid progress. I see the VOTable format as a mechanism for representing
>subsets extracted from astronomical catalogues, and I think it makes
>sense to restrict it to this purpose, rather than trying to `creep' its
>specification to allow the representation of other sorts of data.
>
>Conversely, the VO will indeed return other sorts of data, as well as
>catalogue extracts, and we'll need some format to represent these. We've
>not discussed such a format at all yet, but I'd implicitly assumed we'd
>invent a `VOBulkData' format for handling `bulk data' (spectra, images,
>data cubes etc), which are essentially n-dimensional arrays. It would
>seem to make sense for it to be along similar lines to the VOTable: an XML
>based scheme with alternatives to represent the basic arrays either in-line
>using XML tags or externally as FITS images or other binary files.
>
>Some astronomical datasets are more complicated than simple tables or
>images of course, and there are cases where it is useful to be able to
>define hierarchical structures optionally containing a mixture of bulk
>data, tables and auxiliary items. XML is well-suited to this sort of thing
>and there could be a `VOHDF' (VO hierarchical data format) element which
>optionally has VOTable and VOBulkData elements as its children.
>
>At the end of this note I include a few informal examples of what the
>VOBulkData and VOHDF elements might look like.
>
>Starlink has used hierarchical data formats for many years (albeit not in
>the context of XML), and they can be very powerful and flexible ways of
>representing astronomical data. However, they need to be used in a
>controlled fashion, with standard structures agreed, if they are to be
>effective and useful. These are things that we should think carefully
>about, rather than rushing into.
>
>
>
>>In the FITS world the wish has often been expressed that column
>>and keyword be treated as the same kind of beast ...
>>
>>
>
>I think that the underlying requirement here is that column and keyword
>names should occupy the same name-space. If this is the case then it is
>possible to support expressions (for both selection and projection)
>involving both columns and keywords. For example (and using the VOTable
>notation), if a VOTable includes FIELDs (ie. columns) a, b and c and PARAMs
>(ie. keywords) p and q then:
>
> a + (2.0*b) + (p/4.0) > c - sin(q)
>
>is a valid expression. The CURSA table-handling package provides this
>feature (and doubtless other packages do too), and it seems to work fine.
>
>regards,
>Clive.
>
>-----------------------------------------------------------------------------
>
>The following are outline examples of VOBulkData and VOHDF elements.
>They are presented informally and just intended to show how the elements
>might work. They are certainly not complete or fully thought-out
>specifications. The VOBulkData element is quite closely modelled on the
>VOTable.
>
><VOBULKDATA>
> <RESOURCE>
> <DESCRIPTION> An optional free-text description.
> ...
> </DESCRIPTION>
>
> <PARAM ID="param1" ... Auxiliary parameters (keywords),
> </PARAM> identical to VOTable PARAMs.
>
> <PARAM ID="param2" ...
> </PARAM>
> The RESOURCE contains one or more
> data ARRAYs, allowing related
> arrays to be grouped. Optionally
> each array may be assigned a role
> (primary data, statistical
> variance etc.)
> <ARRAY ID= name=
> role= primary | variance | quality flags ...
> ndim="3" Dimensionality.
> dims="512,512,3" Size in each dimension.
> datatype="float"
> >
> <AXIS dim=1 Details of each axis.
> label="X axis"
> unit="pixels"
> ...
> >
> </AXIS>
>
> <AXIS dim=2
> label="Y axis"
> unit="pixels"
> ...
> >
> </AXIS>
>
> <ARRAYDATA>
> Actual n-dimensional array of data, with options for pure XML,
> wrapping a FITS image or a simple binary representation.
> </ARRAYDATA>
> </ARRAY>
>
> </RESOURCE>
>
></VOBULKDATA
>
>It will probably be necessary to have some WCS mechanism to allow pixel
>indices to be related to physical coordinates. But, on the basis of
>previous experience, perhaps we should tiptoe quietly away from this
>problem for the time being. :-)
>
>
>A VOHDF element might look something like:
>
><VOHDF>
> <DESCRIPTION> An optional free-text description.
> ...
> </DESCRIPTION>
>
> <VOTABLE> Zero or more VOTable elements.
> ...
> </VOTABLE>
>
> <VOBULKDATA> Zero or more VOBulkData elements.
> ...
> </VOBULKDATA>
>
> <VOHDF> Zero or more VOHDF elements (thus allowing
> ... arbitrary structures to be constructed).
> </VOHDF>
></VOHDF>
>
>
>-----------------------------------------------------------------------------
>Clive Davenhall Institute for Astronomy,
>e-mail (internet, JANET): acd @ roe.ac.uk Royal Observatory Edinburgh,
>fax from within the UK: 0131-668-8416 Blackford Hill, Edinburgh,
>fax from overseas: +44-131-668-8416 EH9 3HJ, Scotland.
>
>
>
>
Received on 2002-05-13Z07:01:24