Column Groups in VOTable

From: Francois Ochsenbein <francois-at-vizir.u-strasbg.fr>
Date: Mon, 28 Apr 2003 20:51:09 +0200 (MET DST)

Dear All,

The forthcoming meeting in Cambridge could be a good opportunity to discuss how "column groups" can be introduced in VOTable. Such a functionality was already expressed (maybe not explicitely); I feel that this it has also implication on UCDs, and I'm therefore posting this message to both VOTable and UCD groups. I apologize to those who will receive this message twice.

The "column groups" proposition tries to answer questions frequently themail: Undefined variable
asked about column associations, typically:
--> error (or standard deviation) associated to a column, e.g. a flux

    consists of two numbers: the measured value + the mean error
--> qualities or weights associated to values
--> source or origin (e.g. telescope, or bibliographical reference) of a value
--> individual components e.g. x,y position of a CCD
--> etc...

This "column grouping" has obviously the same role as defining structures made of columns; defining structures made of structures can also be viewed as grouping groups of columns.

I see essentially two ways of defining such "column groups" in VOTable:

  1. generalize the <COOSYS> method currently used to describe the coordinate systems. This kind of "by reference" method defines a structure, and any <FIELD> can declare (via the "ref" attribute) to be a member of that structure. As an illustration of a group of columns containing a flux value and its error, the XML code could look like:
    1. within the <DEFINITIONS> element, define a structure as e.g.: <STRUCTURE ID="Flux1" name="FluxParameters"> <PARAMETER ID="Freq1" name="Frequency" value="8.6" datatype="float" unit="GHz" ucd="OBS_FREQUENCY" /> </STRUCTURE>
    2. within the <TABLE> definition, columns belonging to this structure refer to it: <FIELD name="flux" datatype="float" ref="Flux1" unit="mJy" /> <FIELD name="e_flux" datatype="float" ref="Flux1" unit="mJy" />
  2. introduce a new element e.g. <GROUP> in the <TABLE> description which would contain the fields. The same example of a flux + its associated error would be coded as:

   <GROUP name="Flux" ucd="PHOT_FLUX_RADIO_8.4G">     <FIELD ID="Flux1" name="fluxValue" datatype="float" unit="mJy">      <DESCRIPTION>Value of the flux at 8.4GHz</DESCRIPTION>     </FIELD>
    <FIELD ID="e_Flux1" name="errFlux1" ucd="ERROR" datatype="float" unit="mJy">      <DESCRIPTION>Error on flux value</DESCRIPTION>     </FIELD>
    <PARAMETER ID="Freq1" name="Frequency" value="8.6" datatype="float"       unit="GHz" ucd="OBS_FREQUENCY" />
   </GROUP>

There could be a third way which would introduce new tags within each table element like e.g. <VAL> and <ERR> to give   <TD><VAL>11.35</VAL><ERR>1.12</ERR></TD> but it would be against the current philosophy of VOTable which defines all metadata first, and is followed by the data alone, in order to keep the efficiency and the FITS compatibility; this third method would also require frequent modifications of the schema (XMLSchema) -- generally disturbing for working applications.

The <GROUP> defined in b) above seems to me to be a good framework for this definition. I see several advantages: => the basic tabular scheme remains -- VOTable can still be viewed as a

   relational database, and keeps a full compatibility with existing    FITS binary tables;
=> groups of groups (i.e. recursive <GROUP> tags) enables a definition

   of arbitrary complex structures;
=> the UCDs become more accurate when defined in a group:

Using the "ref" attribute in <FIELD> also permits one column to be a member of several groups: for example, an error common to two fluxes measured at different frequencies can be defined as:

   <GROUP name="Flux" ucd="PHOT_FLUX_RADIO_8.4G">     <FIELD ID="Flux1" name="fluxValue" datatype="float" unit="mJy">      <DESCRIPTION>Value of the flux at 8.4GHz</DESCRIPTION>     </FIELD>
    <FIELD ID="e_Flux1" name="errFlux1" ucd="ERROR" datatype="float" unit="mJy">      <DESCRIPTION>Error on flux values, both at 8.4 and 7.5GHz</DESCRIPTION>     </FIELD>
    <PARAMETER ID="Freq1" name="Frequency" value="8.6" datatype="float"       unit="GHz" />
   </GROUP>
   <GROUP name="Flux" ucd="PHOT_FLUX_RADIO_7.5G">

     <FIELD ID="Flux2" name="fluxValue" datatype="float" unit="mJy">
     <DESCRIPTION>Value of the flux at 7.5GHz</DESCRIPTION>
    </FIELD>

    <FIELD ref="e_Flux1" />
    <PARAMETER ID="Freq2" name="Frequency" value="7.5" datatype="float"       unit="GHz" />
   </GROUP>
Received on 2003-04-28Z20:54:48