re: Characterisation draft

From: Anita Richards <amsr-at-jb.man.ac.uk>
Date: Fri, 8 Sep 2006 17:40:22 +0100 (BST)

Some background to how we arrived at suggestions for mandatory elements in Characterisation (see
http://alinda.u-strasbg.fr/Model/Characterisation/CharacterisationDraftMai06/ ):

I considered what information is required by the tools currently available and what information could or could not be used by any working tool. I have also tried to consider all kinds of observed or simulated data and I think that we should impliment the model as soon as possible in order to find in practice if it works for real data and applications; as long as the structure is OK we will best discover if the elements need small revisions after the model has been in use.

Initially we will provide XML templates for manual editing. We will investigate what would be more convenient for large data collections depending on how they store their existing metadata e.g. in databases, as xml, as FITS headers, offline etc.\footnote{For example something like
http://nvo.stsci.edu/VORegistry/UpdateRegistry.aspx?InsertMode=t&ResourceType=DATACOLLECTION is suitable for coarse-grained regstry entries where one individual person only has a few collections to register, but not for providing Char descriptions for thousands of separate observations in telescope archive. Would it, however, be practical to provide a form where the curator enters the keyword in their database or FITS headers which corresponds to an element in the model, or constant element values (e.g. units on an axis). What if the element needed by Char had to be derived from the recorded metadata (e.g. the spectral location, bounds etc. need to be derived from an instrument-specific code using a look-up table or algorithm)? For metadata in a DB, we could persuade data providers to make the derivation an extra column, but they might be reluctant to modify FITS headers if these were the only source of metadata.
}

We should make it as easy as possible to describe data; the model will have to prove its use in pratice before many data providers will invest significant effort - but we can only make it easier if we get feedback. This means minimising the number of compulsory fields although there has to be enough information to expand the ways in which data can be selected or manipulated by VO tools.

A 'description' is an xml document, based on the Char schemata, describing a specific data set. This may contain one or many instrumental pointings (e.g. the GOODS northern field), but a number of separate descriptions would be used for e.g. HST UDF images, Team Keck spectra and WSRT images all in the HDF region, if they have very different coverage on many axes.
\footnote{The relationship between Char and the Obs data model will eventually be available to indicate linked data sets, but we don't presently have any real way to use complicated conditional relationships between different properties on different axes. For example, Resolution R and Wavelength l can be given as separate ranges [min, max] but I don't know of any current VO tool which can handle R = const/l, although that should come in future.}

Char was not designed to describe catalogues but has been used for this successfully and I can see no reason not to use it for e.g. an object list extracted from an image or an observing log. We should not extend Char to provide for issues exclusive to catalogues e.g. definitions of sources ('star', 'galaxy'...); there is a separate model for that.

Cheers

Anita

Received on 2006-09-08Z18:40:51