Comments on Spectrum Data Model 0.9

From: Anita Richards <amsr-at-jb.man.ac.uk>
Date: Tue, 21 Sep 2004 13:19:56 +0100 (BST)

Hello,

It looks really good and I was pleased to see that this relates to other Data Models, I have not done a full check for consistency but that sort of thing gives me conficence that it is designed to be used. I am also impressed by the coverage of errors etc.

I am still having trouble figuring out what parts of the model are intended to apply per data set and which parts are per data point. By analogy, I would expect that the Obs. data model could describe a data cube of images with a single set of metadata, if each plane of the image was the same apart from its content. Of couse, the local statistical noise might be different in each plane but then in a single image it might be differnt in diffeernt regions and we don;t use different metadata for each pixel.

As I understand it, the Spectrum model suggests that each spectral point, has centre, max and min of bin in spectral units (and other metadata) and that is indeed required for some SEDs e.g. as provided by NED.

But in fact some IVOA examples given just give centre of bin e.g. for many samples which are equally spaced in the specified units. In that case it would be useful to have the (constant) bin width in the header to aid extracting metadata for the Obs DM, Registry etc. Will this be covered under Resolution? Moreover, other parameters such as the available errors may be constant for the whole data set (or each point may have its own errors).

i.e., I want to be able to get at information which says "Spectrum of BigSupernova from 1719 to 1721 MHz in 2048 equal-sized channels" which may then be a FITS file or ascii etc. with a header which gives the metadata which are common to every spectral point but the data itself is just a series of pairs of frequencies and flux densities.

If we are talking about e.g. 50 separate monitoring observations in 512 channels x 4 polarizations, all repeated for 10 surces, the difference between using 2 numbers for each spectral point and about 6 starts to become significant when constructing spectra on the fly, returning lots of spectra, trawling metadata or just intimidating the user/data provider.

There are more difficult situations - I am not arguing that they should all be covered now, but which are and which are we deferring?

I am assuming in the above that the data have been fully calibrated and there is not a separate bandpass function (i.e. sensitivity changes with frequency) but that is not always the case. In any event the user might want access to the bandpass correction function which had been applied which might be a mathematical function or a look-up table. Data may be stored in other formats e.g. all the spectral coordinate information in the header and just a table with pairs of values for channel No and flux density, or even a single vector of flux densities.

It seems to me that, at present, a lot of data providers will have some hard work to provide conformant data. That is not necessarily an argument against the model (apart from the unneccesary bloat with I will continue to heckle about!) but it does suggest that it might be useful to define a very few aspects as "must" and offer help in sorting out the rest. I would rather see that, than use alternative `simplified' models like the Dobos-Budvari schema (Sect. 8.5) - although some such simplification might be useful if in fact some aspects of e.g. Coverage do actually belong in the Obs data model and should be provided by a link rather than duplicated. There are problems with the Dobos-Budvari schema as it stands, e.g. it is a nightmare trying to manipulate a spectrum equally spaced in frequency if you can only use wavelength units, although the latter might be adequate for the Registry.

For the most recent/future data sets, I see that it model is applied to SDSS; what do other people think, e.g. ISO, ALMA? I would also like to know what Ivo Busko (SpecView designer) makes of it.

cheers
a

Received on 2004-09-21Z12:20:39