Hi All -
There have been various discussions on the data model / file format issue, but to keep it simple I will respond to Mark's original message.
On Wed, 13 Sep 2006, Mark Taylor wrote:
> [...] this relates to a point that I raised with Markus Dolensky last
> week and he forwarded to Doug concerning SSAP and serialization
> formats. Since it's come up here, I'll shove my oar in.
>
> My problem is that the information returned from an SSAP query
> gives the serialization MIME type, but no more - as you point out
> above, the fact that a spectrum is encoded as FITS could cover any
> number of specific serialization formats. So a client trying
> to make sense of a spectrum returned from SSAP, which only has
> the MIME type got from the Access.Format response field to tell
> it what kind of data is at the other end of the Access.Reference,
> has an unneccessarily difficult job, in that it really has to
> examine the data itself to work out what the serialization format is
> (and in doing that it may end up downloading a large data file only
> to find out that it is in a format that it can't understand).
>
> Possibly the intention is that an SSAP Access.Format of application/fits
> means the data is in the FITS format defined in the Spectral DM
> document (ditto for application/x-votable+xml, application/xml),
> but I can't see this stated explicitly anywhere.
>
> Otherwise, it seems to me that what is called for is an additional
> field in the SSAP response which names the specific serialization
> format, if known. This would require assigning some sort of name
> to the XML, FITS and VOTable formats defined in the Spectral DM
> document (presumably a URI of some sort).
This is primarily a query matter whereas Spectrum is a dataset data model, hence we are getting into issues here which aren't addressed by the Spectrum model alone.
We distinguish between the data model and the data format or serialization. Both are described in the query response. Since the same data object, conformant to the Spectrum data model, may be viewed via various formats/serializations, it is not clear whether the data model itself should specify the serialization; my view has always been that this best done externally, e.g., in the access protocol.
What we currently have in the access protocol in this area:
Dataset.Type # Spectrum, TimeSeries, etc. Dataset.DataModel # Data model, e.g., "Spectrum V1.0" Access.Format # File format (MIME type)
If the DataModel is "Spectrum" then we have a fully VO-compliant dataset. (Yes, services will need to perform a conversion on the fly to return a dataset compliant with the VO Spectrum data model.)
If instead the service returns native project data (typically different for every data collection/mission/instrument) then Dataset.DataModel should identify the specific project data model for the data to be returned. This is the "pass-through" mechanism for accessing native project data via an SSA query interface. An application doesn't have to scan the data file to determine what it contains, this is specified directly by the dataset Type and DataModel.
The data format or serialization is (in principle at least) independent of the data model. This is true for Spectrum but in general will not be true for native project data, where there is typically only one format. Currently, the file format is specified by its MIME type.