Re: SSA working draft

From: Doug Tody <dtody-at-nrao.edu>
Date: Mon, 20 Nov 2006 15:39:19 -0700 (MST)


Hi Inga -

On Mon, 20 Nov 2006, Inga Kamp wrote:
>> If the service generates cutouts, filters the data, does spectral
>> extraction, etc., this can *only* be done at access time, because
>> these are intrinsically on-demand operations driven by parameters
>> supplied by the client at access time.
>
> I would filtering on the database level not describe as a dynamic creation
> of the dataset. At least it confuses people the way it's phrased now.

By filtering I mean something which changes the actual data samples, like interpolating or running a fourier filter on the data, editing defects, flux calibration, etc. - as opposed to a cutout which does not compute new data samples. For a spectrum, filtering at access time is probably not going to happen unless the service changes the dispersion or applies a flux calibration at access time (e.g., to help get the data in a more generic form), but it is possible.

Anyway, you are probably right that some clarification is necessary.

>>> Why do you want to single one type of response format out? Why not have
>>> them all equal?
>>
>> Not sure what you are referring to here; 1.4.1 describes the levels of
>> compliance of a service, not response formats.
>
> This was referring to "if you provide only one format, then VOTable is the
> preferred".

Thanks for the clarification.

Actually this is a good question: should we define a preferred format in which we would like to get spectra back? This could be desirable so that clients don't have to be prepared to deal with all the various kinds of data formats (the main reason we define multiple formats is so that the client can get whatever it prefers). If we suggest a format to support my tendency is to go with VOTable for spectra since it is much better than FITS for metadata, and ok in terms of efficiency in most cases for spectra, especially if protocol-level compression such as gzip is used.

>>> 2.3:
>>>
>>> How can you have metadata on virtual data? How should we anticipate all
>>> possible ways a user may ask for spectral cutouts, extraction etc. to be
>>> prepared to answer?
>>
>> This is what on-demand data generation and virtual data are all about.
>> The service describes the metadata of the virtual data product it would
>> generate.
>>
>> There are an infinite number of possible virtual data products.
>> You don't have to describe them all, rather, given what the client
>> requested, the limitations of your service, and the characteristics
>> of the data, you describe what the service would generate to best
>> match what the client requested.
>>
>> A simple example is if the client requests a certain bandpass range
>> and you have a cutout service, the virtual data product would be a
>> spectrum covering only the given wavelength region (or however close
>> the service can get given a range of other details).
>>
>> If the query is detailed enough, the query response may refer to a
>> single data product. Hence, the query mechanism may be used not only
>> for data discovery, but to negotiate with the service on the details
>> of the data product to be generated.
>>
>
> How do you know the SNR ratio a priori for all possible spectral cutouts?

You wouldn't know it without computing it from the data for the cutout region, but the interface assumes that the SNR may not be known for spectra in general, hence it is optional. Most metadata can however be computed for virtual data given the overall archival dataset values.

>>> 3.3.2.3:
>>>
>>> I thought that BAND is always a string. How can you have then "If a
>>> bandpass is spcedified as a string it is..."
>>
>> BAND is either a numerical bandpass (wavelength in vacuum in meters)
>> or a bandpass name (unspecified; prior discovery is needed to determine
>> the possible values).
>
> Yes, but the type is always string and never real number. It's confusing.

Ok, I see what you meant now. I was referring to the semantic type within the range-list, but the range-list parameter itself is always a string. The text should be clarified.

>>> Apertures need not be circular, so you may want to phrase the respective
>>> sentence different.
>>
>> Only circular apertures are currently supported for on-demand spectral
>> extraction and this should be adequate for point-source or compact
>> objects (even for Grism data). We could generalize this if needed,
>> but it complicates the interface.
>
> How do you avoid confusion when people actually want to search for a
> particular observing aperture? Wouldn't it be better to make this clear in
> the name like EXTRACT_APER?

In an earlier version of the interface we used the APERTURE parameter for this purpose as well. But it is confusing to overload the parameter, and searching by aperture gets complicated in any case as one might have rectangular apertures and so forth. Hence APERTURE is now used only for spectral extraction. As you suggest, it might be useful to use a more specific parameter name to ensure that people don't confuse the meaning.

There is no way in the current interface to query directly on the aperture size and geometry, but one can query on the spatial resolution, which for most spectra is roughly similar to the aperture size. This does not cover the case where the aperture is much larger than the spatial resolution, however this is probably rare enough to not be worth supporting directly in the query interface. One can always submit a more general query and refine the query on the client side using the more detailed spatial coverage information which comes back in the query response.

Received on 2006-11-20Z23:40:10