Hi Pat -
> "dataset" is a heavily (over-)used word. To some people it means one or more
> related files from a telescope or archive (1+ images). To another, the whole
> SDSS source catalog is a dataset (ie. many RDB tables). There are cases
> where a "dataset" is a set of images, spectra, and a source catalog to go
> with it. I think the confusion comes from the fact that "data" is (over-)used
> to mean both the observational data (images, spectra, time series, etc) and
> the derived or extracted information (source catalogs, for example).
This is true, however I think we should define terms such as dataset and data collection precisely and consistently for use within the VO. The usage that we have been promoting is intended to be consistent with what is used in the data grid community, and is as consistent with astronomical usage as anything.
It is reasonably clear what dataset and data collection mean when we refer to an image data collection like the 2MASS survey. It is less clear what the terms mean, or if they apply at all, when we refer to a catalog. Is a point source catalog a dataset? Is it even data? Probably not. A catalog is derived information about data.
This is an important issue since we strive for rigor in how we refer to data within the VO. Perhaps we should develop a glossary of terms for VO.