re: VOResource v0.8.2

From: Anita Richards <amsr-at-jb.man.ac.uk>
Date: Thu, 25 Sep 2003 12:17:12 +0100 (BST)

Ray et al.,

The organisation of the new schemas suggested by Ray http://www.ivoa.net/forum/registry/0591.htm seems nice and logical to me, with a few possible omissions. I think some of my detailed points are similar to those made by Tony, e.g. on links to articles http://www.ivoa.net/forum/registry/0634.htm which I have made more specific.

I apologise if I have missed anything but already included but it would be nice to have a text version, either an updated RSM or a tree with comments which exactly corresponds to the schemas; also if we can try and keep replies with subjects which attatch them to the right thread. There are some general points which concern me:

  1. Testing schemas How are we are going to test the schemas by generating xml describing sample data sets, services etc.? Filling a few in by hand is the best way I have found of understanding what is implied and whether it matches Registry needs. These schemas are at the limit of my understanding of xml and Arnold Rots' STC schemas are way beyond it, but I do not know how to generate complete xml templates (using Oxygen) when many elements have 'minOccurs="0"' - please can someone enlighten me?
  2. Populating the registry How are service and data providers going to supply the necessary information? The most widely floated suggestion so far is web forms, but these should be kept to about 2 pages max or they won't get filled in. They need to be self-contained (e.g. each element has a pop-up with definition, restrictions - in English) and the processing script needs to translate obvious language into restricted terms (e.g. basic data to BasicData), convert units approximately etc.

Where data sets are already in Vizier or other services with standard metadata some information can be harvested, and other elements are VO housekeeping which we can supply. Not all the elements are relevant to any one service. However we should sketch a few dummy forms (or other methods e.g. email questionnaire) to check that it is not all getting too complicated for data/service providers without full-time archivists etc. - or for most VO staff!

3. Detailed comments from AstroGrid work Some of these points probably contradict my plea above for simplicity, if any element is consistently unfilled in the first dozen services/data sets registered then it should probably be dropped. Also, I think that the registries of IVOA members should contain at least all the IVOA elements but there may be others which only we want. However these are what I think are missing or need significant ammendment:

Subject
VOResource-v0.8.2.xsd
defined as
"Terms for Subject should be drawn from the IAU Astronomy Thesaurus

           (http://msowww.anu.edu.au/library/thesaurus/)"

I agree that the IAU Astronomy Thesaurus should be the definative document, but it needs updating, expanding and major work to produce a namespace. The IVOA should probably get the IAU to get it (the IVOA) to update it but in the meanwhile I suggested several alternatives http://wiki.astrogrid.org/bin/view/Astrogrid/RegistryKeywords and adopted one (based on the cut-down set of NASA keywords used by CDS) which I have set up as a namespace.
http://wiki.astrogrid.org/pub/Astrogrid/RegistryIt03Metadata/keywords.xsd

Documentation
VOResource-v0.8.2.xsd
More constraints on what this could contain: As Elizabeth Auden noted http://www.ivoa.net/forum/registry/0625.htm, a link to a Vizier-style ReadMe (e.g. Vizier itself) would be useful as the most concise way to describe catalogues for humans. The VO would not directly search this but it would be readily available to the user if they want to make a manual catalogue selection. Ideally this should contain the UCDs, if available, alongside the native column headings.

UCDs
These should be in the Registry if they exist, possibly under Documentation or in the section which includes Coverage. Attempting to impliment the AstroGrid science cases showed that being able to search for terms like Proper Motion was very useful. If UCDs are not available nothing is lost but the selection may be more complicated.

source - bibcode etc.
In RSMV0.8 this is under content but it should be under Documentation I think.

Data set size and query restrictions, included in AstroGrid registry This will be useful e.g. to help decide which catalogue to search first in a cross-matching exercise; whether it is necessary to call a cut-out server; how long a job will take. RSMv0.8 contains examples of query restrictions (e.g. max. radius of cone search) which seem useful but are not here.

DataService
Elizabeth Auden proposed http://www.ivoa.net/forum/registry/0625.htm that we need a broader term than SkyService, and in fact we need to cover non-observational catalogues like lists of phenomena and other non-positional data - hence I suggest DataService (rather than the subdivisions Elizabeth suggested, as some data may not be easy to categorise).

Coverage
Spatial, Spectral, Temporal - maybe Resolution should be one of the subelements in each of these, along with dimensional coverage, accuracy, regionOfRegard etc. - maybe depth/sensitivity is extra? RSMv0.8 seemed to do very well, with a little reorganisation. The STC documentation seems very comprehensive but we will only find out how complete the related schemas are and how far it is necessary to go when using them to answer queries. The Registry should only use a tiny fraction of their complexity and will impose added restrictions (I hope!) e.g. units.

Otherwise I support Elizabeth Auden's suggestions, especially the need to include services like MySpace.

thanks

a

Received on 2003-09-25Z13:17:57