Registry science metadata - Brown Dwarf example

From: Anita Richards <amsr-at-jb.man.ac.uk>
Date: Wed, 30 Apr 2003 13:43:02 +0100 (BST)

Here is an example of the science metadata which would be pulled out of the registry to answer one of the AstroGrid Science Cases.

I thnk that we need to be very clear about the difference between metadata needed to enable dataset selection (automatic, using VO algorithms, or just returning a list to the human), which can be handled in a separate stage(s) just within the registry, and the metadata used in actually evaluating the query; in the latter case we need to go and fetch the dataset, or send an agent to extract values from it in situ, etc.

The material below can also be found at
http://wiki.astrogrid.org/bin/view/Astrogrid/BrownDwarfRegistryRequirements

with links to explanations, the original case etc.

http://wiki.astrogrid.org/bin/view/Astrogrid/BrownDwarfMetadataList

BrownDwarf Registry Requirements - Science content

ResourceMetadata

Where values are strings, I have stated the options if specific values are needed. Otherwise, I have
just given the value type, and the query will select values within a specified range.

coverage "format" value = "ascii", = "VOTable"
coverage "decmin" value = decimal
coverage "sourcedensity" value = decimal
coverage "tablenrows" value = decimal
coverage "tablesize" value = decimal
coverage "startdate" value = decimal
dataquality "astrometryerror" value = decimal
dataquality "photometryerror" value = decimal dataquality "timingerror" value = decimal subjectkeyword value = "Null", = "Milky Way", = "Stars" "type" value = "catalogue", = "survey"
wavelengthrange value = "ir", value = "optical" wavelengthshort value = decimal
wavelengthlong value = decimal

Additional metadata if nDim data as well as catalogues are used: coverage "angularfraction" value = decimal coverage "format" value = "FITS"
resolution "angularresolution" value = decimal resolution "spectralresolution" value = decimal "type" value = "archive"
wavelengthrange value = "uv", value = "mm"

UCDs

Note these are not 'proper' UCDs as we do not yet have a final convention, but show the sort of
things which need UCDs. I am not differentiating between UCDs in the header e.g. to describe a
single overall positional uncertainty and UCDs for each column.

I hope we adopt some sort of modular or atomic UCD structure, and a capacity to cross-reference
columns, so that UCDs below which are not single words would be built up in a logical way but we do
not have to have e.g. PHOT_ABC where ABC is every possible frequency or filter. The exact order of
conditional queries (e.g. look for Optical first? or Photometry first?) depends on what UCD convention
we adopt. There may be more than one error per quantity, e.g. systematic errors in the catalogue
header plus random errors per entry.

StellarCluster
Star
Membership of StellarCluster
AngularPosition (RA, Dec)
AngularPosition Error (RA, Dec)
ID
Epoch
Epoch Error
ProperMotion
ProperMotion Error
Optical Colour, Photometry (I_Band, R_Band, numerical band spec.)) IR Colour, Photometry (K_Band, numerical band spec.)) Photometry FilterBandpass (for the above, e.g. Cousins, Johnson) Photometry Error (per band)
BrownDwarf
BrownDwarf Probability
Distance
Distance Error
Parallax
Parallax Error
DistanceModulus
DistanceModulus Error
DopplerVelocity
DopplerVelocity Error
ChemicalAbundance (Li, CH4 etc.)

Additional UCDs describing nDim image and spectral DataSets, if used.



http://wiki.astrogrid.org/bin/view/Astrogrid/BrownDwarfRegistryQueryExamples

Examples of searching the Registry:

For this example I consider only catalogues, not extraction of data from images, spectra etc. If we
were doing the latter we would need additional ResourceMetadata about angular coverage,
resolution etc.

I have used what I hope are JAVA arithmetical and logical operators in most places for brevity,
occasionally I have spelt out operators for clarity.

  1. RegistryQuery for potential Brown Dwarfs located in Galactic Clusters

1.1 Query ResourceMetadata

(format value == "ascii" || "VOTable") &&
(type value == "catalogue" || "survey") && (subjectkeywords value ==
"Null" ^ ("Milky Way" || "Stars")
&& (coverage "decmin" value != NaN))

This should select DataSets which are tabular (not images or other such - this iteration) and
contains measurments of sources (as distinct from a list of instrument pointings) (the suggested
values may not be a complete list) It either has no subjectkeywords, i.e. the things it lists are
unclassified, or they are explicitly classified as Milky Way or Stars.

The DataSets should have meaningful values of decmin, implying that they contains positional
information. This means that we must fill in angular coverage for any data set containing celestial
coordinates or object names which can be resolved by SIMBAD into coordinates.

1.2 From the DataSets meeting criteria in 1.1, query ResourceMetadata and sort DataSets
accordingly (perform steps in order)

coverage "sourcedensity" descending order (ie high=good) coverage "tablenrows" descending order for completeness or ascending order for speed
coverage "tablesize" ascending order
dataquality "astrometryerror" ascending order dataquality "timingerror" ascending order

This is an optional prioritisation step to allow only the DataSets which are more likely to be useful to
be selected, and/or to choose the order in which DataSets are queried or
(the interim results) moved.

Prioritisation could be applied at any later step. Other criteria like sensitivity could also be used. This
implies that, for the errors, NaN counts as very large. It would be simplest if Null values were not
allowed for sourcedensity and the size etc. of the DataSets

1.3 From the DataSets meeting criteria in 1.1, query UCD list

UCD == (StellarCluster || (Star && MembershipofStellarCluster?))

This should select DataSets which either explicitly list stellar clusters, or which list stars and states
whether they are members of a cluster.

3. RegistryQuery for existing proper motion and distance measurements

3.1 From the DataSets meeting criteria in 1.1, query UCD list

UCD == (ProperMotion)

AND 3.2 From the DataSets meeting criteria in 1.1, query ResourceMetadata

coverage "startdate" value != NaN

This should select DataSets with meaningful values of the epoch of observation (so proper motions
could be calculated if not already explicitly catalogued).

5. RegistryQuery to enable colour-colour selection:

5.1 From the DataSets meeting criteria in 1.1, query ResourceMetadata

wavelengthrange (value = "ir" || value = "optical")

This should select DataSets which explicitly contain optical or IR measurements.

5.2 From the DataSets meeting criteria in 5.1, query ResourceMetadata
((wavelengthshort (value x))

||
(wavelengthshort (value y)))

This should select DataSets which explicitly contain optical or IR measurements, or which cover at
least part of the wavelength range x-y which covers the I, R and K bands as defined in the optical/IR.

AND 5.3 From the DataSets meeting criteria in 5.1, query UCD list

UCD = (Optical && (Photometry && ((I_Band || R_Band || numerical band spec.)))) ||
(IR && (Photometry && ((K_Band || numerical band spec.)))) ||
((Optical || IR) && (Colour && (I_Band || R_Band || K_Band || numerical
band spec.)))

This should select suitable DataSets whether or not they have a detailed numerical description of the
wavelength coverage (ideally all should have but this may take a while to implement) - or which
already contain Colour information. In the case of e.g. K/R colour, I presume this would be classified
as both IR and Optical. Received on 2003-04-30Z14:44:51