Keith.
-- Keith Noddle Phone: +44 (0)116 223 1894 AstroGrid Project manager Fax: +44 (0)116 252 3311 Dept of Physics & Astronomy Mobile: +44 (0)7721 926 461 University of Leicester Email: ktn-at-star.le.ac.uk Leicester, UK LE1 7RH Web: http://www.astrogrid.orgReceived on 2007-10-03Z17:49:44attached mail follows:
Doug, Ray, Just let me answer to both of you about the necessity of an information schema, which I think is NOT the best way -- neither for storing the metadata related to tabular data, nor for data discovery. 1. For storing the metadata related to tabular data: a) Please consider that the numerous tabular data servers which are (and will be) connected to the VO may have different opinions to yours. For instance, a server producing a table which is a result of some "on-the-fly" computation would most likely not like having to manage a data-base just to store an information schema. Should TAP be discarded as a protocol for such servers ? b) A full information schema is quite complex -- because the description of the relations which exist between the tabular schemas may be quite complex. Complex relations are better described by object models -- yes the relational model has severe limitations ! c) on the opposite of Ray's opinion, it is generally not easy to extend the information schema: complexifications require changes of the schema of the information schema implemented on hundreds of data servers, not a trivial task ! 2. For querying on the basis of the metadata: Would it be possible to start from typical examples ? Here are some: a) Give me the list of all tables that you store in your data server. This looks simple, Doug. But the the result from the "TABLES" information schema is missing all relations between tables, which is an important piece. And anyway an information schema can't be required to deliver this information. b) Give me the list of the columns of tables making up the 2MASS survey. Again this looks simple. There are 2 tables (point sources and extended sources, if one restricts to the currently public datasets). These tables are related (an ext_key indicates related sources). A join between the "TABLES" and "COLUMNS" tables of the information schema would again miss this link. And as above, for full lists, an information schema can't be a requirement. c) Find tables giving spectroscopic redshifts for bright AGNs (Bmag<18) Implied searches: either "AGN" (or AGNs, or active galactic nuclei) in the table titles, plus a requirement for at least one of the columns having a ucd = "src.redshift" plus another "phot.mag;em.opt.B". Not easy to formulate this with an SQL query (it implies a junction of the "COLUMNS" table with itself). Moreover the object class might be hidden in some column containing object classification. d) Find tables containing IR fluxes in the 1-8micron range in my prefered star-forming region. This would imply a possibility of finding tables having something in a given region, plus a choice of columns giving infra-red fluxes in the appropriate wavelength range. Inserting REGIONS as fields of some table of the information schema is not a trivial task, we are currently far from this. Ray or Doug, could you quote practical examples where the existence of an information schema does really simplify the metadata search ? 3. VOTable and TAP. Doug, what else than VOTable could be the output from TAP ? FITS and CSV are obvious, but obvisouly these are badly lacking the fundamental pieces of metadata which are necessary for most data processing. You may have noticed that I'm trying (as usual) the bottom-up approach. Storing metadata of tables in tables looks consistent and attractive -- and I did find it attractive when I started to deal with relational databasees 20 years ago -- but from a practical point of view it does not look to be the best solution for storing all metadata (the information schema becomes rapidly quite complex), and the SQL/ADQL languages are not adapted for the data discovery which requires mainly textual searches (mainly "a la Google" searches). Finally I feel important to leave some freedom to the data providers -- from the beginning the VO claimed that the VO would not imply huge constraints! Cheers, francois
================================================================================
Francois Ochsenbein ------ Observatoire Astronomique de Strasbourg 11, rue de l'Universite F-67000 STRASBOURG Phone: +33-(0)390 24 24 29 Email: francois-at-astro.u-strasbg.fr (France) Fax: +33-(0)390 24 24 32
================================================================================