Re: role of the registry

From: Anita M. S. Richards <a.m.s.richards-at-manchester.ac.uk>
Date: Thu, 10 May 2007 18:57:12 +0100 (BST)

>

> The registry was conceived as a resource location service. The formative
> document describes resource metadata.The name of the working group is
> Resource Registry. There has been debate for a long time on how far to push
> the registry in the direction of detailed service-level metadata.
.....
> This is not unique to the registry, but it is clear in the case of
> the registry that even the simple stuff is not done well or completely.

If you mean that people don;t use it 'well or completely' then I agree, but there is a difference between implimenting it wrongly, and implimenting it correctly but superficially. As a data provider, I want a tool to help me register the content, including coarse coverage, and basic curation and access information. I will also provide tools/services particularly useful for these data and I want to be able to describe them, too. If I published tables, then the column names and metadata _might_ be appropriate. On the other hand, if I have 100 tables each with 100 columns, it might not be.

At present, almost all this is covered in the human language RM document, although tool publishing is a little rough. I am distressed to learn from engineers that the schemata may not be consistent with this, and even more distressed if I am told that in order to understand the model, I have to read the schemata (as mentioned in another posting). I am probably fairly typical of data providers in that, given a nice XML editor - or better still, a form interface - and some instructions and examples, I can just about fill in a document, but not evaluate it critically and we should not expect users to have to read schemata!

Even if you don't agree that column names are useful, they are optional, like much else in the RM document. Much as I rant about data providers who dont include proper coverage information, I agree with it being optional because ultimately the providers have to decide what is appropriate for their data sets.

> I have never been convinced by the arguments for including things like table
> column names and UCDs in the registry. The use case recently cited, that of
> dynamically building SEDs by locating catalogs with relevant columns would,
> I think, never be trusted by astronomers doing research.

On the contrary, Bob, it is invaluable at present via Vizier or AstroGrid or even Google _because other more accurate metadata aren't always available_. UCDs in particular are useful. It depends what you are doing, but for finding potentially interesting data it is certainly useful. You can always check more thoroughly as needed...

> I suggest that everyone take a deep breath, step away from their desks for a
> while, and try to recall what the VO is really about. Key issues are data
> discovery, interoperability, and a low barrier to participation.

Exactly, which is why in my view, we should prioritise a) making sure that the human-language documentation and the schemata are consistent;
b) encourating affiliated VOs to develop tools to make it easy to register resources;
c) if necessary, improving the provision for registering tools and finally
d) make sure that things are labelled as 'must', 'should', 'may' or equivalent so that the buy-in is very simple but people who do want to provide more information can do so, especially where there are tools to use it (as is the case for column names and especially UCDs!).

thanks
a

Received on 2007-05-10Z19:57:34