Hi Roy,
I understand your concern. My overall goals in this effort include
We are charting new waters here, so some things we won't get right the first time. Plus, things chang fast--its not possible to have everything at production quality. Let me ask you, as an NVO developer, a couple of questions:
More specifically...
On Tue, 16 Sep 2003, Roy Williams wrote:
> This registry schema is getting to be very complex. Even to understand the
> simplest xml instance, there need to be 6 or 8 schemas ingested. When we
> make binding tools for VOResource, there are hundreds of classes generated,
> one for each element.
In v0.8.1:
VOResource (core): 32 elements
VOOrg: 3 VODataService: 15 VOPerson: 2
> I am reminded of a
> Bill going through Parliament, having special interests adding their own
> pork-barrel projects. The rule in NVO is not to attempt completeness, but
> rather to get 95% of the use cases with 20% of the work. How can we return
> to this maxim?
I don't think we're trying for the 95%. What is in there, for the most part, I believe, come from current needs.
> (1) Is this schema modular? Do I need to parse all the optional modules in
> order to work with the core?
Yes, it is modular, and it is intended that you do not have to parse the optional modules in order to work with the core. Binding tools, however, may not be set up well to do that, while other parsing tools can handle this better. This is one of the things we need to learn how to cope with. This is the research.
Do you really want extensibility? Do you want to be able to define your "Elephant" resource with specialized elephant metadata? Then this is what we have to figure out how to do.
> What is the semantic nature of the core module?
It describes generic resources. DataCollections and Servives are specializations of a generic Resource.
> (2) What is the list of metadata formats that the registry covers? To me it
> is Services, Datasets, Projects, Organizations. Why are "people" still in
> the registry? Can't Astrogrid do their own thing somehow without bothering
> IVOA, since they are the ones that want this? They can make a "person"
> schema that includes VOResource, rather than forcing VOResource to include
> "person".
Recall from previous discussions:
Why do you feel "bothered" by "people"?
> (3) What small committee is responsible for additions -- and pruning -- in
> the light of experience? Let us form this in Strasbourg. What is the best
> number of people? 6? 10?
(I underscore Tony's response here.)
> (4) Why are there suddenly five kinds of linking relationship? If simple
> "citation" is good enough for the Journals, why is it not good enough for
> VO? Half the people filling in these forms will do nothing in response to a
> complicated question -- and so we lose metadata -- but they will recognize
> and respond to the word "citations".
Do we need to show that one resource is a mirror of another somehow? Is this an important issue for compatibility with ADEC identifiers, which are location-independent, and which you feel is a high priority?
What Tony has suggested is an approach to describing relationships that will actually reduce complexity in the future as we find the need to add more. It puts them all in one place.
> (5) If a Fortran programmer even older than me approaches the registry to
> publish, or to query, can we make something understandable for him/her? What
> does that form look like? Our primary purpose is capturing that metadata,
> not pandering to the most complex cases.
What would be helpful are some examples of simple queries or input forms we want to create and test whether we can accomplish this simply.
If you want to present a simple form, then leave out the bits you want. As I've mentioned, we've been working on schemes for doing this which I would be happy to share with you.
> (7) Am I the only one with these mutinous thoughts?
Probably not, I'm sure. However, I think that as IVOA developers, it is our job to put ourselves on the front line of difficult, complex issues. In doing so, we try to protect the lines behind us: the data providers and the users. Yes, we don't try to solve the entire problem now, but we also don't paint ourselves into corners by not thinking ahead--that's where modularity and extendible architectures come in.
I think on a few Z39.50 metadata schemas from previous decades. The BIB-1, used by librarians, has over 300 terms. GEO-1, for earth science, has over 500. If you look at these, you'll see incredible redundancy. Talk about complexity. How do we support the simple when it needs to be simple, and how do we extend to the complex without things getting unwieldy?
cheers,
Ray
Received on 2003-09-17Z20:18:12