Re: Beyond the draft proposal

From: Brian Thomas <brian_thomas-at-earthlink.net>
Date: Mon, 4 Feb 2008 13:01:51 -0500


On Monday 04 February 2008 10:55:04 am Frederic Hessman wrote:
> Starting to think beyond the IVOA draft proposal:
>
> Right now, the IVOAT vocabulary (a cleaned-up version of the old IAU
> thesaurus) doesn't really cover everything one might need, e.g. there
> are the folowing (expressed as tokens)
>
> JohnsonPhotometry
> RMagnitude
> Filter
>
> so don't we really need
>
> RFilter
>
> or even
>
> JohnsonRFilter?

Yes. there are too many 'compound' terms. These may generally be identified by the multiple words which comprise the token.

I guess the issue is if you remove the compound terms, ex "JohnsonRFilter", how does one specify that term then? Its a question of the 'grammar'. This is easily done in an ontology, but is it needed outside ontologies? Probably, but I would argue that the grammar is defined by whatever 'specification' is using the IVOA Vocabulary terms.

An example here might be the "VOTable spec" which could define its grammar  for using IVOAVocabulary terms in a VOTable as simple addition, e.g. to get "JohnsonRFilter" one simply creates "Filter+JohnsonPhotometry+RMagnitude" (and I suppose order is important here). I'd rather not postulate further about the difficulties/needs of this hypothetical system...it is just an example! Rather, the point here is that I think this is something we should not worry about.

At any rate, If some eager beaver desires a more 'standard' grammar, then that should be a separate document.

We definitely should try to remove compound terms which may be created from 'atomic' stuff already present in the vocabulary.

>
> This is an issue if the only way to use tokens is via rdf:resource :
> can one use
>
> <myFilter rdf:resource="ivoat:SloanPhotometry"
> rdf:resource="ivoat:filter" rdf:resource="rMagnitude"/>
>
> One could argue that IVOAT is already too bloated (it is) and that the
> best approach would be to divide it up into sub-vocabularies, e.g.
>
> math (since this goes far beyond astronomy alone and may need to be
> coupled with an international math vocabulary)
> [snip]

I started to write that I agree that the IVOAT should be broken up, but then, doing so is likely to spark a big debate on *how*.

Rather, I would say we should simply leave it as is, and let the ontology writers decide which terms to import into their ontologies (they hardly have to use *all* the terms). In this fashion, the terms may be easily referenced in the vocabulary, but won't bloat/slow-down the inference engines (necessarily..I suppose someone could desired to create a whopping huge ontology with all the terms, that would be nasty).

>
> Obviously, this can easily result in a flood of mini-vocabularies
> where no one knows where what can be found. On the other hand, do we
> really need SloanPhotometry alongside with Volcano in the same
> vocabulary?

I'd argue you want to remove institution names from the vocabulary, so SloanPhotometry should not be present. Vocabulary should be just general, workaday (in Astronomy) terms. Specific singular instances of things, like people, institutions, particular telescopes, should, in general, be avoided.

>
> Thus, I think it's about time we also discussed what all this whoop-
> dee-doo about vocabularies is actually good for and how the process/
> programmer/user on the street is supposed to deal with them.

They can be good for creating ontologies, and potentially also useful for mapping between ontologies. Please don't ask what ontologies are good for (!)

And I would also like to add that I'd like to see a *dictionary* of the vocabulary terms. This then would settle the semantic meaning of these tokens, which is the crutial missing link between a vocabulary to ontology usage. I have been rebuffed/ignored about adding the definitions to the SKOS vocabulary, could it be time to revive this? I don't think its impossible to add the definitions (many may be initially borrowed from other vocabularies/dictionaries such as WordNet), and you will find that there are many collisions in meaning in the existing tokens, which identify things which should be 'cleaned up' (probably by splitting the offending token into 2 or more other tokens each with separate meaning).

=brian

>
> Rick
>
>
>
Received on 2008-02-04Z19:02:39