Brian and Rob, hello.
On 2007 Nov 21, at 18:35, Rob Seaman wrote:
>> Those arguments are to do with audience (expert vs. non-expert)
>> and previous investments (three important journals already have
>> actual resources tagged with actual vocabulary items).
>
> So there are a number of curated vocabularies, one of which (the
> "IAU Thesaurus") is called a thesaurus, but is actually a
> vocabulary like all the rest?
Until today, I'd had the nagging feeling that a thesaurus was something more exotic than it is. But no, it's just a vocabulary plus light structure. So...
> As far as audience, we should refer to our work products by names
> designed to reach the non-experts.
Absolutely. And I think that 'vocabulary' would do perfectly well for everyone.
I think we should carry on SKOSifying the structure already contained within things like A&A, AOIM and IAU, because it will be valuable and is free, but I suggest that outside this group we talk exclusively about 'vocabularies', on the grounds that no-one but us knows the distinction, and even we're not that bothered. Yes?
Brian:
> Human-to-machine interaction is not particularly relevant in my mind
> at this time, as it is, and is likely to be, an interface which is
> crafted by
> the individual archive/repository/tool builder. IF we were going ahead
> to specify a natural language query (NLQ) in which the terms of the
> thesaurus
> were to be used, then I can see a need for it. But a NLQ (particularly
> one which may be executed across the entire IVOA!!) is just far,
> far away
> and not as pressing as the issues of dataset labeling, machine to
> machine interchange and development of machine understanding of
> data (e.g. ontologies).
That's interesting -- I see it as very much the other way around!
I'm not thinking of full-scale NLQ, but simply getting the machine to do something a bit brighter when a user types 'cataclysmic binary' into a VOExplorer search box. "Ah: 'cataclysmic binary' is part of an altLabel of the iau#cataclysmicvariablestars concept, so I'll make that concept-query of Registry++; mmm, not many hits, so I'll speculatively query iau#binarystars and iau#variablestars as well and offer those to the user. In any case, by this time we're in logic- land, so I'll find what CDS-AstroOnt ontology classes have iau#cataclysmicvariablestars as a relatedConcept (say), because I know that the CDS-AstroOnt classes have links to SIMBAD terms, so I can hit the SIMBAD database, too. Plus, via inter-vocabulary links, I now know what A&A concepts these relate to, and from their prefLabels know which strings to look up in ADS." And so on.
Now, Brian and Ed could tell (and have frequently told) a very similar story using only ontologies; indeed _I've_ told a similar story using ontologies. But ontologies can do more than this, all the way up to machine understanding of data, and we will need this, just as you say.
What I see the vocabulary stuff as doing is a couple of relatively simple things:
So that's why I see the vocabularies stuff as being easier, bringing short-term gains, and providing a route into the fuller ontologies work to come.
> If "Thesaurus" automatically implies machine-to-human interaction,
> then
> I apologize, and then move that we change to a compatible term which
> implies "machine-to-machine" instead (vocabulary? dictionary?)
I think 'ontologies' is good....
All the best,
Norman
-- ------------------------------------------------------------ Norman Gray : http://nxg.me.uk eurovotech.org : University of Leicester, UKReceived on 2007-11-21Z22:19:28