Hi Rick,
I have just been analysing your IVOA Thesaurus. (The version I have used is Wed Oct 24 15:25:24 2007.)
First of all I would like to talk about top level concepts. I found that there were 2,646 top level concepts. I feel this very high, particularly as there are only about 2,850 concepts in total. This is also particularly bothersome if you analyse the IAU Thesaurus which only has 516 top level concepts.
>From this I have a question for everyone, how many top level concepts
does it make sense to have?
My opinion about top level concepts is that they should aid the user to explore the vocabulary. As such, I believe that there should be only a limited number of top level concepts that can be easily displayed at one time, e.g. at most 50 but probably significantly less.
In going through the concepts that appear at top level concepts, I can see plenty of terms that appear as top level concepts that I would think are narrower than other terms, e.g. I would think that telescope mountings should be a narrower term than telescope. Likewise for the many other concepts to do with different features of telescopes.
My second issue is to do with maintenance. In going through the vocabulary I have found quite a few errors that have crept in due to the editing process. (The list can be found at the bottom of this email.) Some of these are simple typos but others are where identifiers have been changed for a concept and the subsequent edits have not been made for all of the relationships involving that edit. Others make the vocabulary poly-hierarchy inconsistent, e.g. acceleration of gravity has gravity as a broader term but the inverse relation is not present in gravity. I think that we need to develop scripts that can check that the inverses of all of the relationships that are declared in the hierarchy are present. This would be based on the related relationship being symmetrical and the broader/narrower being inverses of each other.
I hope this all helps. Rick, please keep up the good work on generating the IVOA thesaurus.
Alasdair
Errors that I have identified in the thesaurus:
The following are labels that I believe appear as identifiers in a relationship but have not been defined as concepts in their own right. This is probably due to typos or the changing of identifier labels.
faberjacksondistances
ugeminoriumstars
ubvrijklphotometry
light
differentialimagemotionmonitor
neutralhydrogen
yerkesgalaxyclassification
tsubdwarfstars
blackdwarfs
correlations
Igalaxies
cgalaxies
millimeterwaves
Isubdwarfstars
bose0einsteinnuclei
period0colorrelations
rareearths
whitedwarfs
tullyfisherrelations
seconarymirrors
oortsconstants
kuiperbelt
lineofsight
multislitspectraphs
multifiberspectraphs
cosmologicalparameters
seti
Alasdair J G Gray <http://www.dcs.gla.ac.uk/~agray/>
Research Associate: Explicator Project
Computer Science, University of Glasgow
0141 330 6292 Received on 2007-10-25Z16:42:55