Sorry, forgot to include everyone in the reply.
Alasdair
Alasdair J G Gray <http://www.dcs.gla.ac.uk/~agray/>
Research Associate: Explicator Project
http://explicator.dcs.gla.ac.uk
Computer Science, University of Glasgow
0141 330 6292
From: Alasdair Gray
Sent: 14 November 2007 10:24
To: 'Frederic V. Hessman'
Subject: RE: Format of tokens
Hi Rick, All,
Comments in line below preceded by [AG]
From: owner-semantics-at-eso.org [mailto:owner-semantics-at-eso.org] On Behalf
Of Frederic V. Hessman
Sent: 13 November 2007 17:09
To: IVOA semantics
Subject: Re: Format of tokens
My concern is that there is a discrepancy between Rick's SKOS model generated by his script and the original files. My feeling is that the SKOS model representing the IAU Thesaurus that is to be published by the IVOA should be an accurate model. If we cannot produce an accurate SKOS model but claim that it is, then people will not trust the IVOAT or any of the semantics works involving vocabularies and ontologies.
The problem was simply that I had forgotten to delete the entries which turned into aliases. The real raw statistics are
Number of initial entries: 2950
[AG] Glad to see that we agree on this figure.
Number of explicit narrower entries (with BTs): 1226
Number of explicit broader entries (with NTs): 512
Number of entries with references (with RTs): 2134
Final number of SKOS Concepts: 2551
[AG] We are within 1 here which could be an error on my behalf.
Number of TopConcepts: 1325
[AG] I do not agree with this figure (see next comment).
Thus, you can't assume that the BT's and NT's are all present in the original (trex.txt). Alasdair's figure of 512 top concepts assumed that the IAU thesaurus was reasonably complete and self-consistent.
[AG] I cannot claim to have looked closely at the BT/NT relationships in
the original (trex.txt) file. However, the IAU thesaurus also issues a
hierarchy file (hierlist.txt). This file gives the hierarchy of the
original thesaurus and it is this that has 516 top level concepts. Rick
has assumed that a top level concept is one that does not have a broader
term. For the IVOAT I would agree with this as it should result in a
less confusing hierarchy that matches users expectations. However, for
the IAU93 this is wrong as it results in a different number of top level
concepts (although I would have thought that it would have been less
then 516 since some of these terms appear as narrower terms of other
concepts) and thus a different hierarchy from the original version of
the thesaurus.
Well, better than you thought and better now that I've found the (latest) bug.
[AG] I'm afraid I have to disagree with you here.
[AG] Now correct on the web version too J
This problem is solved (it was the bug).
Frankly, the original document uses (practically) all capitals and we want to convert the original thesaurus using as few changes as necessary (the only point of doing it), so why not keep the original labels? If people hate to be shouted at and think that the IAU93 isn't very user-friendly, all the better. Any other format will have problems: e.g. you don't really want to turn "BAADE WESSELINK METHOD" into "Baade wesselink method" - you want people to use the IVOAT and see "Baade-Wesselink method".
Please see the appropriate thread in the semantics list for a full discussion of this issue.
... from which you'll see that there are few people who really care. I still haven't seen any recent complaints about compromise notation # 2 but previous stronger complaints about #1 and #3. Barring complaints can we simply adopt #2? There is not perfect solution (e.g. "Ba II stars" -> "BaIiStars", which looks like something else).
[AG] I am happy with option 2 also.
Once we have agreement on these issues, then the results can be applied to the IVOAT.
... and the rest of the thesauri we're going to generate in this exercise.
Cheers (I think I'm going to go for a long drink to recover from this),
Now you all know how many beers you all owe me.
[AG] Absolutely. I am extremely appreciative of all the work that you
have put into generating these models.
Alasdair
Rick
Alasdair J G Gray <http://www.dcs.gla.ac.uk/~agray/>
Research Associate: Explicator Project
http://explicator.dcs.gla.ac.uk
Computer Science, University of Glasgow
0141 330 6292 Received on 2007-11-14Z12:15:58