Re: Format of tokens (was Re: Fwd: Re: IVOA Thesaurus)

From: Alasdair Gray <agray-at-dcs.gla.ac.uk>
Date: Fri, 02 Nov 2007 09:40:10 +0000


 > On Thursday 01 November 2007 6:32 pm Ed Shaya wrote:

 > Well, I vote to put back the underscores and the capitalization where appropriate. There is no need to go out of one's way and make all IDs cryptic just to make a point about the concept of tokens. In ontology these become the element names of instances and it is really handy to be able to readily discern what kind of instance it is by looking, rather than going to some lookup table. We need some prescience here, not to be confused with pre_science.

I would like to address Ed's comment about ontologies first and their difference to vocabularies modelled in skos. In the skos world of vocabularies, a user is not meant to use or see the ids. The application that they use should display the preferred label which all concepts should have (in all our examples they do have preferred labels). A concept without a preferred label does not really make sense either in a skos model or the real world as humans will use some language phrase to describe the concept. Thus, the idea that the ids become the element names of instances does not follow for vocabularies.

Brian Thomas wrote:
> On Thursday 01 November 2007 5:50:14 pm Matthew Graham wrote:
>
>>> Having it consist of only lowercase alpha means (a) we're guaranteed
>>> to avoid any parsing troubles, with RDF parsers or with anything else;
>>> (b) it's clear to anyone looking at this that they're not supposed to
>>> be displaying the concept name, but using the concept's 'Label' and
>>> declared relationships instead; while (c) it retains some mnemonic value.
>>>
>>>
>>> There is a case which can be made for having fully opaque concept
>>> names (this is what's done in the Gene Ontology, for example): it's
>>> point (b) above, plus it removes any temptation to argue about
>>> relationships based on the name alone. Despite that, I think there's
>>> value in making it at least partly human-recognisable.
>>>
>
> So..it would seem that the strongest reason is a). Point b) is really a stylistic
> one (but it seems to require that we always provide rdf:label when this
> style of naming concepts is in effect).
>

Again, this is not quite correct. It does not require every concept to have an rdf:label but it does require every concept to have a preferred label. However, as I said above, it does not make sense to have a concept without a preferred label.

Alasdair
> Ed's point that underscores can help to disambiguate names which need
> 'spaces' also seems to have some value. I note that you really can't
> go part way, either something is human-recognizable, or it isn't. If
> con_science is not to be confused with conscience then I suppose
> the author of the node can invent some convention to suit their needs, but
> its likely to take up more characters (like 'conunderscorescience').
>
> Is there really a parser out there which can't handle an underscore? Or is
> it that people are worried that if we allow this one non-alphanumeric
> character more will follow?
>
> So...unless I hear dire predictions of "slippery slope, slippery slope" or
> how any RDF parser can't handle underscores in names I'm mildly in
> favor of the underscore.
>
> =brian
>
>
Received on 2007-11-02Z10:40:47