Voting on gormat of tokens (was Re: IVOA Thesaurus)

From: Douglas Burke <dburke-at-cfa.harvard.edu>
Date: Thu, 01 Nov 2007 15:26:34 -0400

I vote for there being some form of
normalization/canonicalization/some-other-ization of the Human-readable terms. The important ones for me are all lower case (as I've found too many errors in my own work from case mismatches [*]) and the removal of problematic characters (or combinations of characters). I don't have a real opinion on whether spaces should be removed or replaced by "_".

[* Are there issues with this particular choice in Ed's ontology use-case below?]

Doug

Frederic V. Hessman wrote:
> At the time, there where lots of voices saying that, while you are
> perfectly correct (and I'd prefer to have them as humanly readable as
> possible), the realities of computer-based parsing mean that a trivial
> token format costs less pain.
>
> How about an official show of hands?
>
> Rick
>
> On 1 Nov 2007, at 5:32 pm, Ed Shaya wrote:
>

>> Rick,
>>
>>    Well, I vote to put back the underscores and the capitalization 
>> where appropriate.  There is no need to go out of one's way and make 
>> all IDs cryptic just to make a point about the concept of tokens.  In 
>> ontology these become the element names of instances and it is really 
>> handy to be able to readily discern what kind of instance it is by 
>> looking, rather than going to some lookup table.  We need some 
>> prescience here, not to be confused with pre_science.
>>
>> Ed
>>
>> Frederic V. Hessman wrote:
>>>
>>> On 31 Oct 2007, at 6:54 pm, Ed Shaya wrote:
>>>
>>>> What happened to the underscores between all of the compound words?
>>>> Ed
>>>
>>> A while back, we communally decided that the tokens should be as 
>>> compact and simple as possible, i.e. no caps, no diacritical marking, 
>>> no spaces, no underscores, not only to make them syntactically simple 
>>> but to emphasize that they are only tokens.  The text file still has 
>>> the underscores, but now only for historical reasons (i.e. the 
>>> original SV proposal).
>>>
>>> If everyone would rather see the underscores back again, no problem.
>>>
>>> Rick
>>>
>>

>
>
Received on 2007-11-01Z20:27:05