Hi Norman, all,
On Monday 04 February 2008 4:48:58 pm Norman Gray wrote:
>
> Brian and all, hello.
>
> On 2008 Feb 4, at 18:01, Brian Thomas wrote:
>
> > On Monday 04 February 2008 10:55:04 am Frederic Hessman wrote:
> >> Starting to think beyond the IVOA draft proposal:
> >>
> >> [snip]
> >
> > Yes. there are too many 'compound' terms. These may generally be
> > identified
> > by the multiple words which comprise the token.
>
> I smell mission creep!
Sure, but the man asked, AND, he indicated that he was thinking beyond the draft proposal, so all of my suggestions where indicative of post-draft work.
>
> Remember that we're defining _vocabularies_ here. One of the main
> distinctions between vocabularies and ontologies is that the former
> service a different goal from the latter. That goal is searching, or
> something very like it; vocabularies are much closer to humans -- to
> UIs -- than ontologies are, and in consequence they are inevitably
> messier.
I'm aware they are different, however, I don't like messy things, and its not clear to me that a vocabulary is purely for human-machine interaction, as seems to be implied above. Perhaps it is, but I still don't understand why that makes the vocabulary necessiarily messy. Having a controlled, clean set of unique tokens, seems to me a very good thing. Do we really have to canvas every possible meaning, and way of expressing that meaning, into the vocabulary? Some terms seem to be of very limited utility. I point to the earlier example of having "volcano" included as a token/term. As for the messiness, well, its a natural impulse for me to want to see the plural terms (floating in a sea of singular tokens), the repetition of meaning between more than one token, the degeneracy in meaning of a token removed or cleared up. As for compound terms, I didn't say we should have none of those, but perhaps the IVOAT is just a bit overboard in this area. Witness the already large size of the thing. Cutting some of these out will surely remove "bloat". I realize that there is impetus to 'cut a draft'. I agree with this. But if you ask what might be done to 'clean up' the draft, *after* creating an initial release, then I answer.
> [snip]
>
> Vocabularies have to comprise the terms that users actually use,
> minimally tidied up. The result may be messy and hard to reason with,
> but that's OK, because the world is messy, and we don't want to reason
> with vocabularies.
I think we are back at the issue of "who" is the "user". I think the user is drawn from a more technical group: data archivists, application writers and the like. It is not scientists or students, which I think is your audience. My audience, I believe, is much more comfortable with atomic terms, unique meanings for tokens, and simple grammars with which to express compound terms. Why would I label data in my archive with messy terms? Why would I search for data when the search will result in a combination of objects coming back which are dissimilar/unrelated? Why would I ship out a table to an application where the meaning of the columns is not unique. Having messiness like this, well, its nuts, and makes the vocabulary fairly useless for practical applications.
>
> And I think the result _should_ look much like the IAU original. My
> impression of what was being aimed at in the IVOAT was a tidied up and
> updated IAU93. Let's keep it simple and quick.
Yes, well, we are beyond simple and quick now. To my mind that would have encompassed no more than technical editing (just enough to get the IVOAT into SKOS). But we have added terms and have (at last count) 4 vocabularies in total (are all of those going into the draft??). So its a matter of opinion that the process has been sufficiently limited. >[snip]
Soo.. you are in favor of including something beyond repeating the token name under skos:description?
>
> > which
> > identify things which should be 'cleaned up' (probably by splitting
> > the offending token
> > into 2 or more other tokens each with separate meaning).
>
> I'm all for cleaning things up, but I think we need to be vigilant
> against this tidyup turning into a full-blown ontological exercise
> which, as the DM group can tell us, could end up taking five years
> before anyone notices it's late. How about an effective definition of
> 'cleaned up enough for the IVOAT initial release' being 'whatever can
> get done in the next month'?
Sure. My bar for what is 'good enough' for an initial draft is actually pretty low. The present draft is probably sufficient for community comment. Why do much more work?
Regards,
=brian Received on 2008-02-05Z00:01:09