Re: VOSpace URIs

From: Guy Rixon <gtr-at-ast.cam.ac.uk>
Date: Wed, 3 May 2006 10:16:22 +0100 (BST)


Paul,

some thoughts on your proposal, and a counter proposal that preserves the use of the ivo: scheme without messing with the fragment or query parts of the URI. This is a looong email; sorry, but your ideas are deep and I can't make the counter argument shorter. :)

Firstly, I'd like to note that when I originally proposed a vos: scheme (http://wiki.astrogrid.org/bin/view/AG2/IvoNamesAndLocators) it had different semantics: _any_ VOSpace node could resolve any VOSpace identifier. Therefore, the vos: URI both named the data item and said how to resolve it; and that resolution was independent of the registry, presuming a client knew of one, "local" VOSpace service.

Your proposal isn't independent of the registry. The _client_ has to resolve a vos: URI to an ivo: URI and then find the right VOSpace service in the registry. "Standard URI-parsing software" won't help with this.

Secondly, I'm not sure that your vos: really has "locator-style semantics". A vos: URI doesn't point directly to anything concrete that can be downloaded in the manner of a web page, and the VOSpace semantics are quite different to those of HTTP, FTP etc.

Thirdly, having re-read RFC3986, I don't see anything illegal in an IVOID's use of the query fragment to represent the path into VOSpace. From the RFC:

"3.5. Fragment

   The fragment identifier component of a URI allows indirect    identification of a secondary resource by reference to a primary    resource and additional identifying information. The identified    secondary resource may be some portion or subset of the primary    resource, some view on representations of the primary resource, or    some other resource defined or described by those representations. A    fragment identifier component is indicated by the presence of a    number sign ("#") character and terminated by the end of the URI.

      fragment = *( pchar / "/" / "?" )

   The semantics of a fragment identifier are defined by the set of    representations that might result from a retrieval action on the    primary resource. The fragment's format and resolution is therefore    dependent on the media type [RFC2046] of a potentially retrieved    representation, even though such a retrieval is only performed if the    URI is dereferenced. If no such representation exists, then the    semantics of the fragment are considered unknown and are effectively    unconstrained. Fragment identifier semantics are independent of the    URI scheme and thus cannot be redefined by scheme specifications.

   Individual media types may define their own restrictions on or    structures within the fragment identifier syntax for specifying    different types of subsets, views, or external references that are    identifiable as secondary resources by that media type. If the    primary resource has multiple representations, as is often the case    for resources whose representation is selected based on attributes of    the retrieval request (a.k.a., content negotiation), then whatever is    identified by the fragment should be consistent across all of those    representations. Each representation should either define the    fragment so that it corresponds to the same secondary resource,    regardless of how it is represented, or should leave the fragment    undefined (i.e., not found)."

The first paragraph sounds very much like what we are about. The last suggests that there is an implied media-type for an entire VOSpace that defines the semantics of a space and the syntax of an IVOID fragment pointing into a space...which is _exactly_ how I see it working.

Fourthly, I _do_ use the resource keys in IVOIDs to express hierarchy. The resources in the uk.ac.cam.ast namespace are organized that way.

Fifthly, I can see that we might sometimes want to add a fragment or a query to the URI for an item in VOSpace. We can't do this if we've used up those parts of the URI syntax to point the thing itself. This strikes me as the strongest argument for dropping the current use of ivo: in VOSpace.

Summary: I don't feel the philosophical or legalistic needs to introduce vos:, but I agree that the proposed use of ivo: won't do. Therefore, I propose a different use of ivo:.

 ivo://org.astrogrid/path/to/vospace!/path/to/data/node#etc?etc

In this, the resource key of the data-node in VOSpace includes the resource key of the space itself and there is a stop character ! marking the end of the latter. The ! is part of the space's resource-key; it's not a separator, it doesn't apply to IVOIDs in general and doesn't have to be written into the IVOID spec. The use of ! and the special semantics of the resource key would be part of the VOSpace specification.

Sun did similar in defining URIs for use with .jar files.

We could put the ! on the front of the data-node path instead of the back of the space's path. But I suspect it's easier to force all VOSpace registrants to use ! correctly than to exclude it from all node names in VOSpace generated by users.

Cheers,
Guy

On Tue, 2 May 2006, Paul Harrison wrote:

> Early drafts of the VOSpace (or more accurately VOStore) standard
> proposed defining locators for the space by dereferencing using the
> "fragment" (#) extension of the IVOA standard identifier scheme ivo:
> so for example a reference
>
> ivo://org.astrogrid/vospace#my/pathto/mydata
>
> would point to a particular data holding called mydata in a container
> called "my/pathto/" on a vospace server the location of which could
> be found by looking in the registry entry with identifier "ivo://
> org.astrogrid/vospace"
>
> For the particular case of VOSpace I would like to propose that
> this practice is not followed. I believe that the because VOSpace is
> a complete namespace of its own with locator style semantics, it
> deserves its own URI scheme. There are the following advantages;
>
> 1. Client software and humans could immediately recognise that they
> had been given a VOSpace URI by trivial inspection of the namespace
> prefix, as opposed to having to do at least one dereferencing
> operation in the registry to check if the URI does indeed refer to
> VOSpace. The example given above could refer to a VOEvent for
> instance if someone had been perverse enough to register a voevent
> server with an identifier that looked more like a vospace. One of the
> aims of VOSpace is to make it easy to share data, and so it should
> also be easy to recognise and manipulate the locators for the data.
>
> 2. A VOSpace specific scheme could more closely follow the general
> URI semantics and syntax as laid down in http://www.ietf.org/rfc/
> rfc3986.txt so that they could be manipulated by generic URI parsing
> software - i.e.
> a) Use the general form that indicates a "/" to be interpreted as
> part of a locator hierarchy - it is unfortunate that the ivo: scheme
> allows the use of "/" in identifiers when there is no implied
> hierarchy in that scheme.
> b) Use the "fragment" and "query" parts of the URI directly in a
> VOSpace scheme rather than these syntactic constructs having to be
> used simply to delimit what belongs to the ivo: scheme and what it
> part VOSpace reference. It was an early design aim of VOSpace that it
> should be possible to reference parts of the the data objects - e.g.
> columns in a VOTable file, and it would be beneficial if the URI
> scheme could support this simply and directly.
>
> With these advantages it remains only to define the VOSpace URL
> scheme following the advice in http://www.ietf.org/rfc/rfc2718.txt ;
> The general idea would be to have a scheme vos: that had similar
> semantics to the http: scheme with general structure
>
> vos://authority/path?query#fragment
>
> The authority part is intended to be the ivoa identifier for a
> VOSpace server so that the registry would play the same name lookup
> service role that DNS typically does with http URLs. The main problem
> here is the fact that the IVOA identifiers already use "/" as part of
> the identifier which make the direct use of the unaltered IVOA
> identifier impossible, as the "/" that was part of the IVOA
> identifier would be interpreted as the start of the path part of the
> vos: scheme. In this case the "/" needs to be replaced by another
> reserved character - preferably without another common meaning (and
> in the "discouraged" set of characters in the IVOA identifiers
> standard - "!" or "$" would be candidates, so that the original
> example above could be represented as
>
> vos://org.astrogrid!vospace/my/pathto/mydata
>
> it might even be a good idea to prefix the authority part with a "!"
> also to make it clear that the authority should not be interpreted as
> an ordinary IP DNS host name representation.
>
> vos://!org.astrogrid!vospace/my/pathto/mydata
>
>
> Paul Harrison
>
>
>
>
>

Guy Rixon 				        gtr-at-ast.cam.ac.uk
Institute of Astronomy   	                Tel: +44-1223-337542
Madingley Road, Cambridge, UK, CB3 0HA		Fax: +44-1223-337523
Received on 2006-05-03Z11:17:12