Re: VOSpace URIs

From: Paul Harrison <pharriso-at-eso.org>
Date: Thu, 4 May 2006 11:15:40 +0200

On 03.05.2006, at 11:16, Guy Rixon wrote:
>
> Firstly, I'd like to note that when I originally proposed a vos:
> scheme
> (http://wiki.astrogrid.org/bin/view/AG2/IvoNamesAndLocators)

As I remember that page was prompted by an Astrogrid discussion forum argument (unfortunately the forum server no longer exists to reference the discussion) that was raging at the time because of a failure in various layers of the AG software to agree on how to resolve the various interpretations that needed to be placed on ivo: identifier resolution. This was obviously settled in the software, but it did illustrate the point that having schemes that involved multiple dereferencing just to discover even what type of entity was being referenced lead to complex software, and different software engineers had different "natural" interpretations.

>
> Your proposal isn't independent of the registry. The _client_ has
> to resolve a
> vos: URI to an ivo: URI and then find the right VOSpace service in the
> registry. "Standard URI-parsing software" won't help with this.

it would in the specific case of links and relative references - see the reply to Roy.

>
> Secondly, I'm not sure that your vos: really has "locator-style
> semantics". A
> vos: URI doesn't point directly to anything concrete that can be
> downloaded in
> the manner of a web page, and the VOSpace semantics are quite
> different to
> those of HTTP, FTP etc.

I think that it does have locator style semantics - again see reply to Roy - the registry being analogous to DNS with http:

>
> Thirdly, having re-read RFC3986, I don't see anything illegal in an
> IVOID's
> use of the query fragment to represent the path into VOSpace. From
> the RFC:
>
> "3.5. Fragment
>
> The fragment identifier component of a URI allows indirect
> identification of a secondary resource by reference to a primary
> resource and additional identifying information. The identified
> secondary resource may be some portion or subset of the primary
> resource, some view on representations of the primary resource, or
> some other resource defined or described by those
> representations. A
> fragment identifier component is indicated by the presence of a
> number sign ("#") character and terminated by the end of the URI.
>
> fragment = *( pchar / "/" / "?" )
>
> The semantics of a fragment identifier are defined by the set of
> representations that might result from a retrieval action on the
> primary resource. The fragment's format and resolution is
> therefore
> dependent on the media type [RFC2046] of a potentially retrieved
> representation, even though such a retrieval is only performed
> if the
> URI is dereferenced. If no such representation exists, then the
> semantics of the fragment are considered unknown and are
> effectively
> unconstrained. Fragment identifier semantics are independent of
> the
> URI scheme and thus cannot be redefined by scheme specifications.
>
> Individual media types may define their own restrictions on or
> structures within the fragment identifier syntax for specifying
> different types of subsets, views, or external references that are
> identifiable as secondary resources by that media type. If the
> primary resource has multiple representations, as is often the case
> for resources whose representation is selected based on
> attributes of
> the retrieval request (a.k.a., content negotiation), then
> whatever is
> identified by the fragment should be consistent across all of those
> representations. Each representation should either define the
> fragment so that it corresponds to the same secondary resource,
> regardless of how it is represented, or should leave the fragment
> undefined (i.e., not found)."
>
> The first paragraph sounds very much like what we are about. The
> last suggests
> that there is an implied media-type for an entire VOSpace that
> defines the
> semantics of a space and the syntax of an IVOID fragment pointing into
> a space...which is _exactly_ how I see it working.

yes - I agree that this updated URI spec does bless indirect referencing for the fragment part, where the earlier RFC2396 had said that it should be explictly a fragment of the document referred to by the path part of the URL - so the use of the ivo: scheme fragment for the vospace reference is "legal", but not simple, and that is my primary argument for wanting to introduce a vos: scheme

>
> Fourthly, I _do_ use the resource keys in IVOIDs to express
> hierarchy. The
> resources in the uk.ac.cam.ast namespace are organized that way.

well you are not allowed to attach any meaning that can be interpreted by others according to the IVOA ID Spec

"Any meaning that is suggested by the resource key is intended only for human consumption. The character content of a resource key is not semantically machine-interpretable. "

Additionally the ivo: scheme is not hierarchical in the sense that there is any containment relationship so that for the URI

ivo://org.ivoa/operational/siap

the operation "contents of" ivo://org.ivoa/operational has no meaning, whereas the equivalent in the vos: scheme would return the list of child URLs.

section 2.1.2 of http://www.ietf.org/rfc/rfc2718.txt warns against improper use of // if the path part does not contain a conformant hierarchical structure, and this is the biggest fault in the IVOA ID specification in my opinion.

>
> Fifthly, I can see that we might sometimes want to add a fragment
> or a query
> to the URI for an item in VOSpace. We can't do this if we've used
> up those
> parts of the URI syntax to point the thing itself. This strikes me
> as the
> strongest argument for dropping the current use of ivo: in VOSpace.

I would probably rate this as top amongst the "practical" arguments...

>
> Summary: I don't feel the philosophical or legalistic needs to
> introduce vos:,
> but I agree that the proposed use of ivo: won't do. Therefore, I
> propose a
> different use of ivo:.
>
> ivo://org.astrogrid/path/to/vospace!/path/to/data/node#etc?etc
>
> In this, the resource key of the data-node in VOSpace includes the
> resource
> key of the space itself and there is a stop character ! marking the
> end of the
> latter. The ! is part of the space's resource-key; it's not a
> separator, it
> doesn't apply to IVOIDs in general and doesn't have to be written
> into the
> IVOID spec. The use of ! and the special semantics of the resource
> key would
> be part of the VOSpace specification.

I cannot see how this could be done without changing the IVOID specification - current practice is everything up to the # or ! is part of the resource key, so clients would try to look up ivo:// org.astrogrid/path/to/vospace!/path/to/data/node in the registry and would fail.

We could add this to the specification as being a new sort of marker that indicated an indirection should take place and that the rest of the URI be passed on to the service pointed to by the resource key. This would be a useful general facility and could potentially be used for other services where the primary part of the ivo: uri is used to look up an actual service end point in the registry.

Whilst changing the IVOA ID specification I would take the opportunity to also do the following to make it follow URI conventions better

  1. remove the use of "/" in the authority and resource key sections of the IVOID so that it did not look like a hierarchical URL, but behaves just as an identifier, which was the original intention - to conform with 2.1.2 of http://www.ietf.org/rfc/rfc2718.txt - the authority and resource part could be separated with another reserved character such as $.
  2. ban the use of "." in the authority part of the IVOID so that it does not look syntactically like a internet address - section 3.2.2. of http://www.ietf.org/rfc/rfc3986.txt states that if the authority part looks like an internet address then that should be the primary interpretation.

so the example might look more like

ivo:astrogrid$path$to$vospace!/apath/to/data/node?query#fragment

N.B. the convention is that ?query comes before #fragment as ?query is actioned by the service resolving the uri and #fragment by the client.

Having said all this I still think that vospace deserves its own uri scheme because the above still does not conform to the standard URL convention of

          foo://example.com:8042/over/there?name=ferret#nose
          \_/   \______________/\_________/ \_________/ \__/
           |                      |                           
|                        |                |
        scheme     authority                path              
query        fragment

>
> Sun did similar in defining URIs for use with .jar files.
>
> We could put the ! on the front of the data-node path instead of
> the back of
> the space's path. But I suspect it's easier to force all VOSpace
> registrants
> to use ! correctly than to exclude it from all node names in
> VOSpace generated
> by users.
>
> Cheers,
> Guy
>
Received on 2006-05-04Z11:16:06