I agree with Tom up to a point.
We surely need to define what services the Registry should provide - and recognise that we don't need to make an exhaustive list now because we should be able to extend it.
All registries need to sign up to this set of services - or at least be able
to
specify what sub-set it adheres to. If for any of its advertised services it
does not
know the answer then it should "know an archive/registry who does", and
delegate the provision
of the answer to it.
Any particular registry could adopt a caching policy with greater or lesser
intelligence.
For example if it knows that a dataset is not being extended i.e. is frozen
because no new
observations are being made, then it can safely cache info about it for a
long time. We can
no doubt come up with a list of archive metadata which helps with this sort
of optimisation.
Of course some information may not be cacheable e.g. compute power or
network bandwidth available
right now, which will vary from moment to moment - here the answer could be
delegated to an
appropriate LDAP server.
Where I differ from Tom is the view on the ease of implementation. He is
almost certainly right in that
an implementation _could_ be done very easily. However in order to obtain
acceptable performance
I would guess that a large number of optimisations would be utilised, and
this would bring us to Clive's
view in that any project that wants a really really popular registry would
need to work hard - and here
an analogy with search engines and Google would be useful. Which project
will end up providing the Google
of the IVOA?
The implication from this is that we should define the initial set of
services we need and the
information (data and metadata) needed from archives, and perhaps as a
separate activity
define archive metadata which will help in optimisations.
Cheers
...David
-----Original Message-----
From: Tom McGlynn [mailto:Thomas.A.McGlynn-at-nasa.gov]
Sent: 06 February 2003 14:32
To: Clive Page
Cc: Arnold Rots; registry-at-ivoa.net; metadata-at-us-vo.org
Subject: Re: Scope of registry
One thing that has come to me in thinking about this issue is that there is potentially a difference between the granularity of a registry and that of a registry service.
Consider a registry as being
a table having only a high-level (low-granularity) information
about services. The services themselves provide some protocol
that gives fine-grainded information. To give a concrete
example, the registry might contain a reference to the
Chandra archive, the NTT archive, and so forth. Part of the
information the registry has about the Chandra archive
is its coverage service, which a user can invoke to get
fine grained information about the position of Chandra observations.
In some sense we might think of this as a registry hierarchy:
an observation catalog is a 'registry' of the observations
described.
However, there is no reason why a registry service that a user (or other software) might invoke, couldn't take advantage of both of the registry and the coverage services. I could see this working something like DNS services on the Web. When a domain name server is queried about some name it goes and queries a chain of services until it resolves the name. When it's queried a second time for the same name, it uses a local cache. Users tend to communicate with only a subset of internet nodes, so the relatively small local cache gives a local user almost the same benefit as if it had the full listing of all X billion web addresses.
Similarly when a registry service
is queried about about observations in a given region the first
time it looks at in coarse information to determine possible services
and based upon that and other user criteria is goes off
and gets fine grained information from the appropriate services.
Since this information doesn't change rapidly, and people tend
to be interested in the same regions of the sky, the registry
service caches the fine grained information for a period of
time of the order of hours or days perhaps longer for
unchanging data sets.
We don't get static coupling of the various services with the registry, which we'd have if the registry itself contained the fine grained information, but the user is likely to get most of the speed advantage of having the data all in one place. I'd envisage the particular case of registries and coverage services as being a specialization of some more generic support for registry hierarchies.
If we can agree upon a standard protocol by which a archive gives the detailed information, then I think this approach will be easier for all sides: the data providers who provide only overview information to the registry, the registry builders who don't need to worry about synchronization of data and the users who get the latest information soonest. I don't even think it will be very hard to implement in the registry services.
Tom McGlynn
Clive Page wrote:
> On Wed, 5 Feb 2003, Arnold Rots wrote:
>
>
>>So, it becomes a matter of degree.