RE: Scope of registry

From: Giaretta, DL (David) <D.L.Giaretta-at-rl.ac.uk>
Date: Thu, 6 Feb 2003 15:05:51 -0000


I agree with Tom up to a point.

We surely need to define what services the Registry should provide - and recognise that we don't need to make an exhaustive list now because we should be able to extend it.

All registries need to sign up to this set of services - or at least be able to
specify what sub-set it adheres to. If for any of its advertised services it does not
know the answer then it should "know an archive/registry who does", and delegate the provision
of the answer to it.

Any particular registry could adopt a caching policy with greater or lesser intelligence.
For example if it knows that a dataset is not being extended i.e. is frozen because no new
observations are being made, then it can safely cache info about it for a long time. We can
no doubt come up with a list of archive metadata which helps with this sort of optimisation.
Of course some information may not be cacheable e.g. compute power or network bandwidth available
right now, which will vary from moment to moment - here the answer could be delegated to an
appropriate LDAP server.

Where I differ from Tom is the view on the ease of implementation. He is almost certainly right in that
an implementation _could_ be done very easily. However in order to obtain acceptable performance
I would guess that a large number of optimisations would be utilised, and this would bring us to Clive's
view in that any project that wants a really really popular registry would need to work hard - and here
an analogy with search engines and Google would be useful. Which project will end up providing the Google
of the IVOA?

The implication from this is that we should define the initial set of services we need and the
information (data and metadata) needed from archives, and perhaps as a separate activity
define archive metadata which will help in optimisations.

Cheers

...David

-----Original Message-----
From: Tom McGlynn [mailto:Thomas.A.McGlynn-at-nasa.gov] Sent: 06 February 2003 14:32
To: Clive Page
Cc: Arnold Rots; registry-at-ivoa.net; metadata-at-us-vo.org Subject: Re: Scope of registry

One thing that has come to me in thinking about this issue is that there is potentially a difference between the granularity of a registry and that of a registry service.

Consider a registry as being
a table having only a high-level (low-granularity) information about services. The services themselves provide some protocol that gives fine-grainded information. To give a concrete example, the registry might contain a reference to the Chandra archive, the NTT archive, and so forth. Part of the information the registry has about the Chandra archive is its coverage service, which a user can invoke to get fine grained information about the position of Chandra observations. In some sense we might think of this as a registry hierarchy: an observation catalog is a 'registry' of the observations described.

However, there is no reason why a registry service that a user (or other software) might invoke, couldn't take advantage of both of the registry and the coverage services. I could see this working something like DNS services on the Web. When a domain name server is queried about some name it goes and queries a chain of services until it resolves the name. When it's queried a second time for the same name, it uses a local cache. Users tend to communicate with only a subset of internet nodes, so the relatively small local cache gives a local user almost the same benefit as if it had the full listing of all X billion web addresses.

Similarly when a registry service
is queried about about observations in a given region the first time it looks at in coarse information to determine possible services and based upon that and other user criteria is goes off and gets fine grained information from the appropriate services. Since this information doesn't change rapidly, and people tend to be interested in the same regions of the sky, the registry service caches the fine grained information for a period of time of the order of hours or days perhaps longer for unchanging data sets.

We don't get static coupling of the various services with the registry, which we'd have if the registry itself contained the fine grained information, but the user is likely to get most of the speed advantage of having the data all in one place. I'd envisage the particular case of registries and coverage services as being a specialization of some more generic support for registry hierarchies.

If we can agree upon a standard protocol by which a archive gives the detailed information, then I think this approach will be easier for all sides: the data providers who provide only overview information to the registry, the registry builders who don't need to worry about synchronization of data and the users who get the latest information soonest. I don't even think it will be very hard to implement in the registry services.

        Tom McGlynn

Clive Page wrote:
> On Wed, 5 Feb 2003, Arnold Rots wrote:
>
>

>>So, it becomes a matter of degree.

>
>
> Yes. And I think the question of the granularity of information in the
> Registry is a matter of debate. It could be that different registries
> have different policies. The AstroGrid project has decided in principle
> that a fine-grained registry is something to aim for (but maybe not in
> version 1.0). Others may have different aims.
>
> It is clear that any query can be answered more definitively by firing
> actual queries to each resource around the world, but the number of such
> resources is getting quite large. And we already know what happens when
> you try that even on a limited scale: just use Astrobrowse to query the
> set of sites they currently have listed and you find that even after a
> minute or so not all the replies have come in. A fine-grained registry
> could, in principle, reduce the number of queries you need to send out
> by quite a considerable factor (few observatories have observed more than
> a tiny fraction of the sky, unless they have done systematic surveys). I
> think that would be nice to have, but I fully accept that it is not easy
> to provide, so must be a matter for debate. At least the debate has now
> started.
>
>
Received on 2003-02-06Z16:06:37