RE: Architecture of IVOA version 0.4

From: Tony Linde <ael-at-star.le.ac.uk>
Date: Thu, 20 May 2004 07:13:57 +0100


> \begin{diatribe}

As I've consistently said, we need a registry spec that allows NVO to implement its coarse-grained registry and AstroGrid to implement its fine-grained registry.

If NVO also doesn't want to use the registry as a channel for apps to get the richer resource metadata but for every app to interface directly with the resource themselves then, again, let's design the spec so that both approaches are accomodated.

All I've said over and over is that we shouldn't close down our options at this stage with restrictive specifications.

> metadata. Comparisons between complex databases are not
> going to be made trivial, at least not in our lifetimes. The

I thought SkyQuery had already prototyped this?

> correct, the system will break down. The AG approach reminds
> me of the folders you find in cheap hotels sometimes, which

I think a better comparison is with the way the commercial world is cooperating on many fronts to allow systems to interoperate by creating common standards for things like orders, invoices, book catalogues, newsfeeds etc. Telling the astronomical community that the VObs will not deliver similar levels of interoperability because the data centres don't want to spend a few hours compiling metadata is not going to go down too well.

And of the data centre managers here in the UK that I've spoken to or heard of, they all want to get their metadata into VObs compliant format so that the data *can* be used with other datasets: all they want is to be told what that format is.

Cheers,
Tony.

> -----Original Message-----
> From: owner-architecture-at-eso.org
> [mailto:owner-architecture-at-eso.org] On Behalf Of Robert Hanisch
> Sent: 20 May 2004 03:47
> To: Clive Page; architecture-at-ivoa.net
> Cc: Wil O'Mullane
> Subject: Re: Architecture of IVOA version 0.4
>
> Hi Clive.
>
> I strongly support your suggestion of having a document that
> conveys to the working astronomer what the VO is all about.
> The IVOA architecture document is not the thing. The purpose
> of this is to make sure that we are covering all bases, and
> that connections are understood. It is an internal document.
> We can write something else, but I would not start with the
> architecture document as a basis. An astronomer-friendly
> version of the IVOA Mission/Roadmap might be a better
> starting point (though that, too, needs a major translation effort!).
>
> \begin{who_are_we_doing_this_for?}
> Most working astronomers will not care about how any of the
> VO stuff works.
> They will care only IF it works, and IF it is easy for them
> to use. They will not care about XML, XSD, SIAP, SSAP, ADQL,
> SOAP, WSDL, HTTP GET/POST, UDDI, or any of the acronyms.
> VOTable vs. FITS, maybe. Most will not understand a simple
> SQL query. As I was discussing with Tamas Budavari
> yesterday, even the term "select:" (in SQL parlance) means
> something very different to astronomers than it does to SQL
> speakers. (To be clear, in SQL "select" means "show"; to
> astronomers "select" means "select", i.e., pick those rows
> that match some selection criterion, then "show" the columns
> of that row that interest me.) Something that is obvious to
> you, as a DB-aware astronomer, is unknown to 99% of the
> community we are trying to reach through the VO.
> \end{who_are_we_doing_this_for?}
>
> \begin{the_cognescenti}
> A few technology-interested astronomers will want to know
> more -- these are the people we are trying to reach through
> our NVO Summer School, for example. They will understand and
> help us to build VO-based applications and tools. But the
> majority will be (justifiably) clueless about everything in
> the previous paragraph.
> \end{the_cognescenti}
>
> On the ADQL topic, I agree with you -- we should work toward
> a query language that can work against all VO databases:
> catalogs, observation logs, registries. I do not see
> anything so unique about these databases that they cannot be
> queried via the same language.
>
> \begin{diatribe}
> On the interfaces to local relational databases, as I have
> stated many times previously, I think this is best negotiated
> via follow-on (e.g.
> OpenSkyNode-type) queries once a user has determined that a
> database or collection is of interest based on higher level
> metadata. Comparisons between complex databases are not
> going to be made trivial, at least not in our lifetimes. The
> best we can do is expose - on request - the DB structures to
> knowledgable users, and provide them with the tools to do the
> cross-correlations and comparisons of interest. I know that
> AstroGrid is taking a fine-grained approach to the registry,
> with a goal of capturing metadata at the level of columns,
> units, and UCDs in registered resources.
> I predict this will fail in the long term. You might get
> compliance now, when funding and political support is strong.
> But when data and service providers see that registering a
> resource requires many hours, if not days, of work, and when
> someone has to review what they've done to make sure it is
> correct, the system will break down. The AG approach reminds
> me of the folders you find in cheap hotels sometimes, which
> include menus of nearby restaurants. The restaurants are
> probably all still there, serving the same type of cuisine at
> the same address and same phone number. The menus have
> probably all changed, and thus what you see in the hotel room
> is partly useful, and mainly (in terms of the bulk of
> information) useless. Yes, you can ping them regularly to
> find out their latest menus, but if they have given up
> complying with your standard for publishing their menu
> because it is too complicated, then you have only the
> essential information -- where to go to get some food, and
> when the restaurant is open. The menu does you no good --
> out of date or not -- if you arrive at 2 am and the
> restaurant closed at 9pm.
> \end{diatribe}
>
> My point is, higher level metadata is the key. Detailed
> metadata about the resource should be provided on request,
> from the resource provider.
>
> Anyway, the IVOA architecture document is not the vehicle for
> explaining the VO to the astronomical world. Let's use it
> for its designated purpose, and find a different way to speak
> to our end-users.
>
> Cheers,
> Bob
>
> ----- Original Message -----
> From: "Clive Page" <cgp-at-star.le.ac.uk>
> To: <architecture-at-ivoa.net>
> Sent: Wednesday, May 19, 2004 7:36 AM
> Subject: Re: Architecture of IVOA version 0.4
>
>
> > Roy
> >
> > Sorry for a belated response, but hope comments on your
> V0.4 document are
> > still being accepted.
> >
> > I think the VO projects need a document explaining our
> plans in terms that
> > the astronomer-in-the-street can understand. The majority
> of astronomers
> > that I meet (other than those directly involved in the VO)
> seem to think
> > we are simply wasting public money. I'm not sure whether
> this document is
> > capable of fulfilling that need, but parts of it already
> come close. To
> > do this it will need footnotes (or appendices) explaining
> the jargon and
> > TLAs, such as XML.
> >
> > We also need a top-down architectural description, and a reasoned
> > explanation of why each component is needed; this will help
> fulfil Andy's
> > wish for an analysis which will identify any gaps in our
> architecture.
> >
> > I think the current draft is lacking in section 2, when it
> baldly states
> > that there are three broad classes of service: registry
> services, data
> > services, and compute services. The latter two will be familiar to
> > astronomers, but the registry services will not, and need
> justification
> > and explanation. Without such background, section 6 will
> be more or less
> > incomprehensible to most astronomers. A fuller
> justification may uncover
> > something of a lack of concensus on the centrality of the
> registry among
> > VO projects, but that's no bad thing: if some VO projects
> are going to
> > have a more detailed registry than others, we need to be
> sure that they
> > will interwork properly.
> >
> > The basic idea of a registry is easy to explain: you need
> at the very
> > least a list of top-level URLs of astronomy data centres
> and the services
> > they support, otherwise searches will be no better than you
> can do using
> > Google. The GLU system of CDS is such a basic registry,
> with the addition
> > of details of the interfaces of each service, so that e.g.
> cone-searches
> > can be distributed to a number of data centres using a single set of
> > parameters.
> >
> > The addition of more metadata to the registry can be
> justified on the
> > grounds of making searches more directed: say you want to find all
> > observations of position (ra,dec) in waveband X then having
> at least crude
> > sky coverage and waveband details attached to each registry
> entry will
> > make such queries a lot faster, and avoid queries to
> inappropriate sites.
> >
> > Further justification, I think, comes from the recognition
> that many data
> > services are fronted by DBMS, but queries to them will be almost
> > impossible to formulate without knowing the name, data
> types, units, UCDs
> > and maybe other details of each column in each table.
> Current relational
> > DBMS have no provision for such metadata beyond the bare
> column name.
> > To make the problems concrete, consider an ADQL/S query like:
> >
> > SELECT * FROM table WHERE REGION('CIRCLE ICRS 123.4, 45.6, 2')
> > AND properMotionRA > 100;
> >
> > The REGION bit is feasible because we have *defined* that
> its arguments
> > are always in degrees, but the proper-motion bit will work
> only if the
> > user knows the units or we require them to be the same for
> every data
> > service in the VO. I think the latter is impractical
> (Vizier catalogues
> > alone use 19 different units for proper motion, none of
> them the SI unit
> > for angular velocity). We could require the use of
> "standard" units in
> > every query with local translations, but I suspect that
> getting agreement
> > on a set of standard units will be very hard.
> >
> > Alternatively we could require that every table in a
> VO-compliant DBMS
> > supports a metadata query (such as: give me the units and
> UCDs of all your
> > columns). If I understand it right, most people seem to
> think that such
> > queries are best performed on the Registry. Data centre
> managers who will
> > have to populate these metadata databases will want this under their
> > control, which is likely to lead to a registry associated
> with each data
> > centre. The local registry will have full details of local
> datasets, but
> > having full details of all VO datasets in the world might
> make them too
> > large. The solution seems to be a distributed information
> system, much
> > like the DNS, with regular harvesting. This structure can
> perhaps be
> > explained to astronomers in these terms.
> >
> > Section 3: web and grid services. It would be sensible to
> explain that
> > HTTP get/post services can only be queried using specially
> tailored form
> > interfaces (or where something like GLU is available to
> transform query
> > parameters to the form they need). I think that WSDL can be used to
> > describe such interfaces - does this solve an the problem of making
> > current HTTP/CGI based services available as VO resources?
> If so it would
> > be nice to include a short explanation of this.
> >
> > I agree that the main advantage of SOAP is that WSDL can be used to
> > describe them; does WSDL do everything we want, or is more
> needed? An
> > outline of UDDI would be worthwhile, explaining why the VO world has
> > rejected it.
> >
> > VOTable format: it would be good to include a description
> of why the VO
> > needs a new data format. Billions of FITS files exist, so
> astronomers
> > will want to know why we are inventing a new format which
> is generally ten
> > times as verbose and which hardly any current software will
> handle. I
> > think we ought not to ignore the current debates over the
> structure of XML
> > files holding tabular data: VOTable is something of a
> compromise over what
> > is convenient for astronomical use, and what is easily
> parsed by existing
> > XML tools.
> >
> > OpenSkyQuery and ADQL: It concerns me a bit that we are
> developing or at
> > least considering a number of separate query lanagues: ADQL
> for tabular
> > data, with another for image/spectral data, and yet another
> for registry
> > queries. It would be nice for the architecture document to
> pull these
> > together, and identify the somewhat conflicting
> requirements. Then we
> > might be able to work out whether it is feasible to have
> just one single
> > VO query language, or not.
> >
> > Regards
> >
> >
> > --
> > Clive Page
> > Dept of Physics & Astronomy,
> > University of Leicester,
> > Leicester, LE1 7RH, U.K.
> >
> >
> >
> >
> >
>
Received on 2004-05-20Z06:14:19