Hi Clive.
I strongly support your suggestion of having a document that conveys to the working astronomer what the VO is all about. The IVOA architecture document is not the thing. The purpose of this is to make sure that we are covering all bases, and that connections are understood. It is an internal document. We can write something else, but I would not start with the architecture document as a basis. An astronomer-friendly version of the IVOA Mission/Roadmap might be a better starting point (though that, too, needs a major translation effort!).
\begin{who_are_we_doing_this_for?}
Most working astronomers will not care about how any of the VO stuff works.
They will care only IF it works, and IF it is easy for them to use. They
will not care about XML, XSD, SIAP, SSAP, ADQL, SOAP, WSDL, HTTP GET/POST,
UDDI, or any of the acronyms. VOTable vs. FITS, maybe. Most will not
understand a simple SQL query. As I was discussing with Tamas Budavari
yesterday, even the term "select:" (in SQL parlance) means something very
different to astronomers than it does to SQL speakers. (To be clear, in SQL
"select" means "show"; to astronomers "select" means "select", i.e., pick
those rows that match some selection criterion, then "show" the columns of
that row that interest me.) Something that is obvious to you, as a DB-aware
astronomer, is unknown to 99% of the community we are trying to reach
through the VO.
\end{who_are_we_doing_this_for?}
\begin{the_cognescenti}
A few technology-interested astronomers will want to know more -- these are
the people we are trying to reach through our NVO Summer School, for
example. They will understand and help us to build VO-based applications
and tools. But the majority will be (justifiably) clueless about everything
in the previous paragraph.
\end{the_cognescenti}
On the ADQL topic, I agree with you -- we should work toward a query language that can work against all VO databases: catalogs, observation logs, registries. I do not see anything so unique about these databases that they cannot be queried via the same language.
\begin{diatribe}
On the interfaces to local relational databases, as I have stated many times
previously, I think this is best negotiated via follow-on (e.g.
OpenSkyNode-type) queries once a user has determined that a database or
collection is of interest based on higher level metadata. Comparisons
between complex databases are not going to be made trivial, at least not in
our lifetimes. The best we can do is expose - on request - the DB
structures to knowledgable users, and provide them with the tools to do the
cross-correlations and comparisons of interest. I know that AstroGrid is
taking a fine-grained approach to the registry, with a goal of capturing
metadata at the level of columns, units, and UCDs in registered resources.
I predict this will fail in the long term. You might get compliance now,
when funding and political support is strong. But when data and service
providers see that registering a resource requires many hours, if not days,
of work, and when someone has to review what they've done to make sure it is
correct, the system will break down. The AG approach reminds me of the
folders you find in cheap hotels sometimes, which include menus of nearby
restaurants. The restaurants are probably all still there, serving the same
type of cuisine at the same address and same phone number. The menus have
probably all changed, and thus what you see in the hotel room is partly
useful, and mainly (in terms of the bulk of information) useless. Yes, you
can ping them regularly to find out their latest menus, but if they have
given up complying with your standard for publishing their menu because it
is too complicated, then you have only the essential information -- where to
go to get some food, and when the restaurant is open. The menu does you no
good -- out of date or not -- if you arrive at 2 am and the restaurant
closed at 9pm.
\end{diatribe}
My point is, higher level metadata is the key. Detailed metadata about the resource should be provided on request, from the resource provider.
Anyway, the IVOA architecture document is not the vehicle for explaining the VO to the astronomical world. Let's use it for its designated purpose, and find a different way to speak to our end-users.
Cheers,
Bob
> Roy
>
> Sorry for a belated response, but hope comments on your V0.4 document are
> still being accepted.
>
> I think the VO projects need a document explaining our plans in terms that
> the astronomer-in-the-street can understand. The majority of astronomers
> that I meet (other than those directly involved in the VO) seem to think
> we are simply wasting public money. I'm not sure whether this document is
> capable of fulfilling that need, but parts of it already come close. To
> do this it will need footnotes (or appendices) explaining the jargon and
> TLAs, such as XML.
>
> We also need a top-down architectural description, and a reasoned
> explanation of why each component is needed; this will help fulfil Andy's
> wish for an analysis which will identify any gaps in our architecture.
>
> I think the current draft is lacking in section 2, when it baldly states
> that there are three broad classes of service: registry services, data
> services, and compute services. The latter two will be familiar to
> astronomers, but the registry services will not, and need justification
> and explanation. Without such background, section 6 will be more or less
> incomprehensible to most astronomers. A fuller justification may uncover
> something of a lack of concensus on the centrality of the registry among
> VO projects, but that's no bad thing: if some VO projects are going to
> have a more detailed registry than others, we need to be sure that they
> will interwork properly.
>
> The basic idea of a registry is easy to explain: you need at the very
> least a list of top-level URLs of astronomy data centres and the services
> they support, otherwise searches will be no better than you can do using
> Google. The GLU system of CDS is such a basic registry, with the addition
> of details of the interfaces of each service, so that e.g. cone-searches
> can be distributed to a number of data centres using a single set of
> parameters.
>
> The addition of more metadata to the registry can be justified on the
> grounds of making searches more directed: say you want to find all
> observations of position (ra,dec) in waveband X then having at least crude
> sky coverage and waveband details attached to each registry entry will
> make such queries a lot faster, and avoid queries to inappropriate sites.
>
> Further justification, I think, comes from the recognition that many data
> services are fronted by DBMS, but queries to them will be almost
> impossible to formulate without knowing the name, data types, units, UCDs
> and maybe other details of each column in each table. Current relational
> DBMS have no provision for such metadata beyond the bare column name.
> To make the problems concrete, consider an ADQL/S query like:
>
> SELECT * FROM table WHERE REGION('CIRCLE ICRS 123.4, 45.6, 2')
> AND properMotionRA > 100;
>
> The REGION bit is feasible because we have *defined* that its arguments
> are always in degrees, but the proper-motion bit will work only if the
> user knows the units or we require them to be the same for every data
> service in the VO. I think the latter is impractical (Vizier catalogues
> alone use 19 different units for proper motion, none of them the SI unit
> for angular velocity). We could require the use of "standard" units in
> every query with local translations, but I suspect that getting agreement
> on a set of standard units will be very hard.
>
> Alternatively we could require that every table in a VO-compliant DBMS
> supports a metadata query (such as: give me the units and UCDs of all your
> columns). If I understand it right, most people seem to think that such
> queries are best performed on the Registry. Data centre managers who will
> have to populate these metadata databases will want this under their
> control, which is likely to lead to a registry associated with each data
> centre. The local registry will have full details of local datasets, but
> having full details of all VO datasets in the world might make them too
> large. The solution seems to be a distributed information system, much
> like the DNS, with regular harvesting. This structure can perhaps be
> explained to astronomers in these terms.
>
> Section 3: web and grid services. It would be sensible to explain that
> HTTP get/post services can only be queried using specially tailored form
> interfaces (or where something like GLU is available to transform query
> parameters to the form they need). I think that WSDL can be used to
> describe such interfaces - does this solve an the problem of making
> current HTTP/CGI based services available as VO resources? If so it would
> be nice to include a short explanation of this.
>
> I agree that the main advantage of SOAP is that WSDL can be used to
> describe them; does WSDL do everything we want, or is more needed? An
> outline of UDDI would be worthwhile, explaining why the VO world has
> rejected it.
>
> VOTable format: it would be good to include a description of why the VO
> needs a new data format. Billions of FITS files exist, so astronomers
> will want to know why we are inventing a new format which is generally ten
> times as verbose and which hardly any current software will handle. I
> think we ought not to ignore the current debates over the structure of XML
> files holding tabular data: VOTable is something of a compromise over what
> is convenient for astronomical use, and what is easily parsed by existing
> XML tools.
>
> OpenSkyQuery and ADQL: It concerns me a bit that we are developing or at
> least considering a number of separate query lanagues: ADQL for tabular
> data, with another for image/spectral data, and yet another for registry
> queries. It would be nice for the architecture document to pull these
> together, and identify the somewhat conflicting requirements. Then we
> might be able to work out whether it is feasible to have just one single
> VO query language, or not.
>
> Regards
>
>
> --
> Clive Page
> Dept of Physics & Astronomy,
> University of Leicester,
> Leicester, LE1 7RH, U.K.
>
>
>
>
>
Received on 2004-05-20Z02:44:03