The IVOA in 2007: Assessment and Future Roadmap

From: Roy Williams <roy-at-cacr.caltech.edu>
Date: Mon, 18 Jun 2007 09:45:47 -0700


On behalf of the IVOA Technical Coordination Committee, we are pleased to release a report

"The IVOA in 2007: Assessment and Future Roadmap"

which is available at:
http://www.ivoa.net/internal/IVOA/TechnicalMilestones/IVOARoadMap-2007-final.pdf

This document summarizes many of the issues that were discussed before and during the recent Interop meeting in Beijing, as reproduced below. Each issue comes with one or more Recommendations intended to cement the strong IVOA collaboration, and it is hoped that the Working Groups and national-VO projects can help to maintain this unity.

The report also lists the roadmap for each Working Group, in terms of dates at which standards documents will reach the stages of Working Draft, Proposed Recommendation, Recommendation.

Your comments are welcome, either just to us, or to the 22 members of the TCG as listed on the report, which has the address tcg-at-ivoa.net.

Roy Williams
TCG Chair

Christophe Arviset
TCG vice-Chair


6. Leading Issues
(1) Registry Graininess

The issue of registry graininess has been an on-going issue throughout the development of the registry framework. The so-called “fine-grained registry” approach encourages capturing detailed, possibly dynamic metadata into the registry. The motivation for this is not just to enable sophisticated resource discovery and also as an aid to automated planning for and execution of service-driven applications. In contrast, the “coarse-grained registry” approach prefers a registry that restricts its self to more general metadata with the expectation that more detailed information would be accessed directly from the service. The major concern about fine-grained information is one of registry metadata curation: detailed metadata is more likely to be either incorrect or not provided at all. This concern applies especially to metadata that is available directly from the resource (e.g. table metadata); without a tight coupling between the registry and services, it’s possible for the metadata in the registry to become out of sync or out of date with respect to the resource. This adds to the already large curation costs registries are faced with to ensure that the quality of the metadata in the registry is sufficiently good so that registries are practically useful. Today, we are beginning to see applications being built around fine-grained metadata in a registry, though we have not yet effectively addressed the curation issues. The fact that all metadata are shared across all registries via the harvesting stream, every registry must deal at some level with the associated curation costs regardless of whether it wishes to support fine-grained applications. Recommendation: Since the current registry upgrade is a necessary step prior to putting into place more effective curation practices, we recommend that the upgrade be completed as soon as possible and at the highest priorities. Further changes in the relevant standards that could delay the completion of the upgrade should be avoided. Recommendation: Curation practices aimed at improving metadata quality are needed to catch up with desire to develop applications based on fine-grained registries. After the upgrade is complete, we recommend shifting greater focus putting such practices into place, including effective use of automated validation of resource metadata and the standard services they describe.
Recommendation: Extension schemas that are expected to be widely supported across all VO registries must be put through the IVOA standardization process. Projects that wish to introduce extensions that are intended only for local support should consult with the Registry Working Group (RWG) regarding possible impact on all registries. Documentation in the form of a IVOA Note or, at least, RWG wiki page is recommended.
Recommendation: After the completion of the upgrade, the Registry WG and Grid/web service WG should develop mechanisms for harvesting more of the fine-grained metadata directly from services (through the VO Standard Interface (VOSI) specification), and for reducing the metadata that gets shared on the harvesting stream. A registry will then have greater control over how much information it manages within the context of its store.

(2) GetCapabilities method for Services

Another driver for making more detailed information available from the service directly has been pursued by the DAL WG: they wish to make the next generation of services more self-describing, independent of the registry. In particular, if the service can reveal its capabilities and behaviors directly, then service clients can directly negotiate with the service. It is expected that such information might often be generated either transparently or dynamically by the service implementation, and
(therefore) it will be more up-to-date than the registry. The proposed
way of getting this information to clients is via a getCapabilities method. There is still considerable discussion going on regarding the details of exactly what information is returned and in what form which has been holding up the advancement of critical service specifications
(SSA, TAP). Further complicating the discussion is issue of registry
graininess and how registries should get this information -- see (1) above. Recommendation: In an effort to allow critical specifications to go forward, first-generation techniques for accessing service behaviors from the services should be adopted for current protocols, and the getCapabilities method should be spun off and incorporated into the VO Standard Interfaces (VOSI) specification. This will allow client development based on getCapabilities to go forward without holding back first implementations of SSA and TAP.

(3) Dependencies in IVOA Recommendations
There are some examples appearing of IVOA standards proceeding to Recommendation, that depend on IVOA documents that are not Recommendations. One of the first was VOEvent, with dependency on STC
(not a Rec as of writing, May 2007). Other places where dependency could
occur are UCD versioning, the VODataService (not a Rec) and VO Support Interfaces.
Recommendation: The rules for IVOA standards should have a rule that a Rec cannot in principle be dependent on non-Recs.

(4) Footprints in the Registry

It would be very useful for some registry records to contain a footprint specification, so that machines can decide if a given point or region intersects the coverage of a dataset or service. Currently the registry record can contain either (de facto) free text, or a full STC (Space Time Coordinates) record.
Recommendation:The registry WG should allow and encourage multiple ways to specify footprint, including: free text; STC, a restricted subset of STC (eg BOX, CIRCLE), pointers to footprint services, and ways by which footprints can be created by probing a service directly.

(5) Registry Harvesting and concatenated XML
A problem has emerged in the last year concerning the XML documents that registries exchange in the process of harvesting each other, and this is blocking the progress to Recommendation of the VOResource standard. A set of these documents (instances of VOResource) is handled by the registry with the (false) assumption that a concatenation of valid XML documents is also valid. The problem is with the ID construct in XML, which states that such ID values must be unique. In particular, the STC schema uses these IDs to identify coordinate systems for spatial coverage, although we should say this is a general XML problem, not specific to STC. A user might write
ID="UTC-FK5-GEO" href=”ivo://STClib/CoordSys#UTC-FK5-GEO” meaning the ID value can be used as an abbreviation of the referent (href value). However, if the same abbreviation is declared elsewhere in the document, the XML rules make it invalid, hence the problem with concatenating documents that all use the same coordinate system. A solution is emerging based on the following agreements (a) the ID value can and will be changed arbitrarily in an XML document without changing the essential information, and (b) this is easier to do if all ID values are easy to find in the XML; therefore (c) parsing software for the XML document must make decisions based on the referent value, not the ID value, and
(d) the referent of the ID must be well-defined and stable, so that
parsing software can recognize it.
Recommendation: IVOA standards should try to avoid use of the ID/IDREF mechanism, unless they have good reason to believe that conforming document instances are unlikely ever to be concatenated. Recommendation: The IVOA registry group should develop a general approach for recognizing this pattern and handling such documents in the registry.

(6) SOAP and REST

In the IVOA, the term "web service" generally implies either SOAP or GET/POST/REST type service protocol. The latter are simpler to understand and implement and the software is much less complex and bug-infested, and therefore preferable for simple services; however, in some cases the extra sophistication of SOAP makes it optimal. A significant advantage for SOAP services is that it is easy to create a formal interface document (WSDL), whereas this is more difficult for GET/POST/REST services (done by hand).
Recommendation: The Grid/Web Services WG should create a study to understand where SOAP is sufficiently advantageous and where the easier GET/POST/REST can do the job just as well. The Grid/Web Services group should re-examine the utility of the “VO WS Basic Profile” document in the light of the results of the study.

(7) Asynchronous services

As the VO concept matures, asynchronous services are emerging, where the response to a request is not the answer, but rather a way to check on the running service, which will eventually produce the answer. There is already deployment of asynchronous services (UK-VO, US-VO, France-VO, Euro-VO), and standards are converging. The GWS-WG proposal (called UWS) has the paradigm Initialize job / Upload input / Receive quote / Run job / Poll status / Fetch results; and the DAL proposal integrates asynchrony with astronomical services through the stageData / getData / AccessReference attributes of the S*AP protocols. The Table Access Protocol (TAP) protocol (see (12)) is being developed with an asynchronous capability.
Recommendation: Implementors of asynchronous services should utilize the UWS pattern. The DAL stageData protocol should be implemented using the UWS pattern. The TAP should base its asynchronous operations on UWS.

(8) Data Models and utypes

The concept of "utype" was defined in the IVOA as a response to the fuzzy nature of the UCD descriptor: if a quantity has a utype, then it must be part of a specific data model. Proper utypes would allow queries to be built independent of the underlying database structure ("where STC.coords.FK5.RA between 300 and 302"), and would provide a strong framework for parameter-based queries ("http://.....? STC.coords.FK5.RA = 300 &..."). However, many of the data models in use in the IVOA have XML representation only, and do not have representation as a hierarchy of utype values. We note that the syntax of utypes is not well defined in the IVOA, and also that in simple cases the utype can be cleanly derived from the Xpath representation of an XML element, so this should be a straightforward matter.
Recommendation: A subcommittee of the IVOA, consisting of the relevant persons across the various WGs (at least DM, UCD, VOQL) should review the situation of utypes within IVOA. The syntax of utype and its namespaces should be well-defined. Just as with UCDs, there should be services to find relevant data models and their utypes from search words, and there should be services to trace a given utype back to its precise meaning.

(9) Space-Time Coordinates

This large and comprehensive working draft has become a de facto standard in the IVOA through multiple implementations, and yet it is not yet a Recommendation. The IVOA should take firm action on this matter to resolve the status of STC. While there are several software packages that use STC, none of them exercises *every* part of the proposed standard. Further, there is often complaint from implementers about the complexity of STC -- countered by the contention that astronomical coordinate systems are complex by nature. What astronomers want in this area is both assurance that full rigor and precise coordinates are available in the IVOA; and the release from complexity when that full rigor is not deemed necessary by the astronomer. Recommendation: In addition to STC, there must be a simpler system for everyday use, with mappings to full STC well-defined. It is a matter of defaults. For example if the information in the simple system is just RA and Dec numbers, this can map to the FK5 system with reference point at the barycenter of the solar system and the epoch 2000.0. Regions that are disks and RA/Dec intervals should be expressible in just a few characters. Alternate syntaxes should not only provide a straightforward way for a client to recognize its use, but also recognize its mapping into full STC.
Recommendation: Applications and standards should clearly describe the subset of STC that they are using, for example the Registry uses CIRCLE and BOX; VOEvent uses longitude, latitude, and error radius. This will allow consumers to build applications against these common subsets. However STC beyond this should be recognized and either be fully used or fail gracefully.

(10) Table Access Protocol

The TAP is under development by an IVOA subcommittee. The TCG expects taht it will specify how an ADQL (or optionally an SQL) query can be submitted to a service for processing. The response will be in VOTable format.
Recommendation: The TAP should build on existing IVOA standards. Initial versions of the protocol should state clearly what it will eventually define, but mandate the minimum necessary to ensure a public release of the protocol is achieved without delay..

(11) Multiple Data Access

A principle justification of the VO itself is to encourage statistical studies of populations of astronomical objects, as well as the more traditional single object study. The IVOA should encourage this through multi-point protocols, bulk data access, and scalability of services to the grid.
Recommendation: Data access protocols should be re-considered in terms of their ability to handle multiple requests and bulk data. The Cone and SIA services, in particular, do not handle multiple requests.

(12) VO interoperability with popular software
Most astronomers do most of their work with software packages like IDL, IRAF, DS9, MIDAS, Sextractor, etc. It is highly desirable that these be interoperable with the VO framework through use of VO services and desktop messaging.
Recommendation: The VO national projects and Applications WG should assess VO interoperability with these popular astronomy software packages and environments.

(13) Bundling of VO software

Bundling of astronomy software such as the Scisoft and ex-Starlink collections provides a convenient way of distributing many packages at once to ease the burden of installation. Bundled distributions of VO software would assist in up-take of VO tools, and we note that Scisoft VII will contain a selection of VO software. Recommendation: The list of VO Applications maintained on the (publicly editable) Apps WG wiki pages serve as a place for Applications to be visible for parties compiling collections of VO tools.

(14) Interoperable Security: Security and authentication is being
implemented in several new efforts. The Astrogrid (UK-VO) project has built a sophisticated workflow system for asynchronous computations and is adding authentication; a complementary project from the US NVO project is exploring the idea of “graduated security” for giving community access to high-performance computing. While the IVOA has a mature Single-signon standard for security, using X.509 certificates, there has been little discussion of which VO projects are issuing certificates and the levels of authentication taking place, and which VO projects will accept certificates from which other projects. Recommendation: The Grid-Web Services WG should create a listing of certificate authorities in the national projects, how to get a certificate from each, what can be done with the certificate, and compliance to accreditaton guidelines (eg PMA ). Recommendation: VO organizations should encourage their users to obtain certificates from PMA-accredited Certificate Authorities where possible, or, failing this, from properly accredited certificate authorities inside the VObs movement.
Recommendation: Service providers in the VO should be encouraged to accept by default only certificates from PMA-accredited certificate authorities and certificate authorities accredited within the VO movement. They may choose also to accept "weak certificates" for cases where the providers deem this to be sufficiently safe. Recommendation: The IVOA should choose a set of guidelines for VO-accredited certificate authorities, basing the guidelines on those for the PMA-accredited authorities.

(15) Units: Most scientific quantities carry units, and data returned
from IVOA services should also carry explicit unit information when not clear implicitly. Units should follow the IAU recommendation , and/or the VOTable recommendation . When a user makes a query based on a quantity, units can either be user-defined or fixed. In the former case, the user has the freedom to express the quantity in arbitrary units (eg. calories per square furlong per hour!), or an enumerated choice (eg. Angstroms OR nanometers). In the case of fixed units, the data model of the query is bound to specific units (eg all angles must be in decimal degrees).
Recommendation: A study by the Data Model Working Group of how units are used in IVOA views and services, where it would be appropriate to simply fix the units, and where it is necessary to allow freedom of choice, distinguishing between unit choice in the user interface and in the back-end services. In the latter case, the report should also recommend on how unit conversion is implemented: who is responsible and the nature of the software.

(16) IVOA Newsletter

Recommendation: The global VO community would be well-served by an IVOA newsletter, including announcements from national projects and working groups, events, press coverage of VO issues, etc. Received on 2007-06-18Z16:46:05