17/3/02.
Francois, Roy,
Herewith some comments on version 0.94 of the VOTable standard, as
requested. Apologies for not getting them to you earlier, but they
should be just in time for the 18 March deadline. I'll start with general
points and then move on to more specific ones.
- Can the various figures and tables be numbered and have (short)
captions?
- There are various quirks in the English. Below I've only noted the
ones that seriously affect the meaning of the standard. Do you intend to
tidy up the wording once we've agreed the technical content?
- I appreciate that the authors are given in strict alphabetical order by
surname, but nonetheless I'm somewhat embarrassed to find myself the
first author. You have done most of the work and I'd have thought that
you ought to be the primary authors, maybe putting yourselves first and
then everyone else in alphabetical order, as for the paper for the
Garching conference. However, I'm happy to go with whatever you
decide.
My affiliation should be `University of Edinburgh, UK' (the Royal
Observatory is the place where I work; I'm employed by the University
of Edinburgh).
Dave Giaretta's affiliation should be `Rutherford Appleton Laboratory,
UK'.
- p2, Data Storage. The first paragraph is confusing. Maybe change the
first couple of sentences to: `The tabular data in a VOTable may be
represented using one of three different formats: TABLEDATA, FITS and
BINARY. TABLEDATA is a pure XML representation so that small...'
- p3, 2nd paragraph, last sentence. Change `... is a multidimensional
arrays;' to `... is a multidimensional array;' Also change `further release'
to `subsequent release' or `subsequent version of the standard', and I'd
be tempted to change `will use' to `might use'. We will surely discuss
such changes further (but not until after version 1), and I'm not
convinced that allowing arbitrary structures to be included as table
elements is necessary (or even desirable). Does this sentence belong in
the next section, `Future'?
- p3, `Future', first sentence. Change `usage' to `experience gained
using the VOTable'.
- p3, `Future', 3rd paragraph. As I've said previously, I regard the idea
of using an empty VOTable as a query as sophistry.
- p4, Section 2.1, 2nd paragraph. `The 12 primitive types listed in the
table above all have the same length in bytes'. No they don't! A double
complex is 16 bytes whereas a logical is 1 (as the table says).
There is also the more abstract point of what the data type actually
means for a TABLEDATA representation (see below).
- p4, Section 2.2, 2nd sentence. Change `... changing fasts,' to `...
changing fastest,'. Personally I'd be happy not to have multidimensional
arrays in version 1 and defer them to a subsequent version,
but presumably somebody wanted them (they are strictly necessary for
FITS compatibility) and I don't object.
- p5, Section 2.3, 1st paragraph. Replace the last sentence with: `The
VOTable PARAM element can be used to represent FITS keywords.'
- p5, Section 2.3, 2nd paragraph. I'm not sure that a paragraph should
start with `But' (!?!)
- p5, `What can VOTable do but not FITS?'. I'd have thought that
another entry would be that the VOTable has a standard mechanism for
indicating which celestial coordinate systems are used in a table and
which columns store them. This facility was a major omission from
FITS.
- p5, Section 3, `XML' 1st couple of sentences. Strictly speaking SGML
is a standard which has been used in the publishing industry for many
years (since the mid-80s, I think) and it is used in typesetting all sorts
of books, not just technical documentation.
- p6, Section 3.2, `Syntax policy'. Maybe say that XML element and
attribute names are case-sensitive and have to used with the specified
capitalisation. The VOTable convention is that element names are in
upper case and attribute names (with the exception of ID) in lower case.
- p7, Section 4, paragraph 2. Change `complere' to `complete'.
- p7, Section 5, paragraph 3, last sentence (and elsewhere in the
document). I've mentioned this previously, but I still don't like the
phrase `an exception is thrown' which is prescribing what an application
reading the table should do, whereas its behaviour will, in part, be
modified by its circumstances and purpose. Here I think that it is
better to say that the datatype attribute is mandatory.
- p7, Section 5, paragraph 4, last sentence. Change `datatype="A:' to
`datatype="A"'.
- p7, Section 5, last paragraph. This method of specifying null values
is a recipe for disaster for columns of type real or double; real or
double values should never be compared for equality, particularly when
one value (the null) is stored as a character string and the other isn't. A
better way to represent nulls is to use the IEEE NaN.
- p8, Section 5.5, 1st paragraph, 2nd sentence. Change `In Astrores...'
to `In VOTable...', presumably?
- p10, 2nd paragraph, last sentence. Change `xml' to `XML'.
- p12, Section 6.2. To remove all possibility for ambiguity there
should be references to documents where the gzip and base64 encodings
are defined.
- p12, Section 6.3. In the last example:
<STREAM href="file://mydata.dat"/>
where is local file mydata.dat in the local file system (we've been here
before)?
- p13, Section 7. For version 1 I would omit this section altogether. I
don't think the VOTable is a particularly natural way of expressing
queries, and moreover, the mechanism described here it is not thought
through. It is better not to tackle the question at all in version 1 than
to enshrine doubtful constructs in the standard, which will then haunt
us for years. Version 1 should just address the problem of tables of
returned results.
- p14, Section 8. `ASTRO' seems a slightly odd name for the top-level
VOTable element. It doesn't seem to bear any particular relation to the
VOTable. Further, as XML comes into regular use in astronomy there
will, presumably, be lots of other definitions with an equal claim to
`ASTRO'. Why not call the top-level element `VOTABLE'?
- p14, Section 9. The title seems slightly odd. Wouldn't something
like `Datatypes' be more informative? An introductory sentence saying
that the following entries describe the permitted datatypes would
probably be useful. It could end by saying that the definitions are
adapted from the FITS binary table specification.
As mentioned above, the descriptions only really apply if the table is in
BINARY or FITS format. For TABLEDATA they are really an indication of
how the column might be represented in a program. For example, look at
the example VOTable on pp15-16. Field RA(J2000) has data type `D', but
the values are not stored as an IEEE floating point number, but as a
series of characters: `0.0146' for the first row. I appreciate that this
point is rather pedantic, but a standard should be correct.
- p14, Section 9, Logical. A hexadecimal 0 indicates a null value (and
is legal) rather than indicating an invalid value. Anything other than `T',
`t', `F', `f' or hexadecimal 0 in a logical column is invalid.
- p15, Single and double precision floating point numbers. An IEEE NaN
should always be used to represent null values (see above).
Remove the code `F' for double precision reals, which is not now
mentioned elsewhere.
- p17, the COOSYS element. I would still prefer the connection
between a COOSYS element and the columns which realise the system to
be made by a COOSYS attribute specifying an explicit list of column
identifiers, rather than implicitly by examining the UCD entries for all
the columns and guessing what role each plays. I think that this
approach is both easier to parse and has less scope for ambiguity and
error. Similarly, the COOSYS system attribute would be better as a
character string (some values for which have predefined meanings)
rather than an enumerated type.
I've made these points previously and there is little point in repeating
the detailed arguments. Though I feel reasonably strongly about these
points I'm reluctant to delay the adoption of version 1 because of them.
I hope that some of these comments are useful.
regards,
Clive.
Clive Davenhall Institute for Astronomy,
e-mail (internet, JANET): acd @ roe.ac.uk Royal Observatory Edinburgh,
fax from within the UK: 0131-668-8416 Blackford Hill, Edinburgh,
fax from overseas: +44-131-668-8416 EH9 3HJ, Scotland.
Received on 2002-03-18Z01:27:12