Re: ADQL simplified

From: Anita Richards <amsr-at-jb.man.ac.uk>
Date: Thu, 29 Jan 2004 09:47:10 +0000 (GMT)

> (In fact at the moment the type is usually derived from the way the
> number is specified: 2 is an integer, 2.0 is a real, '2' is a string).

A comment from a 'user' point of view:

This seems a sensible rule of thumb, similarly if one encountered an expression "thing > 2" or even "thing > stuff" it should be assumed that 2 was a number not a string, and "stuff" was a string variable representing a number (as was "thing") (that also assumes that numbers can't be string variables ... ).

However assumptions/defaults should only be applied as a last resort (however they are then invaluable....)

Here's why I think it is a last resort:

We need a way to tell the difference between ONE (ehex for some large number), ONE (string) and 1

Some languages require values ie numbers to be in "" in some cases.

Moreover, it is very dangerous to assume that 2 is an integer if that can lead to a round number aquiring a definition as an integer and then being used in a language like fortran where 1.27389/2 can give a result different from 1.27389/2.0 (this may not be a real example but you see what I mean).

The question of how much specification can affect precision is already a troublesome issue for VO tools.

Example 1: Aladin/AVO-Aladin, especially ACE, does not transport data with enough precision to measure and plot position with milli-arcsec accuracy.

Example 2: Whilst the datacube renderer was being developed there were stages when positions only known to e.g. 1 arcmin were given to an arbitrary and excessive number of d.p.

What I am saying in a confused way, I think, is that

  1. Information should not be lost. If it is _known_ (e.g. from service provider) that a number is a <Value><NumberLiteral><IntNum><Value> (why is Value twice? by the way...) then we should keep all the baggage.
  2. If information is not given, then any assumptions have to be applied with great care. e.g. 2 is assumed to be a number but it is safer to leave the type unspecified and make sure that if an application is expecting a real, it gets 2.0 (but not 2.0000000 or 1.99999985457965478007 etc....). Perl is the most intelligent language I have come across in this respect and might have some good things to copy if we need? But probably you guys know much more sophisticated ways.
  3. This probably goes beyond the ADQL remit and should be passed on to people wrapping tools and to the Registry as the implications are really for how we validate and manipulate data.
  4. The issue for astronomer queries is, how much can we guide the querier to express a query sensibly (e.g. don't use numbers as string variables!), how much can be done by checking and reporting apparent errors in input queries, how can we guide the querier to ask for an appropriate precision explicitly if needed, and how can we report accuracy. I don't mean the accuracy of the data in the sense of instrumental position errors etc. (although that has to be included), I mean being able to decide whether a query needs double precision, or avoiding falling into traps like using a full Julian day as an integer 2451516 for tools that can't handle long integers, or using decimal radians and single precision for milli-arccsec positions - or using 32-bit arithmetic and replies when only arcmin precision is needed.

Cheers
a

Received on 2004-01-29Z10:47:33