Re: ranges

From: Alberto Micol <Alberto.Micol-at-eso.org>
Date: Fri, 02 May 2003 16:43:38 +0200

Yap, I was thinking in terms of non-equi-spaced bins indeed. And of course one does not need to list those bins where the value is zero.

The fact is that by associating the percentage of records to each range, then the registry could give back to the user some kind of estimate of the total number of records s/he will retrieve when posing a given query. Important for ranking!

For example, suppose that an archive resource metadata contains

  1. the bidimensional distribution of the percentage of observations in the 2D (Wavelength, Resolution) space: WRD(w,r)
  2. the monodimensional distribution of limiting fluxes LFD(f)

If the user is querying the registry for archives containing data in a certain (user_w, user_r) wavelength/resolution regime, and reaching at least a certain (user_lf) limiting flux, then the product of the two distributions

   WRD(user_w,user_r) * LFD( user_lf )

might give an indication of the proability of finding data of the required characteristics. It is not necessarily the correct number, but such number might be good enough for ranking the list of resources returned by the registry.

/* Of course the math is a bit simplified here, since there will be the need *

Alberto

--
Alberto.Micol-at-eso.org                         Tel: +49 89 32006365
HST Science Archive       ST-ECF              Fax: +49 89 32006480
ESA/RSSD/SN               c/o ESO             Karl Schwarzschild Str.2,
http://archive.eso.org/   No ads, thanks.     Garching bei Muenchen,
http://www.stecf.org/     HTML emails         D-85748 Germany
Received on 2003-05-02Z16:44:51