Logical storage units in VOSpace 1.1

Reagan Moore moore at sdsc.edu
Wed Aug 15 11:50:15 PDT 2007


Dave:
Actually we use logical storage names in iRODS/SRB to simplify 
interactions with  remote storage systems.  Users interact with:

- logical file names
- logical storage names

The system selects which replica of the file to access, based on 
whether there is a copy at the same IP address, whether there is a 
copy on any disk resource in the world, or whether there is a copy on 
tape.

When a user writes a file, the logical storage name can be used to represent:

- compound resource.  This is typically a disk cache in front of a 
tape.  The system stages all interactions with the tape through the 
disk cache transparently to the user.  Any disk can be paired with 
any tape system.  Thus you can have a local cache for a tape system 
at a remote site.
- cluster resource.  This is typically a collection of storage 
systems, each managed by a separate node in a cluster.  Load leveling 
is done with the data grid automatically distributing the files 
across each node file system.
- automated replication.  The system makes a copy on each of the 
associated storage system represented by the logical storage name.
- fault tolerant replication.  The system makes a copy on n of m 
storage systems, allowing for the possibility that some might be off 
line.

 From the perspective of the user, they only interact with logical 
file names and logical storage names.  A sophisticated user can query 
the system, identify the associated physical storage systems, and 
direct specific replicas to specific storage locations.  This is the 
exception.

The concern about integrating both storage systems and databases into 
the same logical name space is handled by differentiating between 
BLOBs and tables.  The SRB provides the same set of operations on 
BLOBs as on files.  Thus a Binary Large Object can be read, written, 
modified through the same file manipulation commands used to read, 
write, modify files.

For interactions with a database table, the user issues a different 
set of commands, such as an SQL query command with the results 
packaged as an XML file that is sent to the client.

Operations that depend upon the data type of a file are implemented 
through separate APIs.  An example is manipulation of HDFv5 files. 
We use the HDFv5 client APIs to control the requested operations, but 
execute the HDFv5 library calls at the remote storage system to 
perform the desired manipulations.

If a user extracts data from a database table into an XML file using 
an SQL query command, a SRB client that understands how to parse XML 
files is needed to do further manipulation.

The iRODS system removes this restriction by supporting 
micro-services that process explicit data format types.  A rule can 
be written that checks the data type, and then automatically invokes 
the correct parsing operations for manipulating the file.  It is 
possible to create a rule that checks the type of storage system, and 
then based upon the file type issues different micro-services to 
parse and manipulate a file.

The VOSpace interface can remain independent of the logical name 
spaces used by SRB/iRODS.  However, interactions with data in the 
data grid would be based on the logical names for both files and 
storage.  The physical file names (including replicas) and actual 
storage locations would be hidden from the users by default data grid 
functions.  The default functions would implement a standard 
algorithm for selecting the best file (closest replica) and for 
writing to the best storage resource.

Reagan



>Matthew Graham wrote:
>
>>A request has from our friends at SDSC to include references to the 
>>actual storage units that data is being deposited on. The use case 
>>is data replication so, for example, I want to move/copy a data 
>>object from a slow tape archive to an ultrafast disk but both 
>>hardware units are within the same VOSpace or I want to retrieve a 
>>data object from the ultrafast disk copy and not the slow tape one.
>
>Yep, would be nice to have this.
>We talked about this with the SDSC developers in February, but I 
>haven't figured out an easy way to integrate this into vospace yet.
>
>>I think that we can incorporate this easily into our existing data 
>>model. We will refer to hardware units as logical storage units 
>>with the implication that they are identified via a logical 
>>identifier (URI) that is set by the particular VOSpace 
>>implementation.
>
>Ok so far.
>
>>To get the list of available storage units from a VOSpace, we will 
>>need a method: getLogicalStorageUnits() which will return a list of 
>>URIs.
>
>Problem : this works if everything is treated as a 'BLOB'.
>As soon as we distinguish between types of structured data, e.g. 
>'tabular data' or 'image', then the global method no longer works.
>
>If I have three 'logical storage units', two implemented as disk 
>files, and one as a relational database, then the system behaves 
>differently depending on which 'storage unit' the data is stored in.
>
>You could transfer tabular data from  the 'database store' to one of 
>the 'disk store'(s), but it would be stored as a file on the disk, 
>and you will probably loose the ability to treat it as structured 
>data (would this change the node type from StructuredData to 
>UnstructuredData ?).
>
>If you have two copies of the data, one in a database store and one 
>in a file store, and used an ADQL interface to modify the structured 
>data in the database, do all the changes get replicated to the copy 
>stored as a file ?
>
>How do we express rules like "you can replicate a FITS tabular file 
>in a database store, but you can't replicate a FITS image file a 
>database store".
>
>To do this sort of thing, we would need a list of 'allowed stores' 
>for each node, and some may be mutually exclusive.
>So although you could transfer the tabular data from database store 
>to file store, you can't have it in both at the same time (one is 
>structured queryable data the other isn't  - if it was in both, 
>would it be represented as a Structured or Unstructured data node ?).
>
>We have a similar problem with the global listViews and 
>listProtocols methods at the moment, not all nodes may be able to 
>support all the views and protocols, but the global methods don't 
>tell you which ones are valid for which nodes.
>
>>These URIs may be resolvable to a description of the storage unit.
>
>Yep, ok with that.
>
>>The logical storage unit identifier will be an optional argument in 
>>the <transfer> entity so that as part of the data transfer 
>>negotiation, the user can specify a list of storage units that they 
>>want the data transferred to/from.
>
>This implies replicated storage, which is what SRB is very good at. 
>However, this does add a lot of complications.
>
>Do we guarantee that data replication is handled transparently, or 
>do we mark some of the stored data as out of date ?
>If the data for a node is stored in two 'storage units'[a] and [b], 
>user 'A' sends new data to 'storage unit'[a] and user B reads their 
>data from 'storage unit'[b], what data does user B get back ?
>
>Do the same permissions apply to all the copies of the data ?
>If user 'root' can read/write from all the stores, but user 'fred' 
>can't write to the tape store, then what happens if 'root' creates a 
>replicated copy of a node on the tape store. Can 'fred' still modify 
>the data on disk (making the tape version out of date), or does it 
>become read only because he can't modify the tape copy ?
>
>>The identifier will also be an optional argument in the <node> 
>>entity so that specific hardware can be targetted in moving and 
>>copying data.
>
>If the data for one node may be replicated on more than one store, 
>it would have to be a list of <store> elements in each <node>,
>>Comments, suggestions, etc.
>
>Yes, data replication and logical stores would be nice.
>However, to do it right would mean a lot of work, and add a lot of complexity.
>The SRB and IRODS system have already solved these problems, so do 
>we really want to re-invent this particular wheel ?
>
>What is the science use case for this ?
>And can the use case be handled by using the existing SRB or IRODS systems ?
>
>When our astronomers saw a demo of IRODS, their comments were
>    "I'd like the system to have this capability, but I wouldn't need 
>to use it as part of my normal work".
>    "It would be very useful if our sys admin had these kind of tools 
>... so they could manage the data for us"
>    ".. but as a scientist I wouldn't want to handle things at this level"
>
>If we add a list of [capability] elements alongside the [accepts] 
>and [provides] elements in a [node], then a replicated data store 
>based on IRODS could include [uri for IRODS interface + endpoint] as 
>a capability.
>
>    [node]
>        [properties]
>            ....
>        [/properties]
>        [accepts]
>            ....
>        [/accepts]
>        [provides]
>            ....
>        [/provides]
>        [capabilities]
>            [capability uri="ivo://capability.uri.for.irods"]
>                [endpoint].....[/endpoint]
>            [/capability]
>        [/capabilities]
>    [/node]
>
>Effectively, the vospace service would be saying 'replication 
>settings for the data in this node can be manipulated with the IRODS 
>API using this endpoint'. We get access to all of the very nice 
>tools that SDSC are developing, without having to define a whole new 
>API for handling replication.
>
>Note : I haven't studied the IRODS in detail, but I am impressed 
>with what I have seen.
>
>Note : This does not mean that IRODS is the only replication API. If 
>we really, really, want to, we could still define an IVOA standard 
>replication API, and add that as a capability.
>
>        ....
>        [capabilities]
>            [capability uri="ivo://capability.uri.for.ivoa.replication"]
>                [endpoint].....[/endpoint]
>            [/capability]
>        [/capabilities]However, w
>        ....
>
>Data replication would be nice, but do we need to define it in vospace.
>Or can we pass it over to an established API that has been designed 
>to handle this sort of thing.
>
>Dave



More information about the vospace mailing list