Re: Logical storage units in VOSpace 1.1

From: Arun Jagatheesan <arun-at-sdsc.edu>
Date: Mon, 20 Aug 2007 00:50:21 -0700


>

Embedded are my replies to this discussion...

>> To get the list of available storage units from a VOSpace, we will
>> need a method: getLogicalStorageUnits() which will return a list
>> of URIs.
>
> Problem : this works if everything is treated as a 'BLOB'.
> As soon as we distinguish between types of structured data, e.g.
> 'tabular data' or 'image', then the global method no longer works.
>
> If I have three 'logical storage units', two implemented as disk
> files, and one as a relational database, then the system behaves
> differently depending on which 'storage unit' the data is stored in.
>
> You could transfer tabular data from the 'database store' to one
> of the 'disk store'(s), but it would be stored as a file on the
> disk, and you will probably loose the ability to treat it as
> structured data (would this change the node type from
> StructuredData to UnstructuredData ?).
>
> If you have two copies of the data, one in a database store and one
> in a file store, and used an ADQL interface to modify the
> structured data in the database, do all the changes get replicated
> to the copy stored as a file ?
>
> How do we express rules like "you can replicate a FITS tabular file
> in a database store, but you can't replicate a FITS image file a
> database store".
>
> To do this sort of thing, we would need a list of 'allowed stores'
> for each node, and some may be mutually exclusive.
> So although you could transfer the tabular data from database store
> to file store, you can't have it in both at the same time (one is
> structured queryable data the other isn't - if it was in both,
> would it be represented as a Structured or Unstructured data node ?).
> We have a similar problem with the global listViews and
> listProtocols methods at the moment, not all nodes may be able to
> support all the views and protocols, but the global methods don't
> tell you which ones are valid for which nodes.

Just like each VOSpace defines its list of accepted protocols, each storage unit would have its list of known protocol and known data types that would supported. If a VOSpace has only FTP or HTTP store (and assuming it does not want to have a database to store or support ADQL interface), it just throws an exception "not supported" when a put or replicate operation is attempted with structured data.

>
>> These URIs may be resolvable to a description of the storage unit.
>
> Yep, ok with that.
>
>> The logical storage unit identifier will be an optional argument
>> in the <transfer> entity so that as part of the data transfer
>> negotiation, the user can specify a list of storage units that
>> they want the data transferred to/from.
>
> This implies replicated storage, which is what SRB is very good at.
> However, this does add a lot of complications.
>
> Do we guarantee that data replication is handled transparently, or
> do we mark some of the stored data as out of date ?
> If the data for a node is stored in two 'storage units'[a] and
> [b], user 'A' sends new data to 'storage unit'[a] and user B reads
> their data from 'storage unit'[b], what data does user B get back ?

You are refering to the classical problem of dirty read. Its left to the designer - whether they want a very flexible system that embraces every thing (analogy is HTML code) or a strict system ensures replica consitency (again analogy would be languages that are very strict). An intermediate approach that allows policies at the vo-space level would be a better approach. [a] and [b] could be replicas following a policy X. Where X could again be a logical identifier that could mean any thing from no-guarantee-of-replication to 1-second-latency- replication or real-time.

An example: Node [/file/a/b/c] where c could be replicated at two places. The data node C will have the logical storage identifiers where physically the data is present. As part of the node it could have a resource description for "replicaiton policy" where each VOSpace could define very simple policies.

>
> Do the same permissions apply to all the copies of the data ?
> If user 'root' can read/write from all the stores, but user 'fred'
> can't write to the tape store, then what happens if 'root' creates
> a replicated copy of a node on the tape store. Can 'fred' still
> modify the data on disk (making the tape version out of date), or
> does it become read only because he can't modify the tape copy ?

Good question. I am providing my opinion of the expected behavior. I dont know if V1.1 has access controls. But if it were - there could be access controls on both data and storage units. Usually, the access control over the storage units will over ride the access controls over the data namespace. So even if fred is allowed to access /a/b/c based on data namespace - he might not access it if the data is on a resource where is not allowed even read.

Continuing with the approach - Let us assume /a/b/c has two replicas on disk and tape. When Fred tries to update C on disk where he has access - the system checks for access controls and approves Fred's write. IF the replication is carried out by root later (as per the "policy"), it should go through and the physical copy on tape would be updated.

On the other hand, Fred can not himself replicate data to tape as he does not have access to "write" on tape.

>
>> The identifier will also be an optional argument in the <node>
>> entity so that specific hardware can be targetted in moving and
>> copying data.
>
> If the data for one node may be replicated on more than one store,
> it would have to be a list of <store> elements in each <node>,
>> Comments, suggestions, etc.
>
> Yes, data replication and logical stores would be nice.
> However, to do it right would mean a lot of work, and add a lot of
> complexity.
> The SRB and IRODS system have already solved these problems, so do
> we really want to re-invent this particular wheel ?
>

Hmm.. that is the question i had asked from day one of VOSpace. Why not just use iRODS as the protocol, it can be standardized, its open software and open protocol too. Well, it does not have the late- binding for transfer protocol though. But, we are trying to add some thing similar with support from TCP/IP and non-TCP protocols. To have a large community working on a common product would provide faster results and standards.

> What is the science use case for this ?
> And can the use case be handled by using the existing SRB or IRODS
> systems ?

Well, going with the philosophy to support the small and big (with out taking away functionalities).....

Logical storage unit is nothing but a name that identfies a mapping from logical name to storage end point. This can be done just like we do for data name to node-name mapping. However concepts like replication and ensuring replication, might be too much for smaller systems. We could bring this inside by allowing any type of storage policy that is guranteed by the VOspace (which could be no-guarantee too as mentioned before).

>
> When our astronomers saw a demo of IRODS, their comments were
> "I'd like the system to have this capability, but I wouldn't
> need to use it as part of my normal work".
> "It would be very useful if our sys admin had these kind of
> tools ... so they could manage the data for us"
> ".. but as a scientist I wouldn't want to handle things at this
> level"

It just has to be made simple. Yet the bells and whistles provided only when needed (without much difficult for any one to understand and use). Every one is aware that files are not on a single disk and they are distributed. Its easy to understand this file is in "sdsc- tape" and let me replicate it to "my-lab-disk" using "replicate- daily".(the system finds out the protocols and does the transfers every day). Rather than let me find the data in ftp://ftp.sdsc.edu: 8090/a/b/c and copy it every day (replicate) to ftp:// 132.232.33.80:24/a/b/c.

>
> If we add a list of [capability] elements alongside the [accepts]
> and [provides] elements in a [node], then a replicated data store
> based on IRODS could include [uri for IRODS interface + endpoint]
> as a capability.
>
> [node]
> [properties]
> ....
> [/properties]
> [accepts]
> ....
> [/accepts]
> [provides]
> ....
> [/provides]
> [capabilities]
> [capability uri="ivo://capability.uri.for.irods"]
> [endpoint].....[/endpoint]
> [/capability]
> [/capabilities]
> [/node]
>

I forgot the some of these. I think each node must have:

- logical name of the data
- formats it provies (forgot the VOSpace lingo here)
- logical storage unit(s) - assuming its a data node. It *MIGHT* also  
individually have information such as date it was created, last user or owner
- if more than one storage unit has the data - what replication policy is used
- Any user defined microServices that can be invoked on this node (provides behaviour to the node)

  Note that replication policy here could be done by any tool or manually also. So, iRODS is not the only API to do replication.

> Effectively, the vospace service would be saying 'replication
> settings for the data in this node can be manipulated with the
> IRODS API using this endpoint'. We get access to all of the very
> nice tools that SDSC are developing, without having to define a
> whole new API for handling replication.
>
> Note : I haven't studied the IRODS in detail, but I am impressed
> with what I have seen.

Thanks. There is lot more that could be done as wider community rather than every one building and discovering the individual wheels.

>
> Note : This does not mean that IRODS is the only replication API.
> If we really, really, want to, we could still define an IVOA
> standard replication API, and add that as a capability.
> ....
> [capabilities]
> [capability uri="ivo://
> capability.uri.for.ivoa.replication"]
> [endpoint].....[/endpoint]
> [/capability]
> [/capabilities]However, w
> ....
>
> Data replication would be nice, but do we need to define it in
> vospace.
> Or can we pass it over to an established API that has been designed
> to handle this sort of thing.
>
> Dave
Received on 2007-08-20Z09:50:45