I'm forwarding these messages to the vospace list for discussion.
There are a number of interested parties on the list who may not have
seen the original messages.
Dave
attached mail follows:
On 23.11.2006, at 20:00, Matthew Graham wrote:
> One of the problems with the first way is that there will be
> different behaviour depending on whether the space has coupled or
> decoupled data servers:
> Coupled servers:
> A. Client calls pushToVoSpace(<node>, <transfer>) returns <node>
> and <transfer> - the latter containing details for the data server
> B. Client transfers data to data server
> C. Behind the scenes, the data server tells VOSpace that transfer
> has occurred
>
> Decoupled servers:
> A. Client calls pushToVoSpace(<node>, <transfer>) returns <node>
> and <transfer> - the latter containing details for the data server
> B. Client transfers data to data server
> C. Client notifies VOSpace that transfer has been completed, e.g.
> transferComplete(<node>).
>
> Unless, of course, we make the decoupled server scenario the only
> way of doing it. We also have to enforce the use of
> transferComplete otherwise the state of the data transfer is
> indeterminate.
>
> The alternate is that the first part of the process is the user
> finding out what data servers are available with the getDataServers
> call I suggest at the end:
> A. Clients get list of data servers with getDataServers
> B. Client transfers data to data server
> C. Client registers data with VOSpace: register(<node>, URI of
> location) returns the registered <node>
>
> This is much more in keeping with the other data discovery methods
> we already have such as getProtocols. The process also does not
> leave the space in an indeterminate state.
except as I said before, one of the original use cases was that VOSpace was supposed to be managing physical location of data - if the client gets to choose where the data are sent first then it breaks that use case (though I suppose if the getDataServers call had as its argument the intended <node> this could still be done). I am not convinced the "breakage scenario" in this last process is better - if the client pushes data to a store then fails to register the node - the data server gets filled up without the VOSpace knowing about it - there are still 3 steps for the client to perform. In fact if the getDataServers call has an an argument a Node, then there is really little difference between process 2 qnd process 3 above - all that is different is when the VOSpace chooses to actually make the entry in its metadata tree.
The last process also makes it more difficult to implement simple "coupled" data stores - it implies that we have to have authentication on simple http where the client knows an authentication secret in advance for instance to stop mass uploading of porn. Simple one-time-password implementations cannot work if the client has to contact the data server first.
Another point is how decoupled are you talking about - I am presuming that the space still has access to the same filesystem as the decoupled data servers. If it is more decoupled than that then the space is unlikely to be able to control the contents on the data server, which will lead to inconsistent states anyway.
Cheers,
Paul. Received on 2006-11-24Z17:47:04