Re: VOStore interface

From: Matthew Graham <mjg-at-cacr.caltech.edu>
Date: Fri, 05 Aug 2005 17:34:11 -0700


Hi,

I would argue that this is an implementation issue: you have to make sure that VOStore can fulfil what it promises.

The required functionality for authentication is just that the VOStore can recognise a valid message, e.g. the certificate used to sign the SOAP message has the NVO CA in its certificate chain.

    Cheers,

    Matthew

Reagan Moore wrote:

> Matthew:
> This is where we differ in the specification.
>
> Which account does VOStore run under on the local storage system?
>
> Reagan
>
>> Hi,
>>
>> If the user is accessing the local storage system directly then they
>> can do whatever they want. VOStore, however, is the presentation of
>> that repository to the VO world and does not necessarily interface
>> with a VOSpace layer: this means that the VOStore interface has to be
>> capable of handling the VO authentication mechanism. The
>> authorization story is as we seemed to have agreed.
>>
>> Cheers,
>>
>> Matthew
>>
>>
>> Reagan Moore wrote:
>>
>>> Matthew:
>>>
>>> The expectation is that the VOStore interface does not need to do
>>> either authentication or authorization. If a person is working
>>> directly with a local storage system, then they are accessing their
>>> own personal data while running under their personal account ID.
>>> They can execute the VOStore interface as a local application.
>>>
>>> If VOSpace is accessing the local storage system through VOStore,
>>> then VOSpace authenticates its access to the local storage system to
>>> read or write files under the VOSpace account ID. Again VOStore is
>>> just a local application that VOSpace executes.
>>>
>>> If the owner of data on the local storage repository chooses to make
>>> a file world readable, then VOSpace would be able to access the file
>>> through VOStore.
>>>
>>> Reagan
>>>
>>>
>>>> Reagan Moore wrote:
>>>>
>>>>> I would like to propose the following separation of identity and
>>>>> access control management. The issues appear to be how to
>>>>> separate support for local files in a local storage repository
>>>>> from the files that are registered into a shared collection that
>>>>> spans multiple storage repositories. An easy way to make the
>>>>> differentiation is to identify the usage model for each type of
>>>>> data management system. I would like to learn whether this
>>>>> approach would meet all of the IVOA requirements.
>>>>>
>>>>> Local storage repository:
>>>>>
>>>>> This is a storage system that is controlled by local
>>>>> administrators who establish access accounts for the persons who
>>>>> are allowed to use the system.
>>>>> The users can choose their own file names, manipulate the files
>>>>> with the utilities that are available on the local storage, and
>>>>> are authenticated by the local system. If desired, a user could
>>>>> log onto the local storage repository, and use a VO specific
>>>>> interface such as VOStore to access their own personal data. Since
>>>>> VOStore would be run under their account ID to access files that
>>>>> they own, there is no additional required authentication. They
>>>>> could also use other access mechanisms such as perl scripts, or
>>>>> Unix shell commands, C library calls, whatever is supported on the
>>>>> local storage repository. These access mechanisms allow them to
>>>>> access files that they own.
>>>>>
>>>>> A VOStore interface for this usage model would provide:
>>>>> - get file
>>>>> - put file
>>>>> - list files
>>>>> The only advantage is that if the VOStore interface were supported
>>>>> on all local storage repositories, the user would have a standard
>>>>> access mechanism.
>>>>>
>>>>> Shared collection - VOSpace:
>>>>>
>>>>> The purpose of the shared collection is to organize files across
>>>>> multiple storage repositories, provide a way to register files
>>>>> into the shared collection, establish access controls on the
>>>>> shared data, provide standard services for manipulating the files
>>>>> (Cone Search, SIAP, SSAP, Mosaic, ...), support replication,
>>>>> support selection of the closest file.
>>>>>
>>>>> The shared collection provides a global (or logical) name space
>>>>> that can be organized in a directory structure independently of
>>>>> the naming convention and path hierarchy employed at the local
>>>>> storage systems. Thus the VOSpace system must manage the mapping
>>>>> from the logical name space to the naming convention used in the
>>>>> local storage system.
>>>>>
>>>>> An account ID is established under which the shared collection
>>>>> (VOSpace) is able to deposit files in the local storage
>>>>> repository. This means the shared collection owns the data that is
>>>>> stored at the local storage repository. In order to access the
>>>>> data, a user would need to authenticate herself to the shared
>>>>> collection, which in turn authenticates itself to the local
>>>>> storage repository. Whether or not to allow the access is
>>>>> controlled by ACLs managed by VOSpace. This means that the
>>>>> authentication mechanism used by VOSpace is completely independent
>>>>> of the authentication mechanisms used by the local storage systems.
>>>>>
>>>>> In order to handle the fact that local storage systems use a
>>>>> variety of authentication mechanisms (Unix password, PKI
>>>>> certificates, Kerberos certificates, DCE credentials, ...) the
>>>>> VOSpace implementation could use the Generic Security Service API
>>>>> (GSSAPI) to handle the heterogeneity. In addition, an arbitrary
>>>>> authentication mechanism can be chosen for authenticating users to
>>>>> VOSpace.
>>>>>
>>>>> If a VOStore interface is provided by the local storage
>>>>> repository, then VOSpace would be able to invoke the VOStore
>>>>> access mechanism (running under the VOSpace account ID). Note
>>>>> that in this model VOStore does no authentication. All
>>>>> authentication is controlled by a combination of the local storage
>>>>> system and VOSpace.
>>>>>
>>>>> The type of operations that would be required by VOStore, however,
>>>>> are more sophisticated. They include:
>>>>> - get file
>>>>> - put file
>>>>> - list files
>>>>> - register an existing file into VOSpace, while mapping from the
>>>>> local name to the VOSpace preferred name
>>>>> - register an existing directory structure into VOSpace, while
>>>>> setting the VOSpace logical names and VOSpace directory structure
>>>>> to be the same as the local directory structure
>>>>> - register an existing local file into VOSpace as a replica of an
>>>>> existing VOSpace logical file.
>>>>>
>>>>> With the latter three commands, it is possible to meet the
>>>>> specific requirement that users be able to control the names of
>>>>> files both on the local system and in VOSpace. Note that for the
>>>>> user to access the local file system they required an account ID
>>>>> on the local file system. They then stored a local file under
>>>>> their own account ID. They would add read permission for the
>>>>> VOSpace account ID to their local file to permit access by VOSpace.
>>>>>
>>>>> This separates authorization cleanly between the local storage
>>>>> system (which only checks for access by local account IDs) and the
>>>>> VOSpace shared collection (which authorizes all accesses to files
>>>>> owned by VOSpace). This means that VOSpace is managing multiple
>>>>> levels of indirection:
>>>>> - mapping from the global or logical file name space to the local
>>>>> repository name space
>>>>> - mapping from an authenticated user through application of ACLs
>>>>> to decide whether the user can read a VOSpace owned file.
>>>>> - mapping preferred location for accessing replicas (typically
>>>>> pick a file on the file system with the user's IP address, then
>>>>> any other file system, then a tape archive)
>>>>>
>>>>> For completeness, VOStore may need an operation that sets access
>>>>> permission for VOSpace, when VOStore is run under the local user
>>>>> account ID.
>>>>>
>>>>>
>>>>> Reagan Moore
>>>>>
>>>>>>
>>>>>> I think that most of what is VOStore and what is VOSpace is
>>>>>> clear; however, the two grey areas are access control
>>>>>> (authorization) and identifiers and this stems from the use case
>>>>>> where the user wants direct access to a VOStore (e.g. a local
>>>>>> store) and does not want to go through the VOSpace layer. Here
>>>>>> are my suggestions for handling these areas:
>>>>>>
>>>>>> Access control:
>>>>>> -------------------
>>>>>>
>>>>>> A VOStore can run in two modes: authorized and unauthorized. An
>>>>>> unauthorized VOStore is semantically equivalent to an anonymous
>>>>>> ftp site: any authenticated user (we still maintain security) can
>>>>>> put something in, move/rename it, get it and delete it.
>>>>>> An authorized VOStore will only allow the requested operation if
>>>>>> a valid authentication token is included in the request - all the
>>>>>> VOStore has to do here is validate the authentication token. The
>>>>>> generation of the authentication token is handled by VOSpace: it
>>>>>> makes sure that the authenticated user has permission to do what
>>>>>> they are requesting and if so, places a valid token in the
>>>>>> request down to the VOStore.
>>>>>>
>>>>>> Identifiers:
>>>>>> --------------
>>>>>>
>>>>>> The protocol identifier ivo:// identifies a resource that exists
>>>>>> in the VO. It does not promise that you can completely resolve a
>>>>>> URI beginning ivo:// in a registry, merely that some component of
>>>>>> the URI will relate to a resource that has a registry entry, i.e.
>>>>>> the bit before the first # can be resolved in a registry. So I
>>>>>> can go to a registry and find out where
>>>>>> ivo://nvo.caltech/vostores/vostore1 is
>>>>>> but I need to go to VOStore interface for this store to resolve
>>>>>> ivo://nvo.caltech/vostores/vostore1#halibut3. I do not see why we
>>>>>> need to introduce a second protocol just for VOStore contents.
>>>>>>
>>>>>> Now resolution of individual VOStore identifiers has to be done
>>>>>> at the VOStore level; however, VOSpace gives you the ability to
>>>>>> set up a single logical identifier for multiple copies of the
>>>>>> same resource so here we might want a separate protocol: vos and
>>>>>> resolution of this identifier has to be done at the VOSpace level
>>>>>> since VOSpace manages multiple VOStores.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Matthew
>>>>>>
>>>>>>
>>>>>> Paul Harrison wrote:
>>>>>>
>>>>>>> Reagan Moore wrote:
>>>>>>>
>>>>>>>> The differentiation between the VOStore and VOSpace interfaces
>>>>>>>> is becoming unclear. The latest draft implies that properties
>>>>>>>> that were originally associated with VOSpace would now be
>>>>>>>> supported by VOStore.
>>>>>>>>
>>>>>>>
>>>>>>> I have to say that I agree that there seems to be some confusion
>>>>>>> in this area - with hindsight it was probably a mistake to defer
>>>>>>> the specification of VOSpace and work on VOStore alone as the
>>>>>>> "easier" problem - the specifications should be worked in tandem
>>>>>>> to see where it is most appropriate to place roles and
>>>>>>> responsibilities for particular use cases, so that a "global"
>>>>>>> solution is arrived at.
>>>>>>>
>>>>>>> I thought that the original separation into VOStore and VOSpace
>>>>>>> was done so that VOStore could be an essentially "dumb" BLOB
>>>>>>> repository that did what it was told by the VOStore layer when
>>>>>>> it comes to issues of file permissions and hierarchical file
>>>>>>> names. However, because no VOSpace specification was created,
>>>>>>> these more advanced features have crept into the VOStore layer.
>>>>>>>
>>>>>>>>
>>>>>>>> Let's look at the current VOStore and VOSpace proposal:
>>>>>>>>
>>>>>>>> VOStore VOSpace
>>>>>>>> Storage of objects management of
>>>>>>>> virtual file system
>>>>>>>> data stored under unspecified ID?
>>>>>>>> no user home directory User home directory
>>>>>>>> directory hierarchy Directory hierarchy
>>>>>>>> Unique file name within storage User-defined file
>>>>>>>> names
>>>>>>>> Mapping VOSpace
>>>>>>>> name to VOStore name
>>>>>>>> List files for user
>>>>>>>> Restrict access by user identity?
>>>>>>>> Identify files with URIs
>>>>>>>> Access controls on local file name Access controls on
>>>>>>>> VOSPace name
>>>>>>>>
>>>>>>>> This characterization mixes name space, mixes access controls,
>>>>>>>> does not provide consistent identity, does not allow consistent
>>>>>>>> management. For instance, if a URI is being provided for file
>>>>>>>> identity within the VOStore interface, then there is no need
>>>>>>>> for user-specified names within VOSTore. A second issue is the
>>>>>>>> assumption that file access can be restricted by user identity.
>>>>>>>> This means that the VOStore must manage the owner for each
>>>>>>>> file, access controls for each file. File systems usually do
>>>>>>>> this by creating accounts for each user name and applying Unix
>>>>>>>> permissions. Is this capability to be provided now by both
>>>>>>>> VOSpace and VOStore? We need a cleaner separation of
>>>>>>>> capabilities.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> This security aspect is crucial - it is clear that the owners of
>>>>>>> VOStores would not want to be managing user identity lists of
>>>>>>> all the VObs users at their stores - the fine grained access
>>>>>>> controls should be at the VOSpace level. If VOStores only
>>>>>>> respond to requests from trusted VOSpace services then this is
>>>>>>> possible, but I think that the perceived requirement for more
>>>>>>> detailed access control in the VOSpace layer has come about
>>>>>>> because prototype end-user applications have appeared that talk
>>>>>>> directly to the VOStore layer - of course, it is not surprising
>>>>>>> that this has happened because there was no VOSpace definition
>>>>>>> for the end user applications to talk to.
>>>>>>>
>>>>>>> How file/BLOB identity is managed is also crucial to producing a
>>>>>>> system that offers more than ftp. I thought that one of the
>>>>>>> fundamental driving use cases for a VOSpace was that the same
>>>>>>> BLOB could potentially live on serveral VOStores, and that when
>>>>>>> specifying a resource in VOSpace, in a workflow for instance,
>>>>>>> the resource could be retrieved from the VOStore that was
>>>>>>> "closest" on the network to where the resource would be
>>>>>>> consumed. This sort of use case does require some careful
>>>>>>> thought about the allocation and management of identifiers, and
>>>>>>> I think probably means that the VOStore will have to be aware of
>>>>>>> the VOSpace identifier.
>>>>>>>
>>>>>>> I also have an issue with reusing ivo: as the protocol part for
>>>>>>> the URI of an identifier in this system - ivo: is already well
>>>>>>> defined and used as the identifer for registry entries, and the
>>>>>>> "protocol" for accessing the entity associated with the
>>>>>>> identifier is defined in the registry interface standard. This
>>>>>>> means that given an identifier of the form
>>>>>>> ivo://authority.org/something#blah a software agent (or human
>>>>>>> for that matter) cannot tell by inspection whether the
>>>>>>> identifier refers to a file in VOSpace or is simply a reference
>>>>>>> to a registry entry (e.g. for a SkyNode) - this leads to
>>>>>>> software having to be more complex in order constantly to test
>>>>>>> for the different possibilities. I think that it would be better
>>>>>>> to have a URI with a different protocol part, vos: for instance,
>>>>>>> it would then be immediately apparent that the VOSpace protocol
>>>>>>> should be used to access the entity referred to by the identifier.
>>>>>>>
>>>>>>>>
>>>>>>>> Let's look at the Storage Resource Broker data grid separation
>>>>>>>> of local storage management from the virtual file system
>>>>>>>> management:
>>>>>>>>
>>>>>>>> Local storage system SRB name space
>>>>>>>> Storage of objects management of
>>>>>>>> virtual file system
>>>>>>>> data stored under SRB ID
>>>>>>>> no user home directory User home directory
>>>>>>>> directory indirection structure Directory hierarchy
>>>>>>>> Unique file name within storage User-defined file
>>>>>>>> names
>>>>>>>> Mapping SRB name to
>>>>>>>> local file name
>>>>>>>> List files for user
>>>>>>>> Access through SRB ID, controlled by SRB
>>>>>>>> Identify files by URIs
>>>>>>>> Access controls on
>>>>>>>> SRB name
>>>>>>>>
>>>>>>>
>>>>>>> I think that as Regan points out the separation of
>>>>>>> responsibilities that SRB has with the local storage system is
>>>>>>> pretty much the right model for VOSpace and VOStore - though it
>>>>>>> means that SRB is pretty much at VOSpace level rather than a
>>>>>>> VOStore as is suggested in the current VOSpace definition document.
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>> Hi,
>>>>
>>>> If you also allow the possibility that the local storage repository
>>>> can run in an unauthorized (anonymous access) manner then this is
>>>> exactly what Guy and I were suggesting. Does that mean that we
>>>> actually all agree on this :-)
>>>>
>>>> Cheers,
>>>>
>>>> Matthew
>>>
>
Received on 2005-08-06Z02:34:40