DIRAC storage file paths for multiple VOs with fully qualified names

36 views
Skip to first unread message

Daniela Bauer

unread,
Nov 3, 2014, 12:52:59 PM11/3/14
to diracgri...@googlegroups.com
Hi,

we are trying to setup a multiple VO dirac server but we've run into the problem that the VO specific bit of the paths on the storage elements doesn't always match the VO name - this is especially true for fully qualified VO names. E.g. the path for comet.j-parc.jp is  /pnfs/hep.ph.ic.ac.uk/data/comet (and not /pnfs/hep.ph.ic.ac.uk/data/comet.j-parc.jp)

Usually querying the bdii takes care of this, but there seems to be no obvious way to convey this to dirac, especially as the LFN should contain the fully qualified VO name. If we try and set this for each VO and each storage element separately we end up with an almost unmanagable amount of storage elements (and files stored in funny paths like /pnfs/hep.ph.ic.ac.uk/data/comet/comet.j-parc.jp/user/specified/bit)

As far as we can tell, we have:
 

Each storage element has a path specified:
/Resources/StorageElements/AccessProtocol.1/Path = /pnfs/hep.ph.ic.ac.uk/data

When a user uploads a file, the LFN is appended to this path:
/pnfs/hep.ph.ic.ac.uk/data/vo/some_file

The problem is that the VO name in the LFN should be fully qualified, however some storage elements (mainly dCache and CASTOR) use the short name (or possibly anything). All this information is in the BDII, but currently ignored by bdii2CS.

To fix this, I think we need to add a mapping section to each StorageElement block in the config:
Mappings
{
dteam = /pnfs/hep.ph.ic.ac.uk/data/dteam
comet.j-parc.jp = /pnfs/hep.ph.ic.ac.uk/data/comet
}
If the first path element of the LFN (the full VO name) is in mapping, then this should be substituted when creating the PFN from the SE path. I.e. If the VO is found in the mappings, this value is used instead of the Path field. The mappings field can be fully populated from the glue1 schema.

Is our understanding correct ? And would it make sense to proceed as suggested above ?

Cheers,

Daniela




Andrei Tsaregorodtsev

unread,
Nov 3, 2014, 2:45:22 PM11/3/14
to diracgri...@googlegroups.com
  Hi Daniela,

  What you saying is a real problem now and it was supposed to be addressed in the v7r0 version of DIRAC. In this
version all the computing resources ( including Storage Elements ) can have parameters depending on a VO. So,
the path can have a generic value modified for some VOs as necessary. All the consumers of those parameters
are getting the right settings depending in which VO context the StorageElement interface is used.

  The problem is that the v7r0 is coded but its testing is very slow because the new multi-VO oriented resources
description is affecting almost all the components of DIRAC. So, I would prefer to intensify efforts in testing v7r0
which will bring a solution to this problem among others. Alternatively, some interim solution can be coded, but
this would be an extra effort.

  One question that I have - you say that a VO specific path is in the glue schema. Where is it stored exactly ?
I did not manage to find it out.

  Any comments are welcome.

  Cheers,
  Andrei
--
You received this message because you are subscribed to the Google Groups "diracgrid-develop" group.
To unsubscribe from this group and stop receiving emails from it, send an email to diracgrid-deve...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Daniela Bauer

unread,
Nov 4, 2014, 5:43:09 AM11/4/14
to diracgri...@googlegroups.com
Hi Andrei,

The problem is I need something on short notice, so we might have to put in a hack anyway. Would it make sense trying to backport this bit (the storage  issue) of v7 to v6 ? (Not necessarily by you.)
As for v7r0: We have a multitude of dirac servers here and we (as a group) are more than happy to a) test it and b) develop off it, but the last time we tried it just didn't install due to some database issues and we need something that at least installs without too much voodoo. Is there a v7 around for that purpose ? Any instructions ?

As for the glue schema, I was thinking of this bit (here for my own SE):
ldapsearch -H ldap://topbdii.grid.hep.ph.ic.ac.uk:2170 -b "o=grid" -x '(&(GlueChunkKey=GlueSEUniqueID=gfe02.grid.hep.ph.ic.ac.uk)(objectClass=GlueVOInfo))' GlueVOInfoAccessControlBaseRule GlueVOInfoTag GlueVOInfoPath

Cheers,
Daniela

--
Sent from the pit of despair

-----------------------------------------------------------
daniel...@imperial.ac.uk
HEP Group/Physics Dep
Imperial College
London, SW7 2BW
Tel: +44-(0)20-75947810
http://www.hep.ph.ic.ac.uk/~dbauer/

Andrei Tsaregorodtsev

unread,
Nov 5, 2014, 3:23:59 AM11/5/14
to diracgri...@googlegroups.com
  Hi Daniela,


On 04/11/14 11:42, Daniela Bauer wrote:
Hi Andrei,

The problem is I need something on short notice, so we might have to put in a hack anyway. Would it make sense trying to backport this bit (the storage  issue) of v7 to v6 ? (Not necessarily by you.)
Let me see what can be done quickly. I take note of your offer to help in the tests.

  Will come back to that shortly.

  Andrei

Daniela Bauer

unread,
Nov 5, 2014, 8:35:39 AM11/5/14
to diracgri...@googlegroups.com
Hi Andrei,

we had a go at installing dirac off the integration branch this morning. If we can get this to work, we can probably try and do our own back porting and for obvious reasons we'd rather stick any new code we might develop into the integration branch.

There were two immediate problems with the integration branch, one with ipv6 and one with the databases. We had a a go at fixing them and you should see two of Simon's pull requests which have all the details. I don't know if it would be possible to merge them at short notice (we weren't quite sure about the database one). We are then left with the error:

Creating startup link at /srv/localstage/ddirac/
startup/WorkloadManagement_SiteDirector
Loading WorkloadManagement/InputDataAgent
Trying to autodiscover WorkloadManagement/InputDataAgent
Can't load InputDataAgent
== EXCEPTION ==
<type 'exceptions.ImportError'>:cannot import name getStorageElementOptions
  File "/srv/localstage/ddirac/DIRAC/Core/Base/private/ModuleLoader.py", line 232, in __recurseImport
    impModule = imp.load_module( modName[0], *impData )

  File "/srv/localstage/ddirac/DIRAC/WorkloadManagementSystem/Agent/InputDataAgent.py", line 16, in <module>
    from DIRAC.ConfigurationSystem.Client.Helpers.Resources    import getStorageElementOptions
===============
Software for agent WorkloadManagement/InputDataAgent is not installed


It appears in the install step of the SiteDirector (while running dirac-setup-site)

Any idea how to get this to work ?

Cheers,
Daniela
Reply all
Reply to author
Forward
0 new messages