Some job requirements for execution nodes

21 views
Skip to first unread message

Nikolay Kutovskiy

unread,
Sep 26, 2019, 9:00:56 AM9/26/19
to diracgrid-forum
Hello!

There is a need to specify the following requirements for execution nodes:

1) certain python version available on the node (e.g. 2.7.13 or 3.6);
2) free amount of disk space for input and output data (e.g. >20GB).

If it's possible to put such requirements into JDL file then how can it be done?

Best regards,
Nikolay.

Federico Stagni

unread,
Oct 7, 2019, 10:14:02 AM10/7/19
to Nikolay Kutovskiy, diracgrid-forum
Hello,
the short answer is that no, right is not possible without a bit of development to specify such requirements.

The question I have is why would you need to have a certain python version  on the node?
The DIRAC pilot will start on python 2.6 and 2.7 (3.x versions have never been tested) and then DIRAC will install its own python. Applications should be sandboxed well enough to also use their own python versions.

Cheers,
Federico


--
You received this message because you are subscribed to the Google Groups "diracgrid-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to diracgrid-for...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/diracgrid-forum/c81df7fd-fad5-41a0-b45d-ced2aefac042%40googlegroups.com.

Andrei Tsaregorodtsev

unread,
Oct 7, 2019, 5:30:12 PM10/7/19
to diracgrid-forum
   If you control your resources, for example using special images in the cloud, then you can make
available there any software and you can describe it with Tags in the resource description. Then
you can use these tags in the job JDL description.

   For arbitrary numerical values, e.g. disk space, this is not possible to specify job requirement like
DiskSpace > 20. But you can tag your resource with something like BigDiskSpace and then require
this tag in the JDL.

   Cheers,
   Andrei

Nikolay Kutovskiy

unread,
Oct 8, 2019, 9:09:20 AM10/8/19
to diracgrid-forum
Dear Federico,

thank you for your reply. Please, see my comments inline.

On Monday, 7 October 2019 17:14:02 UTC+3, Federico Stagni wrote:
Hello,
the short answer is that no, right is not possible without a bit of development to specify such requirements.

The question I have is why would you need to have a certain python version  on the node?
The DIRAC pilot will start on python 2.6 and 2.7 (3.x versions have never been tested) and then DIRAC will install its own python. Applications should be sandboxed well enough to also use their own python versions.
our user already has his code written in python 3.6.x. So the dilemma is either re-write the whole code on python 2.7.x or to find somehow a computational resource with required version of python in DIRAC infrastructure he is using.

Best regards,
Nikolay.
 

Cheers,
Federico


Il giorno gio 26 set 2019 alle ore 15:00 Nikolay Kutovskiy <nikolay....@gmail.com> ha scritto:
Hello!

There is a need to specify the following requirements for execution nodes:

1) certain python version available on the node (e.g. 2.7.13 or 3.6);
2) free amount of disk space for input and output data (e.g. >20GB).

If it's possible to put such requirements into JDL file then how can it be done?

Best regards,
Nikolay.

--
You received this message because you are subscribed to the Google Groups "diracgrid-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to diracgr...@googlegroups.com.

Daniela Bauer

unread,
Oct 8, 2019, 9:43:22 AM10/8/19
to Nikolay Kutovskiy, diracgrid-forum
Hi Nikolay,

the most common way to resolve this is to ship the preferred python version in the user software.
Most experiments that use DIRAC use cvmfs (https://cernvm.cern.ch/portal/filesystem) for this, I'm not sure how accessible this is for you.

Regards,
Daniela

To unsubscribe from this group and stop receiving emails from it, send an email to diracgrid-for...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/diracgrid-forum/2afbcc82-361c-4ac8-b221-7cfa0dc92ac7%40googlegroups.com.


--
Sent from the pit of despair

-----------------------------------------------------------
daniel...@imperial.ac.uk
HEP Group/Physics Dep
Imperial College
London, SW7 2BW
Tel: +44-(0)20-75947810
http://www.hep.ph.ic.ac.uk/~dbauer/

Federico Stagni

unread,
Oct 8, 2019, 9:48:37 AM10/8/19
to Daniela Bauer, Nikolay Kutovskiy, diracgrid-forum
Hi Nikolay,
Don't use the DIRAC python for the user payload: instead, the user payload should come with its own python version (and dependencies). What you have is an isolation (or containerization) issue. Please see this thread for a discussion on user payload isolation: https://groups.google.com/d/msg/diracgrid-forum/TfiRouxA0XA/JbdzxotsCgAJ

Cheers,
Federico

Nikolay Kutovskiy

unread,
Oct 16, 2019, 1:44:32 AM10/16/19
to diracgrid-forum
On Tuesday, 8 October 2019 00:30:12 UTC+3, Andrei Tsaregorodtsev wrote:
   If you control your resources, for example using special images in the cloud, then you can make
available there any software
the image for VM is the same on all clouds of certain VO in the given grid infrastructure. Such image has minimum software installed. As far as I understand a modification of VM image by cloud admin at some resource center is against the rules.
 
and you can describe it with Tags in the resource description. Then
you can use these tags in the job JDL description.

   For arbitrary numerical values, e.g. disk space, this is not possible to specify job requirement like
DiskSpace > 20. But you can tag your resource with something like BigDiskSpace and then require
this tag in the JDL.
Does JobAgent apart from other info sends an information about available disk space in scratch area of the Worker Node (WN)? If it does then would it be technically possible to use such info during job pulling from the central queue?

Nikolay Kutovskiy

unread,
Oct 16, 2019, 2:33:51 AM10/16/19
to diracgrid-forum
Hi Daniela


On Tuesday, 8 October 2019 16:43:22 UTC+3, Daniela Bauer wrote:
Hi Nikolay,

the most common way to resolve this is to ship the preferred python version in the user software.
What is a proper way to do that? Virtenv is not available on Worker nodes as far as I know (and again if there are some sites with Virtenv installed then it seems the problem with finding such resources appears again because such sites have to be properly tagged). It looks like the usage of containers (e.g. singularity) is the way to go to bring all software users need for their analysis jobs. But it seems (https://github.com/DIRACGrid/DIRAC/issues/3674, https://github.com/DIRACGrid/DIRAC/issues/3381) containers (in particular singularity) is not supported yet. On the other side what if enable singularity support on VM image spread across cloud resources and tag such resource accordingly (e.g. SingularityEnabled or just Singularity). In that case users should be able to run custom singularity containers with required software environment inside.
 
Most experiments that use DIRAC use cvmfs (https://cernvm.cern.ch/portal/filesystem) for this, I'm not sure how accessible this is for you.
Some CVMFS repositories including VO-specific one are available on worker nodes but there are no either all python modules or certain functions what user's python code needs are available there (e.g. numpy).

Best regards,
Nikolay.
 

Regards,
Daniela

Nikolay Kutovskiy

unread,
Oct 16, 2019, 2:40:19 AM10/16/19
to diracgrid-forum
Hi Federico


On Tuesday, 8 October 2019 16:48:37 UTC+3, Federico Stagni wrote:
Hi Nikolay,
Don't use the DIRAC python for the user payload: instead, the user payload should come with its own python version (and dependencies). What you have is an isolation (or containerization) issue. Please see this thread for a discussion on user payload isolation: https://groups.google.com/d/msg/diracgrid-forum/TfiRouxA0XA/JbdzxotsCgAJ

The problem with user analysis jobs is a necessity to have certain version of python (e.g. 3.6 and not 2.7) and to have some python modules (e.g. numpy). I've just posted in the thread you are referring to a question about a possibility to install additional python module (e.g. numpy).

Best regards,
Nikolay.
 

Cheers,
Federico



--
Sent from the pit of despair

-----------------------------------------------------------
daniel...@imperial.ac.uk
HEP Group/Physics Dep
Imperial College
London, SW7 2BW
Tel: +44-(0)20-75947810
http://www.hep.ph.ic.ac.uk/~dbauer/

--
You received this message because you are subscribed to the Google Groups "diracgrid-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to diracgr...@googlegroups.com.

Federico Stagni

unread,
Oct 16, 2019, 6:44:19 AM10/16/19
to Nikolay Kutovskiy, diracgrid-forum
Hi Nikolay,
DIRAC does not yet consider the scratch space as a requirement for job matching, but indeed this is something that at some point will be added.
Regarding the environment for (user) jobs: it's not the role of DIRAC per se to create a suitable environment for the user jobs (oftern referred to as "user payloads": this is something that you, as a user, should find a way to set.

Cheers,
Federico

To unsubscribe from this group and stop receiving emails from it, send an email to diracgrid-for...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/diracgrid-forum/d152da1e-64ba-4f34-b12e-5785af6c6342%40googlegroups.com.

Daniela Bauer

unread,
Oct 16, 2019, 7:00:54 AM10/16/19
to Federico Stagni, Nikolay Kutovskiy, diracgrid-forum
Hi Federico,

But as we (Simon and me as the maintainers of a rather mixed used DIRAC) alluded to in previous emails, this is something that is extremely difficult for users and we should try and make it easier for them.
I really don't understand your reluctance to this: Sure you can just state that this is a user problem, but if I have to solve it for the nth time because, I'd rather try and approach it from the other end. Look at Igor's problems. His users don't have the technical know how  to do this, so it falls to the dirac admin, who in turn, doesn't actually know what the users are doing and hilarity ensues. That's how I (and Simon) end up being a member in half a dozen experiments, just to make their code somehow run.
As noted Simon and me are willing to put some effort into this from the DIRAC end, if only because it will make life for people outside CERN a lot easier.

@Nikolay: As far as I understand the situation now, you would like to run container on a cloud ? Or a specific image ? Because that should be possible, though it might be the admin would have to provide it.
We just about started doing something along those lines on the GridPP DIRAC, I guess this is a question for Andrei as the EGI DIRAC maintainer ? We handle the space needs on the cloud by providing a flavour  that has the adequate space - but this needs the agreement of the cloud provider, there's no point creating a flavour that won't fit on the hardware.

Cheers,
Daniela



Marko Petric

unread,
Oct 16, 2019, 8:21:57 AM10/16/19
to diracgrid-forum
Dear Daniela,
But as we (Simon and me as the maintainers of a rather mixed used DIRAC) alluded to in previous emails, this is something that is extremely difficult for users and we should try and make it easier for them.
I think it would be productive to clarify the problem a bit better. Presently I only understand that your users have problems (not really what is the content of the problem), but not really what is their initial setup (do they use CVMFS, container, or how to they get their software to the grid) and given such a setup that they have what is the problem that their are experiencing?
 
I really don't understand your reluctance to this: Sure you can just state that this is a user problem, but if I have to solve it for the nth time because, I'd rather try and approach it from the other end. Look at Igor's problems. His users don't have the technical know how  to do this, so it falls to the dirac admin, who in turn, doesn't actually know what the users are doing and hilarity ensues. That's how I (and Simon) end up being a member in half a dozen experiments, just to make their code somehow run.
As noted Simon and me are willing to put some effort into this from the DIRAC end, if only because it will make life for people outside CERN a lot easier.
Also here more context, maybe a concrete example, would be appreciated. But I don't see any good solution that would satisfy a broad community and the myriad of setups that different groups have, there are basically only two options:
 - you are a service provider that will execute any payload that a community gives, and it's the job ob the community to provide the payload in a form that can be execute on a vanilla grid node
 - you are a member of the a given community understand their software and are able to construct e.g. DIRAC Workflow modules in an extension for their software and then it's super easy for the community to use DIRAC

I think the current description of the problem is too vague.

Cheers,
Marko

 

--
You received this message because you are subscribed to the Google Groups "diracgrid-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to diracgr...@googlegroups.com.

Federico Stagni

unread,
Oct 17, 2019, 5:24:52 AM10/17/19
to Marko Petric, diracgrid-forum
Hi,
there's no reluctance on my side. I think that we can, as vanilla DIRAC, implement what we can for minimizing possible environment issues, and I appreciate the effort that you guys are putting into this.

At the same time I see (correct me please) 2 use cases:
1) DIRAC client installations "messing up" the current users' environments (and the consequent inability to even run simple stuff like "less")
2) Users' payloads environment definition

We can take care, as DIRAC, of point 1. You already made steps in this direction

Regarding point 2, here things get trickier. The users' jobs environment largely depend on what the user wants to find. Somehow, a kind of "containerization" need to happen (with "container" here I am not talking specifically about singularity or docker containers, here with container I mean everything that can provide a toolbox -- even DIRACOS, if extended, can be one), and for this DIRAC is largely not responsible.
Changes, if ever, need to be in the Workflow package, but this of course assumes that the DIRAC Job APIs have been used to create the job and that dirac-jobexec is the executable chosen.
The suggestion to use CVMFS for distributing the necessary libraries and binaries is just a way for easing the definition of such containers.

Cheers,
Federico

To unsubscribe from this group and stop receiving emails from it, send an email to diracgrid-for...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/diracgrid-forum/ede9068e-10a0-42a2-96ca-44de2e47bbd0%40googlegroups.com.

Daniela Bauer

unread,
Oct 17, 2019, 6:07:32 AM10/17/19
to Federico Stagni, Marko Petric, diracgrid-forum
Hi Federico, Marko,

Obviously I can't speak for anyone else, but I know what my users want. These are people who are coming from the WMS model and they are using WLCG sites. These sites set up a standard worker node and some standard libraries. So what we (GridPP) are aiming for is that they end up in the environment that the site provides (whatever that is, none of our business, that's sort of the point), plus the DIRAC commands. For ILC and other experiments that already run in their own (isolated) environment that should make no difference, but for my (and possibly Igor's) user communities (and lone users, we have a few) that would be very helpful.

Regards,
Daniela

Marko Petric

unread,
Oct 18, 2019, 5:37:16 AM10/18/19
to diracgrid-forum
Hi Daniela,
Do I understand correctly, your users call DIRAC commands inside the payload?
Cheers,
Marko

--
You received this message because you are subscribed to the Google Groups "diracgrid-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to diracgr...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages