Checkpointing in singularity

356 views
Skip to first unread message

Rémy Dernat

unread,
Jan 26, 2017, 9:55:56 AM1/26/17
to singu...@lbl.gov
Hi,


However, this means that the checkpoint method is included in the software design. As an HPC system administrator, you generally have to install applications but you do not have a deep knowledge (or even the permission or the sources) of each app, except for those you code yourself.

For some containers technologies, there is the possibility to use CRIU, ie :

Rémy Dernat

unread,
Jan 26, 2017, 10:01:45 AM1/26/17
to singu...@lbl.gov
Sorry, for the wrong handling.

So, you already have CRIU directly on docker, LXC  or even OpenVZ :

Is there any plan to include CRIU in the next versions of singularity ?

There are many advantages to use freezing/restoring techs for a container. For example, in a HPC environment, you can think to migrate a running job from a host to another if anything went wrong or if you need more resources.

Best regards

Rémy

Gregory M. Kurtzer

unread,
Jan 26, 2017, 12:36:57 PM1/26/17
to singularity
I will look into this in more detail. There are some technical difficulties that Singularity would have rather then Docker, LXC and OpenVZ namely because they all run daemons as root, so it is the root process which invokes the checkpointing. Singularity doesn't use a root owned daemon and thus the calls to ptrace() from CRIU would not work.

As I said, I will investigate this before the release of 2.3, but please submit a feature enhancement request via the Github issue tracker (just to be sure I don't forget).

Thanks!

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.



--
Gregory M. Kurtzer
HPC Systems Architect and Technology Developer
Lawrence Berkeley National Laboratory HPCS
University of California Berkeley Research IT
Singularity Linux Containers (http://singularity.lbl.gov/)
Warewulf Cluster Management (http://warewulf.lbl.gov/)

Rémy Dernat

unread,
Jan 26, 2017, 12:43:21 PM1/26/17
to singu...@lbl.gov
Ok, i will do that !

Thanks Greg

Le 26 janv. 2017 18:37, "Gregory M. Kurtzer" <gmku...@lbl.gov> a écrit :
I will look into this in more detail. There are some technical difficulties that Singularity would have rather then Docker, LXC and OpenVZ namely because they all run daemons as root, so it is the root process which invokes the checkpointing. Singularity doesn't use a root owned daemon and thus the calls to ptrace() from CRIU would not work.

As I said, I will investigate this before the release of 2.3, but please submit a feature enhancement request via the Github issue tracker (just to be sure I don't forget).

Thanks!
On Thu, Jan 26, 2017 at 7:01 AM, Rémy Dernat <rem...@gmail.com> wrote:
Sorry, for the wrong handling.

So, you already have CRIU directly on docker, LXC  or even OpenVZ :

Is there any plan to include CRIU in the next versions of singularity ?

There are many advantages to use freezing/restoring techs for a container. For example, in a HPC environment, you can think to migrate a running job from a host to another if anything went wrong or if you need more resources.

Best regards

Rémy

2017-01-26 15:55 GMT+01:00 Rémy Dernat <rem...@gmail.com>:
Hi,


However, this means that the checkpoint method is included in the software design. As an HPC system administrator, you generally have to install applications but you do not have a deep knowledge (or even the permission or the sources) of each app, except for those you code yourself.

For some containers technologies, there is the possibility to use CRIU, ie :

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.



--
Gregory M. Kurtzer
HPC Systems Architect and Technology Developer
Lawrence Berkeley National Laboratory HPCS
University of California Berkeley Research IT
Singularity Linux Containers (http://singularity.lbl.gov/)
Warewulf Cluster Management (http://warewulf.lbl.gov/)
Reply all
Reply to author
Forward
0 new messages