[slurm-users] work with sensitive data

348 views
Skip to first unread message

Michał Kadlof

unread,
Dec 14, 2021, 3:22:58 PM12/14/21
to Slurm User Community List
Hi,

some of my users work with "sensitive data". Currently we use standard
unix groups with ACLs to limit access but I wonder if there is any way
to keep data encrypted (for example with gpg) and decrypt them "on the
fly" in Slurm job and then encrypt the results again after the job is
finished.

We store users homes on lustre shared filesystem if it matter...

Are there any recommendations, guides or "best practices" how to keep
such data safe?

--
cheers
Michał Kadlof


Hermann Schwärzler

unread,
Dec 15, 2021, 4:29:56 AM12/15/21
to slurm...@lists.schedmd.com
Hi Michał,
hi everyone,

we are having similar issues looming at the horizon (sensitive medical
and human genetic data). :-)

We are currently looking into telling our users to use EncFS
(https://en.wikipedia.org/wiki/EncFS) for this. As it is a filesystem in
user-space unprivileged users can use it freely and as there are
implementations available for Windows and OSX as well they have the
possibility to transfer data in its encrypted form to and from the cluster.

We do not have a "turn-key" solution, yet.
One of the open problems is a way to provide the password for mounting
the encrypted directory inside a slurm-job. But this should be solvable.

Regards,
Hermann

Josef Dvoracek

unread,
Dec 16, 2021, 6:41:35 AM12/16/21
to slurm...@lists.schedmd.com
> One of the open problems is a way to provide the password for
mounting the encrypted directory inside a slurm-job. But this should be
solvable.

I'd be really interested to hear more about the mechanism to distribute
credentials across compute nodes in secure way, especially if we're
using filesystem being not secure by default..

On 15. 12. 21 10:29, Hermann Schwärzler wrote:
...

--
Josef Dvoracek
Institute of Physics | Czech Academy of Sciences | office 230A
cell+signal: +420 608 563 558

Michał Kadlof

unread,
Dec 17, 2021, 5:41:25 PM12/17/21
to slurm...@lists.schedmd.com

On 15.12.2021 10:29, Hermann Schwärzler wrote:
We are currently looking into telling our users to use EncFS (https://en.wikipedia.org/wiki/EncFS) for this.

This looks good to me. However it looks like it still require interactive job to provide password manually. Would be great if anyone could point out how to decrypt it with "sbatch".

Do you know what happens with "decrypted" mount point after job run out of time, or is killed for other reason? Is it then unmounted automatically? Is it remain safe when left mounted permanently (for example on access node)?

--
best regards
Michał Kadlof

Renfro, Michael

unread,
Dec 17, 2021, 6:32:55 PM12/17/21
to Slurm User Community List

Untested, but given a common service account with a GPG key pair, a user with a GPG key pair, and the EncFS encrypted with a password, the user could encrypt a password with their own private key and the service account's public key, and leave it alongside the EncFS.

 

If the service account is monitoring a common area for new files, it can grab the EncFS and the doubly-encrypted password, decrypt the password with its own private key and the user's public key, unlock the EncFS, and run the job.

 

Afterwards, the service account can re-lock the EncFS and let the user unlock it for viewing final results.

 

From: slurm-users <slurm-use...@lists.schedmd.com> on behalf of Michał Kadlof <m.ka...@mini.pw.edu.pl>
Date: Friday, December 17, 2021 at 4:41 PM
To: slurm...@lists.schedmd.com <slurm...@lists.schedmd.com>
Subject: Re: [slurm-users] work with sensitive data

External Email Warning

This email originated from outside the university. Please use caution when opening attachments, clicking links, or responding to requests.


William Brown

unread,
Dec 17, 2021, 6:51:42 PM12/17/21
to Slurm User Community List
I realise not helpful with Lustre but we are using NFSv4 with krb5p mounts to encrypt in flight.

Also AUKS to make the Kerberos tickets available to the compute nodes, an idea from CERN.

All our nodes are AD integrated, so if the user is authenticated by AD they can access the data, and not otherwise.

Authorization is by AD group membership, with RFC2307 attributes in AD so we have username mapping. That is why we use NFSv4.

That suits NGS as most of the software isn't written for MPI or other ways where a real cluster file system is needed. 

An advantage is that the users don't really see anything unusual apart from having to login with a password,  as GSSAPI cannot work with this setup. 


Reply all
Reply to author
Forward
0 new messages