Limiting remote connections

20 views
Skip to first unread message

Felix Kunzweiler

unread,
Nov 27, 2020, 6:17:30 AM11/27/20
to Luigi
Hi everyone,
first, thank you for all the great work, I really enjoy using luigi.

I have a very specific problem regarding the remote execution over ssh.
I build a fairly complex pipeline, that runs some of the heavy tasks on a remote cluster by submitting jobs to a sbatch system. All the necessary data files are handled by RemoteTargets and copied to the cluster in a separate Task prior to the heavy task.

The luigi scheduler and the workers are running on one machine that connects through SSH to the remote cluster and starts the job, using the RemoteContext.
The subsequent luigi Task waits for the remote job to finish by checking for the creation of the output files using RemoteTargets.

This was all working well by using SSHPASS, but with the latest update on the security guidelines of the compute cluster, two factor authentification is required for SSH connections.
To solve this, I added ssh multiplexing by establishing an external ControlMaster connection providing the 2FA manually, which should then used by luigi.

The problem is, when the pipeline is started and the dependency tree is built, many RemoteTargets are checked for existence resulting in a lot of SSH connections going through the ControlMaster connection.

Unfortunately, the maximum number of allowed concurrent connections is 10.
When luigi tries to add one additional connection, the ControlMaster connection breaks.

I tried defining the maximum number of connections as a resource and setting the resource in the tasks that have RemoteTargets.
As far as I understand, this does not work, because the luigi scheduler checks the whole dependency tree and not only when a task is executed.

I would be very glad if someone has an idea on how to deal with this problem.

Best,
Felix

Lars Albertsson

unread,
Nov 27, 2020, 12:04:25 PM11/27/20
to Felix Kunzweiler, Luigi
Hi,

Resources are meant for this use case. Luigi allocates the resource per task, so only your tasks having RemoteTargets would count towards the resource. Other tasks in the dependency tree would run in spite of congestion on the remote resource.

Are you seeing a different behaviour?

--
You received this message because you are subscribed to the Google Groups "Luigi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to luigi-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/luigi-user/fbe53bae-5ccc-498d-a108-6c8c32763a94n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages