Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

XNAT 1.9.1.1 file descriptor leak

99 views
Skip to first unread message

Trey Dockendorf

unread,
Jan 31, 2025, 9:06:09 AMJan 31
to xnat_discussion
We are running several instances of XNAT 1.9.1.1 and XNAT 1.8.10.1.  All the 1.9.1.1 have begun leaking file descriptors.  We've increased the file descriptor count from 1024 to 4096 to 16385 and within 24-48 hours of a restart the XNAT instance is hitting the limit.

The contents of /proc/<pid>/fd are mostly sockets like this:

lrwx------ 1 osundaxnatdev PCON0340 64 Jan 30 09:07 9996 -> 'socket:[16542392]'
lrwx------ 1 osundaxnatdev PCON0340 64 Jan 30 09:07 9997 -> 'socket:[16541493]'
lrwx------ 1 osundaxnatdev PCON0340 64 Jan 30 09:07 9998 -> 'socket:[16542421]'
lrwx------ 1 osundaxnatdev PCON0340 64 Jan 30 09:07 9999 -> 'socket:[16542422]'

This happens on both RHEL8 and RHEL9 but both using Java 1.8.0 from RHEL.

Plugins installed are:

ohif-viewer-3.7.0
container-service-3.6.2
batch-launch-plugin-0.7.0
xsync-plugin-1.8.1

Not entirely sure how to narrow down what is leaking file descriptors so any guidance is appreciated.

Trey Dockendorf

unread,
Jan 31, 2025, 3:59:45 PMJan 31
to xnat_discussion
Digging into the configuration.log I see this:

2025-01-31 15:47:14,129 [taskScheduler-3] ERROR org.nrg.xnat.initialization.InitializingTasksExecutor - An error occurred while running the task "Check if all images referenced in commands are present on docker server. If not, pull them.", 1 incomplete tasks found.

java.lang.RuntimeException: java.io.IOException: com.sun.jna.LastErrorException: [2] No such file or directory

        at com.github.dockerjava.okhttp.UnixSocketFactory.createSocket(UnixSocketFactory.java:23)


We do not run any Docker daemons and it seems that possibly the container-service plugin is constantly attempting to access the Docker socket and leaving file descriptors open in the process.


Rick Herrick

unread,
Jan 31, 2025, 5:26:00 PMJan 31
to xnat_di...@googlegroups.com
Hi Trey,

I think your conclusion is reasonable. If you don't run Docker at all you should be able to fix this by stopping Tomcat, moving the container service plugin out of XNAT's plugins folder, then restarting. Let us know if that mitigates the issue for you.

--
You received this message because you are subscribed to the Google Groups "xnat_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussi...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/xnat_discussion/fbc709d2-eefb-448a-a470-b3a47877e8d3n%40googlegroups.com.

Trey Dockendorf

unread,
Feb 4, 2025, 6:54:41 PMFeb 4
to xnat_discussion
Removing both the container service plugin and batch launch plugin resolved this.  Had to remove the batch launch plugin as it depends on the container service plugin.  We never saw this with the 1.8.10.1 and associated plugin versions so something must have changed at some point with the container plugin to attempt to open Docker sockets and then not close the file descriptor when the attempt fails.

Rick Herrick

unread,
Feb 5, 2025, 11:21:45 AMFeb 5
to xnat_di...@googlegroups.com
That's interesting. What versions of the container service plugin do you have installed on 1.8.10.1 and 1.9.1.1?

Haroon Chughtai

unread,
Feb 6, 2025, 7:54:32 AMFeb 6
to xnat_discussion
We've had the same issue after upgrading some of our instances from XNAT 1.8.10.1/CS 3.4.3 to XNAT 1.9.1.1/CS 3.6.2

The instances remaining on XNAT 1.8.10.1/CS 3.4.3 have had no issues, and our instances that are now on XNAT 1.9.1.1/CS 3.6.2 and have the Container Service configured for use are fine.
However our instances on XNAT 1.9.1.1/CS 3.6.2 that had CS installed but not setup were encountering the file descriptor leak and crashing. Removing the CS and batch launch plugins has mitigated the issue.

Cheers,
Haroon

--
Haroon Raza Chughtai
Senior Research Software Engineer | UCL Advanced Research Computing Centre
Part-time PhD Student | UCL Hawkes Institute
University College London

Rick Herrick

unread,
Feb 6, 2025, 11:14:10 AMFeb 6
to xnat_di...@googlegroups.com
Great, thanks for letting us know. I created a couple issues for this so that we can hopefully prevent this in the future.

Timothy Olsen

unread,
Feb 6, 2025, 11:16:11 AMFeb 6
to xnat_di...@googlegroups.com
Yes.  Thanks for identifying this issue.  Our test servers all have Docker running so we didn't catch it.  This could be a problem for anyone who is doing a new install of XNAT and doesn't have docker set up yet.  We'll prioritize a fix for it.

Tim

Trey Dockendorf

unread,
Feb 6, 2025, 2:30:03 PMFeb 6
to xnat_discussion
We had the same versions as others.

XNAT 1.8.10.1 and container 3.4.3 were working fine and we upgraded some hosts to XNAT 1.9.1.1 and container 3.6.2 and that's when we noticed issues.  We've removed the container and batch plugin where possible which resolved the issue.  In all cases we don't run Docker due to site policies.

Reply all
Reply to author
Forward
0 new messages