Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

XNAT Performance/Scaling Issues

69 views
Skip to first unread message

Alexander Barton

unread,
Feb 12, 2025, 1:33:57 PMFeb 12
to xnat_discussion
Hello XNAT Experts,

We are experiencing performance issues with our XNAT instance and I was hoping to get some guidance or suggestions.

Our specs:
version 1.9.0, build 407
CentOS 7
Postgres 11.21

Several days ago we started seeing CPU usage spike for our server; the tomcat process was routinely consuming 100% across almost every core.  At this point the VM had 64 GB of ram (32 for tomcat), and 10 cores.  In response we more than doubled the core count (24 now), but the issue is still persisting.

Anecdotally, it looks like CPU use spikes every time someone tries to download data, or even look at some data (e.g. clicking "Details" in the prearchive).  We have seen an increase in users over the past month, but it seems to be happening when even one user does this; the site slows to a crawl, our uploads & archiving slow down, &c.

Importantly, we have not changed anything about our configuration -- except for adding CPUs to the VM in response.  It might just be that we need to start scaling to multiple servers, but the increase in users (2 - 3 frequent to 5 - 7 frequent) feels like it should not be impacting the service this much.

Any tips/thoughts/suggestions are much appreciated. 

Thank you!

Alex

PS - I've attached a photo from the Java Melody CPU chart from today for reference.  This type of activity was also happening across the weekend.
xnat-cpu-java-melody.png

Rick Herrick

unread,
Feb 12, 2025, 3:03:59 PMFeb 12
to xnat_di...@googlegroups.com
Hey Alex, couple questions:
  • You said this started happening "several days ago". Was that contemporaneous with upgrading to 1.9.0? Or had you been on 1.9.0 for a little while before this started happening?
  • Do you have any plugins installed and if so which ones and what versions?
The first thing that jumped out at me were the versions of your OS and PostgreSQL, both of which are quite old: CentOS 7 EOL'ed in June 2024 and PostgreSQL 11 in November 2023. Honestly, I'm less concerned about PostgreSQL in this regard and more about CentOS. It's possible that there's an issue with something like NFS or other network-attached storage drivers that are causing these I/O operations to run up the CPU utilization. 

--
You received this message because you are subscribed to the Google Groups "xnat_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussi...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/xnat_discussion/ce4d50c1-2654-40f4-87f5-43ae3b072462n%40googlegroups.com.

Alexander Barton

unread,
Feb 12, 2025, 5:15:13 PMFeb 12
to xnat_discussion
Thank you for the response Rick! 

- We migrated to 1.9.0 back in October (the performance improvements were quite nice the last few days notwithstanding...)
- Many plugins:
  • Batch Launch - 0.7.0
  • Container Service - 3.6.0
  • LDAP Auth - 1.1.0
  • OHIF viewer - 3.7.0
  • OpenID Auth - 1.3.1
  • Ximg View - 1.0.2 *This one is no longer supported, but we use it as the OHIF viewer cannot handle some of our GE Spiral sequences
I agree on the CentOS front.  We've been in the process of upgrading to RHEL, but procurement, validation, &c. have delayed this a bit.  You're also spot on with the NFS, our main archive sits on an NFS.  However our IOPS weren't looking too irregular during this. 

In terms of the Catalina/Tomcat logs is there anything I should be looking for?  Could I up the log level and check for certain processes?  Currently I do see some one SQL error in `catalina.out` With this error (usernames redacted):

```
Loaded subject object (org.nrg.xdat.om.XnatSubjectdata) as context parameter 'subject'.
Exception in thread "Thread-189" org.springframework.jdbc.BadSqlGrammarException: StatementCallback; bad SQL grammar [DROP TABLE xdat_search._BUR01_ROM_xnat_col_subjectData_[username]_1739383960371]; nested exception is org.postgresql.util.PSQLException: ERROR: table "_[project_id]_xnat_col_subjectdata_[username]_1739383960371" does not exist
```
seemingly trying to drop a table that does not exist.  It only occurs once during the day, however.

Once again thank you so much for any guidance!

Alex


mohammad amanuddin

unread,
Feb 21, 2025, 9:57:02 AMFeb 21
to xnat_discussion
Hi All,
Since the tomcat service is causing as spike. What are the spec of tomcat service? What version is it on?  I would recommend upgrading tomcat  service to the latest version.

Timothy Olsen

unread,
Feb 21, 2025, 10:54:13 AMFeb 21
to xnat_di...@googlegroups.com
We've been working with this group to try to figure out what is going on.  It's definitely an odd one.  We'll make sure to post back when we know the resolution.

Tim

Reply all
Reply to author
Forward
0 new messages