help request: broken assessor access after minor plugin update

74 views
Skip to first unread message

Martin Boswell

unread,
Mar 23, 2023, 3:01:53 PM3/23/23
to xnat_discussion
XNAT folks,

I seem to have broken some functionality in my XNAT server and I am out of ideas on how to fix it.  I'd appreciate any advice that you can offer...  

The short story is that I upgraded a custom plugin yesterday, and now my server refuses to correctly respond to certain requests (and leaves a postgres error in sql.log).  

In more detail:  (XNAT 1.8.3 build 5 on a Dell 740xd, Rocky Linux 8.5, Java 1.8.0_322-b06, Tomcat 9.0.54.0, PostgreSQL 12.10)  

A little over a year ago, I built a custom plugin to create image assessors for PET sessions (I have a similar plugin to create assessors for MRI sessions). It is currently used with 10's of thousands of experiments. Yesterday I changed the code in the PET plugin to allow 4 of the fields to accept NULL values.  Here is a diff of one of the `linked_mri_offset_days` field:  

           <xs:element name="linked_mri_offset_days">
                   <xs:annotation>
-                          <xs:documentation>Count of days from scan to linked MRI experiment.</xs:documentation>
+                          <xs:documentation>Count of days from scan to linked MRI experiment (or empty).</xs:documentation>
                   </xs:annotation>
                   <xs:simpleType>
-                          <xs:restriction base="xs:integer">
-                                  <xs:maxInclusive value="14600"/>
-                                  <xs:minInclusive value="-14600"/>
-                          </xs:restriction>
+                          <xs:union>
+                <xs:simpleType>
+                  <xs:restriction base='xs:string'>
+                    <xs:length value='0'/>
+                  </xs:restriction>
+                </xs:simpleType>
+                                  <xs:simpleType>
+                                          <xs:restriction base="xs:integer">
+                                                  <xs:maxInclusive value="14600"/>
+                                                  <xs:minInclusive value="-14600"/>
+                                          </xs:restriction>
+                                  </xs:simpleType>
+                          </xs:union>
                   </xs:simpleType>

The only other change to the code was to the plugin version number in build.gradle:  

-version "1.0.1"
+version "1.0.2"

The plugin compiled without isssue, and I stopped the server, removed the old plugin files and installed the new versions and then restarted the server.  My catalina log showed the following:  

22-Mar-2023 07:08:33.819 INFO [main] org.apache.jasper.servlet.TldScanner.scanJars At least one JAR was scanned for TLDs yet contained no TLDs. Enable debug logging for this logger for a complete list of JARs that were scanned but no TLDs were found in them. Skipping unneeded JARs during scanning can improve startup time and JSP compilation time.
SOURCE: /pool/xnatprod/xnat/data/catalina/webapps/ROOT/
===========================
Database out of date... updating


SELECT dependencies_save_and_drop('jaglab_petanalysisdata');



--Database column jaglab_petanalysisdata.linked_mri_date is no longer required.
ALTER TABLE jaglab_petanalysisdata  ALTER COLUMN linked_mri_date DROP NOT NULL;



--Database column jaglab_petanalysisdata.linked_mri_offset_days is no longer required.
ALTER TABLE jaglab_petanalysisdata  ALTER COLUMN linked_mri_offset_days DROP NOT NULL;


SELECT dependencies_restore('jaglab_petanalysisdata');
===========================
22-Mar-2023 07:09:09.214 INFO [main] org.apache.catalina.startup.HostConfig.deployWAR Deployment of web application archive [/pool/xnatprod/xnat/data/catalina/webapps/ROOT.war] has finished in [43,369] ms
22-Mar-2023 07:09:09.217 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["http-nio-8080"]
22-Mar-2023 07:09:09.228 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in [43498] milliseconds

This all seemed to work correctly, but now when I try to use the data API to access the "PET Analysis" assessors that are created by this plugin, the calls fail.  For instance, I can retrieve a PET experiment's assessor list:  

(reporting) [mboswell@nx5 ~]$ xcurl "https://MY_XNAT_SERVER/data/archive/projects/bacs/subjects/JagustLab_S00001/experiments/JagustLab_E41584/assessors?format=json"
{"ResultSet":{"Result":[{"date":"2010-10-14","xsiType":"jaglab:petAnalysisData","session_label":"B10-282_PET_2010-10-14_PIB_LBL_recon","xnat:imageassessordata/id":"JagustLab01_E55715","insert_date":"2022-08-04 15:32:13.181","session_ID":"JagustLab_E41584","project":"bacs","ID":"JagustLab01_E55715","label":"JagustLab_E41584_PET_analysis_2022-08-04","URI":"/data/experiments/JagustLab01_E55715"}], "totalRecords": "1"}}

But when I try to retrieve the assessor in that list, it returns an error:  

(reporting) [mboswell@nx5 ~]$ xcurl "https://MY_XNAT_SERVER/data/archive/projects/bacs/subjects/JagustLab_S00001/experiments/JagustLab_E41584/assessors/JagustLab01_E55715?<html>=json"
<head>
   <title>Status page</title>
</head>
<body>
<h3>Unable to find the specified experiment.</h3><p>You can get technical details <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.5">here</a>.<br>
Please continue your visit at our <a href="/">home page</a>.
</p>
</body>
</html>


The same response is received from the shorter URL: "https://MY_XNAT_SERVER/data/experiments/JagustLab_E41584/assessors/JagustLab01_E55715?format=json".  
Accessing the same experiment/assessor via XNATpy gives a similar response:  

Traceback (most recent call last):
  File "/home/jagust/mboswell/.conda/envs/reporting/lib/python3.9/site-packages/xnat/core.py", line 261, in get
    value = self._overwrites[name]
KeyError: 'date'
...
File "/home/jagust/mboswell/.conda/envs/reporting/lib/python3.9/site-packages/xnat/session.py", line 311, in _check_response
    raise exceptions.XNATResponseError('Invalid status for response from XNATSession for url {} (status {}, accepted status: {})'.format(uri, response.status_code, accepted_status))
xnat.exceptions.XNATResponseError: Invalid status for response from XNATSession for url https://MY_XNAT_SERVER/data/archive/projects/bacs/subjects/JagustLab_S00001/experiments/JagustLab_E41584/assessors/JagustLab01_E55715?format=json (status 404, accepted status: [200])

If I perform similar queries with the analogous "MRI Analysis" assessors, all works as expected (i.e. it returns details of an "MRI Analysis" assessor).  
After struggling to figure this out for a while, I shut down the server, restored the previous version of the plugin, and restarted the server.  However, the problem remains.  

The source of this problem as far as I can diagnose it seems to be this error from the servers sql.log file (whenever one of the failing requests is issued):  

2023-03-23 11:25:23,510 [http-nio-8080-exec-6] ERROR org.nrg.xft.db.PoolDBUtils - SELECT i_xnat_experimentData('JagustLab01_E55715',0,FALSE,TRUE,false)
2023-03-23 11:25:23,512 [http-nio-8080-exec-6] ERROR org.nrg.xft.db.PoolDBUtils - An error occurred trying to execute the user fulladmin query: SELECT i_xnat_experimentData('JagustLab01_E55715',0,FALSE,TRUE,false)
org.postgresql.util.PSQLException: ERROR: function replace(date, unknown, unknown) does not exist
  Hint: No function matches the given name and argument types. You might need to add explicit type casts.
  Where: PL/pgSQL function i_jaglab_petanalysisdata(character varying,integer,boolean,boolean,boolean) line 37 at assignment
PL/pgSQL function i_xnat_experimentdata(character varying,integer,boolean,boolean,boolean) line 225 at assignment
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2455)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2155)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:288)
        at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:430)
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:356)
        at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:303)
        at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:289)
        at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:266)
        at org.postgresql.jdbc.PgStatement.executeQuery(PgStatement.java:233)
        at org.apache.commons.dbcp2.DelegatingStatement.executeQuery(DelegatingStatement.java:207)
        at org.apache.commons.dbcp2.DelegatingStatement.executeQuery(DelegatingStatement.java:207)
        at org.nrg.xft.db.PoolDBUtils.executeQuery(PoolDBUtils.java:914)
        at org.nrg.xft.db.PoolDBUtils.returnStatisticQuery(PoolDBUtils.java:380)
        at org.nrg.xft.db.PoolDBUtils.ReturnStatisticQuery(PoolDBUtils.java:405)
        at org.nrg.xft.search.ItemSearch.getItemsFromKeys(ItemSearch.java:377)
        at org.nrg.xft.search.ItemSearch.exec(ItemSearch.java:417)
...(clipped)...
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:357)
    at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:382)
    at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
    at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:895)
    at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1722)
    at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
    at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191)
    at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    at java.lang.Thread.run(Thread.java:750)


Does this seem to be the source of these problems?  

What can I do to fix this issue?  

Thanks for any advice.  

- Martin

Rick Herrick

unread,
Mar 28, 2023, 11:42:14 AM3/28/23
to xnat_discussion
Hey Martin,

Could you possibly share before and after versions of your XSD file? We have some idea of what's failing but not necessarily why it's failing. It would be helpful to be able to inspect the database schema that gets generated.

Martin Boswell

unread,
Mar 28, 2023, 12:26:25 PM3/28/23
to xnat_discussion
Thanks for your reply, Rick.  

I'm happy to provide anything that might help remedy this problem.  I've attached the before (v1.0.1) and after  (v1.0.2) versions of my "pet_analysis.xsd" file.  

Unfortunately, in my attempts to fix this problem, I've actually made things worse.  In short, I used ran the sql script to rebuild the database views (https://wiki.xnat.org/display/XKB/Rebuilding+the+XNAT+database+views), and after restarting my xnat instance, all project, subject, experiment, assessor list views in the XNAT web views are missing and give a "Failed to create search results." error (except for the "Recent Data Activity" list which appears and behaves normally).  

It's unclear to me whether I've cause a new problem or compounded the previous problem.  I've never performed this procedure before, so I didn't know exactly what to expect, but I find it surprising that when I started tomcat after running the sql script, it reported "Database up to date.".  During the course of my troubleshooting I've reverted to the older plugin version and restored the new version again as well as rebooting my server and restarting tomcat quite a few times.  Is there something that I can do to force XNAT to rebuild the database?  Is that even the direction that I should be going?  

 - Martin
pet_analysis_v1.0.1.xsd
pet_analysis_v1.0.2.xsd

Martin Boswell

unread,
Apr 13, 2023, 10:27:32 PM4/13/23
to xnat_discussion
Rick,

I've resolved this problem but I believe that it resulted from an XNAT bug.  Basically, these two functions were built in such a way that they could not operate without an error.  

  public.i_jaglab_petanalysisdata
  public.i_jaglab_petanalysisdata_history

Before I changed my plugin to try to allow NULL values for some fields, those two functions included lines like this:  

  fullText := fullText || '(linked_mri_date:date)=(' || current_row.linked_mri_date || ')';

When I changed the definition of the date field, `linked_mri_date`, to allow for NULL values (in my plugin's xsd file), it built the above functions with this type of line for my null-allowing date field (also for the int field):  

  fullText := fullText || '(linked_mri_date:string)=(' || REPLACE(REPLACE(current_row.linked_mri_date,'(','*OPEN*'),')','*CLOSE*') || ')';

That line cannot work in plpgsql, as the REPLACE() function will not accept a date field name for it's first argument.  Perhaps it should have tried to cast current_row.linked_mri_date to a string first?  Anyway, I was finally able to restore my plugin functionality by editing these functions and changing those lines back to their original form.  

So, somewhere between the definition of the `linked_mri_date` field in my xsd file and the generation of the sql for those function, an illegal instruction was generated.  

Should I have used a different definition in my xsd file to allow NULL values?

On a side note, in order to fix my issue with rebuilding the database views, I found some of the public views were owned by postgres instead of xnat (going back about 18 months according to archived postgres dumps; it started just after my xnat 1.8 upgrade).  When I dropped these views in addition to the xnat owned views, functionality was restored on tomcat startup.  

 - Martin
Reply all
Reply to author
Forward
0 new messages