Question about running containers on all project files via REST API

Tanae Bousfiha

unread,

Oct 9, 2025, 3:12:14 PMOct 9

to xnat_discussion

Hello everyone,

I have a question.
Is it theoretically possible to process all files within a project using a container launched via the REST API? Does the container have access to the database (since I couldn’t find any information about this in the XNAT documentation)?
Would this approach cause any mounting or storage space issues?

Thank you very much for your support.

Best regards,
Bousfiha

Rick Herrick

unread,

Oct 9, 2025, 3:59:56 PMOct 9

to xnat_di...@googlegroups.com

Yes, it is not just theoretically possible but practically quite possible to do this. It requires a decent familiarity with the REST API, but for the most part doing this through the UI is just using the UI as a proxy to the REST API. In fact, using scripts calling the REST API to launch containers is a very common operation.

Exactly how you'd do this depends a lot on the type and structure of your data and your use case(s), so it's hard to describe a general process for this, but if you have a container that runs at, e.g., the experiment level, you can take the API call(s) for launching the container, parameterize so that you can pass experiment-specific values for each call, then iterate all of the experiments in your project, calling your REST API with the appropriate values.

Containers do not have direct access to the database by default. You could do this by passing credentials to the container (or using some sort of secret store like Vault or Keycloak), but generally it's probably not a great idea: outside of the security concerns there's also a non-trivial risk of doing something to the database that confuses XNAT and/or breaks the database altogether. If there are some operations that you really require, I'd probably recommend writing a plugin with its own API endpoints for querying and updating XNAT through the internal API and have your scripts/processing in the container use those instead (not to mention that, depending on where the containers are running, they may not even be able to access the database, e.g. security restrictions on PostgreSQL usually whitelist IP addresses that can access the database remotely, which would completely break containers running under Docker Swarm or Kubernetes).

This approach would cause no mounting or storage space issues outside of the standard mounting or storage space issues that you can run into with the container service, simply because this is how the container service always works (i.e. through REST calls). The input and output mounts are usually parameters for the container launch, so if it works running from the UI it'll work running from your script. If your container generates a great deal of data (e.g. Freesurfer can produce many GBs of output from a few MBs of input data) you'll need a corresponding large amount of storage on which to store that data.

One thing to be very aware of is storing session state with whatever tool you're using for scripting. XNAT's underlying application server is Tomcat, which has a default session limit of 1K. That means that there can only be 1K worth of active sessions on the server at one time, after which new session requests are denied until an existing session is terminated or times out. A common mistake people make when scripting is doing something like this:

curl -u admin:admin http://server/data/projects?format=xml

curl -u admin:admin http://server/data/projects/Project_A/subjects?format=xml

curl -u admin:admin http://server/data/projects/Project_A/subjects/Subject_1/experiments?format=xml

Each of those calls creates a new session, each of which will persist for however long the session timeout is configured on the system. A script that's going through 1,000 experiments without storing session state will make the server inaccessible to other users. How you store session state depends on the tool, e.g. for curl you can do:

curl --user admin:admin --cookie-jar cookies.txt --cookie cookies.txt http://server/data/projects?format=xml

Other tools like xnatpy or httpie have their own means of persisting session state, so make sure you know how to use that and use it!

Rick Herrick

Senior Software Developer

ri...@xnatworks.io

https://xnatworks.io | Find us on LinkedIn!

--
You received this message because you are subscribed to the Google Groups "xnat_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussi...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/xnat_discussion/d20786bc-f451-44b4-8df9-be6dc9bd1b2dn%40googlegroups.com.

Tanae Bousfiha

unread,

Oct 9, 2025, 4:26:31 PMOct 9

to xnat_discussion

Dear Herrick,
First of all, thank you very much for your excellent support.

I’ve developed an automated Docker container script that dynamically generates the Dockerfile (linked to an external user script), creates a generic command.json for all cases, uploads everything to XNAT, and enables it via the REST API.

I selected the Project context and extracted all files within the project (including scans, subjects, and resources).
After running the workflow, the container launched successfully and reached the “Complete” status.

However, when checking the stdout, I found the message:

no input files

Interestingly, under the Container Information, all the extracted files were correctly listed — yet none of them actually reached the container.

This makes me suspect a mounting issue. Since the container completed successfully (and not with a “Failed upload file” error), I am unsure how to interpret this behavior. If it had failed during upload, I would have assumed the problem was with the output definition — but here, the input seems to be the issue.

Could you please help me understand how this situation could occur?

Thank you very much for your time and assistance.

Best regards,
Bousfiha

def launch_container_with_all_files(xnat_host, project_id, command_id, wrapper_name,
                                    xnat_user, xnat_password, files):

    if not files:
        print("Keine Dateien übergeben.")
        return

    valid_files = [f for f in files if is_valid_filename(f["Name"])]
    input_file_names = [f["Name"] for f in valid_files]
    input_files_str = " ".join(input_file_names)  

    payload = {
        "project": project_id,
        "input_files": input_files_str
    }

    url = f"{xnat_host}/xapi/projects/{project_id}/commands/{command_id}/wrappers/{wrapper_name}/root/project/launch"

    print("Starte Container für alle Dateien im Projekt.")
    print("Dateien:", input_file_names)
    print("Launch-URL:", url)
    print("Payload:", json.dumps(payload, indent=2))

    headers = {"Content-Type": "application/json"}
    response = requests.post(
        url, auth=(xnat_user, xnat_password),
        headers=headers, json=payload, verify=False
    )

    print("Status:", response.status_code)
    print("Antwort:", response.text)
    if response.status_code in [200, 201, 202]:
        print("Container erfolgreich gestartet mit ALLEN Dateien!")
    else:
        print("Fehler beim Container-Start:", response.status_code, response.text)

John Flavin

unread,

Oct 9, 2025, 5:11:50 PMOct 9

to xnat_di...@googlegroups.com

It's hard to say what exactly is happening without knowing all the details about what your container is doing, how you’re handing the inputs in the command, how you've configured the container service and XNAT paths, and so forth. Lots of things might be happening.

But all that aside, my first guess is this is a Path Translation issue.

Can you check that page and see if that seems like it might solve your issue? If it doesn't seem correct to you, let us know and we can begin the work of figuring out what is happening.

John Flavin

To view this discussion visit https://groups.google.com/d/msgid/xnat_discussion/2f9dbdee-7401-46c8-9be5-b75ec9137ce7n%40googlegroups.com.

Tanae Bousfiha

unread,

Oct 11, 2025, 1:37:08 PMOct 11

to xnat_discussion

Dear Flavin,

I have checked the XNAT Path Prefix (xnat/data/) and my Server Path Prefix (xnat/data), and they seem to match so far.

Within the experiment, I selected only one context in the JSON command: xnat:projectData.

"contexts": ["xnat:projectData"],
"external-inputs": [
{
"name": "project",
"type": "Project",
"required": True,
"load-children": True
}
],
"output-handlers": [
{
"name": "output",
"accepts-command-output": "result_file",
"as-a-child-of": "project",
"type": "Resource",
"label": "Results"
}

I couldn’t select more than one output destination since only one context was defined — otherwise, the command would be rejected.

My goal with this setup is to have one container running at the xnat:projectData level that can still access and extract all files (using REST API calls) within the project, including those at the subject, session (MR/PET/CT), and scan levels.

I assume the container wasn’t able to receive any files because only one context (xnat:projectData) was defined in the JSON command. However, if I define multiple contexts, it seems I would need to launch multiple containers instead of just one.

Is my understanding correct?

Best regards

Bousfiha

John Flavin

unread,

Oct 13, 2025, 3:39:23 PMOct 13

to xnat_di...@googlegroups.com

What about the mounts section? It doesn't appear that you have configured your project input to provide files to any mounts. If you don't specify that, the container service won't know to mount the files from the archive into your container. That would explain why your container execution logged that it didn't see any input files; because the container service didn't mount them for you!

I want to clarify your intent, however. Earlier you wrote that you were intending to retrieve the files during the container execution using the REST API. That is a valid approach, and if that is what you want to do you can implement that within your container without mounting anything. I would say using the mounts would likely be far more efficient in terms of time, bandwidth, and disk usage because the files would just be there when the container starts up rather than your container needing to download all of them. But both options are available, and there are valid use cases for using the REST API.

John Flavin

To view this discussion visit https://groups.google.com/d/msgid/xnat_discussion/f321fc64-3406-4bac-9754-bf06e5cdf9dfn%40googlegroups.com.

Tanae Bousfiha

unread,

Oct 13, 2025, 3:46:40 PMOct 13

to xnat_discussion

This is the JSON that I have used, is that correct?
Bousfiha

 json_file = {
        "name": mod_data["command_name"],
        "description": mod_data["command_description"],
        "version": "1.5",
        "type": "docker",
        "image": docker_image,
        "command-line": f"python3 /app/{script_filename} /input/#input_file# /output",
        "mounts": [
            {"name": "input", "path": "/input", "writable": False},
            {"name": "output", "path": "/output", "writable": True}
        ],
        "inputs": [
            {
                "name": "input_files",
                "description": "Input files",
                "type": "files",
                "element_type": "file",
                "required": True,
                "mount": "input",
                "multiple": True
            }
        ],
        "outputs": [
            {
                "name": "result_file",
                "type": "file",
                "description": "Result file output",
                "mount": "output",
                "path": "."
            }
        ],
        "xnat": [
            {
                "name": wrapper_name,
                "label": mod_data.get("label_name", ""),
                "description": mod_data.get("label_description", ""),

                "contexts": ["xnat:projectData"],
                "external-inputs": [
                    {
                        "name": "project",
                        "type": "Project",
                        "required": True,

                        "load-children": True,
                    }
                ],
         "derived-inputs": [
                    {
                        "name": "all_resource",
                        "type": "Resource",
                        "derived-from-wrapper-input": "project",
                        "provides-values-for-input": "input_files",
                        "multiple" : True                  

                   }
                   ],
                "output-handlers": [
                    {
                        "name": "output",
                        "accepts-command-output": "result_file",
                        "as-a-child-of": "project",
                        "type": "Resource",
                        "label": "Results"
                    }

Tanae Bousfiha

unread,

Oct 13, 2025, 4:37:27 PMOct 13

to xnat_discussion

To clearly summarize what I have done so far:
Basically, I extracted all the files via the REST API, then handed them to the container and finally launched it. Before that, within the same script, I wrote the JSON command and sent it to XNAT, then enabled it both project-wide and site-wide. In my container script, it was just about taking the input and writing the output back..so nothing special.

Bousfiha

def get_all_files_all_levels(xnat_host, project_id, xnat_user, xnat_password):
    session = requests.Session()
    session.auth = (xnat_user, xnat_password)

    all_files = []

    # PROJECT-Ebene
    proj_res_url = f"{xnat_host}/data/projects/{project_id}/resources?format=json"
    resp = session.get(proj_res_url, verify=False)
    for res in resp.json().get("ResultSet", {}).get("Result", []):
        res_name = res['label']
        file_url = f"{xnat_host}/data/projects/{project_id}/resources/{res_name}/files?format=json"
        fresp = session.get(file_url, verify=False)
        for f in fresp.json().get("ResultSet", {}).get("Result", []):
            all_files.append({
                "Ebene":"project",
                "Name":f["Name"],
                "Resource":res_name,
                "URI": f"{xnat_host}/data/projects/{project_id}/resources/{res_name}/files/{f['Name']}"
            })

    # SUBJECT-Ebene
    subj_url = f"{xnat_host}/data/projects/{project_id}/subjects?format=json"
    subjects = session.get(subj_url, verify=False).json().get("ResultSet", {}).get("Result", [])
    for subj in subjects:
        subj_id = subj['ID']
        sres_url = f"{xnat_host}/data/subjects/{subj_id}/resources?format=json"
        resp = session.get(sres_url, verify=False)
        for res in resp.json().get("ResultSet", {}).get("Result", []):
            res_name = res['label']
            file_url = f"{xnat_host}/data/subjects/{subj_id}/resources/{res_name}/files?format=json"
            fresp = session.get(file_url, verify=False)
            for f in fresp.json().get("ResultSet", {}).get("Result", []):
                all_files.append({
                    "Ebene":"subject",
                    "SubjectID":subj_id,
                    "Name":f["Name"],
                    "Resource":res_name,
                    "URI": f"{xnat_host}/data/subjects/{subj_id}/resources/{res_name}/files/{f['Name']}"
                })

    # 3. SESSION-Ebene (MR/PET/CT)
    sess_url = f"{xnat_host}/data/projects/{project_id}/experiments?format=json"
    sessions = session.get(sess_url, verify=False).json().get("ResultSet", {}).get("Result", [])
    for sess in sessions:
        sess_id = sess['ID']
        stype = sess.get('xsiType','')
        # Nur MR/PET/CT Sessions
        if stype not in ["xnat:mrSessionData", "xnat:petSessionData", "xnat:ctSessionData"]:
            continue
        sres_url = f"{xnat_host}/data/experiments/{sess_id}/resources?format=json"
        resp = session.get(sres_url, verify=False)
        for res in resp.json().get("ResultSet", {}).get("Result", []):
            res_name = res['label']
            file_url = f"{xnat_host}/data/experiments/{sess_id}/resources/{res_name}/files?format=json"
            fresp = session.get(file_url, verify=False)
            for f in fresp.json().get("ResultSet", {}).get("Result", []):
                all_files.append({
                    "Ebene":stype,
                    "SessionID":sess_id,
                    "Name":f["Name"],
                    "Resource":res_name,
                    "URI": f"{xnat_host}/data/experiments/{sess_id}/resources/{res_name}/files/{f['Name']}"
                })

        # 4. SCAN-Ebene innerhalb jeder SESSION
        scan_url = f"{xnat_host}/data/experiments/{sess_id}/scans?format=json"
        scans = session.get(scan_url, verify=False).json().get("ResultSet", {}).get("Result", [])
        for scan in scans:
            scan_id = scan['ID']
            scres_url = f"{xnat_host}/data/experiments/{sess_id}/scans/{scan_id}/resources?format=json"
            resp = session.get(scres_url, verify=False)
            for res in resp.json().get("ResultSet", {}).get("Result", []):
                res_name = res['label']
                file_url = f"{xnat_host}/data/experiments/{sess_id}/scans/{scan_id}/resources/{res_name}/files?format=json"
                fresp = session.get(file_url, verify=False)
                for f in fresp.json().get("ResultSet", {}).get("Result", []):
                    all_files.append({
                        "Ebene":"scan",
                        "SessionID":sess_id,
                        "ScanID":scan_id,
                        "Name":f["Name"],
                        "Resource":res_name,
                        "URI": f"{xnat_host}/data/experiments/{sess_id}/scans/{scan_id}/resources/{res_name}/files/{f['Name']}"
                    })

    return all_files

John Flavin

unread,

Oct 21, 2025, 1:42:38 PMOct 21

to xnat_di...@googlegroups.com

I've reviewed your command JSON.

I think that if your goal is to mount all the project files into the container, then the project input itself should have "provides-values-for-input": "input_files" . That will tell the container service to mount the entire project directory into the container, where you will find the files in subdirectories under the /input directory, no REST calls required.

Some other things I notice:

You don't need the all_resource input, at least not if I understand your goal correctly. The subjects, sessions, and scans under a project—which is what I think you want—aren't considered project resources.
Project inputs should generally have "load-children": false to avoid loading all the subjects, sessions, and scans into memory at launch time, especially if you aren't using any jsonpath filtering.
I am not sure exactly what outputs[0].path = "." will do. It might be fine. But at worst it could cause the file ingestion code to get confused. Probably best to remove it.

John Flavin

To view this discussion visit https://groups.google.com/d/msgid/xnat_discussion/6c5867f2-38bf-4008-91a5-f4d4d7355803n%40googlegroups.com.

Reply all

Reply to author

Forward