help with xnat container outputs on NFS when compute backend runs Docker remotely

26 views
Skip to first unread message

Carmen Giugliano

unread,
Nov 19, 2025, 8:48:41 AM (3 days ago) Nov 19
to xnat_discussion

Hello everyone,

Here’s my setup:

  • XNAT: 1.9.2.1

  • Container Service: 3.7.2

  • OS: AlmaLinux

  • Installation: via Docker 

I've connected the xnat  compute backend to a dokcer daemon running on a different machine than the XNAT host. Jobs run inside containers correctly, and output files are generated in the staging area (build/<job-id>), but xnat fails to copy them to the final resource folder.
Here's what i observe:
  • the container runs as UID/GID 1002:1002 (correspondig to xnat:xnat)
  • Files inside the container's /output folder are owned by 1002:1002 and have the drwxr-x--- permissions
  • the staging folder on the NFS share is drwxrwx--- 1002:1002
  • on the host, ls -l shows the same permissions and ownership 
It seems that the XNAT service on the host cannot access the staging folder.
what should i do?
Thanks in advance,
Best.

In the following more details:

here's my path translation:
XNAT path prefix: /data/xnat
Server path prefix: /mnt/xnat_shared
Container user:
1002:1002

Here's my json:
{
  "name": "ciao-gpu-writer",
  "label": "Write ciao.txt to Project from GPU",
  "description": "Runs ciao-image and saves results to a Project Resource.",
  "version": "1.0",
  "schema-version": "1.0",
  "image": "my-xnat-app:latest",
  "type": "docker",
  "command-line": "pwd && python /app/hello_from_gpu.py",
  "override-entrypoint": true,
  "mounts": [
    {
      "name": "out",
      "writable": true,
      "path": "/output"
    }
  ],
  "environment-variables": {},
  "ports": {},
  "inputs": [],
  "outputs": [
    {
      "name": "txt_out",
      "description": "All text files written to /output",
      "required": true,
      "mount": "out",
      "path": "",
      "glob": "*.txt"
    }
  ],
  "xnat": [
    {
      "name": "project",
      "label": "Ciao Writer Project Launch",
      "description": "Lancia il container su un Progetto",
      "contexts": [
        "xnat:projectData"
      ],
      "external-inputs": [
        {
          "name": "project",
          "description": "Target project",
          "type": "Project",
          "required": true,
          "load-children": true
        }
      ],
      "derived-inputs": [],
      "output-handlers": [
        {
          "name": "save_to_project_resource",
          "accepts-command-output": "txt_out",
          "as-a-child-of": "project",
          "type": "Resource",
          "label": "ciao-output-resource",
          "tags": []
        }
      ]
    }
  ],
  "container-labels": {},
  "generic-resources": {},
  "ulimits": {},
  "secrets": [],
  "visibility": "public"
}

Here's my dockerfile:
FROM python:3.12-slim

ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1

WORKDIR /app

RUN mkdir -p /output && chown 1002:1002 /output

COPY --chown=1002:1002 hello_from_gpu.py /app/hello_from_gpu.py

USER 1002:1002
ENTRYPOINT ["sh", "-c", "chmod 775 /output && chown 1002:1002 /output && exec python /app/hello_from_gpu.py"]
CMD ["python", "/app/hello_from_gpu.py"]

Here's my python:
import os
import time
import stat
import sys
import pwd
import grp


try:
  
    current_uid = os.getuid()
    current_gid = os.getgid()

   
    try:
        username = pwd.getpwuid(current_uid).pw_name
    except KeyError:
        username = f"UID {current_uid} "

    try:
        groupname = grp.getgrgid(current_gid).gr_name
    except KeyError:
        groupname = f"GID {current_gid}"

    print(f"[DEBUG] Process running as: {username} (UID: {current_uid}) / {groupname} (GID: {current_gid})")

 

    output_stat = os.stat('/output')

    print(f"[DEBUG] /output exists: {os.path.exists('/output')}")
    print(f"[DEBUG] /output writable: {os.access('/output', os.W_OK)}")
    print(f"[DEBUG] /output permissions: {stat.filemode(output_stat.st_mode)}")
    print(f"[DEBUG] /output owned by: UID {output_stat.st_uid} / GID {output_stat.st_gid}")

   
    out_dir = "/output"
    os.makedirs(out_dir, exist_ok=True)

    # 1. Write file
    path = f"{out_dir}/ciao.txt"
    with open(path, "w") as f:
        f.write("HELLO FROM GPU \n")
 

except Exception as e:
    
    print(f"[ERRORE] Si è verificato un errore: {e}", file=sys.stderr)

here's my log:
View stdout (live)
/app
[DEBUG] Process running as: UID 1002 (UID: 1002) / GID 1002 (GID: 1002)
[DEBUG] /output exists: True
[DEBUG] /output writable: True
[DEBUG] /output permissions: drwxr-x---
[DEBUG] /output owned by: UID 1002 / GID 1002

John Flavin

unread,
Nov 19, 2025, 5:22:32 PM (3 days ago) Nov 19
to xnat_di...@googlegroups.com
My first thought on reading the description was the file permissions don't match. However, you have checked all the permissions, and everything looks like I would expect it to look. I don't see any reason it wouldn't work. I don't have any other immediate thoughts just based on what you've said, so we'll need to find some more information. 

What do you mean that XNAT fails to copy them to the resource? Is there an error message? Are there any relevant entries in the XNAT or container service logs around the time the container finishes and CS tries to finalize it?

John Flavin

--
You received this message because you are subscribed to the Google Groups "xnat_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussi...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/xnat_discussion/d924830e-1f1f-4cc9-862d-60c811f47cd9n%40googlegroups.com.
Message has been deleted

Carmen Giugliano

unread,
Nov 21, 2025, 11:31:22 AM (22 hours ago) Nov 21
to xnat_discussion
Dear John, 

thanks a lot for your prompt reply. 


>What do you mean that XNAT fails to copy them to the resource?
I mean that nothing appears in the Manage Files as you can see from the screenshot below:
Screenshot 2025-11-21 alle 11.46.09.pngScreenshot 2025-11-21 alle 11.46.23.png


>Is there an error message? Are there any relevant entries in the XNAT or container service logs around the time the container finishes and CS tries to finalize it?


Plus i want to add that in the xnat-web in the docker-compose.yml i've added :
    environment:
      - XNAT_DATASERVER_UMASK=000       
      - XNAT_DATASERVER_DIRECTORY_PERMS=0777 


below there's my containers.log:
2025-11-21 11:59:22,123 [http-nio-8080-exec-34] DEBUG org.nrg.containers.rest.LaunchRestApi - Creating launch UI.
2025-11-21 11:59:22,125 [http-nio-8080-exec-34] DEBUG org.nrg.containers.model.command.auto.LaunchUi - ROOT project - Populating input relationship tree.
2025-11-21 11:59:22,125 [http-nio-8080-exec-34] DEBUG org.nrg.containers.model.command.auto.LaunchUi - ROOT project - Populating input value tree.
2025-11-21 11:59:23,574 [http-nio-8080-exec-19] INFO  org.nrg.containers.rest.LaunchRestApi - Launch requested for wrapper id 13
2025-11-21 11:59:24,156 [http-nio-8080-exec-19] DEBUG org.nrg.containers.services.impl.ContainerServiceImpl - Created workflow 435.
2025-11-21 11:59:24,156 [http-nio-8080-exec-19] DEBUG org.nrg.containers.services.impl.ContainerServiceImpl - Adding to staging queue: count [not computed], project newstorage, wrapperId 13, commandId 0, wrapperName null, inputValues {project=/archive/projects/newstorage}, username admin, workflowId 435
2025-11-21 11:59:24,306 [stagingQueueListener-43] DEBUG org.nrg.containers.jms.listeners.ContainerStagingRequestListener - Consuming staging queue: count [not computed], project newstorage, wrapperId 13, commandId 0, wrapperName null, inputValues {project=/archive/projects/newstorage}, username admin, workflowId 435
2025-11-21 11:59:24,306 [stagingQueueListener-43] DEBUG org.nrg.containers.services.impl.ContainerServiceImpl - consumeResolveCommandAndLaunchContainer wfid 435
2025-11-21 11:59:24,377 [stagingQueueListener-43] DEBUG org.nrg.containers.services.impl.ContainerServiceImpl - Configuring command for wfid 435
2025-11-21 11:59:24,392 [stagingQueueListener-43] DEBUG org.nrg.containers.services.impl.ContainerServiceImpl - Resolving command for wfid 435
2025-11-21 11:59:24,401 [stagingQueueListener-43] DEBUG org.nrg.containers.services.impl.ContainerServiceImpl - Launching command for wfid 435
2025-11-21 11:59:24,401 [stagingQueueListener-43] INFO  org.nrg.containers.services.impl.ContainerServiceImpl - Preparing to launch resolved command.
2025-11-21 11:59:24,436 [stagingQueueListener-43] DEBUG org.nrg.containers.services.impl.ContainerServiceImpl - Checking input values to find root XNAT input object.
2025-11-21 11:59:24,436 [stagingQueueListener-43] DEBUG org.nrg.containers.services.impl.ContainerServiceImpl - Input "project".
2025-11-21 11:59:24,436 [stagingQueueListener-43] DEBUG org.nrg.containers.services.impl.ContainerServiceImpl - Getting input value as XFTItem.
2025-11-21 11:59:24,436 [stagingQueueListener-43] DEBUG org.nrg.containers.services.impl.ContainerServiceImpl - Found a valid root XNAT input object: project.
2025-11-21 11:59:24,436 [stagingQueueListener-43] DEBUG org.nrg.containers.services.impl.ContainerServiceImpl - Update workflow for Wrapper project - Command ciao-gpu-writer - Image my-xnat-app:latest.
2025-11-21 11:59:24,477 [stagingQueueListener-43] DEBUG org.nrg.containers.services.impl.ContainerServiceImpl - Updated workflow 435.
2025-11-21 11:59:24,477 [stagingQueueListener-43] INFO  org.nrg.containers.services.impl.ContainerServiceImpl - Creating container from resolved command.
2025-11-21 11:59:24,479 [stagingQueueListener-43] DEBUG org.nrg.containers.api.DockerControlApi - Creating container:
server docker_gpu tcp://gctd-gpu.epiccloud:2376
image my-xnat-app:latest
command "pwd && python /app/hello_from_gpu.py"
working directory "null"
containerUser "0:0"
volumes [/mnt/xnat_shared/build/3eac0fb3-1082-4e28-bca0-d05d4de5f2c8:/output]
environment variables [XNAT_USER=e62d8ahgcjfxjtrzsaje, XNAT_EVENT_ID=435, XNAT_WORKFLOW_ID=435, XNAT_HOST=XXXXX, XNAT_PASS=XXXXXX]
exposed ports: {}
2025-11-21 11:59:25,056 [stagingQueueListener-43] INFO  org.nrg.containers.services.impl.ContainerServiceImpl - Recording container launch.
2025-11-21 11:59:25,056 [stagingQueueListener-43] DEBUG org.nrg.containers.services.impl.ContainerServiceImpl - Updating workflow for Container b8b1a87188c270468309e9d7e199a0a0888c342d6a6d8f3f49c39b5c2ec3a4d2
2025-11-21 11:59:26,029 [stagingQueueListener-43] DEBUG org.nrg.containers.services.impl.ContainerServiceImpl - Updated workflow 435.
2025-11-21 11:59:26,038 [stagingQueueListener-43] INFO  org.nrg.containers.services.impl.HibernateContainerEntityService - Adding new history item to container entity 224
2025-11-21 11:59:26,040 [stagingQueueListener-43] DEBUG org.nrg.containers.services.impl.HibernateContainerEntityService - Acquiring lock for the container 224
2025-11-21 11:59:26,040 [stagingQueueListener-43] DEBUG org.nrg.containers.services.impl.HibernateContainerEntityService - Acquired lock for the container 224
2025-11-21 11:59:26,040 [stagingQueueListener-43] DEBUG org.nrg.containers.services.impl.HibernateContainerEntityService - Setting container entity 224 status to "Created", based on history entry status "Created".
2025-11-21 11:59:26,040 [stagingQueueListener-43] DEBUG org.nrg.containers.utils.ContainerUtils - Updating status of workflow 435.
2025-11-21 11:59:26,044 [stagingQueueListener-43] DEBUG org.nrg.containers.utils.ContainerUtils - Found workflow 435.
2025-11-21 11:59:26,044 [stagingQueueListener-43] INFO  org.nrg.containers.utils.ContainerUtils - Updating workflow 435 pipeline "project" from "Queued" to "Created" (details: ).
2025-11-21 11:59:26,334 [stagingQueueListener-43] DEBUG org.nrg.containers.services.impl.HibernateContainerEntityService - Releasing lock for the container 224
2025-11-21 11:59:26,363 [stagingQueueListener-43] INFO  org.nrg.containers.services.impl.ContainerServiceImpl - Starting container.
2025-11-21 11:59:26,364 [stagingQueueListener-43] INFO  org.nrg.containers.api.DockerControlApi - Starting container b8b1a87188c270468309e9d7e199a0a0888c342d6a6d8f3f49c39b5c2ec3a4d2
2025-11-21 11:59:27,002 [stagingQueueListener-43] INFO  org.nrg.containers.services.impl.ContainerServiceImpl - Launched command for wfid 435: command 7, wrapper 13 project. Produced container 224.
2025-11-21 11:59:27,003 [stagingQueueListener-43] DEBUG org.nrg.containers.services.impl.ContainerServiceImpl - Container for wfid 435: Container{databaseId=224, commandId=7, status=Created, statusTime=Fri Nov 21 11:59:26 CET 2025, wrapperId=13, containerId=b8b1a87188c270468309e9d7e199a0a0888c342d6a6d8f3f49c39b5c2ec3a4d2, workflowId=435, userId=admin, project=newstorage, backend=docker, serviceId=null, taskId=null, nodeId=null, dockerImage=my-xnat-app:latest, containerName=null, commandLine=pwd && python /app/hello_from_gpu.py, overrideEntrypoint=true, workingDirectory=null, subtype=docker, parent=null, parentSourceObjectName=null, environmentVariables={XNAT_USER=e62d8a0e-b185-401e-a13b-d515bfeef45e, XNAT_EVENT_ID=435, XNAT_WORKFLOW_ID=435, XNAT_HOST=XXXX:443, XNAT_PASS=XXXXXX}, ports={}, mounts=[ContainerMount{databaseId=251, name=out, writable=true, xnatHostPath=/data/xnat/build/3eac0fb3-1082-4e28-bca0-d05d4de5f2c8, containerHostPath=/mnt/xnat_shared/build/3eac0fb3-1082-4e28-bca0-d05d4de5f2c8, containerPath=/output, mountPvcName=null, inputFiles=[]}], inputs=[ContainerInput{databaseId=668, type=RAW, name=project, value=/archive/projects/newstorage, sensitive=false}, ContainerInput{databaseId=669, type=WRAPPER_EXTERNAL, name=project, value=/archive/projects/newstorage, sensitive=false}], outputs=[ContainerOutput{databaseId=219, name=txt_out:save_to_project_resource, fromCommandOutput=txt_out, fromOutputHandler=save_to_project_resource, type=Resource, required=true, mount=out, path=, glob=*.txt, label=ciao-output-resourcecarmen, format=null, description=null, content=null, tags=[], created=null, handledBy=project, viaWrapupContainer=null}], history=[ContainerHistory{databaseId=1305, status=Created, entityType=user, entityId=admin, timeRecorded=Fri Nov 21 11:59:26 CET 2025, externalTimestamp=null, message=null, exitCode=null}], logPaths=[], reserveMemory=null, limitMemory=null, limitCpu=null, swarmConstraints=null, runtime=null, ipcMode=null, autoRemove=false, shmSize=null, network=null, containerLabels={XNAT_USER_EMAIL=txxx.it, XNAT_PROJECT=newstorage, XNAT_ID=newstorage, XNAT_USER_ID=admin, XNAT_DATATYPE=Project}, gpus=null, genericResources=null, ulimits=null, secrets=[]}
....(in loop)
2025-11-21 12:02:24,948 [docker-java-stream--2126463612] DEBUG org.nrg.containers.api.DockerControlApi - Received event: Event(status=create, id=b8b1a87188c270468309e9d7e199a0a0888c342d6a6d8f3f49c39b5c2ec3a4d2, from=my-xnat-app:latest, node=null, type=CONTAINER, action=create, actor=EventActor(id=b8b1a87188c270468309e9d7e199a0a0888c342d6a6d8f3f49c39b5c2ec3a4d2, attributes={XNAT_DATATYPE=Project, XNAT_ID=newstorage, XNAT_PROJECT=newstorage, XNAT_USER_EMAIL=XXXt, XNAT_USER_ID=admin, image=my-xnat-app:latest, name=distracted_greider}), time=1763722598, timeNano=1763722598657450172)
2025-11-21 12:02:24,948 [docker-java-stream--2126463612] DEBUG org.nrg.containers.api.DockerControlApi - Received event: Event(status=start, id=b8b1a87188c270468309e9d7e199a0a0888c342d6a6d8f3f49c39b5c2ec3a4d2, from=my-xnat-app:latest, node=null, type=CONTAINER, action=start, actor=EventActor(id=b8b1a87188c270468309e9d7e199a0a0888c342d6a6d8f3f49c39b5c2ec3a4d2, attributes={XNAT_DATATYPE=Project, XNAT_ID=newstorage, XNAT_PROJECT=newstorage, XNAT_USER_EMAIL=XXXXX, XNAT_USER_ID=admin, image=my-xnat-app:latest, name=distracted_greider}), time=1763722600, timeNano=1763722600604767564)
2025-11-21 12:02:24,948 [docker-java-stream--2126463612] DEBUG org.nrg.containers.api.DockerControlApi - Received event: Event(status=die, id=b8b1a87188c270468309e9d7e199a0a0888c342d6a6d8f3f49c39b5c2ec3a4d2, from=my-xnat-app:latest, node=null, type=CONTAINER, action=die, actor=EventActor(id=b8b1a87188c270468309e9d7e199a0a0888c342d6a6d8f3f49c39b5c2ec3a4d2, attributes={XNAT_DATATYPE=Project, XNAT_ID=newstorage, XNAT_PROJECT=newstorage, XNAT_USER_EMAIL=XXXXX, XNAT_USER_ID=admin, execDuration=0, exitCode=0, image=my-xnat-app:latest, name=distracted_greider}), time=1763722601, timeNano=1763722601506361086)


again, thanks for any advice, 
best, 
Carmen 
Reply all
Reply to author
Forward
0 new messages