I don't know if anyone else has hit this, but our nnInteractive setup stopped working today even though nothing changed on our end it looks like a Hugging Face repo redirect broke session creation. I was able to resolve it with an AI coding assistant (Cursor). Sharing in case others hit the same wall.
Please note that I don't do python setup/development/deployment/etc. The summary and resolution of the issue written below was provided by Cursor. I it hope will be useful for someone who knows the internal workings of itksnap_dls
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
This setup was working fine for us until today. Same server, same workflow (ITK-SNAP on Windows → SSH tunnel → itksnap_dls on Linux GPU). Nothing changed on our end — session creation started failing with no changes to our scripts or ITK-SNAP config.
Setup- Remote mode: ITK-SNAP on Windows → SSH tunnel → itksnap_dls on Linux GPU server (RTX A6000)
- Package: itksnap-dls v0.0.10
- Symptom: /status returns 200, but ITK-SNAP shows "Error creating session on DLS server: Internal Server Error"
- Server log: GET /start_session → 500, with JSONDecodeError in huggingface_hub → api.repo_info() → r.json()
What we found
On every session init, SegmentSession in segment.py calls:
hf.snapshot_download(
repo_id="nnInteractive/nnInteractive",
allow_patterns=["nnInteractive_v1.0/*"],
local_dir=config.hf_models_path)
So the server contacts Hugging Face’s model registry to look up/download nnInteractive — even when the model is already on disk. It is not uploading patient imaging data; it’s a repo metadata check (and download only if files aren’t cached).
Our best guess for why it broke today without any local changes:
- Outdated repo ID: nnInteractive/nnInteractive now redirects to MIC-DKFZ/nnInteractive. The Python client gets a non-JSON response and crashes with JSONDecodeError.
- No true offline path: HF_HUB_OFFLINE=1 alone didn’t help because config_hf_backend() swaps in a plain httpx.Client that still hits the API when the network works. JSONDecodeError isn’t caught by snapshot_download’s fallback logic (unlike ConnectError).
We also found broken symlinks in our local HF cache snapshot folder; fixing those alone wasn’t enough while the API call still ran.
Workaround (verified)Force the HF API to fail so snapshot_download uses local files only:
env HF_ENDPOINT=
http://127.0.0.1:1 HF_HUB_OFFLINE=1 TRANSFORMERS_OFFLINE=1 \
python -m itksnap_dls --host 0.0.0.0 --port 8911 --device cuda
Success = terminal shows nnInteractive session initialized in X.XX seconds, then /start_session returns 200.
Suggested fixes (for maintainers / docs)- Update NNINTERACTIVE_REPO_ID to MIC-DKFZ/nnInteractive
- Use local_files_only=True (or skip snapshot_download) when --models-path or cache already has nnInteractive_v1.0
- Don’t replace the HF HTTP client in a way that disables offline mode
- Document offline deployment in the quick start — e.g. pre-download models, --models-path, and env vars for sites that block outbound traffic