nnInteractive and HuggingFace

16 views
Skip to first unread message

mfk16

unread,
Jun 25, 2026, 6:36:29 PM (yesterday) Jun 25
to itksnap-users
Hi,

I don't know if anyone else has hit this, but our nnInteractive setup stopped working today even though nothing changed on our end it looks like a Hugging Face repo redirect broke session creation. I was able to resolve it with an AI coding assistant (Cursor). Sharing in case others hit the same wall.

Please note that I don't do python setup/development/deployment/etc. The summary and resolution of the issue written below was provided by Cursor. I it hope will be useful for someone who knows the internal workings of itksnap_dls

Thanks,

Matt

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

This setup was working fine for us until today. Same server, same workflow (ITK-SNAP on Windows → SSH tunnel → itksnap_dls on Linux GPU). Nothing changed on our end — session creation started failing with no changes to our scripts or ITK-SNAP config.

Setup
  • Remote mode: ITK-SNAP on Windows → SSH tunnel → itksnap_dls on Linux GPU server (RTX A6000)
  • Package: itksnap-dls v0.0.10
  • Symptom: /status returns 200, but ITK-SNAP shows "Error creating session on DLS server: Internal Server Error"
  • Server log: GET /start_session → 500, with JSONDecodeError in huggingface_hub → api.repo_info() → r.json()
What we found
On every session init, SegmentSession in segment.py calls:
hf.snapshot_download(
    repo_id="nnInteractive/nnInteractive",
    allow_patterns=["nnInteractive_v1.0/*"],
    local_dir=config.hf_models_path)

So the server contacts Hugging Face’s model registry to look up/download nnInteractive — even when the model is already on disk. It is not uploading patient imaging data; it’s a repo metadata check (and download only if files aren’t cached).

Our best guess for why it broke today without any local changes:
  1. Outdated repo ID: nnInteractive/nnInteractive now redirects to MIC-DKFZ/nnInteractive. The Python client gets a non-JSON response and crashes with JSONDecodeError.
  2. No true offline path: HF_HUB_OFFLINE=1 alone didn’t help because config_hf_backend() swaps in a plain httpx.Client that still hits the API when the network works. JSONDecodeError isn’t caught by snapshot_download’s fallback logic (unlike ConnectError).
We also found broken symlinks in our local HF cache snapshot folder; fixing those alone wasn’t enough while the API call still ran.

Workaround (verified)
Force the HF API to fail so snapshot_download uses local files only:

env HF_ENDPOINT=http://127.0.0.1:1 HF_HUB_OFFLINE=1 TRANSFORMERS_OFFLINE=1 \
  python -m itksnap_dls --host 0.0.0.0 --port 8911 --device cuda

Success = terminal shows nnInteractive session initialized in X.XX seconds, then /start_session returns 200.

Suggested fixes (for maintainers / docs)
  1. Update NNINTERACTIVE_REPO_ID to MIC-DKFZ/nnInteractive
  2. Use local_files_only=True (or skip snapshot_download) when --models-path or cache already has nnInteractive_v1.0
  3. Don’t replace the HF HTTP client in a way that disables offline mode
  4. Document offline deployment in the quick start — e.g. pre-download models, --models-path, and env vars for sites that block outbound traffic
Reply all
Reply to author
Forward
0 new messages