About envirioment setup

17 views
Skip to first unread message

Kris

unread,
Sep 4, 2025, 4:49:35 AMSep 4
to Challenge on Learned Image Compression (CLIC)

Dear Organizers,

Sorry for bothering you again, and thank you for your kind support.

During the validation stage submission, I noticed that the decoder model’s intermediate outputs became unexpectedly large, which did not occur in my local Docker container built from the same source. This led to very low PSNR scores after evaluation. I have carefully checked the submitted weight loading and input bitstream, and both are correct, and also the torch is set to completely deterministic.

Besides, I observed that some cache files, such as public pretrained weights from HuggingFace, present in uploaded docker image were not found in the final submission environment. While this is not a big issue because i can still upload it via decoder file, but it raise my concern about how the submission image was built and whether there might be differences in the environment setup that could explain the behavior, including CUDA/cuDNN versions or any NCCL settings.

For reference, my local inference environment uses an A6000 GPU, with two GPUs initialized but only one utilized. The software setup includes PyTorch built with CUDA 12.8, cuDNN version 91002, and NCCL version 2.27.3, which are installed via pip(base image is clean). The Docker container was built using the following command:

docker run --gpus '"device=2,3"' --cpuset-cpus="20-23" --dns 8.8.8.8 -d -p 2038:22 \ --hostname server01 --name kris23_compression_env_0831 \ --shm-size 80G maplelab/mapl_clic:250831

I sincerely apologize for requesting the Docker image late, which may have limited the time  for debugging and aligning with the submission environment.

And I also wnat to clarify that I am not participating in the challenge to compete for any prize,  as my method is not practical in terms of decoding complexity. However, this challenge is very valuable for my future research, and I sincerely appreciate the opportunity and the platform you have provided.

It would be very great if you could share any information about how the submission environment is built, such as the base image, CUDA/cuDNN setup, and whether any specific NCCL or CUDA settings are used, so I can better reproduce and debug this issue locally. 

Thank you very much for your time and support.

Best regards,
Kris 

Reply all
Reply to author
Forward
0 new messages