Question about Multimodal Semantic Segmentation Challenge

18 views
Skip to first unread message

맹제모

unread,
Feb 14, 2026, 1:41:00 AM (12 days ago) Feb 14
to MaCVi Support

I am writing to clarify a few details regarding the Multimodal Semantic Segmentation Challenge. I have successfully downloaded the dataset from the provided link (https://lmi.fe.uni-lj.si/en/MULTIAQUA/) and have begun my preliminary analysis.

Based on my understanding of the dataset structure, I have the following questions regarding the evaluation and submission process:

  1. Label Mapping: Could you confirm if the model’s predicted masks should use the following integer values for each class for submission?

    • 1: Static obstacle

    • 2: Dynamic obstacle

    • 3: Water

    • 4: Sky

  2. Submission Format: Is the correct submission procedure to generate inference masks with the same resolution as the provided annotation images, and then submit a compressed (.zip) folder containing these predicted mask images?

Thank you very much for your time and for organizing this interesting challenge. I look forward to your guidance.

Jon Muhovič

unread,
Feb 19, 2026, 12:36:56 PM (7 days ago) Feb 19
to MaCVi Support
Hello, and sorry for the delay in replying.

1. Yes, these values are exactly correct. While integer 0 appears in GT labels, it denotes ignored pixels (unknown regions or parts of the recording boat).
2. Given the recent leaderboard submissions I guess you figured this out, but yes, regardless of the resolution at which you perform training/inference, the final predictions should be the same size as the RGB images. The results can be encoded with colors or integers, the evaluation server should handle either (more details here). The uploaded zip should contain predictions for all the test set images.

Best regards and good luck,
Jon Muhovič

Reply all
Reply to author
Forward
0 new messages