Clarification on PsyDefDetect Evaluation and Inclusion of Class 0

Eric Rudolph

unread,

Mar 9, 2026, 9:49:19 AMMar 9

to psydef...@googlegroups.com, Philipp Steigerwald

Dear organisers,

I have a question regarding the PsyDefDetect challenge evaluation.

The dataset contains 9 labels: seven hierarchical levels of defensive maturity and two auxiliary labels. On the leaderboard, the best reported result from the paper reaches an F1 score of 0.3148 and was achieved by fine-tuning Ministral-8B. However, the experimental setup section appears to indicate that the evaluation was performed only on the positive classes (1-8).

Could you please clarify whether class 0 should be excluded from the dataset for the challenge evaluation? If class 0 is included in the challenge setting, then the leaderboard entry from the paper may not be directly comparable and it might be worth clarifying or adjusting this on the leaderboard.

Best regards,

Eric Rudolph

王子木

unread,

Mar 14, 2026, 10:43:48 AMMar 14

to PsyDefDetect @ BioNLP 2026

Dear Eric,

Thank you for your question. We confirm that our initial plan is to include the positive classes in the shared task, which is consistent with the paper and the previous practices in multi-class classification. For a more complete comparison, we will request all participants to submit their results after the evaluation phase and set up an additional leaderboard that includes all classes.

Best Regards,

Zimu

Eric Rudolph

unread,

Mar 15, 2026, 6:15:33 AMMar 15

to 王子木, PsyDefDetect @ BioNLP 2026

Dear Zimu,

Thank you for answering my question and I am sorry to bother you again. I am still a little confused. If you say "include the positive classes in the shared task“ does that mean we need to exclude negative labels in classification from train.json and does the test.json file include data points with label 0 or not?

From: psydef...@googlegroups.com <psydef...@googlegroups.com> on behalf of 王子木 <zimuw...@gmail.com>
Date: Saturday, 14 March 2026 at 15:43
To: PsyDefDetect @ BioNLP 2026 <psydef...@googlegroups.com>
Subject: Re: Clarification on PsyDefDetect Evaluation and Inclusion of Class 0

Sie erhalten nicht häufig E-Mails von zimuw...@gmail.com. Erfahren Sie, warum dies wichtig ist

Subject: Re: Clarification on PsyDefDetect Evaluation and Inclusion of Class 0

Dear Zimu,

Thank you for your prompt response. I apologise for the continued questions, but I would like to make sure I understand correctly.

When you say "include the positive classes in the shared task", does this mean that label 0 should be excluded from the training data (train.json) during model training? And accordingly, will the test set (test.json) only contain data points with labels 1–8, or will label 0 also be present there?

Thank you again for your clarification.

Best regards,

Eric Rudolph

在2026年3月9日星期一 UTC+8 21:49:19<eric.r...@th-nuernberg.de> 写道：

Dear organisers,

I have a question regarding the PsyDefDetect challenge evaluation.

The dataset contains 9 labels: seven hierarchical levels of defensive maturity and two auxiliary labels. On the leaderboard, the best reported result from the paper reaches an F1 score of 0.3148 and was achieved by fine-tuning Ministral-8B. However, the experimental setup section appears to indicate that the evaluation was performed only on the positive classes (1-8).

Could you please clarify whether class 0 should be excluded from the dataset for the challenge evaluation? If class 0 is included in the challenge setting, then the leaderboard entry from the paper may not be directly comparable and it might be worth clarifying or adjusting this on the leaderboard.

Best regards,

Eric Rudolph

--
您收到此邮件是因为您订阅了Google群组上的“PsyDefDetect @ BioNLP 2026”群组。
要退订此群组并停止接收此群组的电子邮件，请发送电子邮件到psydefdetect...@googlegroups.com。
如需查看此讨论，请访问 https://groups.google.com/d/msgid/psydefdetect/70bb7e5a-bc15-42f4-a427-0c0837c39782n%40googlegroups.com。

Reply all

Reply to author

Forward