baseline issue in Track 2

Wei Zhao

unread,

Apr 8, 2024, 11:47:55 AM4/8/24

to AXOLOTL-24

Hello,

Can you please check if the data processing [1] in the track 2 is correct?

L110: dataset = dataset.dropna(subset=["gloss", "example", "word”])

It is said that instances will be removed when they are empty in one of the three columns. L110 works on dev data but fails on test data. I think we should keep the instances with empty glosses in test data, as filling out the glosses is the goal of the track 2.

Below is the correction:

L110: dataset = dataset.dropna(subset=["gloss", "example", "word”])

Please confirm this issue. Thanks.

[1] https://github.com/ltgoslo/axolotl24_shared_task/blob/main/code/baselines/baseline_track2.py#L110

Best,

Wei

Wei Zhao

unread,

Apr 8, 2024, 11:49:00 AM4/8/24

to AXOLOTL-24

Below is the correction:

L110: dataset = dataset.dropna(subset=["example", "word”])

Best,

Wei

Andrey Kutuzov

unread,

Apr 8, 2024, 12:37:59 PM4/8/24

to axolo...@googlegroups.com

Hi Wei,

We will have a look, but this is only an example baseline
implementation, so you are completely free to simply ignore it.

> https://github.com/ltgoslo/axolotl24_shared_task/blob/main/code/baselines/baseline_track2.py#L110 <https://github.com/ltgoslo/axolotl24_shared_task/blob/main/code/baselines/baseline_track2.py#L110>
>
> Best,
> Wei
>
> --
> You received this message because you are subscribed to the Google
> Groups "AXOLOTL-24" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to axolotl-24+...@googlegroups.com
> <mailto:axolotl-24+...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/axolotl-24/2a721ccb-7cae-4235-a5d8-7c6546371993n%40googlegroups.com <https://groups.google.com/d/msgid/axolotl-24/2a721ccb-7cae-4235-a5d8-7c6546371993n%40googlegroups.com?utm_medium=email&utm_source=footer>.

--
Andrey
Language Technology Group (LTG)
University of Oslo

Mariia Fedorova

unread,

Apr 9, 2024, 5:30:11 AM4/9/24

to AXOLOTL-24

Hello Wei,

we have updated the track 2 baseline so that it is able to run on the test set without failures (even if the subtask1 was not solved first), run prediction only without always training the model first etc.

We have also found an issue with the usage of the target word indices, so it is disabled until fixed. But, as said, this is just an example baseline and not the best possible solution, so feel free to change whatever you need.

With kind regards,

Maria.

From: axolo...@googlegroups.com <axolo...@googlegroups.com> on behalf of Wei Zhao <andywe...@gmail.com>
Sent: 08 April 2024 17:47:55
To: AXOLOTL-24
Subject: [axolotl] baseline issue in Track 2

--
You received this message because you are subscribed to the Google Groups "AXOLOTL-24" group.

To unsubscribe from this group and stop receiving emails from it, send an email to axolotl-24+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/axolotl-24/2a721ccb-7cae-4235-a5d8-7c6546371993n%40googlegroups.com.

Reply all

Reply to author

Forward