Datatset: evaluation of floating point numbers

23 views
Skip to first unread message

Kanchan Shivashankar

unread,
Aug 13, 2024, 5:34:51 AM8/13/24
to scholarly-qald-2024
Dear Organizers,

Thank you for organizing this challenge. My name is Kanchan Shivashankar, I am currently participating in Scholarly QALD challenge.
I encountered a problem when evaluating my generated results against the gold answers in the train dataset. The precision of floating point numbers are rounded off from the value available in KG . This affects the final evaluation scores.

Are they rounded off in the test dataset as well? If so what is the precision value to follow?

Here is an example from train dataset (for your reference):
id: 48f41d21-b0fb-45b7-a6e4-96495160f2d7
gold_answer: 3.4050634
SemOpenAlex : 3.405063390731812

Thank you for your response in advance!

Regards,
Kanchan

Debayan Banerjee

unread,
Aug 14, 2024, 9:47:54 AM8/14/24
to Kanchan Shivashankar, scholarly-qald-2024
HI Kanchan,

thanks for participating. 
I am afraid the rounding off is performed by the LLM which generates the dataset, and is unpredictable in nature. We do not have a perfect evaluation metric to consider these differences. I did a quick count and around 10 questions in the test set have long floats. Based on my observation, the rounding off varies between 6-8 places after decimal.
Sadly at this late stage, we can not do much about this, and only. make sure that in the future, we take care of such cases better.


--
You received this message because you are subscribed to the Google Groups "scholarly-qald-2024" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scholarly-qald-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scholarly-qald-2024/7294704f-a41d-481e-b93c-4bbd2d8ff6e9n%40googlegroups.com.


--
Regards,
Debayan Banerjee
Reply all
Reply to author
Forward
0 new messages