--
You received this message because you are subscribed to the Google Groups "SemEval 2019: Task 9" group.
To unsubscribe from this group and stop receiving emails from it, send an email to semeval-2019-ta...@googlegroups.com.
To post to this group, send email to semeval-2...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/semeval-2019-task-9/b64468a1-bb8e-4ba9-be42-d7ccaa957576%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
<duplicate_ids.txt>
Hi Ananda, All,Thanks for pointing this out. We would upload the revised file. There are about 219 rows duplicated.However, this should have a minimal effect and you may not want to immediately re-train your model for updating the trial leaderboard.Sapna
On 15 Nov 2018, at 02:06, Ananda Seelan <ananda...@gmail.com> wrote:
Hello,There seems to be a bunch of duplicate lines in the training file. I checked the Subtask-A training file and I've attached the list of duplicate row ids.Cheers
On Sunday, 28 October 2018 22:23:50 UTC+5:30, Sapna Negi wrote:Dear participants,The final versions of datasets are now available in our GitHub repos, after incorporating the corrections in the previous versions pointed out by some of you.In case you are not aware of this already, please note that the evaluation phase will tentatively start on the 10th of January, where a fresh evaluation dataset will be uploaded on CodaLab.Currently, our CodaLab leaderboard reflects the trial phase results only. The trial test data can be treated as a validation set for your final SemEval submissions.We would also like to remind you that we have a separate Codalab website for subtask B i.e. cross-domain suggestion mining.RegardsTask organisers--
You received this message because you are subscribed to the Google Groups "SemEval 2019: Task 9" group.
To unsubscribe from this group and stop receiving emails from it, send an email to semeval-2019-task-9+unsub...@googlegroups.com.
Hi Ananda,
I think the duplicates could be originating from the raw data itself, i.e. duplication of text as title, re-submission of posts by the authors, or quoting of a post in a different post.
This dataset is a semeval specific extension of an older set where such issues were not encountered. We will investigate this further asap.
Regardless, apologies. We will try to rectify this asap, and will take extra precautions in the evaluation set.
Thanks and Regards
Organizers
To unsubscribe from this group and stop receiving emails from it, send an email to semeval-2019-ta...@googlegroups.com.
To post to this group, send email to semeval-2...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/semeval-2019-task-9/b64468a1-bb8e-4ba9-be42-d7ccaa957576%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
<duplicate_ids.txt>
-- 
You received this message because you are subscribed to the Google Groups "SemEval 2019: Task 9" group.
To unsubscribe from this group and stop receiving emails from it, send an email to semeval-2019-ta...@googlegroups.com.
To post to this group, send email to 
semeval-2...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/semeval-2019-task-9/75d4f1f3-080c-496e-8fad-6a44c22fddac%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to semeval-2019-task-9+unsub...@googlegroups.com.
To post to this group, send email to semeval-2...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/semeval-2019-task-9/b64468a1-bb8e-4ba9-be42-d7ccaa957576%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
<duplicate_ids.txt>
--
You received this message because you are subscribed to the Google Groups "SemEval 2019: Task 9" group.
To unsubscribe from this group and stop receiving emails from it, send an email to semeval-2019-task-9+unsub...@googlegroups.com.