Training set data format

106 views
Skip to first unread message

ynu...@gmail.com

unread,
Sep 28, 2018, 9:52:48 AM9/28/18
to SemEval 2019: Task 9

1538142572(1).jpg

1538142594(1).jpg

When I use the pandas read_csv to  read the "Training_Full_V1.0.csv data", there is always an error. And then when I opened the file with excel, I found the file format as shown. At the same time, use the Sublime Text 3 opening this file, some data format as shown.   Do you have this problem and how to solve it ?

Sapna Negi

unread,
Sep 29, 2018, 7:36:21 AM9/29/18
to ynu...@gmail.com, SemEval 2019: Task 9
Hi,

If you look at our baseline script:

We are reading the csv using csv.reader. Perhaps you can consider using that.

Regards
Sapna


On 28 Sep 2018, at 14:52, ynu...@gmail.com wrote:

<1538142572(1).jpg>

<1538142594(1).jpg>

When I use the pandas read_csv to  read the "Training_Full_V1.0.csv data", there is always an error. And then when I opened the file with excel, I found the file format as shown. At the same time, use the Sublime Text 3 opening this file, some data format as shown.   Do you have this problem and how to solve it ?

--
You received this message because you are subscribed to the Google Groups "SemEval 2019: Task 9" group.
To unsubscribe from this group and stop receiving emails from it, send an email to semeval-2019-ta...@googlegroups.com.
To post to this group, send email to semeval-2...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/semeval-2019-task-9/663baff1-8b84-4d14-a434-fb83f7b8ff99%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
<1538142572(1).jpg><1538142594(1).jpg>

Yunxia Ding

unread,
Sep 29, 2018, 8:30:45 AM9/29/18
to SemEval 2019: Task 9
I have tried the read csv function in baseline script, there has two errors: the first is "list index out of range", the second is that the sent_list contains only 3105 sentences, because the latter data cannot be read when id=‘na_6’.

Then I find the Training_Full_V1.0.csv datasets like this in id="na_6".




1538224110(1).jpg



在 2018年9月28日星期五 UTC+8下午9:52:48,Yunxia Ding写道:

Sapna Negi

unread,
Sep 29, 2018, 9:37:30 AM9/29/18
to Yunxia Ding, SemEval 2019: Task 9
Hi,

Please use the revised version of baseline script and let me know if the error still persists.

Regards
Sapna

On 29 Sep 2018, at 13:30, Yunxia Ding <ynu...@gmail.com> wrote:

I have tried the read csv function in baseline script, there has two errors: the first is "list index out of range", the second is that the sent_list contains only 3105 sentences, because the latter data cannot be read when id=‘na_6’.

Then I find the Training_Full_V1.0.csv datasets like this in id="na_6".




<1538224110(1).jpg>



在 2018年9月28日星期五 UTC+8下午9:52:48,Yunxia Ding写道:

1538142572(1).jpg

1538142594(1).jpg

When I use the pandas read_csv to  read the "Training_Full_V1.0.csv data", there is always an error. And then when I opened the file with excel, I found the file format as shown. At the same time, use the Sublime Text 3 opening this file, some data format as shown.   Do you have this problem and how to solve it ?
-- 
You received this message because you are subscribed to the Google Groups "SemEval 2019: Task 9" group.
To unsubscribe from this group and stop receiving emails from it, send an email to semeval-2019-ta...@googlegroups.com.
To post to this group, send email to semeval-2...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
<1538224110(1).jpg>

Yunxia Ding

unread,
Sep 29, 2018, 10:02:11 AM9/29/18
to SemEval 2019: Task 9
Hi,
I have tried the revised version of baseline script,but the error still persists. And I tried using TrialData_SubtaskA_Test.csv datasets, there is no problem.  So I think it is a problem with Training_Full_V1.0.csv.

Is it possible to adjust the data format of the train to be consistent with the test?

Best wishes,
Yunxia

在 2018年9月28日星期五 UTC+8下午9:52:48,Yunxia Ding写道:

1538142572(1).jpg

Sapna Negi

unread,
Sep 29, 2018, 10:04:51 AM9/29/18
to Yunxia Ding, SemEval 2019: Task 9
The data reads well at our end.
But yes, we will adjust the data format next. 
We will send an email when we release the final version of the train dataset.

Thank you for your feedback.
Sapna

On 29 Sep 2018, at 15:02, Yunxia Ding <ynu...@gmail.com> wrote:

data

Reply all
Reply to author
Forward
0 new messages