csv to arff format

1,838 views
Skip to first unread message

durvesh javle

unread,
Aug 10, 2014, 1:29:06 PM8/10/14
to wekamooc...@googlegroups.com
Hi,

I am finding it difficult to convert csv and xls xlxs format to  arff. Can u please hlep

Alistair Thomson

unread,
Aug 10, 2014, 7:34:44 PM8/10/14
to wekamooc...@googlegroups.com
You give almost no information with your request, so here is a generic solution.

It's simplest to manually create an arff from a csv file, so convert your xls and xlsx files to csv by using 'save as' from your spreadsheet program.

Open the csv file in a text editor (like notepad, notepad++, etc).

Insert a line at the front of your file to explain the @relation and one line for each @attribute in the file, then a final inserted line that just says @data. The
original csv file lines should then follow from the next line unaltered. For the attributes, you need to understand the data - are the fields numeric or literal.

Here is an example - the schizophrenia arff file.
Note that:
 - there is one attribute for each column of the original spreadsheet
 - use ? for a missing value
 - for a limited number of literal values in a particular field, list them inside the curly brackets like so "@attribute target {CS,PS,TR}"
 - the default class attribute for WEKA will always be the last one - here schizophrenic or not.

@relation schizo
@attribute ID numeric
@attribute target {CS,PS,TR}
@attribute gain_ratio_1 numeric
@attribute gain_ratio_2 numeric
@attribute gain_ratio_3 numeric
@attribute gain_ratio_4 numeric
@attribute gain_ratio_5 numeric
@attribute gain_ratio_6 numeric
@attribute gain_ratio_7 numeric
@attribute gain_ratio_8 numeric
@attribute gain_ratio_9 numeric
@attribute gain_ratio_10 numeric
@attribute gain_ratio_11 numeric
@attribute sex {female,male}
@attribute class {non-schizophrenic,schizophrenic}
@data
7,PS,0.935,0.933,0.968,0.92,0.993,0.978,0.974,0.871,0.912,0.92,0.883,female,non-schizophrenic
7,PS,0.909,0.82,0.882,0.884,0.813,0.944,0.892,0.882,0.912,0.881,0.879,female,non-schizophrenic
7,CS,0.86,0.863,0.935,?,0.846,0.91,0.846,0.835,0.719,0.782,0.795,female,non-schizophrenic
7,TR,?,?,0.814,0.783,?,0.768,0.749,0.82,0.76,0.783,0.722,female,non-schizophrenic
7,PS,0.879,0.864,0.804,0.65,0.74,0.766,0.866,0.817,0.879,0.733,0.845,female,non-schizophrenic
19,PS,0.919,0.875,?,0.915,0.883,0.802,0.802,0.77,0.963,0.932,1.01,female,non-schizophrenic
19,PS,0.848,0.786,0.83,0.802,0.865,0.827,?,0.765,0.839,0.798,0.857,female,non-schizophrenic
19,CS,0.925,0.786,0.732,0.77,0.736,0.711,0.653,0.832,0.864,0.868,0.837,female,non-schizophrenic
19,PS,0.813,0.88,0.753,0.87,0.795,0.902,0.889,0.837,0.731,0.808,0.894,female,non-schizophrenic
... etcetera ...

Gabriel Santos

unread,
Aug 10, 2014, 9:29:13 PM8/10/14
to wekamooc...@googlegroups.com
Hi Durvesh,

The easiest way to do it is to convert your data to CSV first and load CSV file into Weka. Once the data is loaded you can click the "Save" button (the right most button) to your computer. The default file format is ARFF.

BR,
Gabriel Santos
Macau
Community TA

durvesh javle

unread,
Aug 11, 2014, 2:53:16 AM8/11/14
to wekamooc...@googlegroups.com
Hi Gabriel/ Alister,

Thanks for the details.

When i use Weka to open the csv file get the error as below

java.io.IOException: Wrong number of values. Read 6, expected 2,r ead Token[EOL], line 3

I m trying to open file for one of the Kaggle contests. the csv file works well with R.

Durvesh.

Gabriel Santos

unread,
Aug 11, 2014, 4:18:50 AM8/11/14
to wekamooc...@googlegroups.com
Hi Durvesh,

After googling for Kaggle Contest I found the website with 2 CSV files. One called example_submission.csv and another zipped.
I was able to open both files without problem.

The error you are describing is generated by Weka if it found a comma at the end of the line instead of the [EOL] character. Misinterpretation of the separator value, perhaps.

Attached a screenshot of Weka after loading the biggest file -- fer2013.csv (+294MB after decompressed).
fer2013.jpg

tvss pavan kumar

unread,
Aug 11, 2014, 4:24:03 AM8/11/14
to wekamooc...@googlegroups.com
You can also convert csv to arff in R by using write.arff(). Import it as a dataframe first. Load the library foreign.


--
You received this message because you are subscribed to the Google Groups "WekaMOOC-general" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wekamooc-gener...@googlegroups.com.
To post to this group, send email to wekamooc...@googlegroups.com.
Visit this group at http://groups.google.com/group/wekamooc-general.
To view this discussion on the web, visit https://groups.google.com/d/msgid/wekamooc-general/949f4ae0-9021-43ad-85c2-94dfb2fd9375%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

durvesh javle

unread,
Aug 11, 2014, 5:00:16 AM8/11/14
to wekamooc...@googlegroups.com
Hi Gabriel/ Pavan,

Thanks for the prompt help appreciated!!
Reply all
Reply to author
Forward
0 new messages