Ignoring extra columns in CSV

783 views
Skip to first unread message

Miki Tebeka

unread,
Feb 20, 2013, 3:24:59 PM2/20/13
to pyd...@googlegroups.com
Greetings,

I have a CSV file with some rows that have extra (empty) columns.
Whey I try to read_csv it, I get the following error:

CParserError: Error tokenizing data. C error: Expected 66 fields in line 1033, saw 68

Is there a way to tell pandas to ignore the extra columns?
(I saw this thread, looks related but not sure how it helps me).

Thanks,
--
Miki

Wes McKinney

unread,
Feb 20, 2013, 5:11:24 PM2/20/13
to pyd...@googlegroups.com
> --
> You received this message because you are subscribed to the Google Groups
> "PyData" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pydata+un...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

did you try error_bad_lines=False? it looks like this argument is not
in the docstring =(

- Wes

Miki Tebeka

unread,
Feb 21, 2013, 2:21:14 PM2/21/13
to pyd...@googlegroups.com
 
did you try error_bad_lines=False? it looks like this argument is not
in the docstring =(
Thanks I'll try.
However this will ignore the whole line. I'd rather include the line, just ignore the extra columns at the end. 
(The Python CSV parser seems to do that).

Miki Tebeka

unread,
Feb 22, 2013, 3:16:35 PM2/22/13
to pyd...@googlegroups.com
Any ideas? I get the same error loading the tab separated of the UFO database in https://github.com/johnmyleswhite/ML_for_Hackers/tree/master/01-Introduction/data/ufo

In [1]: df = pd.read_csv('ufo_awesome.tsv', sep='\t', header=False)
          ...
CParserError: Error tokenizing data. C error: Expected 6 fields in line 755, saw 7

Reply all
Reply to author
Forward
0 new messages