Reading csv from pandas having both quotechar and delimiter for a column value

1,309 views
Skip to first unread message

Amol Sharma

unread,
Feb 28, 2016, 2:07:58 PM2/28/16
to pyd...@googlegroups.com
Here is the content of a csv file 'test.csv', i am trying to read it via pandas read_csv()

    "col1", "col2", "col3", "col4"
    "v1", "v2", "v3", "v4"
    "v21", "v22", "v23", "this, "creating, what to do? " problems"

This is the command i am using -
     
    messages = pd.read_csv('test.csv', sep=',', skipinitialspace=True)

But i am getting the following error -

    CParserError: Error tokenizing data. C error: Expected 4 fields in line 3, saw 5

i want the content for column4 in line3 to be 'this, "creating, what to do? " problems'

How to read file when a column can have quotechar and delimiter included in it ?


--
Thanks and Regards,
Amol Sharma

dartdog

unread,
Mar 3, 2016, 2:37:28 PM3/3/16
to PyData
You need to have the header row with the same number of columns.. it is not a "" problem..

Nicolas Bonnotte

unread,
Mar 7, 2016, 10:02:02 AM3/7/16
to pyd...@googlegroups.com
IMHO, it's the quotechars « " » inside the last column4 value that's the problem. If you want  « " » to be treated like a regular character, it needs to be doubled:

In [39]: csv_source = StringIO.StringIO(""" "col1", "col2", "col3", "col4"
"v1", "v2", "v3", "v4"
"v21", "v22", "v23", "this, ""creating, what to do? "" problems" """)

In [40]: pd.read_csv(csv_source, skipinitialspace=True)
Out[40]:
  col1 col2 col3                                      col4
0   v1   v2   v3                                        v4
1  v21  v22  v23  this, "creating, what to do? " problems

There might be an intelligent way to tell the parser to be careful and only treat as a quotechar the « " » that are preceded or followed by a separator « , » or a new line, but I don't know it.

N.


--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages