I am trying to write a dataframe to csv in a compressed gzip format using Python3.
I can get my code to work in Py2.7, but because of the difference relating to bytes and strings in python 3 its not working. I am not sure how to fix it.
The code I have creates a temporary file then uses that as the stream object to gzip.GzipFile. I then use the gzip file as the argument to Pandas.to_csv.
I get the error 'str' does not support the buffer interface which I understand but not sure how to fix my code in python 3 to get around that error.
I've tried different 'mode' arguments but haven't found a way to get around this error.
Here is my sample code that I've been using:
from tempfile import NamedTemporaryFile
import gzip
import pandas as pd
df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
'foo', 'bar', 'foo', 'bar'],
'B' : ['one', 'one', 'two', 'two',
'two', 'two', 'one', 'two'],
'C' : [56, 2, 3, 4, 5, 6, 0, 2],
'D' : [51, 2, 3, 4, 5, 6, 0, 2]})
with NamedTemporaryFile(mode='wb') as tmp:
with gzip.GzipFile(fileobj=tmp) as archive:
df.to_csv(archive, header=False)