One option is to use the replace method :
In [10]: import pandas as pd
In [11]: df
Out[11]:
a b
0 NaN 1
1 1 2
2 NaN 3
3 1 1
4 NaN 2
5 1 3
In [12]: df.dtypes
Out[12]:
a object
b int64
dtype: object
In [13]: newdf = df.replace('NaN', np.nan)
In [14]: newdf
Out[14]:
a b
0 NaN 1
1 1 2
2 NaN 3
3 1 1
4 NaN 2
5 1 3
In [15]: newdf.dtypes
Out[15]:
a float64
b int64
dtype: object
I'm probably missing something obvious here, but I'm creating a dataframe from a duct that contains a number of 'NaN' strings for missing data. Pandas seems to be interpreting these literally as strings. How do I convert them to nulls so that I can use methods such as dropna to deal with them?
--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.