dataframe.hasNaN()

1,180 views
Skip to first unread message

Adam Hughes

unread,
Feb 16, 2013, 10:06:59 PM2/16/13
to pyd...@googlegroups.com
Hi all,

I'm trying to test a dataframe to see if has any Nan's.  I didn't see a function like this on the api on the website (although this api is out of date) and saw on stack overflow that the following operation works smoothly on nparrys.

import numpy as np
df=DataFrame(...)
any(np.isnan(x) for x in df.values.flatten())  

My first question is, does pandas already have a dataframe method like this?  Secondly, is there a better way to do this?  Third, would this be a method worth putting into the DataFrame class?  Eg:

df.hasNan() 


Note: for series, there is a quicker operation.

s=Series(...)
np.isnan(np.sum(s))

Paul Hobson

unread,
Feb 16, 2013, 10:36:03 PM2/16/13
to pyd...@googlegroups.com
Adam, you should be able to use a similar method as the series.

import numpy as np
import pandas as pd  
df = pd.DataFrame(np.ma.masked_less(np.random.normal(size=(5,5)), 0))
number_NANs = np.isnan(df),sum().sum()

Does that work for you?
-paul

Adam Hughes

unread,
Feb 16, 2013, 10:39:38 PM2/16/13
to pyd...@googlegroups.com
Paul,

Thanks.  This works.  I originally tried it but called the sum inside paranthesis:

np.isnan(df.sum()).sum()

Thanks.

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Wes McKinney

unread,
Feb 16, 2013, 11:20:57 PM2/16/13
to pyd...@googlegroups.com
If df.isnull() sufficient? So df.isnull.values.sum() might be what you want also

Adam Hughes

unread,
Feb 17, 2013, 2:04:11 AM2/17/13
to pyd...@googlegroups.com
Wes,

Thanks.  Does this work on your current working version of pandas?  I get the following error on versions 0.10.0-1 and 0.10.1-1:

In [8]: from pandas import DataFrame

In [9]: df=DataFrame()

In [10]: df.isnull()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-10-a21e30720f75> in <module>()
----> 1 df.isnull()

/usr/local/EPD/lib/python2.7/site-packages/pandas/core/frame.py in __getattr__(self, name)
   2013             return self[name]
   2014         raise AttributeError("'%s' object has no attribute '%s'" %
-> 2015                              (type(self).__name__, name))
   2016 
   2017     def __setattr__(self, name, value):

AttributeError: 'DataFrame' object has no attribute 'isnull'

Wes McKinney

unread,
Feb 17, 2013, 10:52:24 AM2/17/13
to pyd...@googlegroups.com
Apologies. Try pandas.isnull(df) instead-- isnull/notnull are only
instance methods on Series

- Wes

Adam Hughes

unread,
Feb 17, 2013, 1:40:04 PM2/17/13
to pyd...@googlegroups.com
Thanks guys, all of these solutions are good.

Appreciate it

Tanushree Pareek

unread,
Mar 9, 2018, 8:49:37 AM3/9/18
to PyData





How can i sue in my dataset
null_columns = loan_data.isnull(df)
it gives me error .

I tried lots of method but doesn't help
null_columns=loan_data[null_columns('int_rate')].isnull().sum()
print(train[train.isnull().any(axis=1)][null_columns].head())
print(loan_data[loan_data["int_rate"].isnull()][null_columns])

please help me I want to know the null values and also impute them.
int_rate is my column name in loan_data

Tanushree Pareek

unread,
Mar 9, 2018, 8:49:37 AM3/9/18
to PyData
Reply all
Reply to author
Forward
0 new messages