dataframe.hasNaN()

Adam Hughes

unread,

Feb 16, 2013, 10:06:59 PM2/16/13

to pyd...@googlegroups.com

Hi all,

I'm trying to test a dataframe to see if has any Nan's. I didn't see a function like this on the api on the website (although this api is out of date) and saw on stack overflow that the following operation works smoothly on nparrys.

import numpy as np

df=DataFrame(...)

any(np.isnan(x) for x in df.values.flatten())

My first question is, does pandas already have a dataframe method like this? Secondly, is there a better way to do this? Third, would this be a method worth putting into the DataFrame class? Eg:

df.hasNan()

Note: for series, there is a quicker operation.

s=Series(...)

np.isnan(np.sum(s))

Paul Hobson

unread,

Feb 16, 2013, 10:36:03 PM2/16/13

to pyd...@googlegroups.com

Adam, you should be able to use a similar method as the series.

import numpy as np

import pandas as pd

df = pd.DataFrame(np.ma.masked_less(np.random.normal(size=(5,5)), 0))

number_NANs = np.isnan(df),sum().sum()

Does that work for you?

-paul

Adam Hughes

unread,

Feb 16, 2013, 10:39:38 PM2/16/13

to pyd...@googlegroups.com

Paul,

Thanks. This works. I originally tried it but called the sum inside paranthesis:

np.isnan(df.sum()).sum()

Thanks.

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Wes McKinney

unread,

Feb 16, 2013, 11:20:57 PM2/16/13

to pyd...@googlegroups.com

If df.isnull() sufficient? So df.isnull.values.sum() might be what you want also

Adam Hughes

unread,

Feb 17, 2013, 2:04:11 AM2/17/13

to pyd...@googlegroups.com

Wes,

Thanks. Does this work on your current working version of pandas? I get the following error on versions 0.10.0-1 and 0.10.1-1:

In [8]: from pandas import DataFrame

In [9]: df=DataFrame()

In [10]: df.isnull()

---------------------------------------------------------------------------

AttributeError Traceback (most recent call last)

<ipython-input-10-a21e30720f75> in <module>()

----> 1 df.isnull()

/usr/local/EPD/lib/python2.7/site-packages/pandas/core/frame.py in __getattr__(self, name)

2013 return self[name]

2014 raise AttributeError("'%s' object has no attribute '%s'" %

-> 2015 (type(self).__name__, name))

2016

2017 def __setattr__(self, name, value):

AttributeError: 'DataFrame' object has no attribute 'isnull'

Wes McKinney

unread,

Feb 17, 2013, 10:52:24 AM2/17/13

to pyd...@googlegroups.com

Apologies. Try pandas.isnull(df) instead-- isnull/notnull are only
instance methods on Series

- Wes

Adam Hughes

unread,

Feb 17, 2013, 1:40:04 PM2/17/13

to pyd...@googlegroups.com

Thanks guys, all of these solutions are good.

Appreciate it

Tanushree Pareek

unread,

Mar 9, 2018, 8:49:37 AM3/9/18

to PyData

How can i sue in my dataset
null_columns = loan_data.isnull(df)
it gives me error .

I tried lots of method but doesn't help
null_columns=loan_data[null_columns('int_rate')].isnull().sum()
print(train[train.isnull().any(axis=1)][null_columns].head())
print(loan_data[loan_data["int_rate"].isnull()][null_columns])

please help me I want to know the null values and also impute them.
int_rate is my column name in loan_data

Tanushree Pareek

unread,

Mar 9, 2018, 8:49:37 AM3/9/18

to PyData

Reply all

Reply to author

Forward