{Series,DataFrame,..}.astype(bool) converts NaN values to True

4,651 views
Skip to first unread message

Seth P

unread,
Sep 27, 2014, 12:56:27 PM9/27/14
to pyd...@googlegroups.com
As mentioned in the subject, {Series,DataFrame,..}.astype(bool) converts NaN values to True. I realize that bool(NaN) is True, so there's obvious consistency there. However my intuition, especially when using a container of bools as a mask, would be that NaN values would convert to False. Perhaps this is one of those cases where the Pandas treatment of NaN should differ from numpy's?

Here are some related discussions, though none seem to address explicitly what the desired treatment of NaNs (or Nones) by .astype(bool):
https://groups.google.com/d/msg/pydata/pOz9LCx3JF0/selM28IIbCsJ
https://github.com/pydata/pandas/issues/6528
https://github.com/pydata/pandas/pull/8151

Apologies if there have been other discussions on the topic that I've missed.

Moritz

unread,
Sep 28, 2014, 5:45:11 PM9/28/14
to pyd...@googlegroups.com
Hi,
 
I tripped over this one as well, but an easy solution is to use the .notnull() method. Always worked for me.

Moritz

Seth P

unread,
Sep 30, 2014, 2:36:12 PM9/30/14
to pyd...@googlegroups.com
Yep, easy to do .fillna(False).astype(bool) to get my expected behavior; it's just that, at least for masks, it seems to make more sense to me for NaN to be converted to False. But as others have pointed out, for consistency with other behavior (e.g. True if np.NaN else False evaluates to True), perhaps best to leave things as is and just use .fillna(False).astype(bool) when so desired.

Jeff

unread,
Sep 30, 2014, 2:47:39 PM9/30/14
to pyd...@googlegroups.com
In internal code, pandas always uses an explicity fillna (e.g. ``.fillna(False).astype(bool)`` to avoid this exact situation (as their are times when you want True)

e.g. imagine a != comparision
Reply all
Reply to author
Forward
0 new messages