Incomplete explanation of Python `in` operator with Series

40 views
Skip to first unread message

K Shibley

unread,
Apr 9, 2021, 5:32:29 PM4/9/21
to PyData
The Series object is very unique when it comes to the Python `in` operator. It's the only data structure I can think of that operates on `keys` while checking membership but yields `values` while iterating. This creates a lot of confusion amongst new users and there is little explanation given for this behavior.

The FAQ explains why keys are used for checking membership (similar to a dictionary) but it does not explain why values are yielded when iterating (unlike a dictionary).

To give a clear example of why this behavior is unintuitive and deserves a proper explanation, consider the following:

```
import pandas as pd
import numpy as np

df = pd.DataFrame(np.ones((3,3)),columns=['a','b','c'])
df.replace(1,'abc',inplace=True)
a = df['a']
print([x in a for x in a]) # This will print # [False, False, False]
```

Intuitively, [x in a for x in a] should return all True, yet because the Series object iterates over values and checks membership in keys, this is not the case.

If there is a practical reason for this disjoint behavior, it should be documented.

K Shibley

unread,
Apr 24, 2021, 5:14:19 PM4/24/21
to PyData
Bump
Reply all
Reply to author
Forward
0 new messages