Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

IndexError for using pandas dataframe values

43 views
Skip to first unread message

Daiyue Weng

unread,
May 25, 2016, 10:19:30 AM5/25/16
to
Hi, I tried to use DataFrame.values to convert a list of columns in a
dataframe to a numpy ndarray/matrix,

matrix = df.values[:, list_of_cols]

but got an error,

IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis
(None) and integer or boolean arrays are valid indices

so what's the problem with the list of columns I passed in?

many thanks

Peter Otten

unread,
May 25, 2016, 12:03:29 PM5/25/16
to
Your suggestively named list_of_cols is probably not a list. Have your
script print its value and type before the failing operation:

print(type(list_of_cols), list_of_cols)

Peter Otten

unread,
May 28, 2016, 3:26:06 AM5/28/16
to
Am Do Mai 26 2016, 09:21:59 schrieb Daiyue Weng:

[If you had sent this to the list I would have seen it earlier.
Just in case you didn't solve the problem in the meantime:]

> it prints
>
> <class 'list'> ['key1', 'key2']

So my initial assumption was wrong -- list_of_cols is a list. However,
df.values is a numpy array and therefore expects integer indices:

>>> df = pd.DataFrame([[1,2,3],[4,5,6]], columns="key1 key2 key3".split())
>>> df
key1 key2 key3
0 1 2 3
1 4 5 6

[2 rows x 3 columns]
>>> df.values
array([[1, 2, 3],
[4, 5, 6]])
>>> df.values[["key1", "key2"]]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: 'key1'

(I get a different error message, probably because we use different versions
of numpy)

To fix the problem you can either use integers

>>> df.values[:,[0, 1]]
array([[1, 2],
[4, 5]])

or select the columns in pandas:

>>> df[["key1", "key2"]].values
array([[1, 2],
[4, 5]])



0 new messages