selecting rows from dataframe

26 views
Skip to first unread message

SS

unread,
Jun 21, 2016, 6:47:33 AM6/21/16
to PyData
I want to select rows listed in my list from a dataframe. When I use the command : 
list1 = ['Row2-Column2','Row3-Column2']
df2 = df1[df1.columns.isin(list1)]

where df1 is 
Row1_Column1    Row1-Column2    Row1-Column3    Row1-Column4    2   Row1-Column6
Row2_Column1    Row2-Column2    Row2-Column3    Row2-Column4    3   Row2-Column6
Row3_Column1    Row3-Column2    Row3-Column3    Row3-Column4    1   Row3-Column6
Row4_Column1    Row4-Column2    Row4-Column3    Row4-Column4    2   Row4-Column6
 
i am getting the error:
Item wrong length 1 instead of 3

Please let me know where I am getting wrong.

Thanks 
SS

Miki Tebeka

unread,
Jun 22, 2016, 4:48:05 AM6/22/16
to PyData
df.columns is just the list of columns. I I understand correctly you want to select bases on values.
To do this you can use boolean indexing (http://pandas.pydata.org/pandas-docs/stable/indexing.html): df[df.isin(list1).any(axis=1)] should work.

Note the "in" for lists is O(n), if you convert list1 to a set it'll be much faster:

values = set(['Row2-Column2','Row3-Column2'])
df[df.isin(values).any(axis=1)]
 
 
 
 
Reply all
Reply to author
Forward
0 new messages