How can two pandas dataframe can be merged on the basis of partial match of column value

8 views

Skip to first unread message

himan...@gmail.com

unread,

Feb 7, 2020, 8:50:36 AM2/7/20

to Python Programming for Autodesk Maya

Hi all, I am a biologist trying to learn python. Please help me for the below code.

Number of files can be more than 50

>>> file1

cNo gene year

1 A,B 2004

2 C,D 2008

3 K,L 2011

>>> file2

cNo gene year

1 a,e 2001

2 d,c,p 2003

3 x,y,x 2000

4 m,n 1988

>>> file3

cNo gene year

1 R,S 2002

2 X 2005

3 A,Q 2002

Condition: compare gene of each file among all and find common/partial common 'genes' and respective 'name' and 'cNo'

final_output

name cNo genes

file1,file2,file3 1,1,3 [A,B], [a,e], [A,Q]

file1,file2 2,2 [C,D], [d,c,p]

file2,file3 3,2 [x,y,z], [X]

import pandas as pd

import functools

file1 = pd.DataFrame({'cNo':[1,2,3], 'gene': ['A,B','C,D','K,L'],'year':[2004,2008,2011]})

file2 = pd.DataFrame({'cNo':[1,2,3,4],'gene':['a,e','d,c,p','x,y,x','m,n'],'year':[2001,2003,2000,1988]})

file3 = pd.DataFrame({'cNo':[1,2,3],'gene':['R,S','X','A,Q'],'year':[2002,2005,2002]})

files = [file1, file2, file3]

#I don't know how to merge by partial matching of column

df = functools.reduce(lambda left,right: pd.merge(left,right,on='genes'), files)

print (df)

Justin Israel

unread,

Feb 7, 2020, 12:56:41 PM2/7/20

to python_in...@googlegroups.com

Seems like the wrong scope of a question for this particular mailing list. Maybe you want a python data science group that is likely to use pandas in their workflows?

--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_m...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/python_inside_maya/c94db8e5-3eaa-479e-a028-940fe3cdd54d%40googlegroups.com.

Reply all

Reply to author

Forward

0 new messages