How to do this "join" for pairing in Pandas?

26 views
Skip to first unread message

Rex

unread,
Aug 25, 2016, 9:15:06 PM8/25/16
to PyData
1. Given following CSV file

     $cat data.csv>     

     ID,City,Zip,Flag
     1,A,95126,0
     2,A,95126,1
     3,A,95126,1
     4,B,95124,0
     5,B,95124,1
     6,C,95124,0
     7,C,95127,1
     8,C,95127,0
     9,C,95127,1


(a) where "ID" above is a primary key (unique), 

(b) for each "City" and "Zip" combination, there is one ID in max with Flag=0; while it can contain multiple IDs with Flag=1 for each "City" and "Zip" combination.

(c) Flag can be 0 or 1


2. For each ID with Flag=0, we want to pair it with another ID with Flag=1 but with the same City - Zip. If one cannot find another paired ID with Flag=1 and matched City - Zip, we just delete that record.

Here is the expected result:

     ID,City,Zip,Flag
     1,A,95126,0
     2,A,95126,1
     4,B,95124,0
     5,B,95124,1
     7,C,95127,1
     8,C,95127,0


Any valuable tips how to do this pairing in Pandas?

Great thanks!

Reply all
Reply to author
Forward
0 new messages