1. Given following CSV file
$cat data.csv>
ID,City,Zip,Flag
1,A,95126,0
2,A,95126,1
3,A,95126,1
4,B,95124,0
5,B,95124,1
6,C,95124,0
7,C,95127,1
8,C,95127,0
9,C,95127,1
(a) where "ID" above is a primary key (unique),
(b) for each "City" and "Zip" combination, there is one ID in max with Flag=0; while it can contain multiple IDs with Flag=1 for each "City" and "Zip" combination.
(c) Flag can be 0 or 1
2. For each ID with Flag=0, we want to pair it with another ID with Flag=1 but with the same City - Zip. If one cannot find another paired ID with Flag=1 and matched City - Zip, we just delete that record.
Here is the expected result:
ID,City,Zip,Flag
1,A,95126,0
2,A,95126,1
4,B,95124,0
5,B,95124,1
7,C,95127,1
8,C,95127,0
Any valuable tips how to do this pairing in Pandas?