Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

how to join two files using two columns as key

0 views
Skip to first unread message

Yong Liu

unread,
Apr 24, 2003, 3:27:05 PM4/24/03
to
Hi

I have two tab-delimited flat files. Both have column x coordinate and y
coordinate. I want to join them by using their coordinate as the key
(which should include both x and y coordinate). Is there way to do this
using shell commands sort and join, etc? Thanks a lot! (only those
coordinates in file1 will be appeared in final merged results). -Yong

File 1:
x y number1
1 2 343.5
1 3 444.2
0 1 33.4

File 2:

x y number2
0 1 333
1 2 444
1 3 555
0 8 666

Merge results:
x y number1 number2
1 2 343.5 444
1 3 444.2 555
0 1 33.4 333


William Park

unread,
Apr 24, 2003, 3:55:24 PM4/24/03
to


1. Shell:
- read and print 'x', 'y', 'number1'
- grep "^$x[ \t]+$y[ \t]" file2
- print out 'number2'

2. Awk:
- read file1 and store 'x,y':'number1' (key:value) in associative
array 'var1'
- read file2 and store 'x,y':'number2' (key:value) in associative
array 'var2'
- for each 'x,y' key in var1, print var1["x,y"] and var2["x,y"]

3. Python:
- read file1 and file2, and store '(x,y)':'number' in dictionary
'var1' and 'var2'.
- for each '(x,y)' key in var1, print var1[(x,y)] and var2[(x,y)]

--
William Park, Open Geometry Consulting, <openge...@yahoo.ca>
Linux solution for data management and processing.

commandos

unread,
Apr 25, 2003, 6:11:49 AM4/25/03
to
Yong Liu <yon...@uiuc.edu> wrote in message news:<Pine.GSO.4.31.030424...@staff1.cso.uiuc.edu>...


Assume 2 files are sorted, you can try
grep "^`cut -d' ' -f 1,2 FILE1`" | awk '{print $3}' > FILE3
paste FILE1 FILE3

Stephane CHAZELAS

unread,
Apr 25, 2003, 7:12:14 AM4/25/03
to
Yong Liu wrote:
> I have two tab-delimited flat files. Both have column x coordinate and y
> coordinate. I want to join them by using their coordinate as the key
> (which should include both x and y coordinate). Is there way to do this
> using shell commands sort and join, etc? Thanks a lot! (only those
> coordinates in file1 will be appeared in final merged results). -Yong
>
> File 1: > File 2: > Merge results:
> x y number1 > > x y number1 number2
> 1 2 343.5 > x y number2 > 1 2 343.5 444
> 1 3 444.2 > 0 1 333 > 1 3 444.2 555
> 0 1 33.4 > 1 2 444 > 0 1 33.4 333

> > 1 3 555
> > 0 8 666

You'll have to first merge the key fields

T=`printf '\t'`
join -t"$T" <(tail +2 $file1 | sed "s/$T/,/" | sort) \
<(tail +2 $file2 | sed "s/$T/,/" | sort) \
| sed "s/,/$T/"

if your shell doesn't have process substitution (<(...)), use
fifos or tempfiles.

--
Stéphane

rakesh sharma

unread,
Apr 28, 2003, 5:25:53 PM4/28/03
to
Yong Liu <yon...@uiuc.edu> wrote in message news:<Pine.GSO.4.31.030424...@staff1.cso.uiuc.edu>...


#!/usr/local/bin/perl

open(FA,"< fileA");
open(FB,"< fileB");
chomp(my @A=<FA>);
chomp(my @B=<FB>);

$f{join ",",(split)[0,1]} = (split)[-1] for (@B);
print map {$_,"\t",$f{join ",",(split)[0,1]},"\n"} grep {exists
$f{join ",",(split)[0,1]}} @A;

rakesh sharma

unread,
Apr 28, 2003, 10:07:51 PM4/28/03
to
Yong Liu <yon...@uiuc.edu> wrote in message news:<Pine.GSO.4.31.030424...@staff1.cso.uiuc.edu>...


perl -lane '$f{"$F[0],$F[1]"}=$F[2] if(@ARGV);
$k="$F[0],$F[1]";print$_,"\t",$f{$k} if(exists$f{$k}&&!@ARGV);' fileB fileA

0 new messages