group by data in a unix file based on multiple columns

429 views
Skip to first unread message

Maimu PV

unread,
Aug 20, 2014, 4:24:18 PM8/20/14
to unix-and-perl-...@googlegroups.com
Hello -
Does anyone able to help me to resolve my issue

I have to sum my column based on the multiple key columns

Input 

D1 G1 3
D1 G1 4
D2 G1 5
D2 G1 3
D2 G2 2
D2 G2 3

output should be like

D1 G1 7
D2 G1 8
D2 G2 5


I can do it for 1  key column, but when I have 2 key column i am unable to do

thanks
Maimu

Keith Bradnam

unread,
Aug 20, 2014, 5:13:07 PM8/20/14
to unix-and-perl-...@googlegroups.com
If you are reading this data from a file, then you can just use a simple hash to do this. Each hash key will be a combination of 1st and 2nd columns, the value in the third column is added on to the current hash value. E.g. something like

# assumes data is in file specified on command-line and is tab delimited
# makes heavy use of $_ variable which is not to be encouraged!
my %hash;

while(<>){
    chomp;
    my ($colA, $colB, $colC) = split; 
    $hash{"$colA $colB"} += $colC;
}
# at this point %hash has all the data you need.

Maimu PV

unread,
Aug 20, 2014, 5:37:23 PM8/20/14
to unix-and-perl-...@googlegroups.com
Thank you Keith !! I really appreciate your fast reply.

I am new to the Perl..

How do I print these has value, I mean the result of some in to a file or to the display?

I tried print $hash and it wont work.

Also can I achieve this using awk..?

i tried below awk and i am getting syntax error

awk 'BEGIN {OFS="\t"}
{ for(i=2;i<=NF;i++){s[$1][k[i]]+=$(i); names[$1]++;}}
       {for(i in names){
           printf "%s%s",i,OFS;
           for(l in s[i]){printf "%s%s", s[i][l],OFS;}
           printf "\n";}
       }' data.txt


Thanks
Jabir

On Wednesday, August 20, 2014 1:24:18 PM UTC-7, Maimu PV wrote:

Syed Arshi

unread,
Aug 20, 2014, 6:39:01 PM8/20/14
to unix-and-perl-...@googlegroups.com
Hi,

You can use for loop to get keys and their respective values from hash table

foreach my $key (keys %hash) {
         print $key . "\t" . $hash{$key} . "\n";
}

Best,

Arshi

Sent from my iPhone4S
--
You received this message because you are subscribed to the Google Groups "Unix and Perl for Biologists" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unix-and-perl-for-bi...@googlegroups.com.
To post to this group, send email to unix-and-perl-...@googlegroups.com.
Visit this group at http://groups.google.com/group/unix-and-perl-for-biologists.
For more options, visit https://groups.google.com/d/optout.

Maimu PV

unread,
Aug 20, 2014, 9:08:34 PM8/20/14
to unix-and-perl-...@googlegroups.com
Thank you Arshi ..
Now I got it..

One more question on this..
If i have a CSV file, is there a way i can do this hash..
is so how can i achieve the field separator as comma and do this operation


Thanks

Keith Bradnam

unread,
Aug 20, 2014, 10:34:08 PM8/20/14
to unix-and-perl-...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages