Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Sum the middle column.

1 view

Skip to first unread message

Noatec

unread,

Sep 13, 2006, 4:01:46 PM9/13/06

If I have a data set like the one below.
1st column is a tape volser.
2nd column is the number of reads in the last hour.
3rd column is the last write date.
I need to sum the 2nd column for each uniq volser and spit out
volser,sum,last write date.
Could someone help me get started?
300000,1,2003-08-22
300000,1,2003-08-22
300000,1,2003-08-22
300000,1,2003-08-22
300000,1,2003-08-22
300000,1,2003-08-22
300000,2,2003-08-22
300000,2,2003-08-22
300000,2,2003-08-22
300000,2,2003-08-22
300000,3,2003-08-22
300000,3,2003-08-22
300001,1,2003-04-21
300001,1,2003-04-21
300001,1,2003-04-21
300001,1,2003-04-21
300001,1,2003-04-21
300001,1,2003-04-21
So on an so forth for 1500 volsers.

Aaron Dougherty

unread,

Sep 13, 2006, 7:37:24 PM9/13/06

Howdy,
Just about any time you're looking for unique data, you're going to be
looking at hashes. In this case, you want the tape volser to be unique,
so that will be the key of your hash. It has two values, reads and
write date, so the value of your hash would be an anonymous hash with
those two keys.

## Assuming the data provided is in the scalar $data
my %volsers=();
for my $line (split(/\n/, $data)){
my ($volser, $reads, $date) = split(/\,/,$line);
$volsers{$volser}{reads}+=$reads;
$volsers{$volser}{date}=$date;

Noatec

unread,

Sep 14, 2006, 9:30:32 AM9/14/06

Thanks a bunch for the insights Aaron,
I'll play with incorporating this into what I have.
This is the base I started with:

#!/usr/bin/perl
use warnings;
my $line = mlist;
my $sum = 0;

# open the file
open(MLIST,"$line") or die "Unable to open LOG:$!\n";

# read it in one record at a time
while ($line = <MLIST>) {
my ($volser,$reads,$lwd) = split(/,/,$line);
printf "$volser $reads $lwd"
}
# close the file
close(MLIST);

Noatec

unread,

Sep 18, 2006, 4:15:31 PM9/18/06

Aaron,
I need to apologize for jumping groups.
I got frustrated when I couldn't get the code you gave me to mesh with
what I already had so I posted the same question to comp.lang.perl.misc
Apparently that's heavily frowned upon. Don't worry though, the guys
over there really let me have it.
Your code worked fine, I just didn't understand hashes like I should
have before posting.

Here's the end result.
Thanks again
#!/usr/bin/perl
use strict;
use warnings;

my $thresh = $ARGV[0];
my $filename = 'mlist';
my $sum = 0;
my $line='';
my $reads='';
my $volser='';
my $tmp_vol = '';

# open the file
open my $MLIST, '<', $filename or die "Unable to open $filename: $!\n";

# load the hash
my %hash=();
while (my $line = <$MLIST>) {
my ($volser, $reads, $date) = split(/,/,$line);
if ($tmp_vol eq ''){
$tmp_vol=$volser;
}
$hash{$volser}{reads}+=$reads;
$hash{$volser}{date}=$date;
if ($tmp_vol != $volser) {
if ($hash{$tmp_vol}{reads} <= $thresh) {
print "$tmp_vol $hash{$tmp_vol}{reads} $date";
}
$tmp_vol=$volser;
}
}

#close the file
close($MLIST);

0 new messages