Perl Project 2 Questions

12 views
Skip to first unread message

Timothy Butterfield

unread,
Sep 2, 2013, 4:23:48 PM9/2/13
to unix-and-perl-...@googlegroups.com
Hi, I recently completed Perl Project 2: Descriptive Statistics. While I was able to write functional scripts that returned the correct values, I did so by loading two use functions and employing several array tools that have not been introduced in the course at this point. I would like to know whether in doing so I have circumvented learning some tools that I will need for the actual expression analyses that I need to do. 

Below is the code that I wrote and the output. Thanks for your input.

Tim
--
#!/usr/bin/perl
# statsl.pl by tsbutterfield
use strict; use warnings; use List::Util qw(sum); use List::Util qw(min max);

#This program will determine the following for a string of values entered on the command line: Count, Sum, Mean, Min, Max, Median Value, Population Variance, Sample Variance, Population Standard Deviation, Sample Standard Deviation

die "usage: stats.pl <number1> <number2> <etc.>\n" unless @ARGV >1;


#Report #1: Counts
my @numbers = @ARGV;
print "Counts: ",( scalar(@numbers)), "\n";

#Report #2: Sum
my $sum = sum(@ARGV);
print "Sum: $sum\n";

#Report #3: Mean
my $mean = (sum(@ARGV)/@ARGV);
print "Mean: $mean\n";

#Report #4: Min Value
my $min = min @ARGV;
print "Min: $min\n";

#Report #5: Max Value
my $max = max @ARGV;
print "Max: $max\n";

#Report #6: Median
my $median = ($min + $max)/2;
print "Median: $median\n";

#Report #7: Population Variance
my @squares = map { ((($_)-$mean)**2)} @numbers;
my @sum = sum(@squares);

my $pop_var = (@sum/5);
print "pop var: $pop_var\n";

#Report #8: Sample Variance
my $samp_var = (@sum/4);
print "samp var: $samp_var\n";

#Report #9: Population Standard Deviation
my $pop_stdev = (($pop_var)**0.5);
print "pop stdev: $pop_stdev\n";

#Report #10: Sample Standard Deviation
my $samp_stdev = (($samp_var)**0.5);
print "samp stdev: $samp_stdev\n";

---
Mountain-Lion:Code tsbutter$ perl stats.pl 1 3 5 7 9
Counts: 5
Sum: 25
Mean: 5
Min: 1
Max: 9
Median: 5
pop var: 0.2
samp var: 0.25
pop stdev: 0.447213595499958
samp stdev: 0.5

Ian Korf

unread,
Sep 3, 2013, 3:22:06 AM9/3/13
to unix-and-perl-...@googlegroups.com
Using List::Util to do min, max, and sum is more efficient than writing the code yourself. Is it cheating? Yes, but cheating is good because using other peoples' code is a good. But for your own peace of mind, you should be able to do min, max, and sum yourself. They're very easy, so if you can't do them without resorting to List::Util, you need to go back and practice your list skills a bit more.

I'm surprised (maybe shocked) that you used map to calculate variance. We don't usually teach map because it's a little confusing. If you can use map, you can write min, max, and sum without List::Util (which you probably can).

Your median is not calculated correctly. It's the middle value in a list (or the average of 2 middle values). So you need to sort the list and throw in a conditional to determine if the list has an odd or even number of elements.

One more thing, you should never hard-code values like 4 or 5. Those should be variables. If your input has more than 5 numbers, your variance calculations will be way off.

Timothy Butterfield

unread,
Sep 5, 2013, 1:22:54 AM9/5/13
to unix-and-perl-...@googlegroups.com
Ian, 

I went through and modified my code to use shift and pop functions to identify min/max values, an if/else loop to evaluate the median and foreach loops to evaluate my squares, sums of squares, variance and standard deviation.  

I like to look at each step using the print function, but when I print my squares, the elements are all compressed together, without any spaces. However, when I used the map function, the array elements are printed in an interpretable form. What do I need to add to the print command on lines 66 in Report #7: Squares & Sum of Squares in order to view the @squares array elements separated from one another?

*My input & output are immediately below.

Thanks,

Tim

#!/usr/bin/perl
# stats.pl by tsbutterfield
use strict; use warnings; 


#This program will determine the following for a string of values entered on the command line: Count, Sum, Mean, Min, Max, Median Value, Squares, Sum of Squares, Population Variance, Sample Variance, Population Standard Deviation, Sample Standard Deviation

die "usage: stats.pl <number1> <number2> <etc.>\n" unless @ARGV >1;


#Report #0: Original & Sorted Array
my @numbers = @ARGV;
print "Array: @ARGV\n";

my @sorted_numbers = sort {$a <=> $b} @numbers;
print "Sorted_Array: @sorted_numbers\n";


#Report #1: Counts
my $counts = scalar(@numbers);
print "Counts: $counts\n";


#Report #2: Sum
my $sum =0;

foreach my $num (@sorted_numbers) {
$sum = $sum + $num;
}
print "Sum: $sum\n";


#Report #3: Mean
my $mean = ($sum/@numbers);
print "Mean: $mean\n";


#Report #4: Min Value
my $min = shift (@sorted_numbers);
print "Min: $min\n";


#Report #5: Max Value
my $max = pop (@sorted_numbers);
print "Max: $max\n";


#Report #6: Median Value
if ( @sorted_numbers % 2 == 0) {
#if even, then:
my $med = ($sorted_numbers[(@sorted_numbers/2)-1] + $sorted_numbers[(@sorted_numbers/2)])/2;
print "Median: $med\n";
}
else{
#if odd, then"
print "Median: $sorted_numbers[@sorted_numbers/2]\n";
}


#Report #7: Squares & Sum of Squares
my @squares;

foreach my $number (@numbers) {
push @squares, (($number - $mean)**2);
}
print "Squares: ", @squares, "\n"; #No spaces print between elements

my $sum_squares = 0;
foreach my $i (@squares) {
$sum_squares = $sum_squares + $i;
}
print "Sum_squares: ", $sum_squares, "\n";


#Report #8: Population  & Sample Variance
my $pop_var = ($sum_squares/$counts);
print "Pop. Variance: $pop_var\n";


my $samp_var = ($sum_squares/($counts - 1));

print "Sample Variance: $samp_var\n";


#Report #9: Population & Sample Standard Deviation
my $pop_stdev = (($pop_var)**0.5);
print "Pop. St.Dev.: $pop_stdev\n";


my $samp_stdev = (($samp_var)**0.5);
print "Sample St.Dev.: $samp_stdev\n";



Mountain-Lion:Code tsbutter$ perl stats.pl 5 13 7 9 11 3 15
Array: 5 13 7 9 11 3 15
Sorted_Array: 3 5 7 9 11 13 15
Counts: 7
Sum: 63
Mean: 9
Min: 3
Max: 15
Median: 9
Squares: 16164043636
Sum_squares: 112
Pop. Variance: 16
Sample Variance: 18.6666666666667
Pop. St.Dev.: 4
Sample St.Dev.: 4.32049379893857

Ian Korf

unread,
Sep 5, 2013, 2:48:11 AM9/5/13
to unix-and-perl-...@googlegroups.com
print "@array" will put spaces between the array elements. You can use other things with special Perl variables, but really you should use join instead. For example, to print tab separated you would print join("\t", @array).

For the min and max, you should not use pop and shift. These remove values from the array, which you might want in its native form later. Better to use $numbers[0] and $numbers[@numbers-1].

I think you want $sum as a scalar value, not an array. Also, you have to divide $sum by its length or length -1. So $sum / @sum or $sum / scalar(@sum) which is the same thing in this context.


On Monday, September 2, 2013 1:23:48 PM UTC-7, Timothy Butterfield wrote:
Reply all
Reply to author
Forward
0 new messages