I'm fairly new to Perl, so I apologize if this is an often asked question.
I have read through the FAQ and several manuals and haven't seen examples
of what I am trying to do.
I basically want to read through a text file line by line, place each
unique line in an array. When I hit a duplicate line, I want to increment a
value in my array so I can keep track of how many times that line is
duplicated. I'm not sure if I should be using a List of Lists or a Hash for
this (like I said, my exposure to Perl is limited).
I know that to print out all elements of an array, I would use something
like this:
for $i (0 .. $#UniqueLines)
{
for $j ( 0 .. $#{$UniqueLines[$i]} )
{
print $UniqueLines[$i][$j];
}
}
I tried to do something similar to this to change the value of a slice. I
haven't found any examples on how to do this, just how to print out. So I
came up with this:
while ($line = <INPUTFILE>)
{
for $i (0 .. $#UniqueLines)
{
if ($UniqueLines[$i] eq $line)
{
for $j ( 0 .. $#{$UniqueLines[$i]} )
{
$UniqueLines[$i][$j++];
}
$duplicate = 1;
last;
}
}
}
Obviously, this doesn't do what I want, so if someone could point me to a
man page or an area of the FAQ that could help me out, I'd greatly
appreciate it.
Thanks,
Kate
Please read section four of the Perl FAQ. Pay careful attention when
you reach "How can I extract just the unique elements of an array?".
Greg
--
The first truth is that liberty is not safe if the people tolerate the growth
of private power to the point where it becomes stronger than that of their
democratic state itself. That, in its essence, is Fascism.
-- Franklin D. Roosevelt
>In article <8DD46B23...@news.chiso.com>,
> kate_no_spam@for_me_chiso.com (Kate) writes:
>: I basically want to read through a text file line by line, place
>: each unique line in an array. When I hit a duplicate line, I want to
>: increment a value in my array so I can keep track of how many times
>: that line is duplicated. I'm not sure if I should be using a List of
>: Lists or a Hash for this (like I said, my exposure to Perl is limited).
>
>Please read section four of the Perl FAQ. Pay careful attention when
>you reach "How can I extract just the unique elements of an array?".
>
>Greg
If I do it this way, I'm still not sure how to count how many times an
entry occurs. Extracting the unique entries isn't my problem, its
incrementing my counter.
Thanks for the help,
Kate
my @uniq;
my %seen;
while (<>) {
chomp;
push @uniq, $_ unless $seen{$_}++;
}
my $mode = shift @uniq;
for (@uniq) {
$mode = $_ if $seen{$_} > $seen{$mode};
}
print "The most frequently occurring line was:\n $mode\n",
"It occurred $seen{$mode} time",
$seen{$mode} == 1 ? "" : "s",
".\n";
Greg
--
A lot of people mistake a short memory for a clear conscience.
-- Doug Larson
GB> In article <8DD4695E...@news.chiso.com>,
GB> kate_no_spam@for_me_chiso.com (Kate) writes:
GB> : Greg Bacon <gba...@cs.uah.edu> wrote in <7imcj5$jh2$3...@info2.uah.edu>:
GB> : >Please read section four of the Perl FAQ. Pay careful attention when
GB> : >you reach "How can I extract just the unique elements of an array?".
GB> :
GB> : If I do it this way, I'm still not sure how to count how many times an
GB> : entry occurs. Extracting the unique entries isn't my problem, its
GB> : incrementing my counter.
GB> my @uniq;
GB> my %seen;
GB> while (<>) {
GB> chomp;
GB> push @uniq, $_ unless $seen{$_}++;
GB> }
GB> my $mode = shift @uniq;
GB> for (@uniq) {
GB> $mode = $_ if $seen{$_} > $seen{$mode};
GB> }
greg, you can do better than that! why two loops? why the push into @uniq?
while(<>) {
chomp ;
$count = ++$seen{$_} ;
$max_line = $_ && $max_count = $count if $count > $max_count ;
}
@uniq = keys %seen ;
or for you fans of shorter code:
$max_line = $_ && $max_count = $seen{$_} if ++$seen{$_} > $max_count ;
uri
--
Uri Guttman ----------------- SYStems ARCHitecture and Software Engineering
u...@sysarch.com --------------------------- Perl, Internet, UNIX Consulting
Have Perl, Will Travel ----------------------------- http://www.sysarch.com
The Best Search Engine on the Net ------------- http://www.northernlight.com
Thanks alot for your guidance. I appreciate the help!
Thanks again,
Kate
Uri Guttman <u...@sysarch.com> wrote in <x7emk1i...@home.sysarch.com>:
But re-read that FAQ more carefully. Try part (b), and print
out the hash %saw afterward. The keys are your unique lines, and
the values are your counts. Once you see that this actually does
what you want, you'll wish to go read up on the cute pieces
which do this magic for you. Part of this depends on the fact
that Perl automatically enlarges arrays and hashes when you
give them new pieces, and changes old values in hashes when you
use new values with the previous keys.
> Thanks for the help,
You're welcome,
David
--
David Cassell, OAO cas...@mail.cor.epa.gov
Senior computing specialist
mathematical statistician
It's a pedagogical issue. I wanted to show how to collect the
information she wanted and then show one possible way to use it.
Playing Perl golf (fewest (key)strokes wins!) with people who have lots
of experience is fine, but it's not going to help much for people who
are still trying to get the hang of it.
Greg
--
VMS is a text-only adventure game. If you win you can use Unix.
-- Bill Davidsen
GB> It's a pedagogical issue. I wanted to show how to collect the
GB> information she wanted and then show one possible way to use it.
GB> Playing Perl golf (fewest (key)strokes wins!) with people who have lots
GB> of experience is fine, but it's not going to help much for people who
GB> are still trying to get the hang of it.
i agree to a point. showing a beginner a simpler way and later the
better but more idiomatic way is common. where to draw the line is an
issue. at the boston tutorials i heard a line (from randal?) which was
something like teaching beginners is progressive lying. everything you
teach early on is rescinded later with better techniques. i just like to
start off with the better techniques instead of the simpler ones.
why can't newbies use hash slices in their hello world programs? :-)
In article <x7emk1i...@home.sysarch.com> on 28 May 1999 12:12:09 -
0400, Uri Guttman <u...@sysarch.com> says...
...
> greg, you can do better than that! why two loops? why the push into @uniq?
Uri, you can do better than that! Why not post code that compiles?
> while(<>) {
> chomp ;
>
> $count = ++$seen{$_} ;
> $max_line = $_ && $max_count = $count if $count > $max_count ;
> }
>
> @uniq = keys %seen ;
>
> or for you fans of shorter code:
>
> $max_line = $_ && $max_count = $seen{$_} if ++$seen{$_} > $max_count ;
Do you want me to spell out the problem for you, or would you prefer to
own up to it yourself?
:-( for you, buddy.
--
(Just Another Larry) Rosler
Hewlett-Packard Company
http://www.hpl.hp.com/personal/Larry_Rosler/
l...@hpl.hp.com
LR> [Posted and a courtesy copy mailed, so Uri can feel bad sooner :-).]
i don't feel bad. i left the bug as an exercise for you. :-)
>>
>> $max_line = $_ && $max_count = $seen{$_} if ++$seen{$_} > $max_count ;
LR> Do you want me to spell out the problem for you, or would you prefer to
LR> own up to it yourself?
LR> :-( for you, buddy.
i don't like or. i like ||. i will use || even when it is buggy! in fact
i use it for open since i use parens with open. i don't normally paren =
so this should have use or or parens. but the thought of winning perl
golf (i like greg's coinage!) can lead to bugs.
and of course, i didn't test it as it was too simple. :-)
In article <x71zg1h...@home.sysarch.com> on 28 May 1999 15:03:31 -
0400, Uri Guttman <u...@sysarch.com> says...
> >>>>> "LR" == Larry Rosler <l...@hpl.hp.com> writes:
> LR> [Posted and a courtesy copy mailed, so Uri can feel bad sooner :-).]
>
> i don't feel bad. i left the bug as an exercise for you. :-)
>
> >> $max_line = $_ && $max_count = $seen{$_} if ++$seen{$_} > $max_count ;
>
> LR> Do you want me to spell out the problem for you, or would you prefer to
> LR> own up to it yourself?
>
> LR> :-( for you, buddy.
>
> i don't like or. i like ||. i will use || even when it is buggy! in fact
> i use it for open since i use parens with open. i don't normally paren =
> so this should have use or or parens. but the thought of winning perl
> golf (i like greg's coinage!) can lead to bugs.
For those whom Uri is (deliberately?) leaving clueless, in the paragraph
above,
s/or/'and'/;
s/\|\|/&&/g;
s/!.*/!/s;
> and of course, i didn't test it as it was too simple. :-)
That's what they all say. :-)
I'm still not satisfied that you've established that your way is
better or even more idiomatic. The one advantage you might be able
to press with your snippet is that it's more specialized for the task
of finding the mode. Your approach doesn't preserve order either.
In production code, I like to separate code into task chunks as much
as possible to save some hair pulling for the poor sod who has to come
behind me to maintain it. Your approach loses style points in that
category too.
: where to draw the line is an
: issue. at the boston tutorials i heard a line (from randal?) which was
: something like teaching beginners is progressive lying. everything you
: teach early on is rescinded later with better techniques. i just like to
: start off with the better techniques instead of the simpler ones.
"I already have too much problem with people thinking the efficiency
of a perl construct is related to its length."
-- Larry Wall
Greg
--
We have enough youth, how about a fountain of SMART?
GB> "I already have too much problem with people thinking the efficiency
GB> of a perl construct is related to its length."
GB> -- Larry Wall
i have shown and seen that many times here. efficiency analysis of perl
code is not something you can always eyeball.
What you mean like :
#!/usr/bin/perl -w
use strict;
my @hello = qw(hello world);
my %blah;
@blah{@hello} = (1,2);
print join ',', map {ucfirst($_)}
sort { $blah{$a} <=> $blah{$b} }
keys %blah;
;-}
/J\
--
Jonathan Stowe <j...@gellyfish.com>
Some of your questions answered:
<URL:http://www.btinternet.com/~gellyfish/resources/wwwfaq.htm>
Hastings: <URL:http://www.newhoo.com/Regional/UK/England/East_Sussex/Hastings>