> I want to know how many new line chars there are in all files in a directory
> (and it's subdirectories). What's the best way?
You'll want to use File::Find (a standard module) to do your directory
recursion for you. For each file you get to, open it, count its newlines,
and add that to your running total.
--
Jeff "japhy" Pinyan % How can we ever be the sold short or
RPI Acacia Brother #734 % the cheated, we who for every service
http://www.perlmonks.org/ % have long ago been overpaid?
http://princeton.pm.org/ % -- Meister Eckhart
Thanks!
> I want to know how many new line chars there are in all files in a
> directory (and it's subdirectories). What's the best way?
I'm sure this isn't how you want to do it, but this might work:
$ cat `find . -type f` | wc -l
It'll choke if you have too many files in the directory in question, as
there are limits to how long the argument list can be in the shell, but
provided that you don't exceed that limit, this will get you a quick and
dirty answer to your question.
Otherwise, you'll need to build up a list with File::Find or similar
module, then work through the list looking for newline chatacters for
each file in that list. It should get the same result as above, but will
take more hand-coding to get to the final result, and it shouldn't hit
the limitation of too many files that the shell approach will have.
--
Chris Devers
™*
: I want to know how many new line chars there are in all files
: in a directory (and it's subdirectories). What's the best way?
A lot depends on your idea of "best". It might be that the
best way is to hand the project off to someone else and reap the
benefits of their skills. In fact, many very rich people say this
is the best way to do just about anything. So what do you mean by
"best"?
One way to tackle this problem is to figure out how to find
the number of new lines in any file. Since any file may be very
large, assume at least one file cannot be loaded into memory.
Now take that solution and File::Find to apply it to many files.
HTH,
Charles K. Clarkson
--
Mobile Homes Specialist
254 968-8328
. . . With Liberty and Justice for all (heterosexuals).
Use File::Find to iterate over the files and then sum up the
newlines you find in each file. Counting the newlines in a
single file is left as an exercise for the reader.
HTH,
Thomas
$|=1;
# Set here the pattern extensions of your image files
# Usage
(@ARGV == 1 ) || die ("Usage: recurs.pl [-h] \n\t-h this help\n\n");
$command=$ARGV[0];
#$command=~s/(.*)/'$1'/;
&recursive();
# Subroutine "recursive" goes recursively down at the dir tree and
# and runs the $ARGV[0] for you. After comming to the end it's coming back
up
# at the tree.
sub recursive {
system($command);
#print "$command\n";
#die;
foreach $dir (<*>;) {
if (-d $dir) {
# print "$dir\n";
chdir $dir;
&recursive();
chdir "..";
}
}
}
> --
> To unsubscribe, e-mail: beginners-...@perl.org
> For additional commands, e-mail: beginne...@perl.org
> <http://learn.perl.org/> <http://learn.perl.org/first-response>
>
>
>
[snip]
: # Last modified: Apr 10 1997
[snip]
Please do not provide outdated, buggy solutions to a beginners
list. We are trying to do much more than just solve problems. We
are (hopefully) fostering good programming skills first and good
solutions second. Your solution not only didn't run, it provided
several excellent reasons why simple approaches do not always work
across many OS platforms.
As others have said, you can use File::Find (or, my favorite module,
IO::All) to identify the files. For counting the number of lines, you
ought to check your docs:
perldoc -q "How do I count the number of lines in a file?"
Be careful with this one. The documentation for it makes it sound like
it's a good idea to set this but doing so turns buffering OFF, not ON.
Normally you leave this alone, even for pipes and sockets; Perl does the
right thing in almost every case.
See:
perldoc perlvar (search for $|)
perldoc FileHandle
--
Just my 0.00000002 million dollars worth,
--- Shawn
"Probability is now one. Any problems that are left are your own."
SS Heart of Gold, _The Hitchhiker's Guide to the Galaxy_
#!/usr/bin/perl
use File::Find;
my $totalLines;
find(\&wanted, '@directories');
sub wanted {
unless ($_=~m/.html|.mas|.pl|.txt$/i) {return 0;} #filter the kinds
of files you want
open FILE, "<$File::Find::name";
print "$_: ";
my @lines=<FILE>;
print "$#lines\n";
$totalLines+=$#lines; #wanted's value is ignored so we have to
do this here.
return;}
print "$totalLines\n";
This only limits me by the size of the file, or no?
Thanks!
$#lines is the index of the last entry in @lines. scalar( @lines ) is
the number of items in the array. Normally, $#lines + 1 == scalar(
@lines ). I think you should use scalar( @lines ) here.
> return;}
> print "$totalLines\n";
>
> This only limits me by the size of the file, or no?
Yes. If you have big files, replace the slurp with a loop.
(snipped)
> So far I did this:
> #!/usr/bin/perl
> use File::Find;
My personal advice is to not use a module unless
you have good reason, such as speed of a good
module or inability to write code equal to a module.
Use a module when doing so is beneficial. Many
modules are worth using. Most are not.
Research and read about Perl's $. default variable.
Below you will find an exceptionally efficient method
and an easy-to-configure method. This method will
present an alternative for you to study.
Purl Gurl
#!perl
$internal_path = "c:/your/path/directory"; # your path here
chdir($internal_path);
@Directory_Parent = $internal_path;
while (@Directory_Parent)
{
$directory = shift (@Directory_Parent);
opendir(DIR, $directory) || next;
while (defined($child = readdir(DIR)))
{
if (-d "$directory/$child" && $child ne "." && $child ne "..")
{ push(@Directory_Parent, "$directory/$child"); }
if (-f "$directory/$child")
{ push (@File_List, "$directory/$child"); }
}
closedir(DIR);
}
for $file (@File_List)
{
open (COUNT, $file) || die $!;
while (<COUNT>)
{ $total_lines += $.; }
close (COUNT);
}
print "Total Lines: $total_lines";
# optional
# {
# local ($") = "\n";
# print "Files Checked:\n@File_List";
# }
That should be followed by these two lines:
use warnings;
use strict;
> use File::Find;
> my $totalLines;
> find(\&wanted, '@directories');
Do you actually have a directory in the current directory named '@directories'?
> sub wanted {
> unless ($_=~m/.html|.mas|.pl|.txt$/i) {return 0;} #filter the
Your regular expression says to match any character (.) followed by the string
'html' anywhere in $_ OR any character followed by the string 'mas' anywhere
in $_ OR any character followed by the string 'pl' anywhere in $_ OR any
character followed by the string 'txt' only at the end of $_. What you
probably what is:
return unless /\.(?:html|mas|pl|txt)$/i;
> kinds of files you want
> open FILE, "<$File::Find::name";
> print "$_: ";
> my @lines=<FILE>;
> print "$#lines\n";
> $totalLines+=$#lines; #wanted's value is ignored so we have to
$#lines is one less then the number of lines in the file so your total will
not be accurate. That should be:
$totalLines += @lines;
But you don't really need to store all the lines in an array, you can do it
more simply as:
() = <FILE>;
print "$.\n";
$totalLines += $.;
Or use the example in the FAQ:
perldoc -q "How do I count the number of lines in a file"
> do this here.
> return;}
> print "$totalLines\n";
>
> This only limits me by the size of the file, or no?
John
--
use Perl;
program
fulfillment
Hi,
In addition to John W. Krahn's good advices:
> So far I did this:
>
> #!/usr/bin/perl
>
> use File::Find;
> my $totalLines;
> find(\&wanted, '@directories');
> sub wanted {
> unless ($_=~m/.html|.mas|.pl|.txt$/i) {return 0;} #filter the kinds
> of files you want
> open FILE, "<$File::Find::name";
Always check if operations succeeded:
open (FILE, '<', $File::Find::name)
or die "couldn't open $File::Find::name: $!";
> print "$_: ";
> my @lines=<FILE>;
and close opened files:
close FILE or die "couldn't close $File::Find::name: $!";
Thanks, don't know how I missed that. :-)
You bring up an interesting point about closing the filehandle because
normally you don't have to worry about that as perl will do the right thing.
However in the example I posted using the $. variable:
sub wanted {
...
() = <FILE>;
print "$.\n";
$totalLines += $.;
}
print "$totalLines\n";
Produces an incorrect value for $totalLines unless you close the filehandle
but if you don't close the filehandle then you can do this:
sub wanted {
...
() = <FILE>;
}
print "$.\n";
John
Not sure if I understand you correctly:
Do you suggest *not* to close filehandles, because it's done by perl "doing
the right thing"?
Or should one decide in every case, if closing should be explicitly done or
not?
My thought was: When I always close filehandles, I don't have to think about
closing or not closing them (comparable to give always signs when driving,
even if nobody else is on the road).
For exemple perldoc -f open states:
"[...] You don't have to close FILEHANDLE if you are immediately
going to do another "open" on it [...]"
In a normal case, there is one point at which any filehandle is not reopened:
After the last reopening. So this case would have to be checked in a loop
(pseudocode: close FH if finished reopening)?
> However in the example I posted using the $. variable:
>
> sub wanted {
> ...
> () = <FILE>;
> print "$.\n";
> $totalLines += $.;
> }
> print "$totalLines\n";
>
> Produces an incorrect value for $totalLines unless you close the filehandle
I did not see this point.
> but if you don't close the filehandle then you can do this:
>
> sub wanted {
> ...
> () = <FILE>;
> }
> print "$.\n";
Sorry if it's just a misunderstanding by me wasting your time!
joe