Finding # of msgs in mailbox?

24 views
Skip to first unread message

Neil Bowers

unread,
Nov 14, 1994, 9:41:40 PM11/14/94
to
James A. Robinson (ji...@plato.simons-rock.edu) wrote:
: I am using procmail to drop mail msgs into different folders, and I
: want to make a script that tells me how many msgs are in each folder.

I have a from perl script which parses your .procmailrc to determine
a list of spools, then gives one of two styles of from listing.
I emailed James a copy separately.

neilb

James A. Robinson

unread,
Nov 14, 1994, 7:24:53 PM11/14/94
to

I am using procmail to drop mail msgs into different folders, and I
want to make a script that tells me how many msgs are in each folder.
The mail format is in the format of

From person1\n
rest of header\n
\n
Msg txt.\n
\n
\n
From person2\n
rest of header\n
\n
Msg txt.\n
\n
\n
etc...

I don't understand the mutiple line mode, and my attempt at

$/ = "";
$* = 1;

if (/\n\nFrom.*/)
{
$count++;
}
etc...

failed miserably. Can anybody help me out with a pattern search that
will accurately count the number of msgs? Do have to group each
header+msg into a paragraph or something?


Jim
--
http://plato.simons-rock.edu/~jimr/

Lawrence Kesteloot

unread,
Nov 14, 1994, 8:56:16 PM11/14/94
to
In article <3a8v4l$4...@plato.simons-rock.edu>,

James A. Robinson <ji...@plato.simons-rock.edu> wrote:
>I am using procmail to drop mail msgs into different folders, and I
>want to make a script that tells me how many msgs are in each folder.
>
> ...
> if (/\n\nFrom.*/)

As far as I know, you're guaranteed that the first five letters of the
line will be F-r-o-m-space, so:

egrep -c "^From " folder

should work. This perl script is actually faster on my machine:

while (<>) {
$count++ if (/^From /);
}
print "$count\n";

Lawrence

Jeffrey Friedl

unread,
Nov 15, 1994, 6:17:39 AM11/15/94
to
ji...@plato.simons-rock.edu (James A. Robinson) writes:
|> I don't understand the mutiple line mode, and my attempt at

"Multiple-line mode" just means that ^ and $ can match embedded newlines.
No other magic. If you don't have ^ or $ in your regex, multiline mode (or
lack thereof) just doesn't matter.

Perhaps "multiple-line mode" is a misnomer. How about "magic ^/$ mode"?
Without the magic, ^ and $ match the beginning and end of the STRING.
With the magic, they match the beginning and end of a text line.

Searching for /^From /

multi-line mode:
will find any line in $_ beginning with "From ";

default mode:
will be true onlyif $_ begins with "From "

So, if your whole spool is in a variable (such as $_), you can count
lines beginning with "From " with with:

___ perl 4 ___ ____ perl 5 _____

{
local($*) = 1;
$count = 0; $count = 0;
$count++ while m/^From /g; $count++ while m/^From /gm;
} ^--- multi-
line

(the perl4 way will work with perl5, of course, but is not the "approved" way)

It's important for your understanding to know what would happen if
1) you forgot to set multiline mode.
2) you used /\nFrom / instead of /^From /

Think about it a bit. Answers discussed below.

-----

If you don't already have the whole spool in a variable and don't need to
do it that way, Lawrence Kesteloot already posted something that should
help.

|> $/ = ""; $* = 1;
|> if (/\n\nFrom.*/)
|> {
|> $count++;
|> }

Just to comment here, had you used 'while' rather than 'if', *AND* used the
/g modifier, you would have been close. All you would have need to do
then was insert a
$_ = <>
before the if line (-:

(oh, and you would have missed the first "From " in the file if it was on
the first two lines.)

Your mail system may well be different, but many (most?) UNIX mailers
will ensure that only start-of-message lines will begin with 'From '.
Try
echo 'From me' | mail <yourself>

and you'll probably see that the mailer inserts '>' or whatnot in front
of the "From" so that it doesn't look like the start of a header.


--------------

It's important for your understanding to know what would happen if
1) you forgot to set multiline mode.
2) you used /\nFrom / instead of /^From /

If you forgot to set multiline mode with
m/^From /g,
the ^ would only match the beginning of the string. Thus, $count would
be either 0 or 1 depending upon if the very first line started with
"From " or not.

If you used
m/\nFrom /g
then multiline mode wouldn't matter -- only matters with ^ and $. The
result would be the same as m/^From / (*with* multiline mode) UNLESS the
very first line DID begin with "From ". Such a line would be matched by
m/^From / but NOT m/\nFrom / since there's no \n before the very first
"From ".

Go ye forth and practice unto perl.

*jeffrey*
-------------------------------------------------------------------------
Jeffrey E.F. Friedl <jfr...@omron.co.jp> Omron Corporation, Kyoto Japan
See my Jap/Eng dictionary at http://www.omron.co.jp/cgi-bin/j-e
or http://www.cs.cmu.edu:8001/cgi-bin/j-e

James A. Robinson

unread,
Nov 18, 1994, 9:55:29 AM11/18/94
to
In article <3a94g0$j...@fermi.cs.unc.edu>,

Lawrence Kesteloot <kest...@cs.unc.edu> wrote:
>In article <3a8v4l$4...@plato.simons-rock.edu>,
>James A. Robinson <ji...@plato.simons-rock.edu> wrote:
>>I am using procmail to drop mail msgs into different folders, and I
>>want to make a script that tells me how many msgs are in each folder.
>>
>> ...
>As far as I know, you're guaranteed that the first five letters of the
>line will be F-r-o-m-space, so:


Hello,

Thanks to everyone for their help. The problem I have with
everybody's solution (/^From\s*/ is that my mailer does not alter
non-header "From"'s that are flush left, so I may end up counting a
"From" that is in the message body itself, that's why I wanted to scan
for 2 returns as well, it would lessen the likely-hood of a mistake
being made.

Well, perhaps I'll move back to MH and use those seperate files as an
indicator. :) (anybody know what plum DOES? Are there any docs/faqs
on it?)


Thanks again,

Jim Robinson
--
http://plato.simons-rock.edu/~jimr/

Reply all
Reply to author
Forward
0 new messages