Wanted: C-News log report generator.

45 views
Skip to first unread message

Emmett Hogan

unread,
Aug 15, 1991, 2:58:53 PM8/15/91
to

The subject line says it all...has anyone written a C or PERL program
to parse the C-News log file and spit out a nice formatted report?

Thanks in advance,
Emmett


--
-------------------------------------------------------------------
Emmett Hogan Computer Science Lab, SRI International
Inet: ho...@csl.sri.com
UUCP: {ames, decwrl, pyramid, sun}!fernwood!hercules!hogan
USMAIL: EL231, 333 Ravenswood Ave, Menlo Park, CA 94025
PacBell: (415)859-3232 (voice), (415)859-2844 (fax)
ICBM: 37d 27' 14" North, 122d 10' 52" West
-------------------------------------------------------------------

Les Carleton

unread,
Aug 16, 1991, 3:49:48 AM8/16/91
to

In article <HOGAN.91A...@cujo.csl.sri.com>, ho...@csl.sri.com (Emmett Hogan) writes:
|>The subject line says it all...has anyone written a C or PERL program
|>to parse the C-News log file and spit out a nice formatted report?
|>
|>Thanks in advance,
|>Emmett

I don't usually like to do this, but "Me too!!!" (I've been looking
around the net but can't seem to find)

...Les...
"Newswatcher"
--
___________________________________________________________________
Les Carleton / Tel: +44 (256) 488351 \ \
Digital Equipment Corporation / Email: l...@decuk.uvo.dec.com \ \
CSC/UK Ultrix Support Group / "The Video Widget" \ \
----------------------------+ "Open Standards ... Free Software" / /
"My opinions, My Mind, Don't take that away from me!" / /
_________________________________________________________________/_/
This .signature has been stolen by others ... please flame any you see!!!

Paul A Vixie

unread,
Aug 15, 1991, 8:19:37 PM8/15/91
to
here's what we use. it's called "yacnls". i didn't write it. it gets
very large when it runs and sometimes it just keeps getting larger until
either it or the system crashes. but when it works it works very well.

#! /usr/local/bin/perl

#
# C News Log Report Generator
#
# Mark Nagel <na...@ics.uci.edu>
# $Id: rep_log.pl,v 1.6 89/11/29 11:00:51 news Exp $
#
# Large parts of this script were based on the B News log report
# awk script.
#

$NEWSBIN = $ENV{"NEWSBIN"} || "/usr/lib/newsbin";
$NEWSCTL = $ENV{"NEWSCTL"} || "/usr/lib/news";
$newshist = "$NEWSBIN/maint/newshist";

##############################################################################
# customization #
##############################################################################

#
# The MAXARGLEN variable controls how many message-ids will be queried
# for at one time via the newshist program. Tune to your system (make
# as large as allowed). The length here is the total length in
# characters of all the arguments.
#
$MAXARGLEN = 2048;

#
# The "local" array contains a list of regular expressions that
# identify a site entry in the log file as local. Each regular
# expression will be matched case-independently and anchored at the
# beginning/end.
#
@local = (
"bacchus", # randoms
"[^.]*\.pa\.dec\.com" # other local client names
);

#
# The "gateway" array contains a list of regular expressions that
# identify a site entry in the log file as a gateway. Each regular
# expression will be matched case-independently and anchored at the
# beginning/end.
#
@gateway = (
"decwrl"
);

##############################################################################
# initialization #
##############################################################################

$duplicates = 0;
@msgids = ();
$arglen = 0;

$silent = 0;
while ($_ = $ARGV[0], /^-/) {
shift;
last if (/^--$/);
/^-s/ && ($silent = 1);
}

##############################################################################
# log file scan #
##############################################################################

while (<>) {
next if /^$/; # skip blank lines
chop;

#
# extract fields from line
#
($month,$date,$time,$site,$code,$msgid,@logent) = split;

#
# fix up the site name as necessary
#
for $regexp (@gateway) {
if ($site =~ /^$regexp$/i) {
$site = "(GATEWAY)";
last;
}
}
for $regexp (@local) {
if ($site =~ /^$regexp$/i) {
$site = "local";
last;
}
}
$site =~ s/\..*$//;

#
# check the receipt code
#
if ($code eq "-") { # rejected article
$reject{$site}++;
if ($logent[0] eq "duplicate") {
$duplicates++;
} elsif ($logent[0] eq "no" && $logent[1] eq "subscribed") {
#
# "no subscribed groups in `...'"
#
$ng = $logent[4];
$ng =~ s/`([^']*)'/$1/;
@ng = split(/,/, $ng);
for $i (@ng) {
$unsub{$i}++;
}
} elsif ($logent[0] eq "all" && $logent[3] eq "excluded") {
#
# "all groups `...' excluded in active"
#
$ng = $logent[2];
$ng =~ s/`([^']*)'/$1/;
@ng = split(/,/, $ng);
for $i (@ng) {
$excluded{$i}++;
}
} else {
#
# print any others as-is for inspection
#
print "$_\n" unless ($silent);
}
} elsif ($code eq "+") { # accepted article
$accept{$site}++;
if ($arglen + length($msgid) > $MAXARGLEN) {
do recordgroups(@msgids);
@msgids = ($msgid);
$arglen = length($msgid);
} else {
push(@msgids, $msgid);
$arglen += length($msgid);
}
for ($i = 0; $i <= $#logent; $i++) {
$n = $logent[$i];
$neighbor{$n} = 1;
$xmited{$n}++;
}
} elsif ($code eq "j") { # junked after accepted
$junked{$site}++;
if ($logent[0] eq "junked") {
$ng = $logent[4];
$ng =~ s/`([^']*)'/$1/;
@ng = split(/,/, $ng);
for $i (@ng) {
$badng{$i}++;
}
}
} elsif ($code eq "i") { # ihave message
$ihave++;
} elsif ($code eq "s") { # sendme message
$sendme++;
} else { # illegal/unknown code
print "$_\n" unless ($silent);
}
}
do recordgroups(@msgids) if ($#msgids >= 0);

##############################################################################
# statistics generation #
##############################################################################

#
# rejected messages
#
$rtot = 0;
while (($key, $val) = each(reject)) {
if ($val > 0) {
$list{$key} = 1;
$rtot += $val;
}
}

#
# accepted messages
#
$atot = 0;
while (($key, $val) = each %accept) {
if ($val > 0) {
$list{$key} = 1;
$atot += $val;
}
}

#
# transmitted messages
#
$xtot = 0;
while (($key, $val) = each(xmited)) {
if ($val > 0) {
$list{$key} = 1;
$xtot += $val;
}
}

#
# junked messages
#
$jtot = 0;
while (($key, $val) = each(junked)) {
if ($val > 0) {
$list{$key} = 1;
$jtot += $val;
}
}

##############################################################################
# report generation #
##############################################################################

#
# Transmission Statistics
#
$totalarticles = $atot + $rtot;
$totalarticles++ if ($totalarticles == 0);
print "\n" unless ($silent);
print "System \tAccept\tReject\tJunked\tXmit to\t %total\t%reject\n";
for $i (sort(keys(list))) {
$sitetot = $accept{$i} + $reject{$i};
$sitetot++ if ($sitetot == 0);
$articles{$i} = $sitetot;

printf "%-14s\t%6d\t%6d\t%6d\t%7d\t%6d%%\t%6d%%\n",
$i, $accept{$i}, $reject{$i}, $junked{$i}, $xmited{$i},
($sitetot * 100) / $totalarticles, ($reject{$i} * 100) / $sitetot;
}
printf "\nTOTALS \t%6d\t%6d\t%6d\t%7d\t%6d%%\t%6d%%\n",
$atot, $rtot, $jtot, $xtot, 100, ($rtot * 100) / $totalarticles;
print "\nTotal Articles processed $totalarticles";
print " (1 duplicate)" if ($duplicates == 1);
print " ($duplicates duplicates)" if ($duplicates > 1);
print "\n";

#
# Netnews Categories
#
if ($atot > 0) {
print "\nNetnews Categories Received\n";
$l = 0;
for $i (keys(ngcount)) {
$l = length($i) if ($l < length($i));
}
$fmt = "%-${l}s %d\n";
while (1) {
$max = 0;
for $j (keys(ngcount)) {
if ($ngcount{$j} > $max) {
$max = $ngcount{$j};
$i = $j;
}
}
last if ($max == 0);
printf $fmt, $i, $ngcount{$i};
$ngcount{$i} = 0;
}
}

#
# Bad Newsgroups
#
@keys = sort(keys(badng));
if ($#keys >= 0) {
print "\nBad Newsgroups Received\n";
$l = 0;
for $i (@keys) {
$l = length($i) if ($l < length($i));
}
$fmt = "%-${l}s %d\n";
for $i (@keys) {
printf $fmt, $i, $badng{$i};
}
}

#
# Unsubscribed Newsgroups
#
@keys = sort(keys(unsub));
if ($#keys >= 0) {
print "\nUnsubscribed Newsgroups Received\n";
$l = 0;
for $i (@keys) {
$l = length($i) if ($l < length($i));
}
$fmt = "%-${l}s %d\n";
for $i (@keys) {
printf $fmt, $i, $unsub{$i};
}
}

#
# Excluded Newsgroups
#
@keys = sort(keys(excluded));
if ($#keys >= 0) {
print "\nExcluded Newsgroups Received\n";
$l = 0;
for $i (@keys) {
$l = length($i) if ($l < length($i));
}
$fmt = "%-${l}s %d\n";
for $i (@keys) {
printf $fmt, $i, $excluded{$i};
}
}

##############################################################################
# recordgroups(msgid)
#
# Given a list of message-ids, retrieve the newsgroups associated with each
# message-id and update the global ngcount table appropriately.

sub recordgroups {
local(@msgids) = @_;
local($i, @groups);

for ($i = 0; $i <= $#msgids; $i++) {
$msgids[$i] =~ s/<([^>]*)>/$1/;
}
open(NH, "-|") || exec $newshist, '--', @msgids;
while (<NH>) {
chop;
($_, $_, @groups) = split;
foreach $i (@groups) {
$i =~ s/\/.*$//;
if ($i =~ /\./) {
$i =~ s/\..*//;
$ngcount{$i}++;
}
}
}
close(NH) || warn("exec($newshist): $!\n");
}
--
Paul Vixie, DEC Western Research Lab
Palo Alto, California, USA "Be neither a conformist or a rebel,
<vi...@decwrl.dec.com> decwrl!vixie for they are really the same thing.
<pa...@vixie.sf.ca.us> vixie!paul Find your own path, and stay on it." (me)

Ron Nash

unread,
Aug 16, 1991, 12:00:43 PM8/16/91
to
In article <HOGAN.91A...@cujo.csl.sri.com> ho...@csl.sri.com (Emmett Hogan) writes:
>
>The subject line says it all...has anyone written a C or PERL program
>to parse the C-News log file and spit out a nice formatted report?
>
>Thanks in advance,
>Emmett

Here is a perl script called nstats. It gives a nice report.

Enjoy:

#!/usr/bin/perl
#
# Nstats - Print C news statistics via Perl
#
# Version 1.2 (10/17/89)
#
#
#
# Author's notes:
#
# Constructive comments and enhancements are solicited (flames are not).
# Please send suggestions or enhancements to denny@mcmi.
#
# Larry Wall has a Very Nice Work in Perl. Many thanks to him.
#
# Denny Page, 1989
#
#
#
# Program notes:
#
# The simplest usage is 'perl nstats ~news/log'. I leave you to find
# more complicated invocations.
#
# While a duplicate is actually a rejected message, it is treated
# separately here. Rejected messages herein are messages that are not
# subscribed to in the sys file or are excluded in the active file.
#
# Junked messages are not displayed in the system summaries. It's not
# your neighbor's fault that you are missing active file entries. If
# you are concerned about receiving junk groups, exclude them in your
# sys or active file. They will then be summarized :-).
#
# The reason for a newsgroup being bad is assigned only once. If the
# reason changes later in the log (such as the sys file being modified
# such that a newsgroup is no longer rejected, but rather is filed in
# junk), no notice will be taken.
#
# Calls to newshist are cached at 25. This may need to be adjusted at
# some sites.
#
# Sitenames are truncated to 15 characters. This could be done better.
#
#
# Output headers have the following meanings:
#
# System Name of the neighboring system.
# Accept Number of accepted articles from system.
# Dup Number of duplicate articles received from system.
# Rej Number of rejected articles from system.
# Sent Number of articles sent to system.
# Sys% Accepted (or duplicate or rejected) articles as a
# percentage of total articles from that system.
# Tot% Accepted (or duplicate) articles as a percentage
# of total accepted (or duplicate) articles.
# Avl% Number of articles sent as a percentage of total
# available (accepted) articles.
#
############################################################
#
# Revision history:
#
# 09/24/89 dny Initial version
# 09/28/89 dny Added category totals
# 10/02/89 dny Fixed link count bug in record_groups
# 10/03/89 dny Cleaned up variable names
# 10/16/89 dny Renamed variables - Perl 3.0
# 10/17/89 dny Fixed bug in rejection counts
# 04/18/91 ma...@comp.vuw.ac.nz
# speedups replacing newshist now
# that it is a shell script
# 06/01/91 mer...@iWarp.intel.com
# replaced mark's shell script with Perl code
# 02/06/91 ma...@comp.vuw.ac.nz
# fix randal's code to handle Message-IDs
# of the form <`=C*~5...@cck.cov.ac.uk>
#
############################################################

############################################################


# Record the category of a list of message-ids
sub record_groups {
local(@ids) = @_;

grep(!/</ && s/.*/<$&>/, @ids);
local($ids) = join("\n",@ids);
open(newshist, "-|") || exec <<"PERL_EOF";
/usr/lib/newsbin/dbz -ix /usr/lib/news/history <<'SH_EOF'
$ids
SH_EOF
PERL_EOF

$batchcnt = $#_ + 1;
while (<newshist>) {
if (s/^.+\t.+\t(.+)\n$/$1/) {
$batchcnt--;
foreach $link (split(/ /)) {
$link =~ s/^([^\.\/]+).*/$1/;
$category{$link}++;
}
}
}
$category{"*expired*"} += $batchcnt;
close(newshist);
}

############################################################

$#id_cache = -1;

while (<>) {
($from, $action, $message_id, $text) =
/^.+\s(\S+)\s(.)\s<(.+)>\s(.*)$/;
$from = substr($from, 0, 15);

# Accepted message
if ($action eq '+') {
$accepted{$from}++;
foreach $site (split(/ /, $text)) {
$site = substr($site, 0, 15);
$sent{$site}++;
}

$id_cache[++$#id_cache] = $message_id;
unless ($#id_cache < 50) {
do record_groups(@id_cache);
$#id_cache = -1;
}
next;
}
elsif ($action eq '-') {
# Duplicate
if ($text eq 'duplicate') {
$duplicates{$from}++;
next;
}
$rejected{$from}++;
# Group not in sys
if ($text =~ s/no subscribed groups in `(.+)'/$1/) {
foreach $group (split(/,/, $text)) {
if ($badgroup{$group}++ == 0) {
$badgroup_reason{$group} = "not subscribed in sys";
}
}
next;
}
# Group excluded in active
elsif ($text =~ s/all groups `(.+)' excluded in active/$1/) {
foreach $group (split(/,/, $text)) {
if ($badgroup{$group}++ == 0) {
$badgroup_reason{$group} = "excluded in active";
}
}
next;
}
}
# Junked message
elsif ($action eq 'j') {
$junk{$from}++;
if ($text =~ s/junked due to groups `(.+)'/$1/) {
foreach $group (split(/,/, $text)) {
if ($badgroup{$group}++ == 0) {
$badgroup_reason{$group} = "not in active (junked)";
}
}
next;
}
}
# Ignore ihave/sendme messages
elsif ($action eq 'i') {next;}
elsif ($action eq 's') {next;}

# Unknown input line
print $_;
}


if ($#id_cache >= 0) {
do record_groups(@id_cache);
}


# Collect all sitenames and calc totals
foreach $system (keys(accepted)) {
$systems{$system} = 1;
$total_accepted += $accepted{$system};
}
foreach $system (keys(duplicates)) {
$systems{$system} = 1;
$total_duplicates += $duplicates{$system};
}
foreach $system (keys(rejected)) {
$systems{$system} = 1;
$total_rejected += $rejected{$system};
}
foreach $system (keys(sent)) {
$systems{$system} = 1;
$total_sent += $sent{$system};
}
$total_articles = $total_accepted + $total_duplicates + $total_rejected;

# Print system summaries
print "\nSystem Accept sys% tot% Dup sys% tot% Rej sys% Sent avl%\n";

foreach $system (sort keys(systems)) {
$articles = $accepted{$system} + $duplicates{$system} + $rejected{$system};

if ($accepted{$system} > 0) {
$accepted_pct = ($accepted{$system} * 100) / $articles + 0.5;
$accepted_totpct = ($accepted{$system} * 100) / $total_accepted + 0.5;
}
else {
$accepted_pct = 0;
$accepted_totpct = 0;
}
if ($duplicates{$system} > 0) {
$duplicates_pct = ($duplicates{$system} * 100) / $articles + 0.5;
$duplicates_totpct = ($duplicates{$system} * 100) / $total_duplicates + 0.5;
}
else {
$duplicates_pct = 0;
$duplicates_totpct = 0;
}
if ($rejected{$system} > 0) {
$rejected_pct = ($rejected{$system} * 100) / $articles + 0.5;
}
else {
$rejected_pct = 0;
}
if ($sent{$system} > 0) {
$sent_pct = ($sent{$system} * 100) / $total_accepted + 0.5;
}
else {
$sent_pct = 0;
}

printf "%-15s %5d %3d%% %3d%% %4d %3d%% %3d%% %4d %3d%% %5d %3d%%\n",
$system,
$accepted{$system}, $accepted_pct, $accepted_totpct,
$duplicates{$system}, $duplicates_pct, $duplicates_totpct,
$rejected{$system}, $rejected_pct,
$sent{$system}, $sent_pct;
}


if ($total_accepted > 0) {
$accepted_pct = ($total_accepted * 100) / $total_articles + 0.5;
}
else {
$accepted_pct = 0;
}
if ($total_rejected > 0) {
$rejected_pct = ($total_rejected * 100) / $total_articles + 0.5;
}
else {
$rejected_pct = 0;
}
if ($total_duplicates > 0) {
$duplicates_pct = ($total_duplicates * 100) / $total_articles + 0.5;
}
else {
$duplicates_pct = 0;
}

printf "TOTALS %5d %3d%% %4d %3d%% %4d %3d%% %5d\n",
$total_accepted, $accepted_pct,
$total_duplicates, $duplicates_pct,
$total_rejected, $rejected_pct,
$total_sent;

# Display any bad newsgroups received
@keys = sort(keys(badgroup));
if ($#keys >= 0) {
print "\n\nBad Newsgroups Articles Reason\n";
foreach $group (@keys) {
printf "%-35s %4d %s\n",
$group, $badgroup{$group}, $badgroup_reason{$group};
}
}


# Display news categories received
@keys = sort(keys(category));
if ($#keys >= 0) {
print "\n\nCategories Received Articles\n";
foreach $group (@keys) {
printf "%-35s %4d\n",
$group, $category{$group};
}
}
################################################## snip snip
#
#Just another Perl and Cnews hacker,
#--
#/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
#| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III |
#| mer...@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
#\=Cute Quote: "Intel: putting the 'backward' in 'backward compatible'..."====/
#
--
Ron Nash San Diego State University
Internet: na...@ucselx.sdsu.edu
Gin-N-Tonic 5 year old 1/2 Arab endurance prospect
Luv on Fire 8 year old Arab, trusty steed and friend

Michael Richardson

unread,
Aug 16, 1991, 12:03:48 PM8/16/91
to
In article <1991Aug16.0...@decuk.uvo.dec.com> l...@decuk.uvo.dec.com writes:
>In article <HOGAN.91A...@cujo.csl.sri.com>, ho...@csl.sri.com (Emmett Hogan) writes:
>|>The subject line says it all...has anyone written a C or PERL program
>|>to parse the C-News log file and spit out a nice formatted report?

Awk, but you could run a2p on it.
It starts out like: (oh geesh, it is so short, I'll just post the
whole thing)

#!/bin/sh
#
# Shell/awk script for chewing up Cnews log files and spitting out
# summary news stats. Reads from standard input or give it a news log
# file as an argument. Bug fixes and enhancements welcome.
#
# If you don't run with the relaynews daemon patch change the 'i = 8'
# in the for() just below to 'i = 7'. You might want to rip out the
# "control" and "failed" stuff, too.
#
# John A. Palkovic 1/31/91
#
# Keith Cantrell (kcan...@digi.lonestar.org) added the ability to
# print the summary of files sent out. 1/30/90
#
awk '
{
if ($5 == "+") {
freq[$4]++;
for ( i = 8; i <= NF; i++)
sent[$(i)]++;
}
if ($5 == "-") reject[$4]++
if ($5 == "c") control[$4]++
if ($5 == "f") failed[$4]++
}END{
printf " posts\t rjct\t ctrl\t fail\thost\n"
printf " -----\t-----\t-----\t-----\t--------------------------\n"
for(host in freq) {
printf " %5d\t%5d\t%5d\t%5d\t%s\n", \
freq[host], reject[host], control[host], failed[host], host
totalpost += freq[host]
totalreject += reject[host]
totalcontrol += control[host]
totalfailed += failed[host]
}
printf " -----\t-----\t-----\t-----\t--------------------------\n"
printf " %5d\t%5d\t%5d\t%5d\ttotals\n", \
totalpost,totalreject,totalcontrol,totalfailed
printf "\n\nNumber of articles sent:\n";
for(host in sent) {
printf "%12s:%7d\n", host, sent[host]
totalsent += sent[host];
}
printf "--------------------\n";
printf "%12s %7d\n", "total", totalsent;
}' $1

>I don't usually like to do this, but "Me too!!!" (I've been looking
>around the net but can't seem to find)

Has anyone been collecting any of the non-standard Cnews utilities?
I sure would like to have an idea what else is out there, if only to
give me a bigger perspective on ways to solve specific problems.

--
:!mcr!: | The postmaster never | So much mail,
Michael Richardson | resolves twice. | so little time.
HOME: m...@sandelman.ocunix.on.ca Bell: (613) 237-5629
Small Ottawa nodes contact me about joining ocunix.on.ca!

Scott Lawrence

unread,
Aug 16, 1991, 2:38:17 PM8/16/91
to

This isn't exactly what was requested, but I've found it usefull. It is
a perl script I wrote to help me tune my expiration times. Run on the
machine where the news spool is, it reports on the number of files and
disk blocks for groups and hierarchies.

#! /bin/sh
# This is a shell archive. Remove anything before this line, then unpack
# it by saving it into a file and typing "sh file". To overwrite existing
# files, type "sh file -c". You can also feed this as standard input via
# unshar, or by typing "sh <file", e.g.. If this archive is complete, you
# will see the following message at the end:
# "End of shell archive."
# Contents: getargs.pl news-stats
# Wrapped by lawrence@adastra on Fri Aug 16 09:30:47 1991
PATH=/bin:/usr/bin:/usr/ucb ; export PATH
if test -f 'getargs.pl' -a "${1}" != "-c" ; then
echo shar: Will not clobber existing file \"'getargs.pl'\"
else
echo shar: Extracting \"'getargs.pl'\" \(5958 characters\)
sed "s/^X//" >'getargs.pl' <<'END_OF_FILE'
X#!/usr/local/bin/perl
X#
X# $Header: /home/simpsons/work/lawrence/odr/commands/RCS/getargs.pl,v 1.2 90/12/07 08:16:30 lawrence Exp Locker: lawrence $
X#
X# Provides the routine getargs
X# which takes a picture of the expected arguments of the form:
X# ( <tuple> [, <tuple> ]... )
X# <tuple> ::= <type>, <keyword>, <size>, <variable>
X# <type> ::= '-' for switch arguments
X# 'm' for mandatory positional arguments
X# 'o' for optional positional arguments
X# <keyword> ::= string to match for switch arguments
X# (also used to print for usage of postional arguments)
X# <size> ::= number of values to consume from ARGV ( 0 = set variable to 1 )
X# <variable> ::= name of variable (not including $ or @) to assign
X# argument value into
X#
X# automatically provides -usage (unless it is specified in the picture,
X# in which case the caller will get it like any other switch).
X# automatically provides --
X#
package getargs;
X
sub main'getargs
X{
local(@Picture) = @_;
X
X# Now parse the argument picture
local( $Type, $Keyword, $Size, $Variable, $Tuple, %Sizes, %Switches );
local( $Options, $Mandatories, @Positional, $Target, %Targets );
X
for ( $Tuple = 0; $Tuple < $#Picture; $Tuple += 4 )
X{
X ( $Type, $Keyword, $Size, $Variable ) = @Picture[ $Tuple..$Tuple+3 ];
X
X $Sizes{ $Keyword } = $Size;
X $Targets{ $Keyword } = $Variable;
X
X if ( $Type eq "-" ) # switch argument
X {
X # print "Switch: -$Keyword\n";
X }
X elsif ( $Type eq "m" ) # mandatory positional argument
X {
X $Options && die "Optional Arg in picture before Mandatory Arg\n";
X $Mandatories++;
X push( @Positional, $Keyword );
X }
X elsif ( $Type eq "o" ) # optional positional argument
X {
X $Options++;
X push( @Positional, $Keyword );
X }
X else { die "Undefined Type: $Type\n"; }
X}
X
X local( @ActualArgs ) = @ARGV;
X
Switch: while ( $#Switches && ($_ = shift @ActualArgs) )
X{
X if ( /^--/ ) ## force end of options processing
X {
X last Switch;
X }
X elsif ( /^-\d+/ ) ## numeric argument - not an option
X {
X unshift( @ActualArgs, $_ );
X last Switch;
X }
X elsif ( /^-\?/ || /^-usage/ )
X {
X &usage( @Picture );
X return 0;
X }
X elsif ( /^-(\w+)/ ) ## looks like a switch...
X {
X if ( $Target = $Targets{ $1 } )
X {
X &assign_value( $Target, $Sizes{ $1 } );
X }
X else
X {
X warn "Invalid switch $_\n";
X &usage( @Picture );
X return 0;
X }
X }
X else
X {
X unshift( @ActualArgs, $_ );
X last Switch;
X }
X } # Switch
X
X Positional: while( $_ = shift( @Positional ) )
X {
X &assign_value( $Targets{ $_ }, $Sizes{ $_ } ) || last Positional;
X $Mandatories--;
X }
X
X if ( @ActualArgs )
X {
X warn "Too many arguments: @ActualArgs\n";
X &usage( @Picture );
X 0;
X }
X elsif ( $Mandatories > 0 )
X {
X warn "Not enough arguments supplied\n";
X &usage( @Picture );
X 0;
X }
X else
X {
X 1;
X }
X
X} # sub getargs
X
sub assign_value
X{
X local ( $Target, $Size ) = @_;
X local ( $Assignment );
X
X if ( $Size <= @ActualArgs )
X {
X Assign:
X {
X $Assignment = '$main\''.$Target.' = 1;'
X , last Assign if ( $Size == 0 );
X $Assignment = '$main\''.$Target.' = shift @ActualArgs;'
X , last Assign if ( $Size == 1 );
X $Assignment = '@main\''.$Target.' = @ActualArgs[ $[..$[+$Size-1 ],@ActualArgs = @ActualArgs[ $[+$Size..$#ActualArgs ];'
X , last Assign if ( $Size > 1 );
X $Assignment = '@main\''.$Target.' = @ActualArgs, @ActualArgs = ();'
X , last Assign if ( $Size == -1 );
X die "Invalid argument type in picture\n";
X }
X
X eval $Assignment;
X 1;
X }
X else
X {
X @ActualArgs = ();
X 0;
X }
X}
X
sub usage
X{
X local( $CommandName ) = $0;
X $CommandName =~ s\^.*/\\;
X print "Usage:\n";
X print " $CommandName";
X local( @Picture ) = @_;
X local( $Type, $Keyword, $Size, $Tuple, $Switches );
X
X $Switches = 0;
X Switch: for ( $Tuple = 0; $Tuple < $#Picture; $Tuple += 4 )
X {
X ( $Type, $Keyword, $Size ) = @Picture[ $Tuple..$Tuple+2 ];
X
X if ( $Type eq "-" ) # switch argument
X {
X $Switches++;
X print " [-$Keyword";
X if ( $Size == -1 )
X {
X print " <$Keyword> ... ]";
X last Switch;
X }
X print " <$Keyword>" while ( $Size-- > 0 );
X print "]";
X }
X }
X
X print "\n "." " x length($CommandName)." [--]" if $Switches;
X
X Positional: for ( $Tuple = 0; $Tuple < $#Picture; $Tuple += 4 )
X {
X ( $Type, $Keyword, $Size ) = @Picture[ $Tuple..$Tuple+2 ];
X
X if ( $Type eq "m" ) # mandatory positional argument
X {
X if ( $Size == -1 )
X {
X print " <$Keyword> ...";
X last Positional;
X }
X print " <$Keyword>" while ( $Size-- > 0 );
X }
X elsif ( $Type eq "o" ) # optional positional argument
X {
X if ( $Size == -1 )
X {
X print " [<$Keyword>] ...";
X last Positional;
X }
X print " [<$Keyword>" while ( $Size-- > 0 );
X print "]";
X }
X }
X
X print "\n";
X}
X1;
X
X# $Log: getargs.pl,v $
X#Revision 1.2 90/12/07 08:16:30 lawrence
X#Fixed comment leaders
X#Fixed usage to only print the last component of the command pathname
X#
X# Revision 1.1 90/12/06 23:39:18 lawrence
X# Initial revision
X#
END_OF_FILE
if test 5958 -ne `wc -c <'getargs.pl'`; then
echo shar: \"'getargs.pl'\" unpacked with wrong size!
fi
chmod +x 'getargs.pl'
# end of 'getargs.pl'
fi
if test -f 'news-stats' -a "${1}" != "-c" ; then
echo shar: Will not clobber existing file \"'news-stats'\"
else
echo shar: Extracting \"'news-stats'\" \(8969 characters\)
sed "s/^X//" >'news-stats' <<'END_OF_FILE'
X#!/usr/local/bin/perl
X
require 'ctime.pl';
require 'getargs.pl';
X
X&getargs( '-', 'debug', 0, 'Debug'
X ,'-', 'top', 1, 'TopN'
X ) || exit 1;
X
X$TopN = 15 unless defined $TopN;
X
X@ExpAges = ( 5, 10, 15 ) unless defined @ExpAges;
X
X$OverAge = $#ExpAges + 1;
X
X$ReportDate = &ctime(time);
X$CurrentDir = '/usr/spool/news';
X@DirStack = ();
X
chdir $CurrentDir
X || die "$!\nCan't change to `$CurrentDir'\n";
X
X$DiskStats = `df $CurrentDir`;
X
X( $AllBlocks, $AllFiles, $AllBlockAges, $AllFileAges ) = &StatGroup( '' );
X
X##
X## Print the total summary and hierarchy stats report
X##
X$- = 0;
X$^ = 'summaryTop';
X$~ = 'detailStats';
X
X@FlsT = split( ':', $AllFileAges );
X@BlksT = split( ':', $AllBlockAges );
X
sub byTotalBlocks { $TotalBlocks{ $b } <=> $TotalBlocks{ $a }; }
foreach $Group ( (sort( byTotalBlocks @Hierarchies))[1..($TopN-1)] )
X{
X @FlsA = split( ':', $FilesAged{ $Group } );
X @BlksA = split( ':', $BlocksAged{ $Group } );
X write;
X}
X
X##
X## Print the top groups by Files and Blocks
X##
X$~ = 'detailStats';
X
X$- = 0;
X$^ = 'groupBlocks';
X
sub byBlocks { $Blocks{ $b } <=> $Blocks{ $a }; }
foreach $Group ( (sort( byBlocks @Groups ))[0..($TopN-1)] )
X{
X @FlsA = split( ':', $FilesAged{ $Group } );
X @BlksA = split( ':', $BlocksAged{ $Group } );
X write;
X}
X
X$- = 0;
X$^ = 'groupFiles';
X
sub byFiles { $Files{ $b } <=> $Files{ $a }; }
foreach $Group ( (sort( byFiles @Groups ))[0..($TopN-1)] )
X{
X @FlsA = split( ':', $FilesAged{ $Group });
X @BlksA = split( ':', $BlocksAged{ $Group });
X write;
X}
X
X##
X## Print the raw numbers
X##
X$- = 0;
X$^ = 'top';
X$~ = 'detailStats';
foreach $Group ( sort( @Groups ) )
X{
X @FlsA = split( ':', $FilesAged{ $Group });
X @BlksA = split( ':', $BlocksAged{ $Group });
X write;
X}
X
X
exit;
X
X###
X### FORMATS
X###
X
X###
X### Summary
X###
format summaryTop =
X
News Statistics Report for @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
X $ReportDate
X
X@*
X$DiskStats
X
Total Files = @<<<<<<<<< Total Blocks = @<<<<<<<<<
X $AllFiles $AllBlocks
X Files/Blocks < @> <=Files/Blocks < @> <=Files/Blocks < @> <=Files/Blocks
X $ExpAges[0], $ExpAges[1], $ExpAges[2]
X @>>>>>>>/@<<<<<<< @>>>>>>>/@<<<<<<< @>>>>>>>/@<<<<<<< @>>>>>>>/@<<<<<<<
X $FlsT[0],$BlksT[0], $FlsT[1],$BlksT[1], $FlsT[2],$BlksT[2], $FlsT[3],$BlksT[3]
X
Top @<<< News Hierarchies
X $TopN
X
Hierarchy Total Files//Total Blocks
X Files/Blocks < @> <=Files/Blocks < @> <=Files/Blocks < @> <=Files/Blocks
X $ExpAges[0], $ExpAges[1], $ExpAges[2]
X-------------------------------------------------------------------------------
X.
X
format detailStats =
X@<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @>>>>>>>>>>//@<<<<<<<<<<<
X$Group, $TotalFiles{ $Group }, $TotalBlocks{ $Group }
X @>>>>>>>/@<<<<<<< @>>>>>>>/@<<<<<<< @>>>>>>>/@<<<<<<< @>>>>>>>/@<<<<<<<
X $FlsA[0],$BlksA[0], $FlsA[1],$BlksA[1], $FlsA[2],$BlksA[2], $FlsA[3],$BlksA[3]
X
X.
X
X###
X### Groups by Files
X###
format groupFiles =
X
Top @<<< Groups by number of files for @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
X $TopN $ReportDate
X
Newsgroup Total Files//Total Blocks
X Files/Blocks < @> <=Files/Blocks < @> <=Files/Blocks < @> <=Files/Blocks
X $ExpAges[0], $ExpAges[1], $ExpAges[2]
X----------------------------------------------------------------------------
X.
X
X###
X### Groups by Blocks
X###
format groupBlocks =
X
Top @<<< Groups by number of blocks for @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
X $TopN $ReportDate
X
Newsgroup Total Files//Total Blocks
X Files/Blocks < @> <=Files/Blocks < @> <=Files/Blocks < @> <=Files/Blocks
X $ExpAges[0], $ExpAges[1], $ExpAges[2]
X----------------------------------------------------------------------------
X.
X
X###
X### Raw Statistics
X###
X
format top =
X
Raw News Group Statistics for @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
X $ReportDate
X
Newsgroup Total Files//Total Blocks
X Files/Blocks < @> <=Files/Blocks < @> <=Files/Blocks < @> <=Files/Blocks
X $ExpAges[0], $ExpAges[1], $ExpAges[2]
X---------------------------------------------------------------------------
X.
X
sub StatGroup # ( $Group )
X # returns ( $TotalBlocks, $TotalFiles, $BlocksAged, $FilesAged )
X{
X local( $Group ) = $_[0];
X local( $Blocks, $Files, $SubGroups, $TotalBlocks, $TotalFiles )
X = (0,0,0,0,0);
X local( @AgeFiles, @AgeBlocks );
X local( $SubBlocks, $SubFiles, $SubBlocksAged, $SubFilesAged );
X local( $GroupName );
X
X ##
X ## Initialize the Age arrays
X ## Each element is the number of {files,blocks} in the age bracket
X ## bounded by the elements of @ExpAges
X ##
X $#AgeFiles = $OverAge;
X $#AgeBlocks = $OverAge;
X foreach ( 0..$OverAge )
X {
X $AgeFiles[ $_ ] = 0;
X $AgeBlocks[ $_ ] = 0;
X }
X
X do { print "Check `$Group' <pause>:"; $Pause = <STDIN>; } if $Debug;
X
X &PushDir( $Group ) unless $Group eq '';
X
X if ( opendir( DIR, '.' ) )
X {
X ## Get all the entries in this directory that we care about
X ## that leaves out anything with a '.' in it, which means that
X ## it is either '.', '..', some kind of metafile, or a database
X ## directory (they all have '.' in thier names).
X ##
X local( @Entries ) = grep( ! /\./, readdir( DIR ));
X closedir DIR;
X
X local( $Entry );
X Entry: foreach $Entry ( @Entries )
X {
X ## print "\tchecking `$Entry'\n";
X ($dev, $ino, $mode, $nlink, $uid, $gid
X , $rdev, $size, $atime, $mtime, $ctime, $blksize
X , $EntryBlocks ) = stat( $Entry );
X
X if ( -d _ )
X {
X ( $SubBlocks, $SubFiles, $SubBlocksAged, $SubFilesAged )
X = &StatGroup( $Entry );
X $TotalBlocks += $SubBlocks;
X $TotalFiles += $SubFiles;
X $SubGroups++;
X @SubBlocksAgedList = split( ':', $SubBlocksAged );
X @SubFilesAgedList = split( ':', $SubFilesAged );
X foreach ( 0..$OverAge )
X {
X $AgeBlocks[ $_ ] += $SubBlocksAgedList[$_];
X $AgeFiles[ $_ ] += @SubFilesAgedList[$_];
X }
X
X }
X else
X {
X /\D/ && next Entry;# just in case some non-article snuck in.
X $Blocks += $EntryBlocks;
X $Files++;
X
X ##
X ## Figure out which age bracket it is in
X ##
X $IsOverAge = 1;
X $EntryAge = -M _;
X AgeBracket: foreach $Age ( 0..$#ExpAges )
X {
X if ( $EntryAge < $ExpAges[$Age] )
X {
X $AgeFiles[ $Age ]++;
X $AgeBlocks[ $Age ] += $EntryBlocks;
X $IsOverAge = 0;
X last AgeBracket;
X }
X }
X if ( $IsOverAge )
X {
X $AgeFiles[ $OverAge ]++;
X $AgeBlocks[ $OverAge ] += $EntryBlocks;
X }
X }
X }
X
X ## Make up the statistics record for this group
X $GroupName = $CurrentDir;
X $GroupName =~ s#^/usr/spool/news/##;
X $GroupName =~ s#/#.#g;
X
X $TotalBlocks += $Blocks;
X $TotalFiles += $Files;
X
X push( @StatList, "$GroupName $Blocks $Files $TotalBlocks $TotalFiles");
X push( @Hierarchies, $GroupName ) if $SubGroups;
X push( @Groups, $GroupName ) if $Files;
X
X $Blocks{ $GroupName } = $Blocks;
X $Files{ $GroupName } = $Files;
X $BlocksAged{ $GroupName } = join( ':', @AgeBlocks );
X $FilesAged{ $GroupName } = join( ':', @AgeFiles );
X $TotalBlocks{ $GroupName } = $TotalBlocks;
X $TotalFiles{ $GroupName } = $TotalFiles;
X }
X else
X {
X warn "$!\nCan't check `$CurrentDir'\n";
X $Blocks = $Files = 0;
X }
X
X &PopDir unless $Group eq '';
X return( $TotalBlocks, $TotalFiles
X ,$BlocksAged{ $GroupName }, $FilesAged{ $GroupName }
X );
X}
X
sub PushDir # ( $NewDir )
X{
X local( $NewDir ) = $_[0];
X
X print "\tpushing to `$CurrentDir/$NewDir'\n" if $Debug;
X
X chdir $NewDir
X || die "$!\nCan't change to $NewDir\n";
X
X push( @DirStack, $CurrentDir );
X $CurrentDir .= '/'.$NewDir;
X}
X
sub PopDir # ( )
X{
X print "\tpoping from `$CurrentDir' " if $Debug;
X
X $CurrentDir = pop( @DirStack );
X chdir $CurrentDir
X || die "$!\nCan't return to `$CurrentDir'\n";
X
X print "to `$CurrentDir'\n" if $Debug;
X}
X
X
END_OF_FILE
if test 8969 -ne `wc -c <'news-stats'`; then
echo shar: \"'news-stats'\" unpacked with wrong size!
fi
chmod +x 'news-stats'
# end of 'news-stats'
fi
echo shar: End of shell archive.
exit 0
--
Scott Lawrence <lawr...@APS.Atex.Kodak.COM> Voice: {US} 508-670-4023
Atex Advanced Publishing Systems G3Fax: {US} 508-670-4033
Atex, Inc; 805 Middlesex Tpke. MS 153; Billerica MA 01821

Gordon Moffett

unread,
Aug 16, 1991, 3:56:56 PM8/16/91
to
In the referenced article, vi...@decwrl.dec.com (Paul A Vixie) writes:
# here's what we use. it's called "yacnls". i didn't write it. it gets
# very large when it runs and sometimes it just keeps getting larger until
# either it or the system crashes. but when it works it works very well.
#
# #! /usr/local/bin/perl
#
# #

# # C News Log Report Generator
# #
# # Mark Nagel <na...@ics.uci.edu>
# # $Id: rep_log.pl,v 1.6 89/11/29 11:00:51 news Exp $

This script uses /usr/lib/newsbin/maint/newshist, which calls the dbz command
once (and possibly twice) *for each and every message-id* in the log file.
Thus puts a horrendous load on the system when it would be just as
simple to pipe the message-ids to the dbz command.


Gordon Moffett
--
Gordon Moffett g...@netcom.com

"If there were no Humanity it would be necessary to invent one."

Mark Nagel

unread,
Aug 18, 1991, 9:02:18 PM8/18/91
to
In <1991Aug16.1...@netcom.COM> g...@netcom.COM (Gordon Moffett) writes:

># # C News Log Report Generator
># #
># # Mark Nagel <na...@ics.uci.edu>
># # $Id: rep_log.pl,v 1.6 89/11/29 11:00:51 news Exp $

>This script uses /usr/lib/newsbin/maint/newshist, which calls the dbz command
>once (and possibly twice) *for each and every message-id* in the log file.
>Thus puts a horrendous load on the system when it would be just as
>simple to pipe the message-ids to the dbz command.

I wrote this script quite some time ago, before the dbz command was
around. I think that newshist was more efficient at the time. Mea
culpa.

Mark
--
Mark Nagel
UC Irvine Department of ICS +----------------------------------------+
INET: na...@ics.uci.edu | Charisma doesn't have jelly in the |
UUCP: ucbvax!ucivax!nagel | middle. -- Jim Ignatowski |

Nik Simpson

unread,
Aug 19, 1991, 9:27:59 AM8/19/91
to
In article <1991Aug16.1...@Sandelman.OCUnix.on.ca> m...@Sandelman.OCUnix.on.ca (Michael Richardson) writes:
>In article <1991Aug16.0...@decuk.uvo.dec.com> l...@decuk.uvo.dec.com writes:
>>In article <HOGAN.91A...@cujo.csl.sri.com>, ho...@csl.sri.com (Emmett Hogan) writes:
>>|>The subject line says it all...has anyone written a C or PERL program
>>|>to parse the C-News log file and spit out a nice formatted report?
>
> Awk, but you could run a2p on it.
> It starts out like: (oh geesh, it is so short, I'll just post the
>whole thing)
>
Script deleted
I found that this script would cause awk to barf on my machine
if the log file was empty for some reason or if no local posting had
taken place, so here is a modified version to fix these problems


# shell/awk script for chewing up Cnews log files and spitting out


# summary news stats. Reads from standard input or give it a news log
# file as an argument. Bug fixes and enhancements welcome.
#
# If you don't run with the relaynews daemon patch change the 'i = 8'
# in the for() just below to 'i = 7'. You might want to rip out the
# "control" and "failed" stuff, too.
#
# John A. Palkovic 1/31/91
#
# Keith Cantrell (kcan...@digi.lonestar.org) added the ability to
# print the summary of files sent out. 1/30/90
#

# Fixed problem of nothing in sent/freq array causing awk to barf
# Nik Simpson (n...@swindon.ingr.com) 7/19/91
awk '
BEGIN {
FREQ=0
sent[0]="null"


}
{
if ($5 == "+") {

FREQ=1
freq[$4]++;
for ( i = 7; i <= NF; i++)


sent[$(i)]++;
}
if ($5 == "-") reject[$4]++
if ($5 == "c") control[$4]++
if ($5 == "f") failed[$4]++
}END{

if ( FREQ == 1 )


{
printf " posts\t rjct\t ctrl\t fail\thost\n"
printf " -----\t-----\t-----\t-----\t--------------------------\n"
for(host in freq)
{
printf " %5d\t%5d\t%5d\t%5d\t%s\n", \
freq[host], reject[host], control[host], failed[host], host

Tpost += freq[host]
Treject += reject[host]
Tcontrol += control[host]
Tfailed += failed[host]


}
printf " -----\t-----\t-----\t-----\t--------------------------\n"

printf " %5d\t%5d\t%5d\t%5d\ttotals\n",Tpost,Treject,Tcontrol,Tfailed
}
for(host in sent) {
if ( host != "0" )


{
printf "%12s:%7d\n", host, sent[host]

Tsent += sent[host];
}
}
if ( Tsent > 0 )
{


printf "\n\nNumber of articles sent:\n";

printf "--------------------\n";
printf "%12s %7d\n", "total", Tsent;
}
}'


>I sure would like to have an idea what else is out there, if only to
>give me a bigger perspective on ways to solve specific problems.
>

In the distribution I provide for Intergraph I have a couple
of scripts that I find useful

1) fixactive, take a known good but out of date copy of the
active file , use it as a the basis for creating a new
active file. Useful when something or someone trashes the
active file.

2) checkjunk, Go through junk directory adding newgroups
for selected hierarchies, for example I want new comp
groups added immediately, however the same cannot be said
for all "alt" groups. It also tries decide whether the
group shpould be moderated or not by looking for a
"Approved:" header line.
--
|--------------------------------------------------|
| Nik Simpson Mail : n...@infonode.ingr.com |
| Intergraph Corp. System Product Marketing |
|--------------------------------------------------|

Steve Simmons

unread,
Aug 19, 1991, 10:47:22 AM8/19/91
to
Nice report generators being posted. Here's one that has a
considerably different philosophy. Note -- some awks will break
on the size of the here document. Use gawk if needed. It's not
been formally posted as there are clearly more things needed(!).

#! /bin/sh
# This is a shell archive. Remove anything before this line, then unpack
# it by saving it into a file and typing "sh file". To overwrite existing
# files, type "sh file -c". You can also feed this as standard input via
# unshar, or by typing "sh <file", e.g.. If this archive is complete, you
# will see the following message at the end:
# "End of shell archive."

# Contents: cnews.logrep
# Wrapped by ne...@hela.iti.org on Mon Aug 19 10:49:52 1991


PATH=/bin:/usr/bin:/usr/ucb ; export PATH

if test -f 'cnews.logrep' -a "${1}" != "-c" ; then
echo shar: Will not clobber existing file \"'cnews.logrep'\"
else
echo shar: Extracting \"'cnews.logrep'\" \(21554 characters\)
sed "s/^X//" >'cnews.logrep' <<'END_OF_FILE'
X: /bin/sh
X#
X# Shell script for cnews log reports
X#
X# This script scans the cnews data areas and looks for things that
X# are out of whack. It reads some of the log files looking for
X# interesting information. If there are no errors or oddities, it
X# tells you so. This should be run immediately after the cnews daily
X# 'newsdaily' script. It expects the logs to be named <thing>.o
X# but you can change that by resetting the *LOG variables as shown
X# below.
X#
X# One point of "philosophy": this whole script is built on the idea
X# that silence is *not* golden. If things are OK, it explicitly
X# tells you so.
X#
X# The log file analyzer is a big awk script. Make sure you understand
X# awk associative arrays before attempting to modify it.
X#
X# $RCSfile: cnews.logrep,v $ $Revision: 0.16 $
X#
X# $Author: news $ $Date: 91/04/22 11:15:10 $
X#
X# $State: Exp $ $Locker: news $
X#
X# $Log: cnews.logrep,v $
X# Revision 0.16 91/04/22 11:15:10 news
X# Cleaned up extra backslashes. Added sendme processing with the ihaves.
X# Made names of all log files explicit.
X#
X# Revision 0.15 91/04/22 10:03:59 news
X# Added cancel and failed cancel reporting.
X#
X# Revision 0.14 91/04/22 09:12:14 news
X# Got oldies number right.
X#
X# Revision 0.13 91/04/12 11:40:12 news
X# Added reports on future dates and too-old dates, added notice on
X# postings which lack message ids.
X#
X# Revision 0.12 91/04/11 16:46:08 news
X# Added outgoing batch checking.
X#
X# Revision 0.11 91/04/11 16:33:33 news
X# Many changes as per suggestions from Brendan Kehoe and Owen Medd.
X# Many many changes. Many many many changes.
X#
X# Initialize the dirs, names of the logs, and variables
X#
X. /usr/lib/news/cnewsbin/config
XPATH="$NEWSPATH:$NEWSBIN"
Xexport PATH
XSCRIPT=`basename $0`
X#
X# Shorthand names for standard C news locations
X#
XLOGDIR=$NEWSCTL
XSYSFILE=${LOGDIR}/sys
XINCOMING=${NEWSARTS}/in.coming
XOUTGOING=${NEWSARGS}/out.going
XBADBATCHES=${INCOMING}/bad
X#
X# If you are running RELAYNEWS as a daemon, uncomment this next line
X#
X#RELAYNEWS=1
X#
X# When there are completely unrecognizable entries in the log, the
X# normal procedure is to print them out in toto so you can extend
X# this script to say something intelligent about them. But if
X# there are hundreds, it can be overwhelming. These variables allow
X# you to restrict the number you will print.
X#
XMAX_DEFECTS=5
XMAX_UNKNOWN=5
XMAX_MYSTERY_DASH=5
XMAX_MYSTERY=5
XMAX_NOPATH=5
X#
X# If this is not where you put the old log files, adjust
X# these locations and names accordingly
X#
XERRLOG=${LOGDIR}/errlog.o
XNEWSLOG=${LOGDIR}/log.o
XBATCHLOG=${NEWSCTL}/batchlog.o
XBATCHPARMS=${NEWSCTL}/batchparms
X#
XNEWSMGR=news
XHOSTNAME=`newshostname`
XTEMPFILE=/tmp/${SCRIPT}.$$
Xset `date`
XDATE="${1} ${2} ${3}, ${6}"
Xtrap 'rm -f $TEMPFILE ; exit' 0 1 2 3 15
Xexec > $TEMPFILE
X#
X# Print preface.
X#
Xcat << EOF
XThis is the news system status report for $DATE as
Xgenerated by $0.
X
XEOF
X#
X# Errorcheck to be sure the error log exists.
X#
Xif [ "" = "$HOSTNAME" ] ; then
X cat << EOF
XThe news host name was not found. Please check the file
X$NEWSCTL/whoami, and make sure the host name is set
Xappropriately for your system (e.g, look at \'uuname -l\' and
X\'uname -n\'. The report will be somewhat incomplete.
X
XEOF
Xfi
X#
X# Errorcheck to be sure the error log exists.
X#
Xif [ ! -f "$ERRLOG" ] ; then
X cat << EOF
XThe news error log ($ERRLOG) for was not found.
XPlease check the news system and this script.
X
XEOF
Xelse
X #
X # If the error log is empty, report that. Otherwise copy it to
X # the news manager. In either case, give a nice prefatory remark.
X #
X SIZE=`wc -l < $ERRLOG`
X SIZE=`echo $SIZE`
X if [ $SIZE = 0 ] ; then
X cat << EOF
XThe news error log ($ERRLOG) shows no errors.
X
XEOF
X else
X #
X # Give a nice prefatory remark, then copy the log to the news
X # manager. Put beginning and end markers.
X #
X cat << EOF
XThe news error log ($ERRLOG) has $SIZE complaints. A
Xcopy of the error log is included below:
X
X$ERRLOG:
XEOF
X cat < ${ERRLOG}
X cat << EOF
XEnd of $ERRLOG
X
XEOF
X fi
Xfi
X#
X# Report on bad batches left lying around
X#
Xif [ -d ${BADBATCHES} ] ; then
X (
X cd $BADBATCHES
X COUNT=`ls | wc -l`
X COUNT=`echo $COUNT`
X if [ $COUNT = 0 ] ; then
X echo "There are no bad batches being held in $BADBATCHES."
X else
X echo "There are $COUNT bad batches being held in $BADBATCHES:"
X if [ $COUNT -gt 10 ] ; then
X ls -C
X else
X ls -l
X fi
X fi
X )
Xelse
X echo "Could not find bad batches directory ($BADBATCHES)!"
Xfi
Xecho ""
X#
X# Report on old nrun files.
X#
Xif [ ! -d $INCOMING ] ; then
X echo "Could not find incoming directory ($INCOMING)!"
Xelse
X (
X cd $INCOMING
X COUNT=`find . -name 'n*' -type f -mtime +1 -print | wc -l`
X COUNT=`echo $COUNT`
X if [ $COUNT != 0 ] ; then
X echo "There seem to be old run fragments left in ${INCOMING}:"
X echo ""
X ls -ls n*
X else
X echo "There are no old run fragments left in ${INCOMING}."
X fi
X echo ""
X #
X # Report on old batch files.
X #
X LIST=`find . -name "[1-9]*" -type f -mtime +1 -print`
X COUNT=`echo $LIST | wc -w`
X COUNT=`echo $COUNT`
X if [ $COUNT != 0 ] ; then
X cat << EOF
XThere seem to be $COUNT old incoming batches left in ${INCOMING}.
XYou should check to see if unbatching is being done properly. The
Xold batches are:
X
XEOF
X if [ $COUNT -gt 10 ] ; then
X ls -C $LIST
X else
X ls -ld $LIST
X fi
X else
X echo "There are no old batches left in ${INCOMING}."
X fi
X )
Xfi
Xecho ""
X#
X# Report on the status of batching. If the batchparms files
X# does not exist, simply state that we think no batching is
X# going on. If it's there, try to report on batching. If your
X# site doesn't do batching and you'd like to shut up this section
X# of the report, rename $BATCHPARMS to $BATCHPARMS.sample.
X#
X# This needs to be expanded to report more fully on batch results....
X#
Xif [ ! -f "$BATCHPARMS" ] ; then
X echo "You do not appear to be doing any uucp batching (no batch"
X echo "paramters file $BATCHPARMS)."
Xelse
X if [ ! -f "$BATCHLOG" ] ; then
X echo "Could not process batch log ${BATCHLOG}."
X else
X #
X # Report on stalled batching for outside sites. Report if the
X # log does not exist.
X #
X COUNT=`grep "no recent movement" ${BATCHLOG} | wc -l`
X COUNT=`echo $COUNT`
X if [ $COUNT != 0 ] ; then
X echo "The news batch queue is full for the following sites:"
X grep "no recent movement" ${BATCHLOG} | awk '{ printf "%s\n", $1 }' | sort | uniq
X else
X echo "There are no full outgoing batch queues show in $BATCHLOG."
X fi
X fi
Xfi
Xecho ""
X#
X# Errorcheck to be sure the regular log exists.
X#
Xif [ ! -f "$NEWSLOG" ] ; then
X cat << EOF
XCould not process news log!
X
XThe standard news log ($NEWSLOG) was not found.
XPlease check the news system and this script.
X
XEOF
Xelse
X #
X # Generate traffic, activity, and oddity report. I am
X # not a big awk fan, but gotta admit you can do a lot
X # with associative arrays.
X #
X awk < $NEWSLOG '
XBEGIN {
X accept_count = 0
X entry_count = 0
X ship_count = 0
X x_count = 0
X duplicate_count = 0
X ihave_count = 0
X sendme_count = 0
X unapproved_count = 0
X unsub_count = 0
X junk_count = 0
X local_articles = 0
X bad_header_count = 0
X empty_header_count = 0
X non_header_count = 0
X bad_date_count = 0
X no_msgid_count = 0
X unknown_no_count = 0
X future_count = 0
X oldies_count = 0
X cancel_count = 0
X precancel_count = 0
X nopath_count = 0
X mystery_dash_count = 0
X mystery_count = 0
X defect_count = 0
X hostname = "'$HOSTNAME'"
X}
X{
X #
X # Count all entries, note and save defective ones
X #
X entry_count++
X if ( NF < 6 )
X {
X defect_count++
X defective_entry[ $0 ]++
X }
X else if ( $5 == "+" )
X {
X #
X # Track the accepted articles
X #
X accept_count++
X accept_host[ $4 ]++
X if ( NF > 6 )
X {
X ship_count++
X shipfield = 7 + "'$RELAYNEWS'"
X #
X # Record which systems got articles
X #
X while ( shipfield <= NF )
X {
X ship_list[ $shipfield ]++
X shipfield++
X }
X }
X #
X # Track the number of local postings
X #
X if ( $4 == hostname )
X local_count++
X }
X #
X # Track all junked articles by reason junked.
X #
X else if ( $5 == "j" )
X {
X junk_count++
X reason = 7
X junkfor=""
X while ( reason <= NF )
X {
X junkfor = junkfor $reason " "
X reason++
X }
X junk_reason[ junkfor ]++
X }
X #
X # Track ihave/sendme records
X #
X else if ( $5 == "i" )
X {
X ihave_count++
X ihave_source[ $4 ]++
X }
X else if ( $5 == "s" )
X {
X sendme_count++
X sendme_source[ $4 ]++
X }
X #
X # Track the exceptions on a class-by-class basis
X #
X else if ( $5 == "-" )
X {
X #
X # Track the number of duplicates we get from various sites.
X #
X if ( $7 == "duplicate" )
X {
X duplicate_count++
X duplicate_host[ $4 ]++
X }
X #
X # Track the number of unapproved articles and their sources.
X #
X else if ( $7 == "unapproved" )
X {
X unapproved_count++
X unapproved_source[ $4 ]++
X unapproved_target[ $12 ]++
X }
X #
X # Lots of errors are no this, no that, etc. We process
X # them all at once.
X #
X else if ( $7 == "no" )
X {
X if ( $8 == "subscribed" )
X {
X #
X # Track the number of unsubscribed articles we
X # get from various sites and the target group.
X #
X unsub_count++
X unsub_source[ $4 ]++
X unsub_target[ $11 ]++
X }
X #
X # These are needed for new pedantic Cnews. Over
X # time these should decline. We track where the
X # articles came from but not the message ids. If
X # the admin is really interested, they can track
X # them down by grepping the logfile for "no FOO"
X # and the name of the offending host.
X #
X else if ( ( $8 == "Date:" ) || ( $8 == "@" ) || ( $8 == "From:" ) || ( $8 == "Subject:" ) )
X {
X bad_header_count++
X bad_header_source[ $4 ]++
X }
X else
X #
X # These are minus for unknown reason. An obvious
X # candidate for mods to this script.
X #
X {
X unknown_no_count++
X unknown_no_field[ $8 ]++
X unknown_no_entry[ $0 ]++
X }
X }
X #
X # This next is a *severe* error!
X #
X else if ( ( $6 == "no" ) && ( $7 == "Message-ID:" ) )
X {
X no_msgid_count++
X no_msgid_source[ $4 ]++
X }
X #
X # Other less serious errors
X #
X else if ( $7 == "empty" )
X {
X empty_header_count++
X empty_header_field[ $8 ]++
X empty_header_source[ $4 ]++
X }
X else if ( ( $9 == "contains" ) && ( $10 == "non-header" ) )
X {
X non_header_count++
X non_header_source[ $4 ]++
X }
X else if ( ( $7 == "unparsable" ) && ( $8 == "Date:" ) )
X {
X bad_date_count++
X bad_date_source[ $4 ]++
X }
X #
X # Track the number of articles which we explicitly reject via
X # x records in the active file. We do not track the specific
X # groups for two reasons -- presumably since you xed it, you
X # know what you reject; and it is hard to parse.
X #
X else if ( ( $7 == "all" ) && ( $8 == "groups" ) )
X {
X x_count++
X }
X #
X # Track the articles which are too far in the future
X #
X else if ( ( $7 == "Date:" ) && ( $8 == "too" ) && ( $9 == "far" ) )
X {
X future_count++
X future_source[ $4 ]++
X }
X #
X # Track the articles which are too old to be worth keeping
X # I love this cnews feature -- it will probably keep things
X # freer of loops than anything else.
X #
X else if ( ( $7 == "older" ) && ( $8 == "than" ) )
X {
X oldies_count++
X oldies_source[ $4 ]++
X }
X #
X # Some articles come in sans Path: headers. This
X # identifies them.
X #
X else if ( ( $7 == "no" ) && ( $7 == "Path:" ) && ( $8 == "header" ) )
X {
X nopath_count++
X nopath_source[ $4 ]++
X }
X #
X # Any unrecognised "-" tag is kept here. As we find these
X # they should be added to the things handled above.
X #
X else
X {
X mystery_dash_count++
X mystery_dash_entry[ $0 ]++
X }
X }
X #
X # This counts cancel messages. They are not generated by
X # standard C news, but by Dave Aldens relaynews daemon.
X #
X else if ( $5 == "c" )
X {
X cancel_count++
X }
X #
X # This counts failed cancel messages. They are not generated
X # by standard C news, but by Dave Aldens relaynews daemon.
X #
X else if ( $5 == "f" )
X {
X precancel_count++
X }
X #
X # Any unrecognised tags get noted here. As we get these
X # they sould be added to the things handled above.
X #
X else
X {
X mystery_count++
X mystery_tags[ $5 ]++
X mystery_entry[ $0 ]++
X }
X}
XEND {
X printf "\nThere were %d entries in the standard log. Breakdown:\n", entry_count
X if ( entry_count != 0 )
X {
X #
X # Report the data by categories.
X #
X # Local postings. It would be nice to report the newsgroups
X # posted to, but that data is not in the log.
X #
X printf "\n%6d articles were posted from this site (%s)\n", local_count, hostname
X #
X # Next, where outside articles came from
X #
X printf "\n%6d incoming articles accepted for processing\n", accept_count
X if ( accept_count > 0 )
X for ( host in accept_host )
X printf " %6d from %s\n", accept_host[ host ], host
X printf "\n%6d of those were rejected as duplicates\n", duplicate_count
X #
X # Report number of duplicates and who gave them to us.
X #
X if ( duplicate_count > 0 )
X for ( host in duplicate_host )
X printf " %6d from %s\n", duplicate_host[ host ], host
X #
X # Ihave activity.
X #
X printf "\n%6d/%d ihave/sendme messages were processed\n", ihave_count, sendme_count
X if ( ihave_count > 1 )
X {
X for ( source in ihave_source )
X printf " %6d from %s\n", ihave_source[ source ], source
X }
X if ( sendme_count > 1 )
X {
X for ( source in sendme_source )
X printf " %6d from %s\n", sendme_source[ source ], source
X }
X #
X # Cancel and failed cancel reporting. These messages only appear
X # in the log file if you are running Dave Aldens relaynews
X # daemon. Since no cancel messages almost certianly means
X # vanilla C news, we do not report on zero counts.
X #
X if ( cancel_count > 0 )
X {
X printf " \n%6d articles were cancelled.\n", cancel_count
X }
X if ( precancel_count > 0 )
X {
X printf " \n%6d articles were cancelled before receipt.\n", precancel_count
X }
X #
X # Junkage report. Give total junkage, then break it down by
X # reason.
X #
X printf "\n%6d articles were junked\n", junk_count
X if ( junk_count > 0 )
X for ( group in junk_reason )
X printf " %6d %s\n", junk_reason[ group ], group
X #
X # Outgoing traffic report. Total ships, then break down by
X # system.
X #
X printf( "\n%6d articles were shipped to other systems\n", ship_count )
X if ( ship_count > 0 )
X {
X for ( to in ship_list )
X printf " %6d for %s\n", ship_list[ to ], to
X printf " (Totals may differ due to same article shipped to multiple systems)\n"
X }
X #
X # List how many articles we explictly rejected. Note we
X # do not track the groups explicitly.
X #
X if ( x_count > 0 )
X {
X printf "\n%6d articles were accepted but not posted/Xed by the active file\n", x_count
X }
X #
X # Now we report on the questionable stuff
X #
X #
X # This error comes first, as it indicates a severe problem
X # with either you or your immediate neighbors
X #
X if ( no_msgid_count > 0 )
X {
X printf "\n* * * Begin Serious Error! * * *\n"
X printf "* * * There were attempts to insert articles which had *no* messages ids.\n"
X printf "* * * This is a sign of significant errors in the posting or transfer\n"
X printf "* * * software and should be checked out IMMEDIATELY!\n"
X for ( source in no_msgid_source )
X printf "* * * %6d attempts were made from %s\n", no_msgid_source[ source ], source
X printf "* * * End Serious Error! * * *\n"
X }
X #
X # Other errors are less serious.
X #
X # The attempts to post to moderated groups.
X #
X printf "\n%6d articles were rejected as unapproved for moderated groups\n", unapproved_count
X if ( unapproved_count > 0 )
X {
X for ( target in unapproved_target )
X printf " %6d posted to %s\n", unapproved_target[ target ], target
X for ( source in unapproved_source )
X printf " %6d came from %s\n", unapproved_source[ source ], source
X }
X #
X # Now the transfer of unsubscribed stuff
X #
X printf "\n%6d articles were rejected as for unsubscribed groups\n", unsub_count
X if ( unsub_count > 0 )
X {
X for ( target in unsub_target )
X printf " %6d posted to %s\n", unsub_target[ target ], target
X for ( source in unsub_source )
X printf " %6d came from %s\n", unsub_source[ source ], source
X }
X #
X # List the articles rejected for having non-headers in the
X # header section. Since the further spread of pedantic
X # C news will eventually eliminate these, we do not report
X # in the case where everything is OK.
X #
X if ( non_header_count > 0 )
X {
X printf "\n%6d articles were rejected as having nonheaders in the header section\n", non_header_count
X for ( source in non_header_source )
X printf " %6d came from %s\n", non_header_source[ source ], source
X }
X #
X # List the number of articles rejected due to incorrect headers
X # and where they came from. Since the further spread of pedantic
X # C news will eventually eliminate these, we keep silent if
X # everything is OK.
X #
X if ( bad_header_count > 0 )
X {
X printf "\n%6d articles were rejected as having bad headers (no Date:, etc)\n", bad_header_count
X for ( source in bad_header_source )
X printf " %6d came from %s\n", bad_header_source[ source ], source
X }
X #
X # List those header fields and origins which were noted as
X # being empty. Since the further spread of pedantic C news will
X # eventually eliminate these, we keep silent if everything is OK.
X #
X if ( empty_header_count > 0 )
X {
X printf "\n%6d articles were rejected as having empty headers\n", empty_header_count
X for ( field in empty_header_field )
X printf " %6d entries had empty \"%s\" fields\n", empty_header_field[ field ], field
X for ( source in empty_header_source )
X printf " %6d entries came from %s\n", empty_header_source[ source ], source
X }
X #
X # Date rejections. Since the further spread of pedantic C news will
X # eventually eliminate these, we keep silent if everything is OK.
X #
X if ( bad_date_count > 0 )
X {
X printf "\n%6d articles were rejected as having unparseable dates:\n", bad_header_count
X for ( source in bad_date_source )
X printf " %6d came from %s\n", bad_date_source[ source ], source
X }
X #
X # These are article which are dated too far in the future.
X #
X if ( future_count > 0 )
X {
X printf "\n%6d articles were rejected as dated too far in the future:\n", future_count
X for ( source in future_source )
X printf " %6d came from %s\n", future_source[ source ], source
X }
X #
X # Time rejects -- articles which are just too damned old. Probably
X # signs of a news loop.
X #
X if ( oldies_count > 0 )
X {
X printf "\n%6d articles were rejected as dated too far in the past:\n", oldies_count
X for ( source in oldies_source )
X printf " %6d came from %s\n", oldies_source[ source ], source
X }
X #
X # The fun stuff. This is where we report unrecognizable things.
X # Data reported here is either meat for future mods to this script
X # or indication of bugs in the news software.
X #
X if ( unknown_no_count > 0 )
X {
X printf "\n%6d articles were rejected with an unrecognised comment\n", unknown_no_field
X printf "about \"no such-and-such\" in the entry. Those rejections were:\n"
X for ( no_field in unknown_no_field )
X printf " %s had the reason: %s\n", unknown_no_field[ no_field ], no_field
X max_unknown = '$MAX_UNKNOWN'
X printf "Here is up to %d of the entries:\n", max_unknown
X for ( no_field in unknown_no_entry )
X if ( max_unknown-- > 0 )
X printf " %s\n", unknown_no_entry[ no_field ]
X else
X break
X }
X #
X if ( nopath_count > 0 )
X {
X printf "\n%6d articles were rejected with a missing Paths header.\n", nopath_count
X max_nopath = '$MAX_NOPATH'
X printf "Here are up to %d of the entries:\n", max_nopath
X for ( nopath in nopath_entry )
X if ( max_nopath-- > 0 )
X printf " %s\n", nopath_entry
X else
X break
X }
X #
X if ( mystery_dash_count > 0 )
X {
X printf "\n%6d articles were rejected with an unrecognised \"-\" field.\n", mystery_dash_count
X max_mystery_dash = '$MAX_MYSTERY_DASH'
X printf "Here are up to %d of the entries:\n", max_mystery_dash
X for ( mystery_dash in mystery_dash_entry )
X if ( max_mystery_dash-- > 0 )
X printf " %s\n", mystery_dash
X else
X break
X }
X if ( mystery_count > 0 )
X {
X printf "\nThere were %d entries which were correctly formatted but with tag\n", mystery_count
X printf "fields that were not recognized. Those tag fields and frequency were:\n"
X tag_count = 0
X for ( mystery in mystery_tags )
X tag_count++
X for ( mystery in mystery_tags )
X {
X if ( tag_count > 1 )
X printf " \"%s\" (%d),", mystery, mystery_tags[ mystery ]
X else
X printf " \"%s\" (%d)\n", mystery, mystery_tags[ mystery ]
X tag_count--
X }
X mystery_max='$MAX_MYSTERY'
X printf "Here are up to %d of the entries:\n", mystery_max
X for ( mystery in mystery_entry )
X if ( mystery_max-- > 0 )
X printf " %s\n", mystery
X else
X break
X }
X #
X # Report on malformed lines in the log. Data reported here is
X # almost certianly a bug in the news software.
X #
X defect_max = '$MAX_DEFECTS'
X if ( defect_count > 0 )
X {
X printf "\nThere were %d entries in the log with too few fields. Here is a sample:\n", defect_count
X for ( defect in defective_entry )
X if ( defect_max-- > 0 )
X printf " \"%s\"\n", defect
X else
X break
X }
X }
X #
X # If nothing was wrong, print a nice reassuring message.
X #
X if ( ( mystery_count == 0 ) && ( mystery_dash_count == 0 ) && ( defect_count == 0 ) )
X printf "\nNo defects or unrecognized entries were found in the standard log.\n"
X}'
Xfi
X/usr/ucb/Mail -s "News Log Report for $DATE" $NEWSMGR < $TEMPFILE
END_OF_FILE
if test 21554 -ne `wc -c <'cnews.logrep'`; then
echo shar: \"'cnews.logrep'\" unpacked with wrong size!
fi
chmod +x 'cnews.logrep'
# end of 'cnews.logrep'


fi
echo shar: End of shell archive.
exit 0

--
When Charlemagne returned from a campaign and discovered his wife in bed with
one of his ministers, he said to the man, "I don't understand. I **have** to
sleep with her." -- as told by Roy Richter (rric...@ph.gmr.com)

Chip Salzenberg

unread,
Aug 19, 1991, 12:26:54 PM8/19/91
to
Here's one reminicent of the B News log scan.

#! /bin/sh
# This is a shell archive. Remove anything before this line, then unpack
# it by saving it into a file and typing "sh file". To overwrite existing
# files, type "sh file -c". You can also feed this as standard input via
# unshar, or by typing "sh <file", e.g.. If this archive is complete, you
# will see the following message at the end:
# "End of shell archive."

# Contents: logscan
# Wrapped by chip@animal on Mon Aug 19 12:26:28 1991


PATH=/bin:/usr/bin:/usr/ucb ; export PATH

if test -f 'logscan' -a "${1}" != "-c" ; then
echo shar: Will not clobber existing file \"'logscan'\"
else
echo shar: Extracting \"'logscan'\" \(8195 characters\)
sed "s/^X//" >'logscan' <<'END_OF_FILE'
Xeval 'exec /bin/perl -S $0 ${1+"$@"}'
X if 0;
X
X# $Id: logscan,v 1.3 1991/06/18 16:39:17 news Exp $
X#
X# C News Log Report Generator
X#
X# Mark Nagel <na...@ics.uci.edu>
X# Chip Salzenberg <ch...@tct.com>
X#
X# Large parts of this script were based on the B News log report
X# awk script.
X#
X
X$NEWSBIN = $ENV{"NEWSBIN"} || "/news/cbin";
X$NEWSCTL = $ENV{"NEWSCTL"} || "/news/lib";
X$dbz = "$NEWSBIN/dbz";
X
X##############################################################################
X# customization #
X##############################################################################
X
X#
X# The "local" array contains a list of regular expressions that
X# identify a site entry in the log file as local. Each regular
X# expression will be matched case-independently and anchored at the
X# beginning/end.
X#
X@local = (
X "me", # news server name
X "tct(\.uucp)?" # other local client names
X);
X
X#
X# The "gateway" array contains a list of regular expressions that
X# identify a site entry in the log file as a gateway. Each regular
X# expression will be matched case-independently and anchored at the
X# beginning/end.
X#
X@gateway = (
X "local-.*",
X "gateway"
X);
X
X##############################################################################
X# initialization #
X##############################################################################
X
X$duplicates = 0;
X$oldies = 0;
X$bad_headers = 0;
X$bad_dates = 0;
X
X$arglen = 0;
X
X$silent = 0;
Xwhile ($_ = $ARGV[0], /^-/) {
X shift;
X last if (/^--$/);
X /^-s/ && ($silent = 1);
X}
X
X##############################################################################
X# log file scan #
X##############################################################################
X
X$IDS = "/tmp/ids.$$";
Xopen(IDS, "> $IDS") || die "logscan: can't create $IDS: $!\n";
X
Xwhile (<>) {
X next if /^$/; # skip blank lines
X chop;
X
X #
X # extract fields from line
X #
X ($month,$date,$time,$site,$code,$msgid,@logent) = split;
X
X #
X # fix up the site name as necessary
X #
X for $regexp (@gateway) {
X if ($site =~ /^$regexp$/i) {
X $site = "(GATEWAY)";
X last;
X }
X }
X for $regexp (@local) {
X if ($site =~ /^$regexp$/i) {
X $site = "local";
X last;
X }
X }
X # $site =~ s/\..*$//;
X
X #
X # check the receipt code
X #
X if ($code eq "-") { # rejected article
X $reject{$site}++;
X if ($logent[0] eq "duplicate") {
X $duplicates++;
X } elsif ($logent[0] eq "older" && $logent[1] eq "than") {
X $oldies++;
X } elsif ($logent[0] eq "article" && $logent[1] eq '"header"') {
X $bad_headers++;
X } elsif ($logent[0] eq "unparsable" && $logent[1] eq "Date:") {
X $bad_dates++;
X } elsif ($logent[0] eq "no" && $logent[1] eq "subscribed") {
X #
X # "no subscribed groups in `...'"
X #
X $ng = $logent[4];
X $ng =~ s/`([^']*)'/$1/;
X @ng = split(/,/, $ng);
X for $i (@ng) {
X $unsub{$i}++;
X }
X } elsif ($logent[0] eq "all" && $logent[3] eq "excluded") {
X #
X # "all groups `...' excluded in active"
X #
X $ng = $logent[2];
X $ng =~ s/`([^']*)'/$1/;
X @ng = split(/,/, $ng);
X for $i (@ng) {
X $excluded{$i}++;


X }
X } else {
X #

X # print any others as-is for inspection
X #
X print "$_\n" unless ($silent);
X }
X } elsif ($code eq "+") { # accepted article
X $accept{$site}++;
X print IDS $msgid, "\n";
X for $n (@logent) {
X $neighbor{$n} = 1;
X $xmited{$n}++;
X }
X } elsif ($code eq "j") { # junked after accepted
X $junked{$site}++;
X if ($logent[0] eq "junked") {
X $ng = $logent[4];
X $ng =~ s/`([^']*)'/$1/;
X @ng = split(/,/, $ng);
X for $i (@ng) {
X $badng{$i}++;
X }
X }
X } elsif ($code eq "i") { # ihave message
X $ihave++;
X } elsif ($code eq "s") { # sendme message
X $sendme++;
X } else { # illegal/unknown code
X print "$_\n" unless ($silent);
X }
X}
X
Xclose(IDS);
Xdo recordgroups($IDS);
Xunlink($IDS);
X
X##############################################################################
X# statistics generation #
X##############################################################################
X
X#
X# rejected messages
X#
X$rtot = 0;
Xwhile (($key, $val) = each(reject)) {
X if ($val > 0) {
X $list{$key} = 1;
X $rtot += $val;
X }
X}
X
X#
X# accepted messages
X#
X$atot = 0;
Xwhile (($key, $val) = each %accept) {
X if ($val > 0) {
X $list{$key} = 1;
X $atot += $val;
X }
X}
X
X#
X# transmitted messages
X#
X$xtot = 0;
Xwhile (($key, $val) = each(xmited)) {
X if ($val > 0) {
X $list{$key} = 1;
X $xtot += $val;
X }
X}
X
X#
X# junked messages
X#
X$jtot = 0;
Xwhile (($key, $val) = each(junked)) {
X if ($val > 0) {
X $list{$key} = 1;
X $jtot += $val;
X }
X}
X
X##############################################################################
X# report generation #
X##############################################################################
X
X#
X# Transmission Statistics
X#
X$totalarticles = $atot + $rtot;
X$totalarticles++ if ($totalarticles == 0);
Xprint "\n" unless ($silent);
Xprint "System \tAccept\tReject\tJunked\tXmit to\t %total\t%reject\n";
Xfor $i (sort keys(list)) {
X $sitetot = $accept{$i} + $reject{$i};
X $sitetot++ if ($sitetot == 0);
X $articles{$i} = $sitetot;
X
X printf "%-14.14s\t%6d\t%6d\t%6d\t%7d\t%6d%%\t%6d%%\n",
X $i, $accept{$i}, $reject{$i}, $junked{$i}, $xmited{$i},
X ($sitetot * 100) / $totalarticles, ($reject{$i} * 100) / $sitetot;
X}
Xprintf "\nTOTALS \t%6d\t%6d\t%6d\t%7d\t%6d%%\t%6d%%\n",
X $atot, $rtot, $jtot, $xtot, 100, ($rtot * 100) / $totalarticles;
Xprint "\n";
Xprint "Total Processed: $totalarticles\n";
X
Xprint "\n";
Xprint "Rejected for duplicate ids: $duplicates\n" if $duplicates;
Xprint "Rejected for stale dates: $oldies\n" if $oldies;
Xprint "Rejected for bad dates: $bad_dates\n" if $bad_dates;
Xprint "Rejected for bad headers: $bad_headers\n" if $bad_headers;
X
X#
X# Netnews Categories
X#
Xif ($atot > 0) {
X print "\nNetnews Categories Received\n";
X $l = 0;
X for $i (keys(ngcount)) {
X $l = length($i) if ($l < length($i));
X }
X $fmt = "%-${l}s %d\n";
X while (1) {
X $max = 0;
X for $j (keys(ngcount)) {
X if ($ngcount{$j} > $max) {
X $max = $ngcount{$j};
X $i = $j;
X }
X }
X last if ($max == 0);
X printf $fmt, $i, $ngcount{$i};
X $ngcount{$i} = 0;
X }
X}
X
X#
X# Bad Newsgroups
X#
X@keys = sort by_bad keys(badng);
Xsub by_bad { return $badng{$b} <=> $badng{$a}; }
X
Xif (@keys) {
X print "\nBad Newsgroups Received\n";
X $l = 0;
X for $i (@keys) {
X $l = length($i) if ($l < length($i));
X }
X $fmt = "%-${l}s %d\n";
X for $i (@keys) {
X printf $fmt, $i, $badng{$i};
X }
X}
X
X#
X# Unsubscribed Newsgroups
X#
X@keys = sort by_unsub keys(unsub);
Xsub by_unsub { return $unsub{$b} <=> $unsub{$a}; }
X
Xif (@keys) {
X print "\nUnsubscribed Newsgroups Received\n";
X $l = 0;
X for $i (@keys) {
X $l = length($i) if ($l < length($i));
X }
X $fmt = "%-${l}s %d\n";
X for $i (@keys) {
X printf $fmt, $i, $unsub{$i};
X }
X}
X
X#
X# Excluded Newsgroups
X#
X@keys = sort by_excl keys(excluded);
Xsub by_excl { return $excluded{$b} <=> $excluded{$a}; }
X
Xif (@keys) {
X print "\nExcluded Newsgroups Received\n";
X $l = 0;
X for $i (@keys) {
X $l = length($i) if ($l < length($i));
X }
X $fmt = "%-${l}s %d\n";
X for $i (@keys) {
X printf $fmt, $i, $excluded{$i};
X }
X}
X
X##############################################################################
X# recordgroups()
X#
X# With the given file of article ids, retrieve their associated
X# newsgroups and update the global ngcount table.
X
Xsub recordgroups {
X local($idfile) = @_;
X local($i, @groups);
X
X $pid = open(HIST, "-|");
X die "logscan: can't fork: $!\n" unless defined($pid);
X if ($pid == 0) {
X open(STDIN, $idfile) || die "logscan: can't reopen $idfile\n";
X exec $dbz, "-ix", "$NEWSCTL/history";
X warn "logscan: can't execute $dbz: $!\n";
X $ppid = getppid;
X kill 15, $ppid if $ppid > 1;
X exit 1;
X }
X while (<HIST>) {
X chop;
X ($_, $_, @groups) = split;
X for $i (@groups) {
X $i =~ s/\/.*$//;
X if ($i =~ /\./) {
X $i =~ s/\..*//;
X $ngcount{$i}++;
X }
X }
X }
X close(HIST) ||
X printf STDERR "logscan: $dbz exited with status 0x%04X\n", $?;
X}
END_OF_FILE
if test 8195 -ne `wc -c <'logscan'`; then
echo shar: \"'logscan'\" unpacked with wrong size!
fi
chmod +x 'logscan'
# end of 'logscan'


fi
echo shar: End of shell archive.
exit 0
--

Chip Salzenberg at Teltronics/TCT <ch...@tct.com>, <uunet!pdn!tct!chip>
"He's alive; he's fat; and he's fighting crime. He's Elvis: FBI."

Henry Spencer

unread,
Aug 20, 1991, 11:42:29 AM8/20/91
to
In article <28AF189...@ics.uci.edu> na...@buckaroo.ics.uci.edu (Mark Nagel) writes:
>I wrote this script quite some time ago, before the dbz command was
>around. I think that newshist was more efficient at the time...

Not significantly so, at least not if you supply the <>.
--
Any program that calls itself an OS | Henry Spencer @ U of Toronto Zoology
(e.g. "MSDOS") isn't one. -Geoff Collyer| he...@zoo.toronto.edu utzoo!henry

Henry Spencer

unread,
Aug 20, 1991, 1:50:13 PM8/20/91
to
I wrote:
>>I wrote this script quite some time ago, before the dbz command was
>>around. I think that newshist was more efficient at the time...
>
>Not significantly so, at least not if you supply the <>.

Wups, correction, if you're giving it a whole bunch of message IDs, yes,
there is a difference. Didn't think anyone did that. :-) However, in
general, yes, using the dbz command is preferable nowadays (it's present
even if you aren't using the dbz package).

Larry Wall

unread,
Aug 20, 1991, 4:45:27 PM8/20/91
to
In article <1991Aug20....@zoo.toronto.edu> he...@zoo.toronto.edu (Henry Spencer) writes:

: I wrote:
: >>I wrote this script quite some time ago, before the dbz command was
: >>around. I think that newshist was more efficient at the time...
: >
: >Not significantly so, at least not if you supply the <>.
:
: Wups, correction, if you're giving it a whole bunch of message IDs, yes,
: there is a difference. Didn't think anyone did that. :-) However, in
: general, yes, using the dbz command is preferable nowadays (it's present
: even if you aren't using the dbz package).

Some of you may be pleased to hear that the next Perl patch includes
support for creating a dbzperl. In fact, we've been fetching news
from PSI for that last couple of months using a pair of rather
sophisticated dbzperl scripts. I'll post them one of these days.

Larry

Christopher Davis

unread,
Aug 20, 1991, 10:30:08 PM8/20/91
to

lwall> == Larry Wall <lw...@netlabs.com>

lwall> [...] we've been fetching news from PSI for that last couple of
lwall> months using a pair of rather sophisticated dbzperl scripts. [...]

USENET programs.... in Perl.... hrm....

Which will we see first: rn.pl, or P news? (Can't call it PC news, right?)

(There are *some* implicit smileys. This article is not completely
facetious, however. I think a C news re-implemented in Perl would
probably gain a fair amount of speed, and I *know* I'd like a newsreader
that allowed 'eval' in a kill file.)

--
Christopher Davis <c...@eff.org> | ELECTRONIC MAIL WORDS OF WISDOM #5:
System Manager & Postmaster | "Internet mail headers are
Electronic Frontier Foundation | not unlike giblets."
+1 617 864 0665 NIC: [CKD1] | -- Brian Reid <re...@decwrl.dec.com>

Tom Christiansen

unread,
Aug 21, 1991, 12:00:20 AM8/21/91
to
From the keyboard of c...@eff.org (Christopher Davis):
:Which will we see first: rn.pl, or P news? (Can't call it PC news, right?)

:
:(There are *some* implicit smileys. This article is not completely
:facetious, however. I think a C news re-implemented in Perl would
:probably gain a fair amount of speed, and I *know* I'd like a newsreader
:that allowed 'eval' in a kill file.)

Tell me how you'd like to see this work. I don't see how
to avoid giving plum (a program of mine) a newsreading mode.

--tom

Stefan Linnemann

unread,
Aug 21, 1991, 9:13:00 AM8/21/91
to

>--tom

Do you have the time to produce trn.pl, Tom? :-) Us users would be ever
so grateful! :-)

Stefan.

Stefan M. Linnemann | The cutest .sig
System programmer | is not so big.
Leiden University, the Netherlands. |
Email: cri...@rulcvx.LeidenUniv.nl | SMiLe 1991

Henry Spencer

unread,
Aug 21, 1991, 11:53:10 AM8/21/91
to
In article <CKD.91Au...@eff.org> c...@eff.org (Christopher Davis) writes:
>... I think a C news re-implemented in Perl would
>probably gain a fair amount of speed...

Just curious: what on Earth would give you that idea? All the critical
paths are in C.

Matthew Farwell

unread,
Aug 21, 1991, 4:35:14 AM8/21/91
to

I don't think that rewriting the entirity of Cnews in perl is a very
good idea, but writing some of the longer shell scripts might have some
benefit. The ones in particular that I'm talking about are anne.jones and
inews. Newsdaily might benefit too. Anyone fancy having a go?

Dylan.
--
Matthew J Farwell: dy...@ibmpcug.co.uk || ...!uunet!ukc!ibmpcug!dylan
Never trust a programmer with a screwdriver.

Henry Spencer

unread,
Aug 21, 1991, 2:30:19 PM8/21/91
to
In article <1991Aug21.0...@ibmpcug.co.uk> dy...@ibmpcug.CO.UK (Matthew Farwell) writes:
>... writing some of the longer shell scripts might have some

>benefit. The ones in particular that I'm talking about are anne.jones and
>inews. Newsdaily might benefit too. Anyone fancy having a go?

Newsdaily is pretty completely egrep-bound; you won't see much improvement
just from redoing the skeleton. Note that some recent changes to it sped
it up considerably.

The inews complex (which includes anne.jones) is being rethought and
reimplemented anyway.

Brendan Kehoe

unread,
Aug 21, 1991, 3:14:19 PM8/21/91
to
dy...@ibmpcug.CO.UK wrote:
>I don't think that rewriting the entirity of Cnews in perl is a very
>good idea, but writing some of the longer shell scripts might have some
>benefit. The ones in particular that I'm talking about are anne.jones and
>inews. Newsdaily might benefit too. Anyone fancy having a go?

I implemented anne.jones, tear, and spacefor all in C; they're
available via anonymous FTP from ftp.cs.widener.edu in pub/cnews.set.shar.Z,
or via mail server by writing to archive...@cs.widener.edu with a
Subject: line or body of `send widener cnews.set.shar'. Please be
aware they're not part of the official CNews distribution, but I've
found them very useful (especially for users who post and don't want
to wait around).

--
Brendan Kehoe - Widener Sun Network Manager - bre...@cs.widener.edu
Widener University in Chester, PA A Bloody Sun-Dec War Zone
". . . if you were a gas, you would be inert." -- Dieter

Jeff Wandling

unread,
Aug 21, 1991, 5:46:17 PM8/21/91
to
c...@eff.org (Christopher Davis) writes:

> lwall> == Larry Wall <lw...@netlabs.com>
> lwall> [...] we've been fetching news from PSI for that last couple of
> lwall> months using a pair of rather sophisticated dbzperl scripts. [...]

>USENET programs.... in Perl.... hrm....

No. Please.

>Which will we see first: rn.pl, or P news? (Can't call it PC news, right?)

>(There are *some* implicit smileys. This article is not completely
>facetious, however. I think a C news re-implemented in Perl would
>probably gain a fair amount of speed, and I *know* I'd like a newsreader
>that allowed 'eval' in a kill file.)

Speed. Heh... who cares about speed. The question is, will a perl script
written *today* run under perl of *tomorrow*?... Not a flame 2 Larry. Just
an observation. #include <std_ice_cubes.h>

>--
>Christopher Davis <c...@eff.org> | ELECTRONIC MAIL WORDS OF WISDOM #5:
>System Manager & Postmaster | "Internet mail headers are
>Electronic Frontier Foundation | not unlike giblets."
>+1 617 864 0665 NIC: [CKD1] | -- Brian Reid <re...@decwrl.dec.com>


--
Jeff Wandling <je...@cs.wwu.edu> <uw-beaver!henson.cc.wwu.edu!jeff>
disclaimer: "They just work my wheels off, I'm not paid to have opinions"- me
"It's ok to be young. And the csh is still out to get you. :-)" -t.christiansen

Christopher Davis

unread,
Aug 21, 1991, 6:00:26 PM8/21/91
to

HS> == Henry Spencer <he...@zoo.toronto.edu>

HS> Just curious: what on Earth would give you that idea? All the
HS> critical paths are in C.

Having discussed this with Geoff in e-mail, also, I must say that my
impression was wrong. The only C news code I've really looked at in
much detail is stuff like inews and related bits, which (of course) are
the ones that would (at least at first glance) benefit from perl rewrites.

It also probably didn't help that I was looking at the 10-Jan-1990 source...

*sigh* I overgeneralized on this one. Mea culpa. I'll make it up by
being one of the volunteers to re-run the sf-lovers vote or something.

Christopher Davis

unread,
Aug 21, 1991, 6:18:58 PM8/21/91
to
[Followups to news.software.readers and/or comp.lang.perl; the
discussion's on how to integrate perl code into a KILL file, so it's
germane to both. Moving out of news.software.b.]

Tom> == Tom Christiansen <tch...@convex.COM>

ckd> I *know* I'd like a newsreader that allowed 'eval' in a kill file.

Tom> Tell me how you'd like to see this work. I don't see how
Tom> to avoid giving plum (a program of mine) a newsreading mode.

Well, that depends on whether or not we're looking for backward
compatibility. It shouldn't be too hard to write a 'rn2plum' in perl to
convert the syntax, so I'll assume we don't care if rn-style KILL file
syntax is compatible.

GNUS does pretty well with a sort of "power-steering" elisp for KILL
files; it allows you to hit a few keys to insert a "kill this subject"
or "kill this author" command, which you can always edit to be more (or
less) stringent.

Example: I hit ESC k to get a buffer news.software.b.KILL, then C-c C-k
C-a to "command kill author", and it inserts:
(gnus-kill "From" "tchrist@convex\\.COM (Tom Christiansen)")

It even nicely regexpifies the name, as you can see.

I'd like to see something like:

$HOME/News/comp.lang.perl.FILTER:
----------------
require 'ckdextensions.pl';
&ckd'archive if $header{'From'} =~ /^tchrist@.*convex\.com/i;
&kill if $header{'Subject'} =~ /MODERATOR ABUSE!!!/;
&ckd'mail-to-friend('da...@somevax.big-u.edu') if
(length($header{'References'}) > 50);
&ckd'unshar('/usr/local/lib/perl/usenet')
if ($header{'Keywords'} =~ /\bshar\b/);
----------------

with some easy way of generating simple "kill if from" and "kill if
subject match" without having to hand-write it, but I really want full
perl in there.

I don't think KILL files are still the right name, which is why I
suggested FILTER instead...

--Chris

Tom Christiansen

unread,
Aug 21, 1991, 7:20:30 PM8/21/91
to
From the keyboard of je...@henson.cc.wwu.edu (Jeff Wandling):
:Speed. Heh... who cares about speed. The question is, will a perl script

:written *today* run under perl of *tomorrow*?... Not a flame 2 Larry. Just
:an observation. #include <std_ice_cubes.h>

Perl1 programs, with very few exceptions run fine under perl4(*). The
question is not whether the perl programs of today will run under the perl
of tomorrow, but rather whether the perl programs of tomorrow will run
under the perl of today! I'm serious: people get bent out of shape about
this all the time: we post perl4 programs that make use of perl4 features
(i.e. stuff from the book) and don't work under perl3, and they complain.
Sigh.

--tom

(*) An exception that comes to mind is lower case filehandle idents that
have since become reserved words -- should have been upper case all
the time of course.

Matthew Farwell

unread,
Aug 22, 1991, 4:59:36 AM8/22/91
to
In article <1991Aug21.1...@zoo.toronto.edu> he...@zoo.toronto.edu (Henry Spencer) writes:
>In article <1991Aug21.0...@ibmpcug.co.uk> dy...@ibmpcug.CO.UK (Matthew Farwell) writes:
>>... writing some of the longer shell scripts might have some
>>benefit. The ones in particular that I'm talking about are anne.jones and
>>inews. Newsdaily might benefit too. Anyone fancy having a go?
>Newsdaily is pretty completely egrep-bound; you won't see much improvement
>just from redoing the skeleton. Note that some recent changes to it sped
>it up considerably.

But you could avoid multiple passes, which would become more important
with longer and longer logs.

>The inews complex (which includes anne.jones) is being rethought and
>reimplemented anyway.

It was only a suggestion. Newsdaily isn't exactly a life-critical
application anyway.

Henry Spencer

unread,
Aug 22, 1991, 1:26:41 PM8/22/91
to
In article <1991Aug22....@ibmpcug.co.uk> dy...@ibmpcug.CO.UK (Matthew Farwell) writes:
>>Newsdaily is pretty completely egrep-bound; you won't see much improvement
>>just from redoing the skeleton. *Note that some recent changes to it sped
>>it up considerably.* [highlighting added]

>
>But you could avoid multiple passes, which would become more important
>with longer and longer logs.

Note the highlighted phrase. It already avoids multiple passes to a large
extent.

Denny Page

unread,
Aug 22, 1991, 6:41:44 PM8/22/91
to
Ron> Xref: tekbspa news.software.b:6991 comp.lang.perl:5299

Ron> In article <HOGAN.91A...@cujo.csl.sri.com> ho...@csl.sri.com (Emmett Hogan) writes:
Ron> Here is a perl script called nstats. It gives a nice report.

Ron> Enjoy:

Ron> #!/usr/bin/perl
Ron> #
Ron> # Nstats - Print C news statistics via Perl
Ron> #
Ron> # Version 1.2 (10/17/89)


I haven't had time to read news much lately, but someone told me I
just *had* to look at this posting in comp.lang.perl... It's amazing
how these things can come back an get you... :-)

Anyway, I have recently fixed nstats to go with newer C-news (dbz),
and have included the new version (1.3) below.

Denny

#!/usr/bin/perl
#
# Nstats - Print C news statistics via Perl
#

# Version 1.3 (08/22/91)


#
#
#
# Author's notes:
#
# Constructive comments and enhancements are solicited (flames are not).

# Please send suggestions or enhancements to de...@tss.com.


#
# Larry Wall has a Very Nice Work in Perl. Many thanks to him.
#

# Denny Page, 1989, 1991


#
#
#
# Program notes:
#
# The simplest usage is 'perl nstats ~news/log'. I leave you to find
# more complicated invocations.
#
# While a duplicate is actually a rejected message, it is treated
# separately here. Rejected messages herein are messages that are not
# subscribed to in the sys file or are excluded in the active file.
#
# Junked messages are not displayed in the system summaries. It's not
# your neighbor's fault that you are missing active file entries. If
# you are concerned about receiving junk groups, exclude them in your
# sys or active file. They will then be summarized :-).
#
# The reason for a newsgroup being bad is assigned only once. If the
# reason changes later in the log (such as the sys file being modified
# such that a newsgroup is no longer rejected, but rather is filed in
# junk), no notice will be taken.
#

# Sitenames are truncated to 15 characters. This could be done better.
#
#
# Output headers have the following meanings:
#
# System Name of the neighboring system.
# Accept Number of accepted articles from system.
# Dup Number of duplicate articles received from system.
# Rej Number of rejected articles from system.
# Sent Number of articles sent to system.
# Sys% Accepted (or duplicate or rejected) articles as a
# percentage of total articles from that system.
# Tot% Accepted (or duplicate) articles as a percentage
# of total accepted (or duplicate) articles.
# Avl% Number of articles sent as a percentage of total
# available (accepted) articles.
#
############################################################
#
# Revision history:
#

# 09/24/89 denny@mcmi Initial version
# 09/28/89 denny@mcmi Added category totals
# 10/02/89 denny@mcmi Fixed link count bug in record_groups
# 10/03/89 denny@mcmi Cleaned up variable names
# 10/16/89 denny@mcmi Renamed variables - Perl 3.0
# 10/17/89 denny@mcmi Fixed bug in rejection counts
# 07/31/91 de...@tss.com Changed to use dbz
# 08/03/91 de...@tss.com Added "older than" check
#
############################################################

################ ***** Change this ***** ###################
#
$history='/usr/local/lib/news/history';
$dbz='/usr/local/lib/newsbin/dbz';
$tmpfile="/tmp/nstats.$$";
#
############################################################


# Open dbz command
open(DBZ, "|-") || exec "$dbz -ix $history >$tmpfile";

while (<>) {
($from, $action, $message_id, $text) =

/^.+\s(\S+)\s(.)\s(<.+>)\s(.*)$/;


$from = substr($from, 0, 15);

# Accepted message
if ($action eq '+') {
$accepted{$from}++;
foreach $site (split(/ /, $text)) {
$site = substr($site, 0, 15);
$sent{$site}++;
}

print DBZ "$message_id\n";


next;
}
elsif ($action eq '-') {
# Duplicate
if ($text eq 'duplicate') {
$duplicates{$from}++;
next;
}
$rejected{$from}++;

# Older than specified in explist
if ($text =~ /older than .* days/) {
next;

close(DBZ);

open(DBZ_OUTPUT, "< $tmpfile");
while (<DBZ_OUTPUT>) {


if (s/^.+\t.+\t(.+)\n$/$1/) {

foreach $link (split(/ /)) {
$link =~ s/^([^\.\/]+).*/$1/;
$category{$link}++;
}
}

else {
$category{"*expired*"}++;
}
}
unlink($tmpfile);

Larry Wall

unread,
Aug 22, 1991, 6:09:29 PM8/22/91
to
In article <1991Aug21....@henson.cc.wwu.edu> je...@henson.cc.wwu.edu (Jeff Wandling) writes:
: Speed. Heh... who cares about speed. The question is, will a perl script

: written *today* run under perl of *tomorrow*?... Not a flame 2 Larry. Just
: an observation. #include <std_ice_cubes.h>

The answer is basically "yes". Apart from the occasional bugs I install
while fixing something else, and the even rarer breakages that are induced
by regularizing the language, Perl has been astonishingly upward compatible.
Most Perl version 1 scripts still run under Perl version 4 without change.

(Part of the reason for this is that the namespaces in Perl are kept distinct
via typemarkers, so that I can add new "reserved" words without blowing up
scripts that use that name for an identifier.)

For most System Admin tasks, however, your primary concern will not be
whether Perl 4.0 will be compatible with Perl 18.0, but whether your BSD Perl
is compatible with your AT&T Perl and your IBM Perl and your MIPS Perl...

Compatibility is actually Perl's long suit. This is enhanced by the
fact that you can install whatever version you want, wherever you want,
without economic considerations. There's also a mechanism for keeping
old versions of the executable around for critical operations--for
instance, if you know that 4.010 works on a particular script, you can
change

#!/usr/local/bin/perl

to

#!/usr/local/bin/perl4.010

So, while I agree with Henry about the fact that the core of C News
wouldn't be significantly sped up by recasting in Perl, for many
ancillary tasks it may be the tool of choice, particularly if you want
to do something like write an efficient inews that works under n
different architectures without recompilation. (Yes, I'm aware
that you have to compile Perl on the different architectures first.
The idea is to factor out that work and do it just once.)

Besides, including <std_ice_cubes.h> is a fatal error on machines that
don't have it yet. Bad language design, there... :-)

Larry

Scott Sanbeg

unread,
Aug 23, 1991, 1:00:20 AM8/23/91
to
In article <CKD.91Au...@eff.org> c...@eff.org (Christopher Davis) writes:
...
>
>*sigh* I overgeneralized on this one. Mea culpa. I'll make it up by
>being one of the volunteers to re-run the sf-lovers vote or something.

Good gawd -- you don't deserve THAT severe a punishment! :)
Scott

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Scott Sanbeg Spokane, Wa. Voice:509/535-2561 BIX: ssanbeg
ssa...@visual.spk.wa.us (or) visual!ssa...@tau-ceti.isc-br.com

Root Boy Jim

unread,
Aug 26, 1991, 6:17:00 PM8/26/91
to
he...@zoo.toronto.edu (Henry Spencer) writes:
>Newsdaily might benefit too. Anyone fancy having a go?
>
>Newsdaily is pretty completely egrep-bound; you won't see much improvement
>just from redoing the skeleton. Note that some recent changes to it sped
>it up considerably.

The real problem is that grep doesn't do what you want.
Remember the grep wars of a few years ago? What is needed
is a grep that splits data into two streams:

grep -2 pat1 2>pat1.match file |
grep -2 pat2 2>pat2.match |
grep -2 pat3 2>pat3.match |
grep -2 pat4 2>pat4.match > rest

Now nobody's news log is quite as big as ours (8M),
but then again, multiple egrep is still a drag.

Here is my newsdaily.pl:

#! /usr/local/bin/perl
$NAME=shift || "news";
$DUPS="$NAME@dups"; open(DUPS,">$DUPS") || die("open($DUPS)"); $dups=0;
$JUNK="$NAME@junk"; open(JUNK,">$JUNK") || die("open($JUNK)"); $junk=0;
$DATE="$NAME@date"; open(DATE,">$DATE") || die("open($DATE)"); $date=0;
$NOSB="$NAME@nosb"; open(NOSB,">$NOSB") || die("open($NOSB)"); $nosb=0;
$UNAP="$NAME@unap"; open(UNAP,">$UNAP") || die("open($UNAP)"); $unap=0;
$HDRS="$NAME@hdrs"; open(HDRS,">$HDRS") || die("open($HDRS)"); $hdrs=0;
$SEND="$NAME@send"; open(SEND,">$SEND") || die("open($SEND)"); $send=0;
$MISC="$NAME@misc"; open(MISC,">$MISC") || die("open($MISC)"); $misc=0;

while (<>) {
study;
# print("$count lines\n") unless ++$count % 1000;
++$total;
++$good,next if / \+ /; # good news is no news

++$dups,print(DUPS $_),next if / duplicate$/;

++$junk,print(JUNK $_),next if / junked /;

++$date,print(DATE $_),next if / unparsable [dD]ate: /;
++$date,print(DATE $_),next if / [dD]ate: too far in the future:/;
++$date,print(DATE $_),next if / ancient [dD]ate/;
++$date,print(DATE $_),next if / older than 28 days$/; # OBSOLETE

++$nosb,print(NOSB $_),next if / no subscribed/;

++$unap,print(UNAP $_),next if /unapproved/;

++$hdrs,print(HDRS $_),next if /header/;
++$hdrs,print(HDRS $_),next if /Message-ID/;

++$send,print(SEND $_),next if / [is] /;

++$misc, print(MISC $_);
}
print("$dups duplicates, $junk junked\n");
print("$unap unapproved, $nosb unsubscribed\n");
print("$date date problems, $hdrs header problems\n");
print("$send ihave/sendme, $misc misc\n");
print("$good good articles, $total total.\n");
print("\n");

Here is the tail of our newsdaily:

# look for problem newsgroups on input (can miss cross-posted articles)
####
#### use perl to split log rather than looking at it many times
####
cd /var/log
newsdaily.pl < news.0 >>$gripes
####egrep '`' news.0 | egrep junked | sed 's/.*`\(.*\)'"'"'.*/\1/' | sort |
#### uniq -c | sort -nr | sed 5q >$tmp
####if test -s $tmp
if test -s news@junk
then
(
echo 'leading five unknown newsgroups by number of articles:'
#### cat $tmp
sed 's/.*`\(.*\)'"'"'.*/\1/' news@junk | sort | uniq -c | sort -nr | sed 5q
echo
) >>$gripes
fi

####egrep '`' news.0 | egrep unapproved | sed 's/.*`\(.*\)'"'"'.*/\1/' | sort |
#### uniq -c | sort -nr | sed 5q >$tmp
####if test -s $tmp
if test -s news@unap
then
(
echo 'top five supposedly-moderated groups with unmoderated postings:'
#### cat $tmp
sed 's/.*`\(.*\)'"'"'.*/\1/' news@unap | sort | uniq -c | sort -nr | sed 5q
echo
) >>$gripes
fi

####egrep '`' news.0 | egrep 'no subscribed' | sed 's/.*`\(.*\)'"'"'.*/\1/' | sort |
#### uniq -c | sort -nr | sed 5q >$tmp
####if test -s $tmp
if test -s news@nosb
then
(
echo 'leading five unsubscribed newsgroups:'
#### cat $tmp
sed 's/.*`\(.*\)'"'"'.*/\1/' news@nosb | sort | uniq -c | sort -nr | sed 5q
echo
) >>$gripes
fi

# And other signs of problems.
####egrep 'older than|too far in the future|unparsable Date' news.0 | egrep ' - ' |
#### awk '{print $4}' | sort | uniq -c | sort -nr | sed 5q >$tmp
####if test -s $tmp
if test -s news@date
then
(
echo 'leading five sites sending stale/future/misdated news:'
#### cat $tmp
awk '{print $4}' news@date | sort | uniq -c | sort -nr | sed 5q
echo
) >>$gripes
fi

####egrep ' (no|empty) .* header|contains non-header|Message-ID' news.0 |
#### egrep ' - ' | awk '{print $4}' | sort | uniq -c | sort -nr |
#### sed 5q >$tmp
####if test -s $tmp
if test -s news@xhdr
then
(
echo 'leading five sites sending news with bad headers:'
#### cat $tmp
awk '{print $4}' news@xhdr | sort | uniq -c | sort -nr | sed 5q
echo
) >>$gripes
fi

####RBJ@UUNET Added Duplicate Summary

if test -s news@dups
then
(
echo 'leading ten sites sending duplicate news:'
awk '{print $4}' news@dups | sort | uniq -c | sort -nr | sed 10q
echo
) >>$gripes
fi

INEWS=$NEWSBIN/inject/inews

# and send it
if test -s $gripes
then
#### mail $gurus rbj <$gripes
$INEWS -t "News Summary for `hostname` on `date`" -n mail.reports.usenet <$gripes
fi

####RBJ@UUNET Added NNTP Summary

awk -f /usr/local/lib/nntp/nntp_awk nntp.0 |
$INEWS -t "NNTP Summary for `hostname` on `date`" -n mail.reports.nntp

cd /var/log
compress news.0 newsxd.0 nntp.0 nntpxd.0 nntplink.0 newsrun.0 batch.0
--
Desolation Row Jimmy <rbj@uunet>

Henry Spencer

unread,
Aug 26, 1991, 7:27:53 PM8/26/91
to
In article <1991Aug26....@uunet.uu.net> r...@uunet.uu.net (Root Boy Jim) writes:
>>Newsdaily is pretty completely egrep-bound; you won't see much improvement
>>just from redoing the skeleton. Note that some recent changes to it sped
>>it up considerably.
>
>The real problem is that grep doesn't do what you want.
>Remember the grep wars of a few years ago? What is needed
>is a grep that splits data into two streams...

I can see I need to say this more loudly:

NOTE THAT SOME RECENT CHANGES TO NEWSDAILY SPED IT UP CONSIDERABLY.

The current newsdaily strips out all the uninteresting lines -- the vast
bulk of the log -- before it gets down to multiple egrepping. This does
not quite solve the problem; RBJ is right that a splitting egrep is what
is really wanted, and I have an idea or two along those lines that I'm
going to pursue. However, it does reduce the problem by orders of
magnitude, which should suffice for a while.

Reply all
Reply to author
Forward
0 new messages