Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

FAQ extractor

2 views
Skip to first unread message

Bill England

unread,
Jan 2, 1991, 7:13:03 PM1/2/91
to
In article <10...@unisql.UUCP> you write:
>Does anyone have a little cron daemon or some such that recognizes and
>extracts the growing number of FAQ (Frequently Asked Questions) postings
>and puts them in some other location? I currently extract those in the
>newsgroups that I follow and put them in a public location, but I'm
>thinking of writing something like the above to relieve myself of the
>burden and to also catch those FAQs in newsgroups I don't follow (but
>it's possible someone has already considered and done this) ...
>--
>alfred

It should be pretty easy to hack such an animal in perl. Right now
I have a utility that renames archive news files as a preliminary for
archiving comp.sources.* to floppy disk.

As an initial design how about running find /usr/spool/news -ctime 1
-print to find new news articles and then parsing the Subject line
for the string FAQ.

You will probally need a copy/rename function to translate files names ...
Hmm, maybe copy /usr/spool/news/GROUP/0000 /u/ftp/FAQS/GROUP If your file
system only supports 14 chars then some group renaming/mapping function
will be required.

A modification of this utility could provide you with a daily or weekly
posting list of your favorite authors or topics.

Bill England
weng...@stepsf.COM

#!/u/bin/perl
##
# As a starter here is a perl script that finds some FAQ files
# (This one currently goes through all news files, in pratice
# a -ctime 1 would go into the find.)
#
# Bill England
# weng...@stephsf.com
##
eval "exec /u/bin/perl -S $0 $*"
if $running_under_some_shell;


open (FIND_PIPE, "find /usr/spool/news/ -print|");

while (<FIND_PIPE>){
($fn)=split;

# Skip files that are not text files ... ( Skip Directory names )
#
next unless -T "$fn";

open(IN_NEWS, $fn);

while(<IN_NEWS>){
local($subject);

($junk, $subject)= split(/: /, $_, 2) if /^Subject: /;

if ($subject){
chop($subject);

printf("name: %s, Subject: %s\n", $fn, $subject)
if $subject =~ /^Freq/;

last; # IN_NEWS
}
}
} # end of find_pipe
--
+- Bill England, weng...@stephsf.COM -----------------------------------+
| * * H -> He +24Mev |
| * * * ... Oooo, we're having so much fun making itty bitty suns * |
|__ * * ___________________________________________________________________|

harald.a...@elab-runit.sintef.no

unread,
Jan 4, 1991, 5:32:32 AM1/4/91
to
There is a simpler way, at least for the end-user.
I hacked up a PERL script to run nntp (the socket-based mechanism), and
by using the XHDR command, I can get the subject fields, and search them.

This gives me at least the local article numbers of the FAQ articles.

I expect the server would not like me to implement a command that searches
all groups for FAQ subjects :-)
(no, the script IS too ugly for posting :-)


Harald Tveit Alvestrand
Harald.A...@elab-runit.sintef.no
C=no;PRMD=uninett;O=sintef;OU=elab-runit;S=alvestrand;G=harald
+47 7 59 70 94

0 new messages