Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

leafnode: running texpire weekly instead of daily?

15 views
Skip to first unread message

Adam Funk

unread,
Jul 24, 2012, 6:53:21 AM7/24/12
to
(I posted this in the leafnode-users mailing list a little while ago,
but no-one replied there. Apologies to those who are seeing it again
here.)


I have a fairly long retention period in my leafnode configuration,
and the cron.daily/leafnode job takes about 1 hour to run. (I'm
running leafnode-2.0.0.alpha20110807b.luascript, and the cron script
might have been copied from an old Debian/Ubuntu package.)

Is the texpire job's running time related more to the size of the
spool or the number of articles that it decides to delete?

Would it do any harm to run it weekly instead of daily?

Thanks,
Adam




--
But the government always tries to coax well-known writers into the
Establishment; it makes them feel educated. [Robert Graves]

Mike Yetto

unread,
Jul 24, 2012, 8:17:03 AM7/24/12
to
Adam Funk <a24...@ducksburg.com> writes and having writ moves on.
>(I posted this in the leafnode-users mailing list a little while ago,
>but no-one replied there. Apologies to those who are seeing it again
>here.)

I saw this on the mailing list, but don't have a definitive
answer. So, here's my best guess.

>I have a fairly long retention period in my leafnode configuration,
>and the cron.daily/leafnode job takes about 1 hour to run. (I'm
>running leafnode-2.0.0.alpha20110807b.luascript, and the cron script
>might have been copied from an old Debian/Ubuntu package.)

>Is the texpire job's running time related more to the size of the
>spool or the number of articles that it decides to delete?

I would think that it is dependent on the size of the spool since
each article must be checked before a decision is made. A factor
to consider is if you are deleting by thread or by individual
article. I assume deleting by thread could leave more articles on the
spool as well as require more processing. As this might be
negligible or spool-size dependent YMMV.

>Would it do any harm to run it weekly instead of daily?

There should be no harm done, but if you're wondering which way
is more efficient / less intrusive, only some testing will tell.

Mike "post results / corrected assumptions" Yetto
--
In theory, theory and practice are the same.
In practice they are not.

Whiskers

unread,
Jul 24, 2012, 1:10:47 PM7/24/12
to
On 2012-07-24, Adam Funk <a24...@ducksburg.com> wrote:
> (I posted this in the leafnode-users mailing list a little while ago,
> but no-one replied there. Apologies to those who are seeing it again
> here.)
>
>
> I have a fairly long retention period in my leafnode configuration,
> and the cron.daily/leafnode job takes about 1 hour to run. (I'm
> running leafnode-2.0.0.alpha20110807b.luascript, and the cron script
> might have been copied from an old Debian/Ubuntu package.)
>
> Is the texpire job's running time related more to the size of the
> spool or the number of articles that it decides to delete?
>
> Would it do any harm to run it weekly instead of daily?
>
> Thanks,
> Adam

I don't use Leafnode at present (slrn talks directly to Individual.net
over ADSL), but when I did use it I had texpire run weekly, in the
early hours of Sunday when no-one would be using the machine; then all
I had to do was remember not to shut down on Saturday night :))

--
-- ^^^^^^^^^^
-- Whiskers
-- ~~~~~~~~~~

Fritz Wuehler

unread,
Jul 24, 2012, 2:38:44 PM7/24/12
to
Adam Funk <a24...@ducksburg.com> wrote:

> (I posted this in the leafnode-users mailing list a little while ago,
> but no-one replied there. Apologies to those who are seeing it again
> here.)
>
>
> I have a fairly long retention period in my leafnode configuration,
> and the cron.daily/leafnode job takes about 1 hour to run. (I'm
> running leafnode-2.0.0.alpha20110807b.luascript, and the cron script
> might have been copied from an old Debian/Ubuntu package.)
>
> Is the texpire job's running time related more to the size of the
> spool or the number of articles that it decides to delete?

It seems to be directly related to the size of the spool. My daily texpire
runs in the middle of the night but it also takes a long time. It doesn't
take an hour though. maybe 20 minutes. How big is your spool? Mine is about
8G right now. I have other problems with leafnode and I'm going to switch to
INN when I get the time.

> Would it do any harm to run it weekly instead of daily?

No, the only purpose is to free up space. If you don't need the space freed
daily you could even run it monthly and probably not even miss it. It does
do some repair but if you don't have problems you're aware of then you won't
miss that part either.

John F. Morse

unread,
Jul 24, 2012, 4:31:43 PM7/24/12
to
Fritz Wuehler wrote:
> Adam Funk <a24...@ducksburg.com> wrote:
>> (I posted this in the leafnode-users mailing list a little while ago,
>> but no-one replied there. Apologies to those who are seeing it again
>> here.)
>>
>> I have a fairly long retention period in my leafnode configuration,
>> and the cron.daily/leafnode job takes about 1 hour to run. (I'm
>> running leafnode-2.0.0.alpha20110807b.luascript, and the cron script
>> might have been copied from an old Debian/Ubuntu package.)
>>
>> Is the texpire job's running time related more to the size of the
>> spool or the number of articles that it decides to delete?
>>
>
> It seems to be directly related to the size of the spool. My daily texpire
> runs in the middle of the night but it also takes a long time. It doesn't
> take an hour though. maybe 20 minutes. How big is your spool? Mine is about
> 8G right now. I have other problems with leafnode and I'm going to switch to
> INN when I get the time.
>


Running INN is precisely what I was going to suggest.

Without knowing what Adam's computer's hardware capabilities are, the
times given are not of much value for comparison. But here is what some
of my servers did at midnight.

My oldest INN server, using a 266 MHz AMD-K6 CPU and only 256 MB of RAM,
can expire a "full Usenet text feed" plus a few binary groups, in less
than two hours.

That would be 88,798 articles from a spool containing 7,985,227 articles
as shown below:

Expire messages:
expireover start Tue Jul 24 00:04:42 CDT 2012: ( -z/var/log/news/expire.rm -Z/var/log/news/expire.lowmark)
Article lines processed 51109
Articles dropped 0
Overview index dropped 0
expireover end Tue Jul 24 01:29:43 CDT 2012
lowmarkrenumber begin Tue Jul 24 01:29:43 CDT 2012: (/var/log/news/expire.lowmark)
lowmarkrenumber end Tue Jul 24 01:29:45 CDT 2012
expire begin Tue Jul 24 01:30:16 CDT 2012: (-v1)
Article lines processed 7985277
Articles retained 7896479
Entries expired 88798
expire end Tue Jul 24 01:45:13 CDT 2012
all done Tue Jul 24 01:45:13 CDT 2012



Total time is 01:45:13 which includes everything (expiring the overview
database, checking and renumbering the lowmark, expiring articles, and
making and mailing the report).

This server uses a 40 GB hard drive, which is 95% full.

An identical computer with the same hardware except it has a 20 GB hard
drive and 84% full, finishes in 37:54 expirint 71,708 articles.


Expire messages:
expireover start Tue Jul 24 00:04:39 CDT 2012: ( -z/var/log/news/expire.rm -Z/var/log/news/expire.lowmark)
Article lines processed 51104
Articles dropped 0
Overview index dropped 0
expireover end Tue Jul 24 00:32:13 CDT 2012
lowmarkrenumber begin Tue Jul 24 00:32:13 CDT 2012: (/var/log/news/expire.lowmark)
lowmarkrenumber end Tue Jul 24 00:32:14 CDT 2012
expire begin Tue Jul 24 00:32:44 CDT 2012: (-v1)
Article lines processed 2649036
Articles retained 2577328
Entries expired 71708
expire end Tue Jul 24 00:37:54 CDT 2012
all done Tue Jul 24 00:37:54 CDT 2012



Here is another server using a slower 233 MHz AMD-K6 with only a 4.3 GB
hard drive, takes six minutes, but it is a transit only server (has no
active file and won't serve readers):


Renumbering active file.
Expire messages:
expire begin Tue Jul 24 00:05:19 CDT 2012: (-v1 -z/var/log/news/expire.rm)
Article lines processed 363582
Articles retained 290295
Entries expired 73287
expire end Tue Jul 24 00:06:12 CDT 2012
all done Tue Jul 24 00:06:12 CDT 2012



Still another transit server, which has an Intel 1.6 GHz Dual Core
Celeron E2100, 4 GB RAM and a 160 GB hard drive:


Expire messages:
expireover start Tue Jul 24 00:00:20 CDT 2012: ( -z/var/log/news/expire.rm -Z/var/log/news/expire.lowmark)
Article lines processed 5152
Articles dropped 0
Overview index dropped 0
expireover end Tue Jul 24 00:04:38 CDT 2012
lowmarkrenumber begin Tue Jul 24 00:04:38 CDT 2012: (/var/log/news/expire.lowmark)
lowmarkrenumber end Tue Jul 24 00:04:38 CDT 2012
expire begin Tue Jul 24 00:05:08 CDT 2012: (-v1)
Article lines processed 3089660
Articles retained 3015844
Entries expired 73816
expire end Tue Jul 24 00:05:35 CDT 2012
all done Tue Jul 24 00:05:35 CDT 2012



Over three million articles in 05:35.

I can only guess that Adam's computer is old, slow, or is being used by
other processes.

Or he pulls/sucks many groups, or large groups. (Hint: Use filtering if
available.)

Or his retention is too lengthy. (Hint: Download and save interesting
articles locally.)

Or Leafnode leaves a lot to be desired (I've never used Leafnode so I'll
not judge it).

I should also point out that I use CNFS for the spool, which is a set
size of spool buffers. Old articles are overwritten, saving deletion
(removal) time.

With CNFS, retention can vary by incoming load, and by server
configuration. Junk groups come and go within days or maybe hours, while
groups that contain useful information can remain for many months and
even years. You get to choose which groups are assigned to which class,
how many cycbuffs serve each class, and their sizes.

Transit servers do not need to keep articles very long, maybe ten days
at the most, just in case a downstream peer has been offline for awhile.
Most servers will not accept articles older than ten days anyway.

Reader servers should keep articles for a minimum of 16 days, which
would allow someone on a two-week vacation to read, or at least download
the news (this includes both weekends).

For an example, the first server shown above has the following expiry
configuration results:


Class veryfast for groups matching
"control.cancel,*.test,news.lists.filters"
Buffer 001, size: 64.0 MBytes, position: 16.2 MBytes 67.25 cycles
Newest: 2012-07-24 15:00:10, 0 days, 0:04:21 ago
Oldest: 2012-06-29 15:45:02, 24 days, 23:19:29 ago

Class fastbins for groups matching "*.bina*,!local.binaries"
Buffer 002, size: 256 MBytes, position: 23.0 MBytes 75.09 cycles
Newest: 2012-07-24 14:59:44, 0 days, 0:04:48 ago
Oldest: 2012-05-16 16:51:34, 68 days, 22:12:58 ago

Class fasttext for groups matching
"news.admin.net-abuse.email,news.admin.net-abuse.sightings,*.jobs*,perl.cpan.testers,alt.alt.spamtrap"
Buffer 003, size: 512 MBytes, position: 43.8 MBytes 56.09 cycles
Newest: 2012-07-24 14:59:44, 0 days, 0:04:48 ago
Oldest: 2012-06-30 19:37:22, 23 days, 19:27:10 ago

Class slowbins for groups matching "local.binaries"
Buffer 004, size: 2.00 GBytes, position: 132 MBytes 0.06 cycles
Newest: 2012-07-22 17:25:39, 1 days, 21:38:53 ago
Oldest: 2009-07-01 2:03:46, 1119 days, 13:00:46 ago

Class slowtext for groups matching "*,!junk"
Buffer 005, size: 2.00 GBytes, position: 528 kBytes 6.00 cycles
Newest: 2012-06-16 8:51:43, 38 days, 6:12:49 ago
Oldest: 2012-06-04 16:27:58, 49 days, 22:36:34 ago
Buffer 006, size: 2.00 GBytes, position: 528 kBytes 6.00 cycles
Newest: 2012-06-27 10:32:16, 27 days, 4:32:16 ago
Oldest: 2012-06-16 8:51:52, 38 days, 6:12:40 ago
Buffer 007, size: 2.00 GBytes, position: 528 kBytes 6.00 cycles
Newest: 2012-07-09 13:23:31, 15 days, 1:41:02 ago
Oldest: 2012-06-27 10:32:29, 27 days, 4:32:04 ago
Buffer 008, size: 2.00 GBytes, position: 528 kBytes 6.00 cycles
Newest: 2012-07-21 10:32:27, 3 days, 4:32:06 ago
Oldest: 2012-07-09 13:23:39, 15 days, 1:40:54 ago
Buffer 009, size: 2.00 GBytes, position: 566 MBytes 5.28 cycles
Newest: 2012-07-24 15:04:22, 0 days, 0:00:11 ago
Oldest: 2012-01-31 15:46:29, 174 days, 22:18:04 ago
Buffer 011, size: 2.00 GBytes, position: 528 kBytes 5.00 cycles
Newest: 2012-02-23 17:35:50, 151 days, 20:28:43 ago
Oldest: 2012-02-10 12:37:27, 165 days, 1:27:06 ago
Buffer 012, size: 2.00 GBytes, position: 528 kBytes 5.00 cycles
Newest: 2012-03-08 11:09:25, 138 days, 2:55:09 ago
Oldest: 2012-02-23 17:35:56, 151 days, 20:28:38 ago
Buffer 013, size: 2.00 GBytes, position: 528 kBytes 5.00 cycles
Newest: 2012-03-22 13:03:23, 124 days, 2:01:11 ago
Oldest: 2012-03-08 11:09:32, 138 days, 2:55:02 ago
Buffer 014, size: 2.00 GBytes, position: 528 kBytes 5.00 cycles
Newest: 2012-04-07 10:15:24, 108 days, 4:49:10 ago
Oldest: 2012-03-22 13:03:37, 124 days, 2:00:57 ago
Buffer 015, size: 2.00 GBytes, position: 528 kBytes 5.00 cycles
Newest: 2012-04-22 15:58:14, 92 days, 23:06:20 ago
Oldest: 2012-04-07 10:15:34, 108 days, 4:49:00 ago
Buffer 016, size: 2.00 GBytes, position: 528 kBytes 5.00 cycles
Newest: 2012-05-07 8:44:55, 78 days, 6:19:40 ago
Oldest: 2012-04-22 15:58:14, 92 days, 23:06:21 ago
Buffer 017, size: 2.00 GBytes, position: 528 kBytes 5.00 cycles
Newest: 2012-05-22 7:30:03, 63 days, 7:34:32 ago
Oldest: 2012-05-07 8:45:02, 78 days, 6:19:33 ago
Buffer 018, size: 2.00 GBytes, position: 528 kBytes 5.00 cycles
Newest: 2012-06-04 16:27:50, 49 days, 22:36:45 ago
Oldest: 2012-05-22 7:30:03, 63 days, 7:34:32 ago

Class junk for groups matching "junk"
Buffer 019, size: 256 MBytes, position: 136 MBytes 149.53 cycles
Newest: 2012-07-24 15:04:21, 0 days, 0:00:15 ago
Oldest: 2012-07-23 17:24:49, 0 days, 21:39:47 ago


Note the "Oldest:" lines above.

Hope that helps.


--
John

When a person has -- whether they knew it or not -- already
rejected the Truth, by what means do they discern a lie?

Fritz Wuehler

unread,
Jul 25, 2012, 12:18:47 PM7/25/12
to
"John F. Morse" <jo...@example.invalid> wrote:

> Or Leafnode leaves a lot to be desired (I've never used Leafnode so I'll
> not judge it).

I have used it for years with very few problems. I have had to rebuild the
overview and groupinfo a few times but never lost any articles. Lately I've
been having problems with getting a message about new newsgroups that
doesn't go away and I can no longer read several local newsgroups unless I
do texpire. As soon as fetchnews runs something gets corrupted and the local
groups are most unreadable again. Leafnode seems fine for small to midsize
personal server but I don't trust it anymore so I am going to INN.

> I should also point out that I use CNFS for the spool, which is a set
> size of spool buffers. Old articles are overwritten, saving deletion
> (removal) time.
>
> With CNFS, retention can vary by incoming load, and by server
> configuration. Junk groups come and go within days or maybe hours, while
> groups that contain useful information can remain for many months and
> even years. You get to choose which groups are assigned to which class,
> how many cycbuffs serve each class, and their sizes.

I read about that and it looks very attractive from a speed standpoint. I am
thinking I would like more control over retaining certain groups though.

> Reader servers should keep articles for a minimum of 16 days, which
> would allow someone on a two-week vacation to read, or at least download
> the news (this includes both weekends).

I use my news server as an archive as well. I will have to figure out how to
set up INN for this. My retention needs vary widely by group.

John F. Morse

unread,
Jul 25, 2012, 1:56:56 PM7/25/12
to
I've tried to find time to take a look at Leafnode, but there is always
something else coming up that needs my attention. <sigh>

I've been running a Usenet server farm using several (five at this time)
computers, since 2000. From 1982 I ran a simple BBS on a 1976 Apple ][,
then an Apple //e. In 1989 I switched over to a multi-node BBS on a
Macintosh Plus, then a Mac IIsi, followed by a Mac IIci.

The lure of the Internet attracted all of the users, running it into
disuse around 2000, so I fired up a Usenet server to take over from a
commercial Texas NNTP server that had shut down. My new server used
RumorMill software, which could pull news from other NNTP servers.

In 2005 I started up INN under Debian 3.1 "Sarge" and slowly added
additional servers. I shut down the RumorMill server shortly afterward,
but occasionally used it for short periods to pull news from specific
groups on two ISPs, feeding the INN server(s).

Then I established peering accounts with some of the top Usenet servers
around the world, and no longer needed the RumorMill server.

I do use Pullnews (a program that is part of the INN package) to pull
five binary groups from my ISP's NSP, mainly for fills, because most of
my peers do not handle binaries, or those which do limit them in size.
Pullnews runs as a cron job twice every hour, so news articles are never
older than that.

Control is what INN provides.

I have seven groups that never expire. I use the tradspool for them (one
file per article), and have it optioned to keep forever. These are local
groups where local users are interested in archiving them for historical
reasons. The articles date from 2001, but only occupy 507,732 kilobytes
(496 MB). Compare to the 85 GB used by the CNFS cycbuffs on the main
reader server.

All other groups (I host 32,669 total groups) are in CNFS, which keeps
the hard disk from filling up. The CNFS buffers are created at whatever
size is required, and they do not grow.

Buffers can be grouped into metabuffers, which then can hold whichever
newsgroups you desire. This makes retention needs by group(s) easy to
set and forget.

Here is how the /etc/news/cycbuff.conf is set up for the same example
old reader server (you may want to save this message to refer back to as
a guide):

------------------------------------------------------------------------

news@news5:/etc/news$ cat cycbuff.conf
## $Id: cycbuff.conf 7651 2007-08-20 10:28:34Z iulius $
##
## Configuration file for INN CNFS storage method.
##
## This file defines the cyclical buffers that make up the storage pools
## for CNFS (Cyclic News File System). For information about how to
## configure INN to use CNFS, see the storage.conf man page; and for
## information about how to create the CNFS buffers, see the cycbuff.conf
## man page.
##
## The order of lines in this file is not important among the same item.
## But all cycbuff item should be presented before any metacycbuff item.

## Number of articles written before the cycbuff header is
## written back to disk to (25 by default).

cycbuffupdate:25

## Interval (in seconds) between re-reads of the cycbuff (30 by default).

refreshinterval:30

## 1. Cyclic buffers
## Format:
## "cycbuff" (literally) : symbolic buffer name (less than 7
characters) :
## path to buffer file : length of symbolic buffer in kilobytes in
decimal
## (1KB = 1024 bytes)
##
## If you're trying to stay under 2 GB, keep your sizes below 2097152.

cycbuff:001:/var/spool/news/cycbuffs/001:65535
cycbuff:002:/var/spool/news/cycbuffs/002:262143
cycbuff:003:/var/spool/news/cycbuffs/003:524287
cycbuff:004:/var/spool/news/cycbuffs/004:2097151
cycbuff:005:/var/spool/news/cycbuffs/005:2097151
cycbuff:006:/var/spool/news/cycbuffs/006:2097151
cycbuff:007:/var/spool/news/cycbuffs/007:2097151
cycbuff:008:/var/spool/news/cycbuffs/008:2097151
cycbuff:009:/var/spool/news/cycbuffs/009:2097151
cycbuff:010:/var/spool/news/cycbuffs/010:2097151
cycbuff:011:/var/spool/news/cycbuffs/011:2097151
cycbuff:012:/var/spool/news/cycbuffs/012:2097151
cycbuff:013:/var/spool/news/cycbuffs/013:2097151
cycbuff:014:/var/spool/news/cycbuffs/014:2097151
cycbuff:015:/var/spool/news/cycbuffs/015:2097151
cycbuff:016:/var/spool/news/cycbuffs/016:2097151
cycbuff:017:/var/spool/news/cycbuffs/017:2097151
cycbuff:018:/var/spool/news/cycbuffs/018:2097151
cycbuff:019:/var/spool/news/cycbuffs/019:262143

## 2. Meta-cyclic buffers
## Format:
## "metacycbuff" (literally) : symbolic meta-cyclic buffer name (less
than
## 8 characters) : comma separated list of cyclic buffer symbolic names
## [:INTERLEAVE|SEQUENTIAL]
##
## With the default INTERLEAVE mode, articles are stored in each cycbuff
## in a round-robin fashion, one article per cycbuff in the order listed.
## With the SEQUENTIAL mode, each cycbuff is written in turn until that
## cycbuff is full and then moves on to the next one.
##
## Symbolic meta-cyclic buffer names are used in storage.conf in the
## options: field.

metacycbuff:veryfast:001
metacycbuff:fastbins:002
metacycbuff:fasttext:003
metacycbuff:slowbins:004
metacycbuff:slowtext:005,006,007,008,009,010,011,012,013,014,015,016,017,018:SEQUENTIAL
metacycbuff:junk:019
news@news5:/etc/news$

------------------------------------------------------------------------

You create your server however you want. Use whatever names you choose
for the cycbuffs, etc. INN will do whatever you tell it to do in the
various configuration files.

I should also add that INN has never crashed on any of my Debian
GNU/Linux computers (well, neither has Debian for that matter). It would
run for years unattended.

Then your newsgroups are assigned to each class in the
/etc/news/storage.conf file:

------------------------------------------------------------------------

news@optima5:/etc/news$ cat storage.conf
## $Id: storage.conf 7651 2007-08-20 10:28:34Z iulius $
##
## Rules for where INN should store incoming articles.
##
## This file is used to determine which storage method articles are sent
## to be stored and which storage class they are stored as. Each
## method is described as follows:
##
## method <methodname> {
## newsgroups: <wildmat>
## class: <storage class #>
## size: <minsize>[,<maxsize>]
## expires: <mintime>[,<maxtime>]
## options: <options>
## exactmatch: <bool>
## }
##
## See the storage.conf man page for more information.
##
## Only newsgroups, class, and (for CNFS, to specify the metacycbuff)
## options are required; the other keys are optional. If any CNFS
## methods are configured, you will also need to set up cycbuff.conf.

## By default, store everything in tradspool.

#method tradspool {
# newsgroups: *
# class: 0
#}

## Here are some samples for a CNFS configuration. This assumes that you
## have two metacycbuffs configured, one for text newsgroups and one for
## binaries. Cancel messages, which tend to be very high-volume, are
## stored in the binary metacycbuff as well. This assumes storeonxref is
## set to true in inn.conf.

## Pick off the binary newsgroups first.

#method cnfs {
# newsgroups: *.bina*,control.cancel
# class: 1
# options: BINARY
#}

## Put the remaining (text) groups in the other cycbuff.

#method cnfs {
# newsgroups: *
# class: 2
# options: TEXT
#}

method tradspool {
newsgroups:
local.announce,local.computers,local.info,@local.binaries,@local.test
class: 0
}

method cnfs {
newsgroups: control.cancel,*.test,news.lists.filters
class: 1
options: veryfast
}

method cnfs {
newsgroups: *.bina*,!local.binaries
class: 2
options: fastbins
}

method cnfs {
newsgroups:
news.admin.net-abuse.email,news.admin.net-abuse.sightings,*.jobs*,perl.cpan.testers,alt.alt.spamtrap
class: 3
options: fasttext
}

method cnfs {
newsgroups: local.binaries
class: 4
options: slowbins
}

method cnfs {
newsgroups: *,!junk
class: 5
options: slowtext
}

method cnfs {
newsgroups: junk
class: 6
options: junk
}

news@news5:/etc/news$

------------------------------------------------------------------------

The /etc/news/expire.ctl file controls expiry. I use it for keeping the
articles in the tradspool groups from expiring, except I do want the
local.test group to expire in three days. Here is how it looks for this
same example server:

------------------------------------------------------------------------

news@news5:/etc/news$ cat expire.ctl
## $Id: expire.ctl 8575 2009-08-18 13:53:54Z iulius $
##
## Sample configuration file for article expiration.
##
## Format:
## /remember/:<keep>
## <class>:<min>:<default>:<max>
## <wildmat>:<flag>:<min>:<default>:<max>
##
## First line gives history retention; second line specifies expiration
## for classes; third line specifies expiration for group if
groupbaseexpiry
## is true in inn.conf.
## <class> Class specified in storage.conf.
## <wildmat> Wildmat-style patterns for the newsgroups.
## <flag> Status of the newsgroups.
## <keep> Number of days to retain a message-ID in history.
## <min> Minimum number of days to keep the article.
## <default> Default number of days to keep the article.
## <max> Flush the article after this many days.
## <keep>, <min>, <default> and <max> can be floating-point numbers or the
## word "never". Times are based on the arrival time for expire and
expireover
## (unless -p is used; see expire(8) and expireover(8)), and on the posting
## time for history retention.
##
## See the expire.ctl man page for more information.

## When an article is rejected or expires before 10 days, we still remember
## it for 11 days from its original posting time in case we get offered it
## again. See the artcutoff parameter in inn.conf; it should match this
## parameter (/remember/ uses 11 days instead of 10 in order to take into
## account articles whose posting date is one day into the future).
/remember/:11

## Keep for 1-15 days, allow Expires: headers to work. This entry uses
## the syntax appropriate when groupbaseexpiry is true in inn.conf. Times
## are based on the arrival time (unless -p is used).

*:A:14:14:14
local.*:A:never:never:never
local.test:A:3:3:3

## Keep for 1-15 days, allow Expires: headers to work. This is an entry
## based on storage class, used when groupbaseexpiry is false. Times
## are based on the arrival time (unless -p is used).
#0:1:15:never
news@news5:/etc/news$

------------------------------------------------------------------------

Everything you will need can be found here:
http://www.eyrie.org/~eagle/software/inn/docs-2.5

INN 2.5.2 is the current release, and is available as a binary package
by many Linux distros through their package manager. Or you can compile
it as described at the above link.

More help here: http://www.eyrie.org/~eagle/faqs/inn.html

The news.software.nntp newsgroup is where discussion takes place, plus
there is a mailing list that is mostly useful for developers.

If you prefer to compile INN from source, and for additional info, see:

http://www.isc.org/software/inn

Nomen Nescio

unread,
Jul 31, 2012, 4:33:14 PM7/31/12
to
"John F. Morse" <jo...@example.invalid> wrote:

> Fritz Wuehler wrote:
> > "John F. Morse" <jo...@example.invalid> wrote:
> >

> I've tried to find time to take a look at Leafnode, but there is always
> something else coming up that needs my attention. <sigh>

I don't think there is any point, given your INN farm :)

> I've been running a Usenet server farm using several (five at this time)


> Control is what INN provides.
>
> I have seven groups that never expire. I use the tradspool for them (one
> file per article), and have it optioned to keep forever. These are local
> groups where local users are interested in archiving them for historical
> reasons. The articles date from 2001, but only occupy 507,732 kilobytes
> (496 MB). Compare to the 85 GB used by the CNFS cycbuffs on the main
> reader server.
>
> All other groups (I host 32,669 total groups) are in CNFS, which keeps
> the hard disk from filling up. The CNFS buffers are created at whatever
> size is required, and they do not grow.
>
> Buffers can be grouped into metabuffers, which then can hold whichever
> newsgroups you desire. This makes retention needs by group(s) easy to
> set and forget.

This is amazing. Thanks for all the detail, it's really helpful.

[configs snipped]

> I should also add that INN has never crashed on any of my Debian
> GNU/Linux computers (well, neither has Debian for that matter). It would
> run for years unattended.

This is interesting and good to hear.

> Everything you will need can be found here:
> http://www.eyrie.org/~eagle/software/inn/docs-2.5

Thanks for the links. I had looked into INN a few years ago and bookmarked
quite a few sites but then was happy enough with leafnode until recently.

Sorry for not answering sooner, been on the road for work and just got back
Sunday night. Thanks for the great info and helpful posts always.





John F. Morse

unread,
Jul 31, 2012, 5:55:41 PM7/31/12
to
Nomen Nescio wrote:
> "John F. Morse" <jo...@example.invalid> wrote:
>> Fritz Wuehler wrote:
>>
>>> "John F. Morse" <jo...@example.invalid> wrote:
>>>
>> I've tried to find time to take a look at Leafnode, but there is always
>> something else coming up that needs my attention. <sigh>
>>
>
> I don't think there is any point, given your INN farm :)
>


My interest is more curiosity than needing Leafnode. I'd like to examine
it to see how it ticks, scores, filters, posts, etc.


>> I've been running a Usenet server farm using several (five at this time)
>> Control is what INN provides.
>>
>> I have seven groups that never expire. I use the tradspool for them (one
>> file per article), and have it optioned to keep forever. These are local
>> groups where local users are interested in archiving them for historical
>> reasons. The articles date from 2001, but only occupy 507,732 kilobytes
>> (496 MB). Compare to the 85 GB used by the CNFS cycbuffs on the main
>> reader server.
>>
>> All other groups (I host 32,669 total groups) are in CNFS, which keeps
>> the hard disk from filling up. The CNFS buffers are created at whatever
>> size is required, and they do not grow.
>>
>> Buffers can be grouped into metabuffers, which then can hold whichever
>> newsgroups you desire. This makes retention needs by group(s) easy to
>> set and forget.
>>
>
> This is amazing. Thanks for all the detail, it's really helpful.
>
>> I should also add that INN has never crashed on any of my Debian
>> GNU/Linux computers (well, neither has Debian for that matter). It would
>> run for years unattended.
>>
>
> This is interesting and good to hear.
>
>> Everything you will need can be found here:
>> http://www.eyrie.org/~eagle/software/inn/docs-2.5
>>
>
> Thanks for the links. I had looked into INN a few years ago and bookmarked
> quite a few sites but then was happy enough with leafnode until recently.
>
> Sorry for not answering sooner, been on the road for work and just got back
> Sunday night. Thanks for the great info and helpful posts always.
>


Work always gets in the way. ;-)

You are welcome, and if you have questions, remember news.software.nntp
is the place to ask.

Nomen Nescio

unread,
Aug 5, 2012, 1:50:03 PM8/5/12
to
"John F. Morse" <jo...@example.invalid> wrote:

> >> I've tried to find time to take a look at Leafnode, but there is always
> >> something else coming up that needs my attention. <sigh>
> >>
> >
> > I don't think there is any point, given your INN farm :)
> >
>
>
> My interest is more curiosity than needing Leafnode. I'd like to examine
> it to see how it ticks, scores, filters, posts, etc.

Leafnode scoring is done by using a filter file in the /etc/leafnode
directory. You specify regexp-ish types of expressions for by newsgroup(s),
what pattern you want to match (eg. header matches) and then what action to
take as in killing the post, accepting it, etc. It's not as powerful as
slrn's scoring but it's seems fine at the server level. Scoring and
filtering are one and the same on leafnode, AFAICS. Some of the filtering
it does that I like is ability to score on max crossposts since that is
usually a tipoff for SPAM and pr0n and I also like being able to kill by
newsserver, screwgol groups is high on the list. You can also kill posts by
age there. But in the main config you can also do some amount of
filtering. You have to run the texpire program of leafnode to filter
messages that already hit your spool. I used to remember more details about
when the filter file and when the scoring in the config file actually took
effect but now I am too tired to remember. The doc in the config files is
mostly good and the users are pretty helpful. But I think my system has
outgrown it.

> You are welcome, and if you have questions, remember news.software.nntp
> is the place to ask.

Thank you!


John F. Morse

unread,
Aug 6, 2012, 4:14:55 AM8/6/12
to
Thanks for the Leafnode info, Nomen.

Something to look into, maybe when there's a couple of feet of snow on
the ground. ;-)

Adam Funk

unread,
Aug 28, 2012, 6:35:03 AM8/28/12
to
On 2012-07-24, Mike Yetto wrote:

> Adam Funk <a24...@ducksburg.com> writes and having writ moves on.

>>Is the texpire job's running time related more to the size of the
>>spool or the number of articles that it decides to delete?
>
> I would think that it is dependent on the size of the spool since
> each article must be checked before a decision is made. A factor
> to consider is if you are deleting by thread or by individual
> article. I assume deleting by thread could leave more articles on the
> spool as well as require more processing. As this might be
> negligible or spool-size dependent YMMV.
>
>>Would it do any harm to run it weekly instead of daily?
>
> There should be no harm done, but if you're wondering which way
> is more efficient / less intrusive, only some testing will tell.
>
> Mike "post results / corrected assumptions" Yetto


OK, here are the results so far, churned through a spreadsheet:

days texpire h:m:s per m:s per
since run million million
last time deletions retentions
run
1 1:20:56 07:09:06 37:23
1 1:21:23 00:02:03 37:34
1 1:21:59 15:14:27 37:50
1 1:19:31 15:32:24 36:40
1 1:24:04 05:57:46 38:45
1 1:21:09 12:07:42 37:23
1 1:20:09 11:54:04 36:54
4 1:24:11 04:55:53 38:43
7 1:23:34 20:12:31 38:16
7 1:27:13 14:06:30 39:47
7 1:28:35 18:24:21 40:15


(The number of deletions ranged from 142 to 4427; the number retained
stayed around 2,100,000.) So the texpire run time seems consistently
dependent on the spool size, and it does no harm to run it less
frequently.


--
Nam Sibbyllam quidem Cumis ego ipse oculis meis vidi in ampulla
pendere, et cum illi pueri dicerent: beable beable beable; respondebat
illa: doidy doidy doidy. [plorkwort]

Mike Yetto

unread,
Aug 28, 2012, 8:01:49 AM8/28/12
to
Thanks for the follow-up. It seems that such a large spool size
is the determining factor. My guess is that a smaller or varying
spool might show a preference for more frequent processing, but
not by much

Mike "with a few thousand retained the test is hardly worthwhile" Yetto

Adam Funk

unread,
Aug 29, 2012, 7:52:47 AM8/29/12
to
On 2012-08-28, Mike Yetto wrote:

> Thanks for the follow-up. It seems that such a large spool size
> is the determining factor. My guess is that a smaller or varying
> spool might show a preference for more frequent processing, but
> not by much
>
> Mike "with a few thousand retained the test is hardly worthwhile" Yetto


I see the texpire log message says:

message.id: 2702 articles deleted, 2200539 kept

I guess that number kept refers to the number of unique messages (in
the message.id subfolders) rather than the number of "Xref" duplicates
(hard links to the same inodes)?



--
Civilization is a race between catastrophe and education.
[H G Wells]
0 new messages