INN odd behavior during expireover?

Jesse Rehmer

unread,

Jun 24, 2022, 12:05:33 AM6/24/22

to

For the past week I've been pull articles into a news server running FreeBSD
and INN 2.7.0rc1, and got to a stopping point after adding a few hundred
gigabytes to the spool, so I decided to run news.daily since I hadn't since
this endevor began.

I understand during certain portions of the news.daily process the server gets
paused. However, I am noticing very strange behavior when the server claims it
is not paused (according to logs), but while expireover is running.

INN is unresponsive to ctlinnd commands for ~5 minutes and suddenly becomes
responsive, the upstream innfeed process (on another VM) reports a
non-responsive connection during these periods. NNRPd is responing normally
for the most part, I can AUTH, issue GROUP, BODY, and similar commands, but
when I issue a POST I get no response from the server until my connection
times out.

I can't find anything useful in the logs as to what is happening while INN is
not responding, nor do I see any other underlying OS or I/O issue. The
expireover process is using 100% of one CPU core, and quite a bit of I/O, but
not enough that should cause this kind of unresponsiveness. The OS is
responsive and other applications/daemons on the box respond as expected so I
don't see a server-wide performance problem.

I've put this server under taxing I/O load while injecting articles over the
last and never saw this behavior until expireover had been running for a
little over 4 hours. Any ideas?

top:
last pid: 53975; load averages: 1.22, 1.20, 1.15

up 4+20:06:35 22:59:56
75 processes: 2 running, 73 sleeping
CPU: 2.1% user, 0.0% nice, 10.6% system, 0.0% interrupt, 87.3% idle
Mem: 160M Active, 6103M Inact, 62M Laundry, 8652M Wired, 905M Free
ARC: 5910M Total, 3785M MFU, 1455M MRU, 912K Anon, 91M Header, 578M Other
3725M Compressed, 8526M Uncompressed, 2.29:1 Ratio
Swap: 2048M Total, 2048M Free

PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
8478 news 1 103 0 66M 42M CPU5 5 434:17 99.86% expireover
55548 root 1 20 0 32M 8916K select 3 3:45 0.09% snmpd
53975 root 1 20 0 14M 3440K CPU3 3 0:00 0.05% top
31561 root 4 20 0 49M 18M select 0 6:12 0.05% vmtoolsd
53913 news 1 24 4 37M 16M sbwait 6 0:00 0.03% nnrpd
29975 root 1 20 0 13M 2292K select 1 4:26 0.01% syslogd
-----

iostat:
root@spool1:/var/log # iostat
tty da0 cd0 pass0 cpu
tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id
1 728 43.4 222 9.4 17.2 0 0.0 0.0 0 0.0 6 0 6 0 88
-----

systat:
/0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10
Load Average ||||||

/0% /10 /20 /30 /40 /50 /60 /70 /80 /90 /100
root idle XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
root idle XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
root idle XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
news expireover XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
root idle XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
root idle XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
root idle XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
root idle XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
root idle XXXXXXXXXXXXXXX
-----

Cheers,
Jesse

The Doctor

unread,

Jun 24, 2022, 12:15:45 AM6/24/22

to

In article <435317104...@freebsd-inject1.usenet.blueworldhosting.com>,

How is your expire configuration configured?
--
Member - Liberal International This is doctor@@nl2k.ab.ca Ici doctor@@nl2k.ab.ca
Yahweh, Queen & country!Never Satan President Republic!Beware AntiChrist rising!
Look at Psalms 14 and 53 on Atheism https://www.empire.kred/ROOTNK?t=94a1f39b
Saying good things is evil when they are lies. -unknown Beware https://mindspring.com

Jesse Rehmer

unread,

Jun 24, 2022, 12:42:40 AM6/24/22

to

On Jun 23, 2022 at 11:05:31 PM CDT, "Jesse Rehmer"
<jesse....@blueworldhosting.com> wrote:
<snip>

> I've put this server under taxing I/O load while injecting articles over the
> last and never saw this behavior until expireover had been running for a
> little over 4 hours. Any ideas?

<snip>

So after killing the few nnrpd processes that were running off, the
performance issue has resolved. There were only two active processes, my
reader, and one from UZANTO-RETO (<https://www.uzantoreto.com/>) who scans my
server. They make slow requests and were not hammering the box, and my
connection is idle. The expireover process is still running same as before,
but article delivery and other commands are good.

As soon as I start NNRPd and make a connection the problem returns even though
I am doing nothing. WTH.

-Jesse

Jesse Rehmer

unread,

Jun 24, 2022, 1:17:30 AM6/24/22

to

On Jun 23, 2022 at 11:15:44 PM CDT, "The Doctor" <The Doctor> wrote:

> How is your expire configuration configured?

On this box it is the first time the expire process has been ran since
injecting ~52,000,000 articles.

expire.ctl is pretty simple, though I understand given the history I've
brought over that the *.jobs* and *.test is going to take a long time.

*:A:1:never:never
*.jobs*:A:1:90:90
*.test:A:1:300:300
news.lists.filters:A:1:10:10

I've put this hardware and INN under much heavier load. In tests prior to
filling the spool this server has achieved article delivery throughput rates
of almost 100 articles per second (input stream 600-800Mbps using text and
binary articles), and kept up with output of 1.2Gbps to downstream nodes
without delay and using far more disk I/O an throughput that I'm seeing occur
now while the expireover process is running.

What I'm seeing now just doesn't make any sense.

Jesse Rehmer

unread,

Jun 24, 2022, 6:47:03 AM6/24/22

to

On Jun 23, 2022 at 11:42:39 PM CDT, "Jesse Rehmer" in

There was an 18-minute gap in innd activity according to the news.notice log
where it was completely unresponsive (no response on port 119, no response to
ctlinnd commands, etc.). During this period truss did not output anything for
innd or its children (unfortunately I wasn't attached when the condition
began). When things sprang to life what I saw with truss looked like history
or overview read/writes and articles started flowing.

What I'm seeing in the logs is the period of unresponsiveness ends with innd
logging its stats. There are some interesting entries after that, such as
"SERVER cant sendto CCreader" and I have a lot of "Connection reset by peer"
and "broken pipe" messages. The latter two appear frequently for a few minutes
and subside.

news.notice:

Jun 24 05:10:50 spool1 innd[39682]: rejecting[python]
<part26of137.j3RK...@camelsystem-powerpost.local> 437 Binary
(yEnc)
Jun 24 05:10:50 spool1 innd[39682]: rejecting[python]
<ozVqI.399066$DJ2.3...@fx42.iad> 437 Binary (yEnc)
Jun 24 05:10:51 spool1 innd[39682]: rejecting[python]
<ozVqI.216709$Qf2.1...@fx38.iad> 437 Binary (yEnc)
Jun 24 05:28:51 spool1 innd[39682]: ME time 1106338 hishave 2281(1872) hisgrep
3391(1340) hiswrite 2113(761) hissync 0(0) idle 12369(8902) artclean 7(955)
artwrite 1303(761) artcncl 0(0) hisgrep/artcncl 0(0) hishave/artcncl 0(0)
artlog/artcncl 0(0) hiswrite/artcncl 0(0) sitesend 16(761) overv 1080211(761)
perl 27(761) python 2857(1868) nntpread 135(9724) artparse 182(8378)
artlog/artparse 0(0) artlog 36(955) datamove 27(10983)
Jun 24 05:28:51 spool1 innd[39682]: ME HISstats 0 hitpos 926 hitneg 0 missed
946 dne
Jun 24 05:28:51 spool1 innd[39682]: ME status seconds 3613 accepted 27217
refused 14 rejected 10902 duplicate 0 accepted size 248355872 duplicate size 0
rejected size 3782013184
Jun 24 05:28:51 spool1 innd[39682]: <feed1.usenet.blueworldhosting.com> status
seconds 3613 accepted 1346 refused 5 rejected 50 duplicate 0 accepted size
124000704 duplicate size 0 rejected size 91707
Jun 24 05:28:51 spool1 innd[39682]: localhost status seconds 1856 accepted
25871 refused 9 rejected 10852 duplicate 0 accepted size 124355152 duplicate
size 0 rejected size 3781921280
Jun 24 05:28:51 spool1 innd[39682]: ctlinnd command s
Jun 24 05:28:51 spool1 innd[39682]: SERVER cant sendto CCreader bytes 211
Connection refused
Jun 24 05:28:51 spool1 innd[39682]: <feed1.usenet.blueworldhosting.com>
connected 38 streaming allowed
Jun 24 05:28:51 spool1 innd[39682]: <feed1.usenet.blueworldhosting.com>
connected 77 streaming allowed
Jun 24 05:28:52 spool1 innd[39682]: <feed1.usenet.blueworldhosting.com>
connected 292 streaming allowed
Jun 24 05:28:52 spool1 innd[39682]: <feed1.usenet.blueworldhosting.com:30>
cant
read: Connection reset by peer
Jun 24 05:28:52 spool1 innd[39682]: <feed1.usenet.blueworldhosting.com:38>
cant
read: Connection reset by peer
Jun 24 05:28:52 spool1 innd[39682]: <feed1.usenet.blueworldhosting.com:77>
cant
read: Connection reset by peer
Jun 24 05:28:52 spool1 innd[39682]: <feed1.usenet.blueworldhosting.com>
connected 295 streaming allowed
Jun 24 05:28:52 spool1 innd[39682]: <feed1.usenet.blueworldhosting.com:165>
cant
read: Connection reset by peer
Jun 24 05:28:52 spool1 innd[39682]: <feed1.usenet.blueworldhosting.com:165>
cant
write: Broken pipe
Jun 24 05:28:52 spool1 innd[39682]: <feed1.usenet.blueworldhosting.com:165>
checkpoint seconds 1131 accepted 6 refused 1 rejected 0 duplicate 0 accepted
size 27845 duplicate size 0 rejected size 0
Jun 24 05:28:52 spool1 innd[39682]: <feed1.usenet.blueworldhosting.com:165>
closed seconds 1131 accepted 6 refused 1 rejected 0 duplicate 0 accepted size
27845 duplicate size 0 rejected size 0
Jun 24 05:28:52 spool1 innd[39682]: localhost:176 cant read: Connection reset
by peer
Jun 24 05:28:52 spool1 innd[39682]: localhost:176 cant write: Broken pipe
Jun 24 05:28:52 spool1 innd[39682]: localhost:176 checkpoint seconds 1108
accepted 0 refused 0 rejected 68 duplicate 0 accepted size 0 duplicate size 0
rejected size 27010432
Jun 24 05:28:52 spool1 innd[39682]: localhost:176 closed seconds 2815 accepted
0 refused 0 rejected 4268 duplicate 0 accepted size 0 duplicate size 0
rejected size 1689575168
Jun 24 05:28:52 spool1 innd[39682]: <feed1.usenet.blueworldhosting.com>
connected 165 streaming allowed
Jun 24 05:28:52 spool1 innd[39682]: localhost:188 cant read: Connection reset
by peer
Jun 24 05:28:52 spool1 innd[39682]: localhost:188 cant write: Broken pipe
Jun 24 05:28:52 spool1 innd[39682]: localhost:188 checkpoint seconds 1089
accepted 44 refused 0 rejected 0 duplicate 0 accepted size 78612 duplicate
size 0 rejected size 0
Jun 24 05:28:52 spool1 innd[39682]: localhost:188 closed seconds 2679 accepted
10005 refused 9 rejected 39 duplicate 0 accepted size 24383800 duplicate size
0 rejected size 441789
Jun 24 05:28:52 spool1 innd[39682]: localhost:190 cant read: Connection reset
by peer
Jun 24 05:28:52 spool1 innd[39682]: localhost:190 cant write: Broken pipe

Russ Allbery

unread,

Jun 24, 2022, 12:06:53 PM6/24/22

to

Jesse Rehmer <jesse....@blueworldhosting.com> writes:

> There was an 18-minute gap in innd activity according to the news.notice
> log where it was completely unresponsive (no response on port 119, no
> response to ctlinnd commands, etc.). During this period truss did not
> output anything for innd or its children (unfortunately I wasn't
> attached when the condition began). When things sprang to life what I
> saw with truss looked like history or overview read/writes and articles
> started flowing.

> What I'm seeing in the logs is the period of unresponsiveness ends with
> innd logging its stats. There are some interesting entries after that,
> such as "SERVER cant sendto CCreader" and I have a lot of "Connection
> reset by peer" and "broken pipe" messages. The latter two appear
> frequently for a few minutes and subside.

This feels like lock contention. What overview method are you using? I
tried to find that in the thread but may have missed it.

--
Russ Allbery (ea...@eyrie.org) <https://www.eyrie.org/~eagle/>

Please post questions rather than mailing me directly.
<https://www.eyrie.org/~eagle/faqs/questions.html> explains why.

Jesse Rehmer

unread,

Jun 24, 2022, 1:18:25 PM6/24/22

to

On Jun 24, 2022 at 11:06:51 AM CDT, "Russ Allbery" in

<87edzeg...@hope.eyrie.org> wrote:

> Jesse Rehmer <jesse....@blueworldhosting.com> writes:
>
>> There was an 18-minute gap in innd activity according to the news.notice
>> log where it was completely unresponsive (no response on port 119, no
>> response to ctlinnd commands, etc.). During this period truss did not
>> output anything for innd or its children (unfortunately I wasn't
>> attached when the condition began). When things sprang to life what I
>> saw with truss looked like history or overview read/writes and articles
>> started flowing.
>
>> What I'm seeing in the logs is the period of unresponsiveness ends with
>> innd logging its stats. There are some interesting entries after that,
>> such as "SERVER cant sendto CCreader" and I have a lot of "Connection
>> reset by peer" and "broken pipe" messages. The latter two appear
>> frequently for a few minutes and subside.
>
> This feels like lock contention. What overview method are you using? I
> tried to find that in the thread but may have missed it.

I agree, but unfortunately lack the technical knowledge to understand better
why this would be happening, and what if anything I could do to resolve it.

My apologies, I did leave out some critical details - I am using
tradspool/tradindexed. Additional observations:

- While innd is unresponsive, nnrpd was happily serving clients and the only
indicator something was amiss from the client perspective is posting would
fail with
a timeout reported by the client.
- Article flow from the upstream feeder was at it's lowest point during this
period.
- The problem resolved as expireover was nearing completion based on these
timestamps:

The last connection issue reported from the upstream feeder:

Jun 24 06:25:44 feed1 innfeed[91878]: <spool1.usenet.blueworldhosting.com>:1
cxnsleep non-responsive connection

Expire output:

Expire messages:
expireover start Thu Jun 23 15:39:00 CDT 2022: ( -z/var/log/news/expire.rm
-Z/var/log/news/expire.lowmark)
Article lines processed 53326142
Articles dropped 56726
Overview index dropped 57054
expireover end Fri Jun 24 07:06:07 CDT 2022
lowmarkrenumber begin Fri Jun 24 07:06:07 CDT 2022:
(/var/log/news/expire.lowmark)
lowmarkrenumber end Fri Jun 24 07:06:07 CDT 2022
expirerm start Fri Jun 24 07:06:07 CDT 2022
expirerm end Fri Jun 24 07:09:14 CDT 2022
expire begin Fri Jun 24 07:09:44 CDT 2022: (-v1)
Article lines processed 32835813
Articles retained 32805948
Entries expired 29865
expire end Fri Jun 24 08:52:30 CDT 2022
all done Fri Jun 24 08:52:30 CDT 2022

Jesse Rehmer

unread,

Jun 24, 2022, 1:51:01 PM6/24/22

to

On Jun 24, 2022 at 12:18:24 PM CDT, "Jesse Rehmer" in

Did some more poking around, and this may just be coincidental, but found
errors from pyClean in /usr/local/news/pyclean/log/traceback. The timestamp of
the traceback file coincides roughly with when I noticed the issue, but there
are no timestamps within the individual traceback messages.

-rw-rw-r-- 1 news news 2097 Jun 23 21:11 traceback

root@spool1:/usr/local/news/pyclean/log # cat traceback
Traceback (most recent call last):
File "/usr/local/news/bin/filter/filter_innd.py", line 324, in filter_art
return self.pyfilter.filter(art)
File "/usr/local/news/bin/filter/filter_innd.py", line 702, in filter
if art[Lines] and int(art[Lines]) != int(art[__LINES__]):
ValueError: invalid literal for int() with base 10: "20\r\n AT&T Launches
Free Wi-Fi Service Throughout August in Select\r\n McDonald's Outlets;
$2.99 a Day Thereafter\r\n - Jul 29, 2003 07:00 AM (PR Newswire)"
Traceback (most recent call last):
File "/usr/local/news/bin/filter/filter_innd.py", line 324, in filter_art
return self.pyfilter.filter(art)
File "/usr/local/news/bin/filter/filter_innd.py", line 1001, in filter
if not self.bin_allowed:
AttributeError: Filter instance has no attribute 'bin_allowed'
Traceback (most recent call last):
File "/usr/local/news/bin/filter/filter_innd.py", line 324, in filter_art
return self.pyfilter.filter(art)
File "/usr/local/news/bin/filter/filter_innd.py", line 1001, in filter
if not self.bin_allowed:
AttributeError: Filter instance has no attribute 'bin_allowed'
Traceback (most recent call last):
File "/usr/local/news/bin/filter/filter_innd.py", line 324, in filter_art
return self.pyfilter.filter(art)
File "/usr/local/news/bin/filter/filter_innd.py", line 1001, in filter
if not self.bin_allowed:
AttributeError: Filter instance has no attribute 'bin_allowed'
Traceback (most recent call last):
File "/usr/local/news/bin/filter/filter_innd.py", line 324, in filter_art
return self.pyfilter.filter(art)
File "/usr/local/news/bin/filter/filter_innd.py", line 702, in filter
if art[Lines] and int(art[Lines]) != int(art[__LINES__]):
ValueError: invalid literal for int() with base 10: 'l54'
Traceback (most recent call last):
File "/usr/local/news/bin/filter/filter_innd.py", line 324, in filter_art
return self.pyfilter.filter(art)
File "/usr/local/news/bin/filter/filter_innd.py", line 1001, in filter
if not self.bin_allowed:
AttributeError: Filter instance has no attribute 'bin_allowed'

Russ Allbery

unread,

Jun 24, 2022, 1:56:13 PM6/24/22

to

Jesse Rehmer <jesse....@blueworldhosting.com> writes:

> My apologies, I did leave out some critical details - I am using
> tradspool/tradindexed.

Do you by chance have a really large group that would take a long time to
expire? It's been a while since I wrote the locking strategy for
tradindexed, but expiring a group locks it for write and receiving an
article in the same group will also attempt to lock it for write, and innd
is single-threaded. So if you receive an article for a group while that
group is being expired, innd will wait for the lock until the expiration
of that group finishes. If that group is extremely large, this could well
take ten minutes.

> Additional observations:

> - While innd is unresponsive, nnrpd was happily serving clients and the
> only indicator something was amiss from the client perspective is
> posting would fail with a timeout reported by the client.

Right, only innd would wait for the lock, so it would fail to spawn new
nnrpd processes if you use the single port model, and nnrpd can't POST
because it hands articles off to innd, but nnrpd doesn't care (and indeed
won't even lock the group overview for read unless it sees that the
overview data has been replaced, which only happens at the end of
expiration and takes a much shorter length of time).

Jesse Rehmer

unread,

Jun 24, 2022, 2:12:11 PM6/24/22

to

On Jun 24, 2022 at 12:56:11 PM CDT, "Russ Allbery" in

<87a6a2g...@hope.eyrie.org> wrote:

> Jesse Rehmer <jesse....@blueworldhosting.com> writes:
>
>> My apologies, I did leave out some critical details - I am using
>> tradspool/tradindexed.
>
> Do you by chance have a really large group that would take a long time to
> expire? It's been a while since I wrote the locking strategy for
> tradindexed, but expiring a group locks it for write and receiving an
> article in the same group will also attempt to lock it for write, and innd
> is single-threaded. So if you receive an article for a group while that
> group is being expired, innd will wait for the lock until the expiration
> of that group finishes. If that group is extremely large, this could well
> take ten minutes.

I injected articles from 2003-present for almost all of the Big8 groups, so
many groups have hundreds of thousands of articles.

I will admit, it has been 10+ years since I've used INN with a spool
containing this many articles, but long ago helped another admin move to a new
server. That box was receiving a full feed, I don't recall the number of
articles in the spool, but it was several terabytes (my size is ~150GB for
comparison). When we ran expireover on that spool the average run time was 6-8
hours and do not recall experiencing any issues like I am seeing. My
expireover run on a fraction of the amount of data and on faster hardware was
~14 hours. Is that expected?

Jesse Rehmer

unread,

Jun 24, 2022, 2:48:26 PM6/24/22

to

On Jun 24, 2022 at 1:12:10 PM CDT, "Jesse Rehmer" in

As a follow-up, would you recommend a different overview storage mechanism?
My understanding is that buffindexed would probably make my situation worse,
ovdb is a non-starter for me, and ovsqlite has *serious* performance issues.
I started building this box with the intent to use ovsqlite, but when I
started injecting articles the throughput rate was abysmal.

Russ Allbery

unread,

Jun 24, 2022, 2:59:15 PM6/24/22

to

Jesse Rehmer <jesse....@blueworldhosting.com> writes:
> "Russ Allbery" in <87a6a2g...@hope.eyrie.org> wrote:

>> Do you by chance have a really large group that would take a long time
>> to expire? It's been a while since I wrote the locking strategy for
>> tradindexed, but expiring a group locks it for write and receiving an
>> article in the same group will also attempt to lock it for write, and
>> innd is single-threaded. So if you receive an article for a group
>> while that group is being expired, innd will wait for the lock until
>> the expiration of that group finishes. If that group is extremely
>> large, this could well take ten minutes.

> I injected articles from 2003-present for almost all of the Big8 groups,
> so many groups have hundreds of thousands of articles.

I'm not sure hundreds of thousands would be enough to take that long.
Maybe? It's been a long time since I've benchmarked this stuff, and never
with modern hardware. It feels more likely to me if you have a single
group with tens of millions of articles in it.

> I will admit, it has been 10+ years since I've used INN with a spool
> containing this many articles, but long ago helped another admin move to
> a new server. That box was receiving a full feed, I don't recall the
> number of articles in the spool, but it was several terabytes (my size
> is ~150GB for comparison). When we ran expireover on that spool the
> average run time was 6-8 hours and do not recall experiencing any issues
> like I am seeing. My expireover run on a fraction of the amount of data
> and on faster hardware was ~14 hours. Is that expected?

Were you using tradindexed before I rewrote it to make it safer, maybe?

You could try ovsqlite, which will have much finer-grained locking rather
than the all-or-nothing approach that tradindexed uses (and kind of has to
use due to its structure).

Julien ÉLIE

unread,

Jun 24, 2022, 3:08:49 PM6/24/22

to

Hi Jesse,

> As a follow-up, would you recommend a different overview storage mechanism?

Incidentally, I was planning to make a video about exactly that subject:
explaining how to set up each of the overview storage mechanisms, and
comparing them.
I would recommend either tradindexed or ovsqlite.
Nonetheless, I do not have a spool as huge as yours. It seems that
performances are not satisfactory with 53 millions of overview lines :-/

> My understanding is that buffindexed would probably make my situation worse

Yes, do *not* use it in your situation!

> ovdb is a non-starter for me, and ovsqlite has *serious* performance issues.
> I started building this box with the intent to use ovsqlite, but when I
> started injecting articles the throughput rate was abysmal.

Did you try to increase transrowlimit when injecting your articles?

In case it could help, here are the results of the benchmark I performed
last week with 3,713,040 overview lines on a VPS with 2 vCore of 2,4 GHz
and 4 GB of RAM.

Duration of expireover
----------------------

ovdb -> 0:14
compressed ovdb -> 0:21
tradindexed -> 0:25
ovsqlite -> 0:27
compressed ovsqlite -> 0:44
buffindexed -> 1:21

Duration of an overview rebuild
-------------------------------

tradindexed -> 1:30
ovdb -> 2:16
buffindexed -> 2:25
compressed ovdb -> 3:36
ovsqlite -> 3:48
compressed ovsqlite -> 5:31

Yes I agree ovsqlite (version 3.34.1 here) is slow when adding new
articles, but quite fast at expiry (and I do not think it will lock any
table).

--
Julien ÉLIE

« Les amis de la vérité sont ceux qui la cherchent, et non ceux qui se
vantent de l'avoir trouvée. » (Condorcet)

Jesse Rehmer

unread,

Jun 24, 2022, 3:22:28 PM6/24/22

to

On Jun 24, 2022 at 2:08:48 PM CDT, "Julien ÉLIE" in

<t95241$kphs$1...@news.trigofacile.com> wrote:

> Hi Jesse,
>
>> As a follow-up, would you recommend a different overview storage mechanism?
>
> Incidentally, I was planning to make a video about exactly that subject:
> explaining how to set up each of the overview storage mechanisms, and
> comparing them.
> I would recommend either tradindexed or ovsqlite.
> Nonetheless, I do not have a spool as huge as yours. It seems that
> performances are not satisfactory with 53 millions of overview lines :-/

And I'm only about half-way finished injecting articles... :)

>> ovdb is a non-starter for me, and ovsqlite has *serious* performance issues.
>> I started building this box with the intent to use ovsqlite, but when I
>> started injecting articles the throughput rate was abysmal.
>
> Did you try to increase transrowlimit when injecting your articles?

I have zero sqlite knowledge, did glance at the options in ovsqlite.conf, but
lack knowledge how to best tune these values; so hastily switched to
tradindexed as I knew I would get the throughput expected with no tinkering.

If you have suggestions for tuning, I'm open to giving ovsqlite another try,
especially after seeing your benchmark results for expiry and rebuild.

Julien ÉLIE

unread,

Jun 24, 2022, 3:24:53 PM6/24/22

to

Hi Russ,

> Do you by chance have a really large group that would take a long time to
> expire? It's been a while since I wrote the locking strategy for
> tradindexed, but expiring a group locks it for write and receiving an
> article in the same group will also attempt to lock it for write, and innd
> is single-threaded. So if you receive an article for a group while that
> group is being expired, innd will wait for the lock until the expiration
> of that group finishes. If that group is extremely large, this could well
> take ten minutes.

I've not looked at the complexity but couldn't the expiration of the
overview of a newsgroup be done in bulks of 100,000 articles for instance?
Lock the group, process the first 100,000 overview lines (and retain
that n lines have been dropped), release the lock, try to acquire a new
lock and process the 100,000 next lines from line (100,000 - n), etc.

--
Julien ÉLIE

« Le café est un breuvage qui fait dormir quand on n'en prend pas. »
(Alphonse Allais)

Julien ÉLIE

unread,

Jun 24, 2022, 3:37:04 PM6/24/22

to

Hi Jesse,

> *:A:1:never:never
> *.jobs*:A:1:90:90
> *.test:A:1:300:300
> news.lists.filters:A:1:10:10

You may also want to expire control.cancel sooner... It may get very
large...

To sort your newsgroups by increasing article counts, run:

% getlist -R counts | sort -n -k 4

The fourth field contains the number of articles.

Jesse Rehmer

unread,

Jun 24, 2022, 3:53:15 PM6/24/22

to

On Jun 24, 2022 at 2:37:03 PM CDT, "Julien ÉLIE" in

<t953ov$kqir$1...@news.trigofacile.com> wrote:

> Hi Jesse,
>
>> *:A:1:never:never
>> *.jobs*:A:1:90:90
>> *.test:A:1:300:300
>> news.lists.filters:A:1:10:10
>
> You may also want to expire control.cancel sooner... It may get very
> large...
>
> To sort your newsgroups by increasing article counts, run:
>
> % getlist -R counts | sort -n -k 4
>
> The fourth field contains the number of articles.

Thanks for the reminder about adding control.cancel... Right now it shouldn't
be an issue:

[news@spool1 ~]$ getlist -R counts | sort -n -k 4 | grep control.cancel
control.cancel 15447 1 15447 n

However, I can see where ignoring could cause problems in the future. Do you
also recommend "control"?

Somehow I've missed the getlist utility all these years... pretty spiffy.
Looks like I have a few groups that are indeed larger than even I expected. I
haven't paid attention to the high/low counts and whatnot since beginning the
injection project.

---------------
rec.outdoors.rv-travel 355495 1 355482 y
alt.folklore.computers 377199 1 377133 y
talk.politics.guns 417832 1 417766 y
comp.sys.mac.system 443025 1 442909 y
alt.politics 460595 1 460585 y
soc.culture.polish 515936 1 515825 y
comp.lang.c 519799 1 517421 y
soc.culture.jewish 521143 1 520973 y
alt.fan.rush-limbaugh 529383 1 529297 y
comp.sys.mac.advocacy 540625 1 540607 y
rec.crafts.metalworking 585043 1 585042 y
rec.arts.sf.written 632786 1 632665 y
soc.culture.usa 671665 1 671598 y
soc.culture.israel 749617 1 749617 y
alt.atheism 769678 1 769661 y
rec.sport.pro-wrestling 777234 1 777234 y
rec.arts.drwho 798601 1 798600 y
rec.sport.football.college 828057 1 828057 y
rec.sport.tennis 849759 1 849759 y
rec.arts.tv 1367076 1 1366870 y
---------------

Russ Allbery

unread,

Jun 24, 2022, 3:57:28 PM6/24/22

to

Julien ÉLIE <iul...@nom-de-mon-site.com.invalid> writes:

> I've not looked at the complexity but couldn't the expiration of the
> overview of a newsgroup be done in bulks of 100,000 articles for
> instance? Lock the group, process the first 100,000 overview lines (and
> retain that n lines have been dropped), release the lock, try to acquire
> a new lock and process the 100,000 next lines from line (100,000 - n),
> etc.

The basic problem is that the expire process involves rewriting the data
file, which invalidates all of the offset information stored in the index.
That's why it holds a write lock on the index while expiring.

Multiple passes would potentially require rewriting the data file and
updating the offsets multiple times, which I would worry would cause other
performance issues. (Back when I was testing this, I think expiration was
mostly I/O bound and writing the data file was the expensive bit.)

I can think of a couple of possible approaches:

* Store articles that arrive in groups being expired but queue the
overview information and store it later once the lock is released. This
would require some complex state tracking in innd that may be tricky to
get right and not lose data.

* Perform the bulk of the expiration without holding a lock, and then take
the lock at the end, find only the articles that have come in during
expiration, and add them to the new data file before replacing the data
file. I think this would work and is probably a simpler place to put
the complexity, but it will require some careful design to make sure
there are no race conditions and we can find all the articles that came
in after expiry started.

Deferring the article back to the peer probably won't work since not all
peers honor deferrals and I think the article is already stored before we
try to write overview.

Using a proper database fixes this problem correctly (but obviously at the
risk of introducing other issues). In an ideal world, it would be nice to
have one overview mechanism that's good enough we could just have everyone
use it.

Jesse Rehmer

unread,

Jun 24, 2022, 4:08:14 PM6/24/22

to

On Jun 24, 2022 at 2:57:26 PM CDT, "Russ Allbery" in

<87sfntg...@hope.eyrie.org> wrote:

> Deferring the article back to the peer probably won't work since not all
> peers honor deferrals and I think the article is already stored before we
> try to write overview.

I'm not a developer by any means, but based on what I see with truss, I
believe this statement to be correct.

sigprocmask(SIG_SETMASK,{ SIGHUP|SIGPIPE|SIGTERM|SIGCHLD|SIGUSR1 },0x0) = 0
(0x0)
read(282,"fNCus+94CPtlu8N1+G\r\n\tx9e9mKrN"...,7223) = 2752 (0xac0)
msync(0x808e00000,2101248,MS_ASYNC) = 0 (0x0)
openat(AT_FDCWD,"/usr/local/news/spool/articles/talk/politics/guns/417838",O_
WRONLY|O_CREAT|O_EXCL,0664) = 109 (0x6d)
writev(109,[{"Sender: Nomen Nescio
<nobody@diz"...,2936},{"news.usenet.blueworldhosting.com"...,33},{"spool1.use
net.blueworldhosting.c"...,35},{"usenet.blueworldhosting.com!feed"...,133},{"
spool1.usenet.blueworldhosting.c"...,172},{"\r\n\r\nIn article
<a90a66d9-444"...,606}],6) = 3915 (0xf4b)
close(109) = 0 (0x0)
link("/usr/local/news/spool/articles/talk/politics/guns/417838","/usr/local/n
ews/spool/articles/alt/politics/obama/143985") = 0 (0x0)
link("/usr/local/news/spool/articles/talk/politics/guns/417838","/usr/local/n
ews/spool/articles/alt/politics/nationalism/white/5158") = 0 (0x0)
link("/usr/local/news/spool/articles/talk/politics/guns/417838","/usr/local/n
ews/spool/articles/rec/arts/tv/1367080") = 0 (0x0)
link("/usr/local/news/spool/articles/talk/politics/guns/417838","/usr/local/n
ews/spool/articles/alt/politics/democrats/176951") = 0 (0x0)
fcntl(8,F_SETLKW,0x7fffffffcf10) = 0 (0x0)
write(173,"417838\tRe: Do you support laws "...,443) = 443 (0x1bb)
lseek(173,0x0,SEEK_CUR) = 290699959 (0x1153bab7)
pwrite(165,"\M-|\M-8S\^Q\0\0\0\0\M-;\^A\0\0"...,56,0x16509d8) = 56 (0x38)
msync(0x802d64000,4096,MS_ASYNC) = 0 (0x0)
fcntl(8,F_SETLKW,0x7fffffffcf10) = 0 (0x0)
fcntl(8,F_SETLKW,0x7fffffffcf10) = 0 (0x0)
write(254,"143985\tRe: Do you support laws "...,443) = 443 (0x1bb)
lseek(254,0x0,SEEK_CUR) = 82669496 (0x4ed6fb8)
pwrite(182,"\M-}m\M-m\^D\0\0\0\0\M-;\^A\0\0"...,56,0x7b0880) = 56 (0x38)
msync(0x802b31000,4096,MS_ASYNC) = 0 (0x0)
fcntl(8,F_SETLKW,0x7fffffffcf10) = 0 (0x0)
openat(AT_FDCWD,"/usr/local/news/spool/overview/a/p/n/w/alt.politics.national
ism.white.IDX",O_RDWR|O_CREAT,0664) = 109 (0x6d)
fstat(109,{ mode=-rw-rw-r-- ,inode=34168327,size=288792,blksize=131072 }) = 0
(0x0)
fcntl(109,F_GETFD,) = 0 (0x0)
fcntl(109,F_SETFD,FD_CLOEXEC) = 0 (0x0)
openat(AT_FDCWD,"/usr/local/news/spool/overview/a/p/n/w/alt.politics.national
ism.white.DAT",O_RDWR|O_APPEND|O_CREAT,0664) = 113 (0x71)
fcntl(113,F_GETFD,) = 0 (0x0)
fcntl(113,F_SETFD,FD_CLOEXEC) = 0 (0x0)
close(35) = 0 (0x0)
close(86) = 0 (0x0)
fcntl(8,F_SETLKW,0x7fffffffcf10) = 0 (0x0)
write(113,"5158\tRe: Do you support laws th"...,441) = 441 (0x1b9)
lseek(113,0x0,SEEK_CUR) = 3775657 (0x399ca9)
pwrite(109,"\M-p\M^Z9\0\0\0\0\0\M-9\^A\0\0\0"...,56,0x46818) = 56 (0x38)
msync(0x802b31000,4096,MS_ASYNC) = 0 (0x0)
fcntl(8,F_SETLKW,0x7fffffffcf10) = 0 (0x0)
fcntl(8,F_SETLKW,0x7fffffffcf10) = 0 (0x0)
write(211,"1367080\tRe: Do you support laws"...,444) = 444 (0x1bc)
lseek(211,0x0,SEEK_CUR) = 643936438 (0x2661b0b6)
pwrite(57,"\M-z\M-.a&\0\0\0\0\M-<\^A\0\0\0"...,56,0x4902888) = 56 (0x38)
msync(0x802d32000,4096,MS_ASYNC) = 0 (0x0)
fcntl(8,F_SETLKW,0x7fffffffcf10) = 0 (0x0)
fcntl(8,F_SETLKW,0x7fffffffcf10) = 0 (0x0)
write(229,"176951\tRe: Do you support laws "...,443) = 443 (0x1bb)
lseek(229,0x0,SEEK_CUR) = 107740676 (0x66bfe04)
pwrite(9,"I\M-|k\^F\0\0\0\0\M-;\^A\0\0\0\0"...,56,0x9733d0) = 56 (0x38)
msync(0x802b30000,4096,MS_ASYNC) = 0 (0x0)
fcntl(8,F_SETLKW,0x7fffffffcf10) = 0 (0x0)
write(5,"[F274954F241C01B17699DD5A2DDD219"...,98) = 98 (0x62)
msync(0x80de00000,398255200,MS_ASYNC) = 0 (0x0)
msync(0x825a00000,298691400,MS_ASYNC) = 0 (0x0)
lseek(7,0x0,SEEK_SET) = 0 (0x0)
write(7,"dbz 6 49781900 14 66\n32810510 3"...,111) = 111 (0x6f)
write(1,"Jun 24 15:05:02.205 + feed1.usen"...,100) = 100 (0x64)
write(1," 3912 inpaths!\n",15) = 15 (0xf)
write(112,"news.usenet.blueworldhosting.com"...,194) = 194 (0xc2)
write(282,"239 <ed02c4f9d54f3b99bfe0840ae0c"...,50) = 50 (0x32)
sigprocmask(SIG_SETMASK,{ },0x0) = 0 (0x0)

Jesse Rehmer

unread,

Jun 24, 2022, 4:14:59 PM6/24/22

to

On Jun 24, 2022 at 3:08:13 PM CDT, "Jesse Rehmer" in
<012930186...@freebsd-inject1.usenet.blueworldhosting.com> wrote:

Argh... I pasted the wrong snippet, this is better, I think:

sigprocmask(SIG_SETMASK,{ },0x0) = 0 (0x0)

select(283,{ 12 14 15 17 282 },{ },0x0,{ 449.000000 }) = 1 (0x1)

sigprocmask(SIG_SETMASK,{ SIGHUP|SIGPIPE|SIGTERM|SIGCHLD|SIGUSR1 },0x0) = 0
(0x0)

read(282,"dhosting.com junk:1181312\r\n\r"...,7223) = 2286 (0x8ee)
openat(AT_FDCWD,"/usr/local/news/spool/articles/uk/media/radio/archers/2289",
O_WRONLY|O_CREAT|O_EXCL,0664) = 232 (0xe8)
writev(232,[{"Path:
",6},{"news.usenet.blueworldhosting.com"...,33},{"spool1.usenet.blueworldhost
ing.c"...,35},{"usenet.blueworldhosting.com!feed"...,942},{"spool1.usenet.blu
eworldhosting.c"...,62},{"\r\n\r\nOn Fri, 24 Jun 2022 10:0"...,2261}],6) =
3339 (0xd0b)
close(232) = 0 (0x0)
openat(AT_FDCWD,"/usr/local/news/spool/overview/u/m/r/a/uk.media.radio.archer
s.IDX",O_RDWR|O_CREAT,0664) = 232 (0xe8)
fstat(232,{ mode=-rw-rw-r-- ,inode=5370741,size=128128,blksize=128512 }) = 0
(0x0)
fcntl(232,F_GETFD,) = 0 (0x0)
fcntl(232,F_SETFD,FD_CLOEXEC) = 0 (0x0)
openat(AT_FDCWD,"/usr/local/news/spool/overview/u/m/r/a/uk.media.radio.archer
s.DAT",O_RDWR|O_APPEND|O_CREAT,0664) = 270 (0x10e)
fcntl(270,F_GETFD,) = 0 (0x0)
fcntl(270,F_SETFD,FD_CLOEXEC) = 0 (0x0)
close(274) = 0 (0x0)
close(283) = 0 (0x0)
fcntl(8,F_SETLKW,0x7fffffffcf10) = 0 (0x0)
write(270,"2289\tRe: OT: Boodly Asda\tNick "...,323) = 323 (0x143)
lseek(270,0x0,SEEK_CUR) = 898638 (0xdb64e)
pwrite(232,"\v\M-5\r\0\0\0\0\0C\^A\0\0\0\0\0"...,56,0x1f480) = 56 (0x38)
msync(0x802d7d000,4096,MS_ASYNC) = 0 (0x0)
fcntl(8,F_SETLKW,0x7fffffffcf10) = 0 (0x0)
write(5,"[4FAD1BE6FF8B411EF21EBF2100E369F"...,98) = 98 (0x62)
write(1,"Jun 24 15:11:55.536 + feed1.usen"...,100) = 100 (0x64)
write(1," 3336 inpaths!\n",15) = 15 (0xf)
write(112,"news.usenet.blueworldhosting.com"...,204) = 204 (0xcc)
write(282,"239 <ab6cbhtfltggcj4f9fsp5o1nh2o"...,50) = 50 (0x32)

Julien ÉLIE

unread,

Jun 24, 2022, 4:39:37 PM6/24/22

to

Hi Jesse,

> Thanks for the reminder about adding control.cancel... Right now it shouldn't
> be an issue:
>
> [news@spool1 ~]$ getlist -R counts | sort -n -k 4 | grep control.cancel
> control.cancel 15447 1 15447 n
>
> However, I can see where ignoring could cause problems in the future. Do you
> also recommend "control"?

I would recommend having control.newgroup, control.rmgroup and
control.checkgroups. Then there shouldn't be anything (but crap) in
control.
Traffic is very low in these groups so only expiring control.cancel
should be OK.

> Somehow I've missed the getlist utility all these years... pretty spiffy.

:-)

> Looks like I have a few groups that are indeed larger than even I expected. I
> haven't paid attention to the high/low counts and whatnot since beginning the
> injection project.
>

> rec.arts.tv 1367076 1 1366870 y

Is it your largest group? (It represents about 2,5% of all your articles.)

Jesse Rehmer

unread,

Jun 24, 2022, 4:45:22 PM6/24/22

to

On Jun 24, 2022 at 3:39:36 PM CDT, "Julien ÉLIE" in

<t957e8$ksm6$1...@news.trigofacile.com> wrote:

> I would recommend having control.newgroup, control.rmgroup and
> control.checkgroups. Then there shouldn't be anything (but crap) in
> control.
> Traffic is very low in these groups so only expiring control.cancel
> should be OK.

Thanks, as always, your insight is welcome.

>> Looks like I have a few groups that are indeed larger than even I expected. I
>> haven't paid attention to the high/low counts and whatnot since beginning the
>> injection project.
>>
>> rec.arts.tv 1367076 1 1366870 y
>
> Is it your largest group? (It represents about 2,5% of all your articles.)

That is the largest group according to getlist.

Jesse Rehmer

unread,

Jun 25, 2022, 3:52:25 AM6/25/22

to

If I change:

*:A:1:never:never

to:

*:A:never:never:never

Would that stop expireover from traversing through all the groups and overview
data, except specific groups listed later in expire.ctl?

Julien ÉLIE

unread,

Jun 25, 2022, 4:30:52 PM6/25/22

to

Hi Jesse,

No, expireover has all the same a look at all the overview lines to
check whether the related articles still exist in the spool. (Yet, as
you're using tradspool, this check is in fact not done because it is
considered expansive, unless you give the -s flag to expireover.)

In the source code of tdx_data_expire_start(), there's a note about
exactly the point you are raising:

** There's probably some room for optimization here for newsgroups that
** don't expire so that the files don't have to be rewritten, or
** newsgroups where all the data at the end of the file is still good
** and just needs to be moved as-is.

However, you may want to run expireover only on a subset of newsgroups,
like you say. Instead of running it on all your newsgroups, you may
want to run "news.daily noexpireover" (plus the usual flags you're
using), and a few individual overview expiry:

grep -E 'news.lists.filters |\.jobs|\.test ' <pathdb>/active \
| expireover -f - -Z <pathtmp>/lowmark
ctlinnd lowmark <pathtmp>/lowmark
rm -f <pathtmp>/lowmark

The grep command extracts the newsgroups from your active file.

(And why not expire the others monthly, and not all of them the same day?)
Just ideas.

Jesse Rehmer

unread,

Jun 30, 2022, 8:27:32 PM6/30/22

to

On Jun 24, 2022 at 2:08:48 PM CDT, "Julien ÉLIE" in

<t95241$kphs$1...@news.trigofacile.com> wrote:

>> ovdb is a non-starter for me, and ovsqlite has *serious* performance issues.
>> I started building this box with the intent to use ovsqlite, but when I
>> started injecting articles the throughput rate was abysmal.
>
> Did you try to increase transrowlimit when injecting your articles?

Hi Julien - I am going to give ovsqlite another try. I was reading through the
man page and configuration file, paying attention to this recommendation:

A transaction occurs every transrowlimit articles or transtimelimit
seconds, whichever is smaller. You are encouraged to keep the default
value for row limits and, instead, adjust the time limit according to
how many articles your news server usually accepts per second during
normal operation (you can find statistics about incoming articles in
your daily Usenet reports).
Inserting or deleting a database row within a transaction is very fast
whereas committing a transaction is slow, especially on rotating
storage. Setting transaction limits too low leads to poor
performance. When rebuilding overview data, it may be worth
temporarily raising these values, though.

Do you recommend increasing both transrowlimit and transtimelimit? I am also
looking at the pagesize and cachesize parameters.

Thanks,

Jesse

Julien ÉLIE

unread,

Jul 3, 2022, 5:32:07 PM7/3/22

to

Hi Jesse,

> I am going to give ovsqlite another try. I was reading through the
> man page and configuration file, paying attention to this recommendation:
>
> A transaction occurs every transrowlimit articles or transtimelimit
> seconds, whichever is smaller. You are encouraged to keep the default
> value for row limits and, instead, adjust the time limit according to
> how many articles your news server usually accepts per second during
> normal operation (you can find statistics about incoming articles in
> your daily Usenet reports).
> Inserting or deleting a database row within a transaction is very fast
> whereas committing a transaction is slow, especially on rotating
> storage. Setting transaction limits too low leads to poor
> performance. When rebuilding overview data, it may be worth
> temporarily raising these values, though.
>
> Do you recommend increasing both transrowlimit and transtimelimit? I am also
> looking at the pagesize and cachesize parameters.

When testing on my news server, which is a relatively fast VPS, I did
not notice any change with a x10 to transrowlimit or transtimelimit
during an overview rebuild.
This recommendation comes from the implementer of ovsqlite, and I
believe it may be useful on slower systems (with rotating hard drives or
slow CPU).

I kept the defaults for pagesize and cachesize. I'm not an SQLite
expert, and ovsqlite is fast enough in my testing news server. Of
course if you find out tweaks that make it even faster in your news
server, feel free to share and I'll be happy to add them in our
documentation for other users.

--
Julien ÉLIE

« La médecine est un métier dangereux. Ceux qui ne meurent pas peuvent
vous faire un procès. » (Coluche)