Slow archive-wal

40 views
Skip to first unread message

Anco Huiberts

unread,
Mar 31, 2022, 1:03:19 AM3/31/22
to Barman, Backup and Recovery Manager for PostgreSQL
We encounter a slow archive-wal process. I see one thing I like to verify with you.

When a backup is running and adding files via archive-wal I see it will wait for the receive-wal process.

Is this as designed? In other words: when the archive-wal is running, is this blocked by the receive-wal process? Is it possible to run this in parallel or something?
Thx!

Anco Huiberts

unread,
Mar 31, 2022, 1:04:33 AM3/31/22
to Barman, Backup and Recovery Manager for PostgreSQL
Like this :

2022-03-31 07:02:56,068 [26936] barman.wal_archiver INFO: Archiving segment 898 of 11679 from streaming: prc/0000000E0003961700000037
2022-03-31 07:02:56,144 [30514] barman.command_wrappers INFO: prc: pg_receivewal: finished segment at 39651/8000000 (timeline 14)
2022-03-31 07:02:57,044 [30514] barman.command_wrappers INFO: prc: pg_receivewal: finished segment at 39651/9000000 (timeline 14)
2022-03-31 07:02:57,914 [30514] barman.command_wrappers INFO: prc: pg_receivewal: finished segment at 39651/A000000 (timeline 14)
2022-03-31 07:02:58,796 [30514] barman.command_wrappers INFO: prc: pg_receivewal: finished segment at 39651/B000000 (timeline 14)
2022-03-31 07:02:58,814 [26936] barman.wal_archiver INFO: Archiving segment 899 of 11679 from streaming: prc/0000000E0003961700000038
2022-03-31 07:02:59,844 [30514] barman.command_wrappers INFO: prc: pg_receivewal: finished segment at 39651/C000000 (timeline 14)
2022-03-31 07:03:01,260 [26936] barman.wal_archiver INFO: Archiving segment 900 of 11679 from streaming: prc/0000000E0003961700000039
2022-03-31 07:03:01,517 [30514] barman.command_wrappers INFO: prc: pg_receivewal: finished segment at 39651/D000000 (timeline 14)
2022-03-31 07:03:03,379 [30514] barman.command_wrappers INFO: prc: pg_receivewal: finished segment at 39651/E000000 (timeline 14)
2022-03-31 07:03:03,515 [26936] barman.wal_archiver INFO: Archiving segment 901 of 11679 from streaming: prc/0000000E000396170000003A
2022-03-31 07:03:05,099 [30514] barman.command_wrappers INFO: prc: pg_receivewal: finished segment at 39651/F000000 (timeline 14)
2022-03-31 07:03:06,763 [30514] barman.command_wrappers INFO: prc: pg_receivewal: finished segment at 39651/10000000 (timeline 14)

Luca Ferrari

unread,
Mar 31, 2022, 2:54:34 AM3/31/22
to Barman, Backup and Recovery Manager for PostgreSQL
On Thu, Mar 31, 2022 at 7:03 AM 'Anco Huiberts' via Barman, Backup and
Recovery Manager for PostgreSQL <pgba...@googlegroups.com> wrote:
>
> We encounter a slow archive-wal process. I see one thing I like to verify with you.
>

What do you measn with "slow"? PostgreSQL will archive the wals, so
you should not see any database slowing down activity (unless you are
running out of IOPS).


> Is this as designed? In other words: when the archive-wal is running, is this blocked by the receive-wal process? Is it possible to run this in parallel or something?
> Thx!

AFAIK it is not possible to do parallel archiving. Other backup
solutions allows for an "offline" kind of parallel approach, so to be
cheap on connection setups and data transfer.

Luca

Anco Huiberts

unread,
Mar 31, 2022, 2:58:04 AM3/31/22
to Barman, Backup and Recovery Manager for PostgreSQL
What do you measn with "slow"?
=> We encounter a long delay to copy the wal from the streaming location to the backup location (using pigz). Sometime it is 6 seconds per file of 16M, sometime 2 minutes. Try to find out why ;-)

Luca Ferrari

unread,
Mar 31, 2022, 5:22:23 AM3/31/22
to Barman, Backup and Recovery Manager for PostgreSQL
On Thu, Mar 31, 2022 at 8:58 AM 'Anco Huiberts' via Barman, Backup and
Recovery Manager for PostgreSQL <pgba...@googlegroups.com> wrote:
>
> What do you measn with "slow"?
> => We encounter a long delay to copy the wal from the streaming location to the backup location (using pigz). Sometime it is 6 seconds per file of 16M, sometime 2 minutes. Try to find out why ;-)

From the logs you pasted it seems to require 1-2 seconds per file. I
suspect sometimes there's a burst and pg_receivewal (on the barman
side) is not able to keep up.
Or you have something that is requiring all the IOPS and the system
slows down (but you should have an evidence also on the PostgreSQL
side).
There are a lot of reasons why this is happening, I suggest to try to
monitor both machines when the WAL is sent with delay.

Luca
Reply all
Reply to author
Forward
0 new messages