Is a single full backup followed by perpetual incremental backups a viable strategy?

31 views
Skip to first unread message

JK Laiho

unread,
Mar 25, 2025, 4:25:07 AMMar 25
to Barman, Backup and Recovery Manager for PostgreSQL
I'm looking at backing up a PostgreSQL 17 server with a combination of streaming backups and WAL streaming. The retention policy I've been considering is RECOVERY WINDOW OF 14 DAYS.

As block-level incremental backups are available since PG 17, I'm thinking of using those.

Here's my proposed scenario: take one full initial backup, then just have a daily cron job run barman backup --incremental latest servername in perpetuity, with the assumption that barman cron is smart about combining backups that become older than the 14-day recovery window.

But: the documentation isn't very clear on if this scenario is actually how Barman operates.

I can see two possible scenarios here. Which is true, or is neither?

1. When the first incremental backup becomes 15 days old, Barman uses pg_combinebackup to create a synthetic full backup from the original full backup and the oldest incremental backup via barman cron. This repeats each day, so at any given point you're storing a synthetic full backup and 14 days of incremental backups on top of it. Together with WAL streaming, this enables PITR to any point within those 14 days.

2. Only the initial full backup matters in terms of the recovery window. Barman doesn't do any automatic combination of it and its following incremental backups. 2 months later, I'll have the original full backup and 2 months worth of incremental backups. WAL files are still only stored for the past 14 days, enabling the same PITR period as scenario 1, but incremental backups can never be deleted until the next full backup is taken (or possibly until I run pg_combinebackup manually, though I don't know if this would mess up Barman's bookkeeping). Eventually I'll run out of disk space. In a recovery scenario, Barman would need to replay everything from the original full backup onwards, applying months or years of incremental backups on top.

I really hope that scenario 1 is how it works, but I hope this group can confirm this one way or another. If it's scenario 2 (or something else entirely), is scenario 1 even conceivable as a future feature?

-- JK

Martin Marques

unread,
Mar 25, 2025, 4:52:21 AMMar 25
to pgba...@googlegroups.com
Hi,

Thank you for bringing this up. We've been talking about it for some
time, but other priorities have prevented us from completing it.

I'm going to go a bit into detail below.

On Tue, 25 Mar 2025 at 09:25, JK Laiho <jarkko...@gmail.com> wrote:
>
> But: the documentation isn't very clear on if this scenario is actually how Barman operates.

The part on the docs that you need to look at is note (we may want to
word that note better; I'll take that as an action item):

```
Block-level incremental backups are not considered in retention
policies, as they depend on their parent backups and the root backup.
Only the root backup is used to determine retention.
```

> I can see two possible scenarios here. Which is true, or is neither?
>
> 1. When the first incremental backup becomes 15 days old, Barman uses pg_combinebackup to create a synthetic full backup from the original full backup and the oldest incremental backup via barman cron. This repeats each day, so at any given point you're storing a synthetic full backup and 14 days of incremental backups on top of it. Together with WAL streaming, this enables PITR to any point within those 14 days.

This here is what you describe as Evergreen with a retention policy.
We'd love to deliver this feature, but there are some other blocking
features we would need to have first. There's also an internal
discussion on how the combinebackup would be executed. Also, PG v18
has a link mode for combinebackup which would really help here.

We have plans to deliver such a feature, but it's not there yet.

> 2. Only the initial full backup matters in terms of the recovery window. Barman doesn't do any automatic combination of it and its following incremental backups. 2 months later, I'll have the original full backup and 2 months worth of incremental backups. WAL files are still only stored for the past 14 days, enabling the same PITR period as scenario 1, but incremental backups can never be deleted until the next full backup is taken (or possibly until I run pg_combinebackup manually, though I don't know if this would mess up Barman's bookkeeping). Eventually I'll run out of disk space. In a recovery scenario, Barman would need to replay everything from the original full backup onwards, applying months or years of incremental backups on top.

This is closer to how it works, but not exactly.

If your oldest full backup is 2 months old, the full backup and full
chain of incremental backups will be retained, together with all the
WAL files.

This is an interesting feature you brought up that we hadn't thought
about: only retain the WALs for the retention period as long as
there's an incremental backup inside the retention policy. It would
have to be implemented with an option switch so people can choose.

> I really hope that scenario 1 is how it works, but I hope this group can confirm this one way or another. If it's scenario 2 (or something else entirely), is scenario 1 even conceivable as a future feature?

Option one, from the user perspective, would look like a command that
would look at the chain of full and incremental backups, and reduce
that chain so the retention policy is still met, but the chain reduced
to its minimal length.

This is something we want to implement this year.

For now, you will need to take a full backup every now and then so
that Barman's maintenance work will clean up obsolete backups.

JK Laiho

unread,
Mar 25, 2025, 9:46:28 AMMar 25
to Barman, Backup and Recovery Manager for PostgreSQL
On Tuesday, March 25, 2025 at 10:52:21 AM UTC+2 Martin Marques wrote:
```
Block-level incremental backups are not considered in retention
policies, as they depend on their parent backups and the root backup.
Only the root backup is used to determine retention.
```

Yeah, I saw that one but wasn't quite sure what its practical implications were. Why is this, by the way? Especially if the incremental backup dependency chain is intact.

"Considered for retention" — are incremental backups within a recovery window (if present) still used for recovery purposes, so that replaying WALs doesn't need to go all the way back to the root backup? I'd assume so, because if they're not, then I'm not sure I understand what the point of having them would even be.

By the way, "root backup"—is that synonymous to "full backup", or just one or more specific types of full backups? I see the term used in a couple of places in the docs in similar contexts as "full backup". If they're one and the same, standardizing on just one term might be appropriate?
 
This is an interesting feature you brought up that we hadn't thought
about: only retain the WALs for the retention period as long as
there's an incremental backup inside the retention policy. It would
have to be implemented with an option switch so people can choose.

Nice; hoping to see this :)

Option one, from the user perspective, would look like a command that
would look at the chain of full and incremental backups, and reduce
that chain so the retention policy is still met, but the chain reduced
to its minimal length.

This is something we want to implement this year.

Very glad to hear this, I'll look forward to it!
 
For now, you will need to take a full backup every now and then so
that Barman's maintenance work will clean up obsolete backups.

All right. Thank you very much for clarifying this!

-- JK 

Martin Marques

unread,
Mar 26, 2025, 9:05:58 AMMar 26
to pgba...@googlegroups.com
Hi,

On Tue, 25 Mar 2025 at 14:46, JK Laiho <jarkko...@gmail.com> wrote:
>
> On Tuesday, March 25, 2025 at 10:52:21 AM UTC+2 Martin Marques wrote:
>
> ```
> Block-level incremental backups are not considered in retention
> policies, as they depend on their parent backups and the root backup.
> Only the root backup is used to determine retention.
> ```
>
>
> Yeah, I saw that one but wasn't quite sure what its practical implications were. Why is this, by the way? Especially if the incremental backup dependency chain is intact.

If, for any reason, one of your incremental backups gets corrupted,
you will not be able to recover to any point in time if you don't have
the WALs. That is a risk someone may be willing to take.

> "Considered for retention" — are incremental backups within a recovery window (if present) still used for recovery purposes, so that replaying WALs doesn't need to go all the way back to the root backup? I'd assume so, because if they're not, then I'm not sure I understand what the point of having them would even be.

Yes, that is correct. The reason why we keep the WALs back to the full
backup is explained above.

> By the way, "root backup"—is that synonymous to "full backup", or just one or more specific types of full backups? I see the term used in a couple of places in the docs in similar contexts as "full backup". If they're one and the same, standardizing on just one term might be appropriate?

I'll get that changed. Root is not a good term for documentation.

Kind regards, Martín
Reply all
Reply to author
Forward
0 new messages