Greenplum 7: wal_compression GUC default value

110 views
Skip to first unread message

Marbin Tan

unread,
Jun 2, 2023, 5:20:12 PM6/2/23
to gpdb...@greenplum.org

Hi folks,

 

I want to start a discussion on whether we should have wal_compression GUC on or off by default.

The current default value is off.

Postgres has wal_compression off by default since the introduction of the GUC.

 

However, as we move forward with Greenplum 7, it is good to evaluate whether we should turn wal_compression on by default.

Greenplum 7 is different compared to Postgres as we strictly use ZSTD as the compression algorithm.

In Postgres, the user has the option to use other compression algorithms, e.g., pglz, lz4, and zstd.

 

Furthermore, the compression level in ZSTD is tunable. It can go from 1 to 22. As of right now, we are using the default value of compression level 3.

Compression level 3 is the recommended value by ZSTD as it hits most general use cases.

The question is, should we change this value?

 

I’ve done some performance testing to see the potential benefits of having wal_compression be set to on along with the exploration of different ZSTD compression level.

 

The performance testing were conducted on two different Greenplum 7 clusters, a multi-node and a single-node setup:

  • Multi-node deployment: 64 primary + 64 mirrors Greenplum 7 cluster
    • 8 primary and 8 mirror per host
    • Segment hosts: 50 sockets, 1 core/CPU, 2 NUMA nodes, 196GiB RAM
    • Coordinator host: 100 sockets, 1 core/CPU, 2 NUMA nodes, , 393GiB RAM
  • Single-node deployment: 3 primary + 3 mirrors (demo-cluster)
    • Local machine (MacBook Pro 2019 Intel chip)

 

The GUC settings used for most of the test:

  • wal_consistency_checking=""
  • fsync=on
  • full_page_writes=on

 

Summary

  • Pros of wal_compression=on
    • It will reduce the WAL generated data by 20-30%.
    • It will reduce overall duration of updates/load by 6-15%.
    • It will reduce the amount of data transferred across interconnect.
    • It will reduce I/O as there are less data generated.
    • Based on TPC-DS load duration, AO/CO load will speed-up by 4%.
  • Cons of wal_compression=on
    • It will increase overall CPU consumption by for multi-node +0.5-1.5% and +0.5-10% for single node.
    • Memory allocation for ZSTD is not tracked, so (as a corner case) we may run out of memory.
      • Total memory usage went from 8.8% to 9.0% (single node) when going from wal_compression off to on.
  • ZSTD compression level 1 and level 3 behaves relatively the same for a multi-node deployment.

 

 

Beyond this point will be the test information, the data, and analysis of the results.

 

Note: the main test performed here is based on a prior testing from the Postgres mailing list.

 

The test creates, inserts, and updates the values.

Initially, only the updates are measured, however, I’ve modified the tests to get some interesting data.

 

    9 short and 1 long, short changed

    hundred tiny fields, all changed

    hundred tiny fields, half changed

    hundred tiny fields, half nulled

    one short and one long field, no change

    ten tiny fields, all changed

    two short fields, both changed

    two short fields, no change

    two short fields, one changed

 

Also, I will mostly present the aggregated data of the above test since showing each test result would be too large. 

 

The scale mentioned below is a multiplier when inserting rows.
1000
scale will be around 14GB – 21GB

5000 scale will be around 40GB – 55GB

10000 scale will be around 55GB – 83GB

 

 

 

Single node test

 

scenario: create + insert

type | zstd compression level | duration delta | wal_generated delta | wal_bytes delta

heap | 1                      | -0.1%          | -0.4%               | -0.4%

 

scenario: update

type | scale | zstd compression level | duration delta | wal_generated delta | wal_bytes delta

AO/CO| 1000  | 3                      | 2.1%           | -0.3%               | -0.2%

heap | 1000  | 1                      | -4.9%          | -31.1%              | -31.3%

heap | 1000  | 3                      | 7%             | -31%                | -31%

heap | 1000  | 19                     | 1442.0%        | -30.5%              | -33.3%

heap | 5000  | 1                      | -3.2%          | -31.0%              | -31.6%

heap | 5000  | 3                      | 4.4%           | -31.3%              | N/A

heap | 10000 | 3                      | 3%             | -31%                | -32%

heap | 50000 | 1                      | -3.9%          | -31.0%              | -31.6%

 

scenario: create + insert + update

type | scale | zstd compression level | duration delta | wal_generated delta | wal_bytes delta

heap | 1000  | 1                      | -1.8%          | -19.8%              | -20.0%

 

GUC wal_consistency_checking=heap

type | scale | zstd compression level | duration delta | wal_generated delta | wal_bytes delta

heap | 1000  | 1                      | -43.1%         | -86.4%              | -86.7%

 

 

Based on the result from the scenario: update table, as you increase the compression level, it will also

Increasing the duration and in return compress the data even more.

However, the increase in duration is very costly compared to how much compression we get in return.

Hence, the test beyond this point will be limited to compression level 1 and 3 (default).

 

We have a very similar result that was done on Postgres.

The results show that the full page image (FPI) size no longer decreases at a certain point even with

the increase of the ZSTD compression level.
     descr                         | rel_size | fpi_size | time_sA

int column no compression     | 429 MB   | 727 MB   | 13.15

int column ztsd default level | 429 MB   | 523 MB   | 14.23

int column zstd level 1       | 429 MB   | 524 MB   | 13.94

int column zstd level 10      | 429 MB   | 523 MB   | 23.46

int column zstd level 19      | 429 MB   | 523 MB   | 103.71

int/text no compression       | 344 MB   | 558 MB   | 10.08

int/text zstd default level   | 344 MB   | 415 MB   | 11.48

int/text zstd level 1         | 344 MB   | 418 MB   | 11.25

int/text zstd level 10        | 344 MB   | 415 MB   | 20.59

int/text zstd level 19        | 344 MB   | 413 MB   | 62.64

 

 

Multi-node test

 

scenario: create only

type | scale | zstd compression level | duration delta | wal_generated delta | wal_bytes delta

heap | 1000  | 3                      | 0.1%           | -7.8%               | -4.1%

 

scenario: update

type | scale | zstd compression level | duration delta | wal_generated delta | wal_bytes delta

heap | 5000  | 3                      | -9.6%          | -29.2%              | -28.1%

heap | 10000 | 1                      | -11.7%         | -31.0%              | -33.7%

heap | 10000 | 3                      | -10.7%         | -27.4%              | -27.8%

 

The surprising result here is that the multi-node actually gained a 10% speed up compared to the single node.

The wal_bytes and wal_generated are within the expected range of 20-30% reduction in size.

 

 

perf capturing

 

All perf records are captured on host (sdw1).

The CPU % values below are retrieved from the flamegraph that is generated based on the perf record.


The flamegraph showed that the extra CPU usage is mainly from ZSTD_compress and ZSTD_getErrorString.

 

Multi-node mirrorless

 

testname                                | compression=3 on | off

9 short and 1 long, short changed       | 5.55%            | 0.90%

hundred tiny fields, all changed        | 2.56%            | 1.06%

hundred tiny fields, half changed       | 2.83%            | 1.05%

hundred tiny fields, half nulled        | 3.07%            | 1.05%

one short and one long field, no change | 1.56%            | 0.90%

ten tiny fields, all changed            | 10.98%           | 2.04%

two short fields, both changed          | 11.06%           | 1.82%

two short fields, no change             | 11.63%           | 1.82%

two short fields, one changed           | 12.07%           | 1.75%

 

 

Multi-node with mirrors

 

testname                                | compression=3 on | off

9 short and 1 long, short changed       | 1.64%            | 1.25%

hundred tiny fields, all changed        | 3.81%            | 4.13%

hundred tiny fields, half changed       | 3.63%            | 3.94%

hundred tiny fields, half nulled        | 3.81%            | 3.48%

one short and one long field, no change | 1.36%            | 1.24%

ten tiny fields, all changed            | 6.52%            | 5.74%

two short fields, both changed          | 7.35%            | 6.60%

two short fields, no change             | 4.05%            | 3.32%

two short fields, one changed           | 4.29%            | 3.80%

 

testname                                |  compression=1 on | off

9 short and 1 long, short changed       |  1.72%            | 1.20%

hundred tiny fields, all changed        |  3.84%            | 3.73%

hundred tiny fields, half changed       |  4.18%            | 3.85%

hundred tiny fields, half nulled        |  3.89%            | 3.66%

one short and one long field, no change |  1.28%            | 1.35%

ten tiny fields, all changed            |  7.09%            | 6.03%

two short fields, both changed          |  7.02%            | 6.68%

two short fields, no change             |  3.81%            | 3.95%

two short fields, one changed           |  4.03%            | 3.85%

 

testname                                | compression=3 on with mirror caught up | off with mirror caught up

9 short and 1 long, short changed       | 1.79%                                  | 1.37%

hundred tiny fields, all changed        | 3.84%                                  | 3.10%

hundred tiny fields, half changed       | 3.87%                                  | 3.75%

hundred tiny fields, half nulled        | 3.83%                                  | 3.59%

one short and one long field, no change | 1.41%                                  | 1.29%

ten tiny fields, all changed            | 7.37%                                  | 6.89%

two short fields, both changed          | 7.01%                                  | 7.23%

two short fields, no change             | 4.23%                                  | 4.05%

two short fields, one changed           | 3.68%                                  | 3.54%

 

Based on the results here, even with waiting for mirror to decompress and replay the WAL, it still does not take a significant amount of CPU time.

Multi-node with mirror uses less overall CPU for Postgres compared to the mirrorless. This is probably because the data needs to be sent to mirrored?

It seems that the overall Postgres CPU usage for wal_compression on compared to off increased by 0.5% to 1.5%.

The increase in CPU cycles for Postgres doesn’t seem to be very much for a host with 16 segments (8 primary + 8 mirrors).

 

 

TPC-DS load of 1TB with ZSTD compression 3

 

AO/CO

off | 1003.607694 seconds

on  | 962.35554 seconds

duration delta (off -> on) -4.11%

 

Heap

off | 1551.47638 seconds

on  | 1241.385109 seconds

duration delta (off -> on) -19.99%

 

Surprisingly, even AO/CO received a 4% benefit from turning compression on. This might be related to auxiliary tables being heap.

As for TPC-DS with heap tables, we can see an almost +20% improvement in speed. (Note: I had another run that came to about +6% improvement)

But overall, this is a significant improvement for the TPC-DS load.

 

 

Memory

 

ZSTD uses its own memory allocation, which means tracking the memory consumption is not simple. Unfortunately, the feature for using custom memory allocation for ZSTD is still not stable. This is something we may need to consider in the future.

 

The memory consumption did not increase much for the single node test; the max memory usage went from 8.8% to 9.0%. The memory usage was captured with a watch on ps of all postgres instances every half a second.

Keep in mind that as we increase the amount of segments per host, the most memory ZSTD will use as each segment will do its own WAL compression.

 

 

I would like to hear your thoughts regarding wal_compression being on as default. If so, which ZSTD compression level? 1 or 3?

Feedback on what could should be investigated looked into further to better determine which direction we should go would also be great.

Also, if anything is unclear please let me know.
Thank you very much.

 

Regards,
Marbin

Soumyadeep Chakraborty

unread,
Jun 2, 2023, 8:42:52 PM6/2/23
to Marbin Tan, gpdb...@greenplum.org
Hey Marbin,

Awesome work! Thanks for digging and doing so many different experiments!

On Fri, Jun 2, 2023 at 2:20 PM 'Marbin Tan' via Greenplum Developers
<gpdb...@greenplum.org> wrote:

> Single node test
> scenario: update
> type | scale | zstd compression level | duration delta | wal_generated delta | wal_bytes delta
> heap | 1000 | 3 | 7% | -31% | -31%

The 7% duration hit is unsurprising. Here, there is no network involved,
and if we aren't saving on network, there is no point in enabling this
GUC. The same conclusion can be applied to mirrorless clusters.

> GUC wal_consistency_checking=heap
> type | scale | zstd compression level | duration delta | wal_generated delta | wal_bytes delta
> heap | 1000 | 1 | -43.1% | -86.4% | -86.7%

This is where wal_compression shines the most -> this is a pathological
workload generating a large amount of FPIs. This actually simulates a
situation where we are loading a very high volume of data, with high
concurrency. It can happen if the data ingest volume is so much that
checkpoints are occurring too frequently, causing more FPIs.

> Multi-node with mirrors
> testname | compression=3 on | off
> 9 short and 1 long, short changed | 1.64% | 1.25%

What exactly does 1.64% mean? Is it the total CPU time of zstd relative
to all PG processes or relative to all processes?

Also what does 1.25% mean? (since compression is off, how is there a
number?)

> testname | compression=3 on with mirror caught up | off with mirror caught up

This is the most interesting configuration, because we are also
factoring in CPU overhead from decompression. The numbers are definitely
encouraging.

Do you have a variant of this: compression=1 on with mirror caught up?

> TPC-DS load of 1TB with ZSTD compression 3
> AO/CO
> off | 1003.607694 seconds

Wow, it took just 1000s for loading 1TB of data? And was this in a
mirrored multi-host setting?

> The memory consumption did not increase much for the single node test; the max memory usage went from 8.8% to 9.0%.

Which test case did this correspond to?

At any rate, I don't think we have to worry about memory too much. If
there aren't any leaks etc, which your experiments seem to have
eliminated, then we are good. This is because, at a time, a backend can
only be compressing a 32K page. The input size to the compression
algorithm will always be 32K. So, all good!

Some follow-up questions:

1. What is the zstd version that you used?
2. Did you have indexes on the tables for any of the runs?

These are my conclusions so far (I will wait for your replies to my
questions before considering these final):

1. We should definitely enable it by default in guc.c.
However, there is no point in doing wal compression if there is no
network to push WAL to. So, this means that we shouldn't enable it in a
mirrorless setting (edge case being mirrorless cluster with WAL archive
turned on where there might still be some benefit)

2. Even though we may be able to squeeze some CPU cycles from setting
compresslevel = 1, given the marginal improvements in CPU your
experiments show, I would lean toward favoring the default
compresslevel = 3. This is because:

* It is what upstream does.
* It is the one that ZSTD advocates and the setting that the zstd
dev team would pay more attention to and optimize in newer releases.
* There is not enough of a CPU gap between level=3 and level=1.
* There might be a workload in which level=3 pulls ahead in terms of
WAL compression ratio (and thus overall duration).

Regards,
Soumyadeep (VMware)

Marbin Tan

unread,
Jun 5, 2023, 12:37:36 PM6/5/23
to Soumyadeep Chakraborty, gpdb...@greenplum.org
Hey Deep,

Thank you for the detailed response.

> Multi-node with mirrors
> testname                                | compression=3 on | off
> 9 short and 1 long, short changed       | 1.64%            | 1.25%

What exactly does 1.64% mean? Is it the total CPU time of zstd relative
to all PG processes or relative to all processes?
This is the overall PG process itself. So, from 1.25% to 1.64%, there is roughly 0.39% used by ZSTD.
Looking at the specifics of the flamegraph for this one.
.02 (ZSTDMT_endStream) +
0.28 (ZSTD_compress) +
0.02 (ZSTD_decompressStream...) +
0.04 (ZSTD_getErrorString) +
0.12 (ZSTD_endStream, ZSTD_compress, ZSTD_getErrorString)
It would be exactly 0.48% of the overall CPU usage, but from the perspective of PG.
It increased the usage of PG processes by 31%.

If you would like the granularity of this for the other tests, please let me know and I'll post them up.

Also what does 1.25% mean? (since compression is off, how is there a
number?)
It is overall % of sampled PG process, so there are no compression happening just PG process activities.
> testname                                | compression=3 on with mirror caught up | off with mirror caught up

This is the most interesting configuration, because we are also
factoring in CPU overhead from decompression. The numbers are definitely
encouraging.

Do you have a variant of this: compression=1 on with mirror caught up?
No, unfortunately, I do not. It would be a great to have, but there's limited resources at the moment.
> TPC-DS load of 1TB with ZSTD compression 3
> AO/CO
> off | 1003.607694 seconds

Wow, it took just 1000s for loading 1TB of data? And was this in a
mirrored multi-host setting?
Yes, this is a mirrored multi-host setting. I believe gpfdist runs on each segment to load the data.
> The memory consumption did not increase much for the single node test; the max memory usage went from 8.8% to 9.0%.

Which test case did this correspond to?
This would be the single node local host test.

At any rate, I don't think we have to worry about memory too much. If
there aren't any leaks etc, which your experiments seem to have
eliminated, then we are good. This is because, at a time, a backend can
only be compressing a 32K page. The input size to the compression
algorithm will always be 32K. So, all good!
I have the same thoughts on that as well.
Some follow-up questions:

1. What is the zstd version that you used?
Local machine: *** Zstandard CLI (64-bit) v1.5.5, by Yann Collet ***
Multi-node machine: 1.4.4+dfsg-3ubuntu0.1
2. Did you have indexes on the tables for any of the runs?
No, there were no indexes. I only ran some tests for indexes, but not enough data to have a conclusion.

These are my conclusions so far (I will wait for your replies to my
questions before considering these final):

1. We should definitely enable it by default in guc.c.
However, there is no point in doing wal compression if there is no
network to push WAL to. So, this means that we shouldn't enable it in a
mirrorless setting (edge case being mirrorless cluster with WAL archive
turned on where there might still be some benefit)
There might be more testing needed for mirrorless and the result may be specific for each datastore.
For example, vSAN by default does not do "data locality".
It pools all the disks groups (groups the disks within a host) together to create a vSAN datastore and
all the writes go through the vSAN network. VMDKs are split into components and may reside on different disk groups.
So, if vSAN is not doing the compression by default, there is a possibility for some performance gain.
2. Even though we may be able to squeeze some CPU cycles from setting
compresslevel = 1, given the marginal improvements in CPU your
experiments show, I would lean toward favoring the default
compresslevel = 3. This is because:

    * It is what upstream does.
    * It is the one that ZSTD advocates and the setting that the zstd
    dev team would pay more attention to and optimize in newer releases.
    * There is not enough of a CPU gap between level=3 and level=1.
    * There might be a workload in which level=3 pulls ahead in terms of
    WAL compression ratio (and thus overall duration).
Agreed, I think compress level 3 is a safe bet.

Regards,
Marbin

Soumyadeep Chakraborty

unread,
Jun 5, 2023, 2:13:38 PM6/5/23
to Marbin Tan, gpdb...@greenplum.org
On Mon, Jun 5, 2023 at 9:37 AM Marbin Tan <tma...@vmware.com> wrote:

> This is the overall PG process itself. So, from 1.25% to 1.64%, there is roughly 0.39% used by ZSTD.

Got it. Thanks for the detailed explanation!

> Local machine: *** Zstandard CLI (64-bit) v1.5.5, by Yann Collet ***
> Multi-node machine: 1.4.4+dfsg-3ubuntu0.1

Interesting. I checked the min required version for zstd in our code:

if test "$with_zstd" = yes; then
dnl zstd_errors.h was renamed from error_public.h in v1.1.1
PKG_CHECK_MODULES([ZSTD], [libzstd >= 1.1.1])
fi

I think it's high time we bump it. I think we will be more confident in
turning wal_compression on by default that way! (PG requires at least
1.4.0 on devel)

Any thoughts on what min version to require? (My opinion is to require
at least 1.4.0)

> There might be more testing needed for mirrorless and the result may be specific for each datastore.
> For example, vSAN by default does not do "data locality".
> It pools all the disks groups (groups the disks within a host) together to create a vSAN datastore and
> all the writes go through the vSAN network. VMDKs are split into components and may reside on different disk > groups.
> So, if vSAN is not doing the compression by default, there is a possibility for some performance gain.

I see, so for high WAL volume, it might increase throughput in such a
setting. This discussion can also be extended to cloud-based deployments
as well with non-local storage, I guess.

With regard to the numbers you posted for the mirrorless deployment, like:

testname | compression=3 on | off
9 short and 1 long, short changed | 5.55% | 0.90%

Even though we take a hit on PG process CPU usage, what was the net
effect on the overall duration of each test? Was the spike in CPU usage
offset by networked-storage-writes?

Regards,
Soumyadeep (VMware)

Marbin Tan

unread,
Jun 5, 2023, 3:20:17 PM6/5/23
to Soumyadeep Chakraborty, gpdb...@greenplum.org
> Local machine: *** Zstandard CLI (64-bit) v1.5.5, by Yann Collet ***
> Multi-node machine: 1.4.4+dfsg-3ubuntu0.1

Interesting. I checked the min required version for zstd in our code:

if test "$with_zstd" = yes; then
  dnl zstd_errors.h was renamed from error_public.h in v1.1.1
  PKG_CHECK_MODULES([ZSTD], [libzstd >= 1.1.1])
fi

I think it's high time we bump it. I think we will be more confident in
turning wal_compression on by default that way! (PG requires at least
1.4.0 on devel)

Any thoughts on what min version to require? (My opinion is to require
at least 1.4.0)

I think 1.4.0 is a good next step and will future proof us.
There are some advanced functionalities that became stable for 1.4.0 and
possibly used by base backup ZSTD. I'm not sure when we will be pulling in ZSTD for base backup,
but once that happens, a higher version of ZSTD will be a requirement.

> There might be more testing needed for mirrorless and the result may be specific for each datastore.
> For example, vSAN by default does not do "data locality".
> It pools all the disks groups (groups the disks within a host) together to create a vSAN datastore and
> all the writes go through the vSAN network. VMDKs are split into components and may reside on different disk > groups.
> So, if vSAN is not doing the compression by default, there is a possibility for some performance gain.

I see, so for high WAL volume, it might increase throughput in such a
setting. This discussion can also be extended to cloud-based deployments
as well with non-local storage, I guess.

With regard to the numbers you posted for the mirrorless deployment, like:

testname                                | compression=3 on | off
9 short and 1 long, short changed       | 5.55%            | 0.90%

Even though we take a hit on PG process CPU usage, what was the net
effect on the overall duration of each test? Was the spike in CPU usage
offset by networked-storage-writes?
For mirrorless deployment, going from off to on, on is slower by 4 to 8%.
Unfortunately, I do not have the metrics for network-storage-writes since I do not have access to the vSphere environment.
I think more investigation will be needed to iron things out for mirrorless deployment on vSphere, but with the current findings. It is safe to say that there will be a performance hit if wal_compression is on for mirrorless deployments.

Soumyadeep Chakraborty

unread,
Jun 5, 2023, 6:20:16 PM6/5/23
to Marbin Tan, gpdb...@greenplum.org
On Mon, Jun 5, 2023 at 12:20 PM Marbin Tan <tma...@vmware.com> wrote:

> I think 1.4.0 is a good next step and will future proof us.
> There are some advanced functionalities that became stable for 1.4.0 and
> possibly used by base backup ZSTD. I'm not sure when we will be pulling in ZSTD for base backup,
> but once that happens, a higher version of ZSTD will be a requirement.

We don't have any plans for that at the moment. We don't even do the
gzip compression that is available with pg_basebackup in 7X from
gprecoverseg. But, we definitely should IMO. I will let others weigh
in, in terms of priority. But, just to be clear, even if we backport
that support in pg_basebackup, 1.4.0 should be more than enough, right.
Or did you see anything that would benefit from going even higher in
pg_basebackup?

That said, when we raise the PR to turn wal_compression on by default,
let's bump the version in a separate commit.
It might need some dev/release pipeline changes as well.

> For mirrorless deployment, going from off to on, on is slower by 4 to 8%.
> Unfortunately, I do not have the metrics for network-storage-writes since I do not have access to the vSphere environment.
> I think more investigation will be needed to iron things out for mirrorless deployment on vSphere, but with the current findings. It is safe to say that there will be a performance hit if wal_compression is on for mirrorless deployments.

Understood.

This is mainly a question of what we document. I think we can say that
wal_compression *might not* provide a benefit in mirrorless settings.
It completely depends on the environment - non local GCP storage can be
so slow (felt the pain last year when experimenting some readahead
stuff), that the extra CPU might be worth it for high WAL volume
workloads!

Regards,
Soumyadeep (VMware)

Marbin Tan

unread,
Jun 5, 2023, 7:16:28 PM6/5/23
to Soumyadeep Chakraborty, gpdb...@greenplum.org

Marbin Tan

unread,
Jun 5, 2023, 7:23:33 PM6/5/23
to Soumyadeep Chakraborty, gpdb...@greenplum.org
> I think 1.4.0 is a good next step and will future proof us.
> There are some advanced functionalities that became stable for 1.4.0 and
> possibly used by base backup ZSTD. I'm not sure when we will be pulling in ZSTD for base backup,
> but once that happens, a higher version of ZSTD will be a requirement.

We don't have any plans for that at the moment. We don't even do the
gzip compression that is available with pg_basebackup in 7X from
gprecoverseg. But, we definitely should IMO. I will let others weigh
in, in terms of priority. But, just to be clear, even if we backport
that support in pg_basebackup, 1.4.0 should be more than enough, right.
Or did you see anything that would benefit from going even higher in
pg_basebackup?
I think that 1.4.0 should be more than enough.
There are some performance improvements in the later releases but nothing that affects the API.
Just to note, there is a rare corruption bug that can occur in 1.4.0, but we shouldn't hit it.

It includes a few bug fixes, including a fix for a rare data corruption bug, which could only be triggered in a niche use case, when doing all of the following: using multithreading mode, with an overlap size >= 512 MB, using a strategy >= ZSTD_btlazy, and compressing more than 4 GB. None of the default compression levels meet these requirements (not even --ultra ones).


That said, when we raise the PR to turn wal_compression on by default,
let's bump the version in a separate commit.
It might need some dev/release pipeline changes as well.
Agreed, this should be a separate commit.
> For mirrorless deployment, going from off to on, on is slower by 4 to 8%.
> Unfortunately, I do not have the metrics for network-storage-writes since I do not have access to the vSphere environment.
> I think more investigation will be needed to iron things out for mirrorless deployment on vSphere, but with the current findings. It is safe to say that there will be a performance hit if wal_compression is on for mirrorless deployments.

Understood.

This is mainly a question of what we document. I think we can say that
wal_compression *might not* provide a benefit in mirrorless settings.
It completely depends on the environment - non local GCP storage can be
so slow (felt the pain last year when experimenting some readahead
stuff), that the extra CPU might be worth it for high WAL volume
workloads!
+1

Regards,
Marbin

Reply all
Reply to author
Forward
0 new messages