Long lived PgPool connections crashing database

659 views
Skip to first unread message

colin....@canopytax.com

unread,
Jul 20, 2022, 12:36:58 PM7/20/22
to vert.x
I'm using the 3.9.13 PostgreSQL Client's PgPool to connect to multiple RDS instances with multiple microservices. I'm seeing a strange issue where long lived connections in these pools seem to be causing a memory leak at the database level. The memory consumption will increase over time so much that all available memory for the DB instance will run out and start using gigs of swap, before the DB crashes and gets restarted. On a DB instance with 4 GB of RAM and only about 10 connections to the DB, it will usually take a few days to completely run out of memory and use about 2 GB of swap. That means each connection is somehow using more than 600 MB of RAM. If I restart the application, it clears out the connections and all the memory on the DB is freed up again.

I'm at a loss as to what could be using that much memory, and what at the application level could be causing that. Most of the DBs themselves are brand new and quite small, with maybe 4 or 5 tables with no more than a couple thousand rows in them. I know long lived postgres connections can accumulate metadata and consume memory, but for a DB this size, I'm not sure that's what I'm seeing here.

So here's my questions:
1) Is there something on the client (vertx) side that could be causing this memory accumulation? If so, is there any way to clear it out?
2) If not, what's the best way to expire DB connections in the pool? Ideally, I could remove connections after a specified time or number of queries ran after checking it back into the pool, and just grab a new connection. All I see is an option for idleTimeout but that will only clear out connections if they are idle for a specified time. With a small pool with constant traffic, they may never (or rarely) go idle for long enough to expire.

Here's my pool setup:

val connectOptions = pgConnectOptionsOf(
    database = config.getString("name"),
    host = config.getString("host"),
    port = 5432,
    connectTimeout = 30,
    idleTimeout = 30,
    user = config.getString("user"),
    password = config.getString("password")
)

val pool = PgPool.pool(vertx, connectOptions, poolOptionsOf(maxSize = 10))

colin....@canopytax.com

unread,
Jul 25, 2022, 11:35:12 AM7/25/22
to vert.x
Still looking for a solution here. Would love to hear if anyone knows of a way to better automatically expire connections than just setting the idleTimeout. I lowered this value, which ends up expiring most of them after hours when traffic is lower, but being able to expire them during the day would be even better.

colin....@canopytax.com

unread,
Jul 28, 2022, 4:20:02 PM7/28/22
to vert.x
I saw some significant improvement in memory consumption by running a "DISCARD ALL" before releasing every connection back to the pool. Julien (or anyone else), do you know if there's something PgPool should be doing to clear out session data after releasing a connection to the pool? Seems odd that I would need to manually discard everything to avoid major memory problems. I feel like that's something the pool/driver should be handling.

My test setup:
- Simple vertx server with one endpoint that hits the DB, returning 0 rows.
- Postgres 14 docker image. Installed htop to monitor memory within the container.
- Artillery to hit endpoint about 40x/second for 3 minutes.

I saw about a 5x memory reduction when calling "DISCARD ALL" before closing the connection.

Thomas SEGISMONT

unread,
Aug 1, 2022, 6:13:58 AM8/1/22
to ve...@googlegroups.com
Can you create a small reproducer without RDS?

It would be interesting to check with your DBA what kind of DB objects are retaining memory.

--
You received this message because you are subscribed to the Google Groups "vert.x" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+un...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/vertx/1cf216c7-a075-4f73-bcea-0bc6ff4cfa6dn%40googlegroups.com.

Julien Viet

unread,
Aug 1, 2022, 7:36:28 AM8/1/22
to vert.x
can you try this with Vert.x 4 ?

colin....@canopytax.com

unread,
Aug 1, 2022, 12:28:48 PM8/1/22
to vert.x
Thanks so much for your replies. Yes, I've been able to reproduce on Vert.x 4.3 locally (not using RDS) as well. I don't have a dedicated DBA here, so I'm working on trying to narrow down what portion of the DISCARD ALL (query plans, prepared statements, etc) actually fixes the memory issue. We do not see this issue at all with our services that run the JDBC client with the postgres driver and C3P0. Ideally, the PgPool class would have some more options around expiring connections like other pools do, such as a max age per connection (not just when idle) or max statements executed. I'll try to get a reproducer put together, but in the meantime, we are working toward getting all of our services upgraded to Vert.x 4.3 and connecting to RDS through Pgbouncer, which has better expiration options and therefore should remedy the issue for now.

Are there any plans in the works for adding some more pooling options to the PgPool? I didn't realize it until too late, but we lost quite a bit of functionality moving some new services away from the JDBC client to PgPool, such as min pool size, initial pool size, connection max age, testing idle connections, testing connections on checkin/out, etc.

Thomas SEGISMONT

unread,
Aug 2, 2022, 7:20:01 AM8/2/22
to ve...@googlegroups.com
Would you mind putting together your feature enhancement requests on a GH issue? Please also add why you need them.

Would you be able to share a minimal reproducer with us?


Julien Viet

unread,
Aug 3, 2022, 4:37:46 PM8/3/22
to vert.x
Hi,

if the pool has issues with such connection it would be good to find
out the actual reason, can you help us understand the issue with a
reproducer for this case ?

as for other options, it seems reasonable to have max-age per
connection implemented at least.

Julien
> To view this discussion on the web, visit https://groups.google.com/d/msgid/vertx/7471b929-b74a-4d5a-a5d5-b236e0a1dbden%40googlegroups.com.

Chandra R.

unread,
Oct 18, 2022, 9:44:34 PM10/18/22
to vert.x
I have a similar ask for max-age per connection, but in a different context (MySqlPool)
https://groups.google.com/g/vertx/c/C_3W1AXoPEc

The ask is similar to this feature in Hikari: https://www.javadoc.io/doc/com.zaxxer/HikariCP/2.6.3/com/zaxxer/hikari/HikariConfig.html#setMaxLifetime-long-
And, the motivation for the feature is the following:

colin....@canopytax.com

unread,
Nov 16, 2023, 2:13:50 PM11/16/23
to vert.x
I know this is really old, but I just wanted to give a quick update here on this issue. For the last year, we've gotten around this problem by simply restarting our services every night. We recently upgraded Vert.x to get the changes released in 4.4.2, and we added in a value for the new maxLifetime in the pool options. That seems to have immediately resolved the issues for us. So it appears that really long lived connections accumulate junk over time and need to be reset on occasion. We had been hoping to solve this using PgBouncer (which we use on our other services that connect with a JDBC pool), but unfortunately, support for that wasn't added until 4.4.2 as well. Anyway, thanks for adding those new features. They are definitely appreciated.

Thomas SEGISMONT

unread,
Nov 16, 2023, 2:27:32 PM11/16/23
to ve...@googlegroups.com
Thank *you* for taking the time to share this with us, very much appreciated!

Reply all
Reply to author
Forward
0 new messages