Memory leak from sqlite3 leads to go runtime?

660 views
Skip to first unread message

Gavra

unread,
Mar 9, 2025, 3:59:53 PMMar 9
to golang-nuts
Hi,
So we have been trying really hard to understand a major leak in our product.
We are using this sqlite3 driver: https://github.com/mattn/go-sqlite3
The pprof heap profile indicates the source is the call to SQLiteRows.Columns func and two additional calls, one on SQLiteStmt and the other on SQLiteConn.
According to the code
1. SQLiteRows references SQLiteStmt which then references to SQLiteConn
2. The SQLiteRows instance is the sole object holding a ref to the allocation by Columns().
This is a strong indication that refs to SQLiteRows are leaked.
We can confirm the leak is increasing over time and does not appear to reflect a burst or large data.
We thought we forgot to close rows or somehow appened refs to rows etc but we ruled it out completely since. Note the heap profile only shows SQLite allocations, no app objects allocated. We verified that forgetting to call SQLiteRows.Close leaks only memory in CGO which means it is not visible in the heap profile.
So on one hand we are convinced someone is holding a reference to SQLiteRows but on the other hand it is not our application and not the SQLite driver.

We tracked back our source code changes and noticed that correlate to the appearance of this issue:
1. Upgrading go: 1.22.6 to 1.22.9
2. Upgrading the sqlite3 driver which had one non-negligible change: adding SetFinalizer to all these objects. 

This is a very weird thing to suggest, but we think this could be caused by the go runtime, somehow.
I attached a screenshot of the heap profile, focused on the major leak around the sqlite3 driver.
(our current plans is to use goref or dlv on a core dump to understand who is holding these references but that could take a few more days).

Thank you.
Screenshot 2025-03-09 at 21.40.22.png

robert engels

unread,
Mar 9, 2025, 5:49:03 PMMar 9
to Gavra, golang-nuts
Looks to me like you are reading a lot of rows under a lock, and never releasing the lock, so the rows remain in memory.

I don’t know the internals of the SQLite very well, but my understanding is that it is not really a “driver” in the traditional sense that communicates with a db - but rather it is the implementation as well. Since it is the implementation, holding the lock seems reasonable to also hold the rows.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/golang-nuts/8a8b608b-e501-403a-b7a7-0d0bda657e4cn%40googlegroups.com.
<Screenshot 2025-03-09 at 21.40.22.png>

Michael Knyszek

unread,
Mar 9, 2025, 11:52:05 PMMar 9
to golang-nuts
I suspect this fact is going to be the most relevant thing to your investigation:


> Upgrading the sqlite3 driver which had one non-negligible change: adding SetFinalizer to all these objects. 

See https://go.dev/doc/gc-guide#Common_finalizer_issues for a variety of ways finalizers can cause leaks (on both the Go and C side). Go 1.24's cleanups (runtime.AddCleanup) might work be better, provided a deterministic execution order isn't required. (This point might be moot since it's in go-sqlite3, which sounds like it's something you don't have control over.)

I have a patch that provides a finalizer/cleanup leak detector by setting GODEBUG=detectcleanupleaks=1, if you want to try it. It's https://go.dev/cl/634599. Happy to explain how to patch and build the Go toolchain if you're up for it.

Robert Engels

unread,
Mar 10, 2025, 12:14:49 AMMar 10
to Michael Knyszek, golang-nuts
I suspect that they added the SetFinalizer calls to help those using that driver that weren’t properly managing the driver resources (connections, queries, etc. ). 

On Mar 9, 2025, at 10:52 PM, 'Michael Knyszek' via golang-nuts <golan...@googlegroups.com> wrote:

I suspect this fact is going to be the most relevant thing to your investigation:

Elad Gavra

unread,
Mar 10, 2025, 2:08:07 AMMar 10
to Robert Engels, Michael Knyszek, golang-nuts

Thank you for the replies.

Robert, I forgot to mention we verified there are no locks/any sqlite routine that is stuck (using the routines profile).

Michael, thanks I'll have a look. We shall see how the debug deployment will go. I'll also check out your suggestion (problem is this reproduces only at customer env). By the way, the author of sqlite 3 package removed the finalizer for rows but the reason was "redundant call".

Thanks!


You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/uL68-fxg2K4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/golang-nuts/2B907B52-C0D4-4C42-8513-95BA4BFF3CF0%40ix.netcom.com.

Robert Engels

unread,
Mar 10, 2025, 3:41:37 AMMar 10
to Elad Gavra, Michael Knyszek, golang-nuts
Hi. You may have to provide a small reproducible test case…

The graph you provided seems to show a memory leak. I would use goref on the rows instances to find the roots. If the memory is off heap that means either the driver has a bug or the driver is being used incorrectly. 


On Mar 10, 2025, at 1:07 AM, Elad Gavra <gav...@gmail.com> wrote:



Gavra

unread,
Mar 19, 2025, 5:55:58 AMMar 19
to golang-nuts
Hey,
We finally figured it all out.
The cause to the leak is a bug in a finalizer that is unrelated to sqlite3. This finalizer blocks the runtime's finalizer goroutine, leading to objects with finalizers to leak.
We find it very surprising that the runtime does nothing to warn about this. Especially when GO's mfinal routine is not visible in a normal build in the goroutine profile.
We intend to bring this to a discussion with the community. IMO, the runtime should expose an API to tell how much time had passed since the current finalizer func started. In addition, it should mention this edge case explicitly. You only need one rogue package to block and all the packages are exposed to a leak.

Please let me know your thoughts.

Thanks.

Brian Candler

unread,
Mar 19, 2025, 6:29:34 AMMar 19
to golang-nuts
On Wednesday, 19 March 2025 at 09:55:58 UTC Gavra wrote:
This finalizer blocks the runtime's finalizer goroutine

Out of interest, what made it block? Was the finalizer doing some channel communication, for example?

Gavra

unread,
Mar 19, 2025, 7:18:18 AMMar 19
to golang-nuts
https://github.com/hirochachacha/go-smb2/blob/c8e61c7a5fa7bcd1143359f071f9425a9f4dda3f/client.go#L1063
We are looking for why exactly it blocked (probably incorrect ctx) but I think this close should run in a goroutine.

Robert Engels

unread,
Mar 19, 2025, 7:32:44 AMMar 19
to Gavra, golang-nuts
In principle, I would argue that there is a correctness problem. You should not rely on finalizers ever - they are catches and often optional - so the design relying on finalizers to run is what is broken. 

In the real world they can make solving certain problems much easier - especially with shared resources. I know Java has deprecated them in lieu of other technologies like Cleaners. 


On Mar 19, 2025, at 6:19 AM, Gavra <gav...@gmail.com> wrote:

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

Robert Engels

unread,
Mar 19, 2025, 7:38:04 AMMar 19
to Gavra, golang-nuts
In fact the code you reference - the close() - does things the Go docs warn specifically not to do. 

You may be better off using runtime.AddCleanup()



On Mar 19, 2025, at 6:32 AM, Robert Engels <ren...@ix.netcom.com> wrote:



Gavra

unread,
Mar 19, 2025, 7:45:53 AMMar 19
to golang-nuts
Yes, I intend to open a bug for them. 
I agree that one should not relay on the execution of finalizers. But, the fact that the runtime just piles them up because one package did wrong is outrageous.

By the way, I am not sure why but I can see the mfinal routine only when running my program from Goland; When I build it manually the routine is invisible using pprof.

Robert Engels

unread,
Mar 19, 2025, 8:17:56 AMMar 19
to Gavra, golang-nuts
The docs:

 A single goroutine runs all finalizers for a program, sequentially. If a finalizer must run for a long time, it should do so by starting a new goroutine.”

So the code did not follow the api docs - as apparently the close() is indefinite. 

On Mar 19, 2025, at 6:46 AM, Gavra <gav...@gmail.com> wrote:

Yes, I intend to open a bug for them. 

Gavra

unread,
Mar 19, 2025, 8:49:59 AMMar 19
to golang-nuts
Yes, even if it was only a few ms, they should have run it in a goroutine.

Michael Pratt

unread,
Mar 19, 2025, 8:59:47 AMMar 19
to Gavra, golang-nuts
On Wed, Mar 19, 2025 at 8:46 PM Gavra <gav...@gmail.com> wrote:
Yes, I intend to open a bug for them. 
I agree that one should not relay on the execution of finalizers. But, the fact that the runtime just piles them up because one package did wrong is outrageous.

By the way, I am not sure why but I can see the mfinal routine only when running my program from Goland; When I build it manually the routine is invisible using pprof.

The finalizer goroutine should appear in goroutine profiles if it is running finalizers (even if a finalizer makes the goroutine block). It is hidden if it is blocked because there are no finalizers to run. If that is not the behavior you are seeing, please file a bug.

 

On Wednesday, 19 March 2025 at 13:38:04 UTC+2 Robert Engels wrote:
In fact the code you reference - the close() - does things the Go docs warn specifically not to do. 

You may be better off using runtime.AddCleanup()



On Mar 19, 2025, at 6:32 AM, Robert Engels <ren...@ix.netcom.com> wrote:


In principle, I would argue that there is a correctness problem. You should not rely on finalizers ever - they are catches and often optional - so the design relying on finalizers to run is what is broken. 

In the real world they can make solving certain problems much easier - especially with shared resources. I know Java has deprecated them in lieu of other technologies like Cleaners. 


On Mar 19, 2025, at 6:19 AM, Gavra <gav...@gmail.com> wrote:

https://github.com/hirochachacha/go-smb2/blob/c8e61c7a5fa7bcd1143359f071f9425a9f4dda3f/client.go#L1063
We are looking for why exactly it blocked (probably incorrect ctx) but I think this close should run in a goroutine.

On Wednesday, 19 March 2025 at 12:29:34 UTC+2 Brian Candler wrote:
On Wednesday, 19 March 2025 at 09:55:58 UTC Gavra wrote:
This finalizer blocks the runtime's finalizer goroutine

Out of interest, what made it block? Was the finalizer doing some channel communication, for example?

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/golang-nuts/5c7c034d-3bc6-4fe5-82bb-a318310bd82fn%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

Michael Knyszek

unread,
Mar 19, 2025, 9:05:30 AMMar 19
to golang-nuts
Sorry you ran into this issue.

Unfortunately, this is pretty much how such features tend to be designed to keep overheads low. You make a good point that we should have more diagnostics to detect when this happens. We can add a `runtime/metrics` metric, or perhaps add extend the capabilities of an experimental debug mode I'm working on for cleanups and finalizers (https://go.dev/cl/634599). At the very least, I'll make note of this in https://go.dev/doc/gc-guide.

Note also that even though `runtime.AddCleanup` is documented as running cleanups concurrently,
1. In Go 1.24 we don't actually do that, so it won't help you to switch in the short term. I have a patch for Go 1.25 to do that properly. (https://go.dev/cl/650697)
2. Even once we can run cleanups concurrently, it's still hazardous to spend a long time in a cleanup. The Go 1.25 implementation will not run each cleanup in its own goroutine, as that is just too expensive and very hard to change. You can still exhaust the finite number of goroutines used to run cleanups.

Java Cleaners have a similar limitation in that each Cleaner runs all its actions on a single thread. ("The execution of the cleaning action is performed by a thread associated with the cleaner. [...] The thread runs until all registered cleaning actions have completed and the cleaner itself is reclaimed by the garbage collector.", from https://docs.oracle.com/javase/9/docs/api/java/lang/ref/Cleaner.html) There is a difference here in that you can create as many cleaners as you want, AFAICT.

I'll note that we might be able to do something more complicated in the future in which we "detach" a cleanup goroutine if we detect that it blocks or runs too long at runtime. But just to set expectations, I don't think there's a high likelihood that we'll do something like this any time soon.

Michael Knyszek

unread,
Mar 19, 2025, 9:24:46 AMMar 19
to golang-nuts
> Unfortunately, this is pretty much how such features tend to be designed to keep overheads low. 

Er, sorry, this is actually a little more nuanced. Creating a new goroutine for each finalizer/cleanup could cause a different problem of generating too many goroutines (so, too much memory being used) in addition to the cost of creating and scheduling new goroutines, and once that behavior is relied on it's very difficult to change. For example, context.AfterFunc has this behavior, and that's an example where the new goroutine overhead has been reported to us as causing performance issues as well. I suspect there just isn't going to be one right answer for all cases. Cleanups and finalizers are relatively low level, so the scales tilted more toward having users do their own scheduling, at the risk of the blocking issues you ran into. (As well as the fact that cleanups and finalizers have more loose guarantees in general.)

Gavra

unread,
Mar 19, 2025, 9:52:17 AMMar 19
to golang-nuts
Thanks Michael.
My opinion is
1. It should be mentioned explicitly in the doc that a blocking finalizer will cause future objects with finalizers to leak. Probably next to the "A single goroutine runs all finalizers for a program..." section.
2. Go should provide a way to know about it when it happens. For example, we could have runtime.LastFinalizer() which returns {StartTime, Trace}. Users may choose to monitor it to detect the issue and its source.
3. I am not sure if it is intentional, but the goroutine dump indicates finq only if debug=1 is used.
4. Maybe we can add pprof/sys to indicate such metrics/potential issues?
5. Regarding the single routine vs multi-routine. Yes, this is clear. Did you consider having a finalizer routine per package? I think this is a significant deficit since it break isolation. Either way, I find providing 1 and 2 sufficient.

Do you want me to start a formal discussion in golang/go or will you take it from here?

Thanks again.

Michael Knyszek

unread,
Mar 19, 2025, 11:02:42 AMMar 19
to golang-nuts
Filed https://github.com/golang/go/issues/72948https://github.com/golang/go/issues/72949, and https://github.com/golang/go/issues/72950 to track the suggested improvements.

On Wednesday, March 19, 2025 at 9:52:17 AM UTC-4 Gavra wrote:
Thanks Michael.
My opinion is
1. It should be mentioned explicitly in the doc that a blocking finalizer will cause future objects with finalizers to leak. Probably next to the "A single goroutine runs all finalizers for a program..." section.
Sure, I filed a bug to add a sentence to the documentation and an example to the GC guide. 
2. Go should provide a way to know about it when it happens. For example, we could have runtime.LastFinalizer() which returns {StartTime, Trace}. Users may choose to monitor it to detect the issue and its source.
I'm not sure it makes sense to add a new API for this. A new API is always a complicated affair, whereas the runtime/metrics package and existing diagnostics (like possibly an analysis in the execution trace tooling) can support detecting common issues without having users change their code (much).

Also, I'm personally interested in an explicit debug mode that detects issues with finalizers and cleanups, since that can also be far more thorough. A bespoke API would likely only be able to help with some narrower set of issues.
3. I am not sure if it is intentional, but the goroutine dump indicates finq only if debug=1 is used.
It's classified as a system goroutine, but it is supposed be re-classified as a user goroutine while it is executing user code, such that it shows up in regular stack traces. There may be a bug in this reclassification. Do you have a reproducer? That would help isolate the issue.
4. Maybe we can add pprof/sys to indicate such metrics/potential issues?
Mentioned above, but yes. Not sure a pprof profile is a good fit, though I suppose you could imagine a "finalizer duration" profile.
5. Regarding the single routine vs multi-routine. Yes, this is clear. Did you consider having a finalizer routine per package? I think this is a significant deficit since it break isolation. Either way, I find providing 1 and 2 sufficient.
Unfortunately no, this was not considered. Reference queues a la Java probably should have been discussed more in the design for AddCleanup. Simultaneously, given that we control goroutine scheduling in the runtime, I do think there are ways to make this better in the future without having each package manage its own queue.

Do you want me to start a formal discussion in golang/go or will you take it from here?
I think the issues I filed are sufficient, but feel free to file more if you disagree.

Thanks again.
Thank you for reporting back to this thread! 

Gavra

unread,
Mar 19, 2025, 11:55:21 AMMar 19
to golang-nuts
Thanks.
The routine is visible on debug=2 but the frame of mfinal.go is missing.
The metrics approach sounds great (provided a low performance footprint).

Yes, great I've just reviewed them. Thank you for your help.

Reply all
Reply to author
Forward
0 new messages