File corrupted while reading record: "/var/gerrit/cache/git_tags-v2.mv.db"

68 views
Skip to first unread message

Sumit Singh

unread,
Feb 4, 2026, 6:04:40 PM (3 days ago) Feb 4
to Repo and Gerrit Discussion
Hi, 

We have Gerrit version 3.12.3 .Last week we noticed that gerrit was down and error log file full of different different warning & error msg. One of the error is mentioned subject line. 
----------------------------------
{"@timestamp":"2026-01-24T04:04:45.447Z","source_host":"Host_IP","message":"Cannot read cache jdbc:h2:file:///var/gerrit/cache/persisted_projects-v2 for project: \"device/google_car\"\nrevision: \"\\221\\210\\003\\272\\212z.L\\373\\371\\345P\\263~\\262\\373\\312\\270\\322%\"\n\n[CONTEXT SSH_SESSION=\"cdf05c90\" TRACE_ID=\"1760125355562-3c402b56\" request=\"SSH\" ]","file":"H2CacheImpl.java","line_number":"504","class":"com.google.gerrit.server.cache.h2.H2CacheImpl$SqlStore","method":"getIfPresent","logger_name":"com.google.gerrit.server.cache.h2.H2CacheImpl","mdc":{},"ndc":"","level":"WARN","thread_name":"SSH gerrit ls-projects -b dev/update-26 ..  big stack trace

{"@timestamp":"2026-01-24T04:04:45.470Z","source_host":"Host_IP","message":"Cannot put into cache jdbc:h2:file:///var/gerrit/cache/persisted_projects-v2 [CONTEXT SSH_SESSION=\"cdf05c90\" ... 
-----------------------------------------
Later we saw internal server error and Java heap space error
-----------------------------------------
{"@timestamp":"2026-01-24T07:07:57.873Z","source_host":"Host_IP","message":"Internal server error (user arden account 2000049) during gerrit ls-projects -b dev/update-26 .... {other many more branches} --type code --all --format json_compact (arden)","@version":2,"exception":{"stacktrace":"com.google.common.util.concurrent.ExecutionError: java.lang.OutOfMemoryError: Java heap space\n\tat com.github.benmanes.caffeine.guava.CaffeinatedGuavaLoadingCache.get(CaffeinatedGuavaLoadingCache.java:67)\n\tat
---------------------------------------------------
We have restarted gerrit service, this made service up for 1 days and next day we noticed H2 db corruption for few db s.
-----------------------------------------------------
[2026-01-26T09:22:04.574Z] [SSH git-upload-pack project/infra/jobs (arden)] WARN  com.google.gerrit.server.cache.h2.H2CacheImpl : Cannot read cache jdbc:h2:file:///var/gerrit/cache/git_tags-v2 for project/infra/jobs [CONTEXT TRACE_ID="1767458944483-7480c9fc" project="project/infra/jobs" request="GIT_UPLOAD" ]
org.h2.jdbc.JdbcSQLNonTransientConnectionException: File corrupted while reading record: "/var/gerrit/cache/git_tags-v2.mv.db". Possible solution: use the recovery tool [90030-232]
 at org.h2.message.DbException.getJdbcSQLException(DbException.java:690)
--------------------------------------------------
We have deleted the corrupted db s from cache folder and re-started the gerrit container again. It seems problem resolved.
We were investigating the root cause of this issue but we couldn't established concrete relation how it started and what cause these error. 

Need you support in this case and suggestion/solution to avoid future occurrence of similar issue.

Thank you. 

Matthias Sohn

unread,
Feb 4, 2026, 6:12:36 PM (3 days ago) Feb 4
to Sumit Singh, Repo and Gerrit Discussion
This may have been caused by an unclean shutdown of the gerrit process.
We also experienced that a few times. We are running gerrit master in a k8s-gerrit HA setup.
It seems h2 v2 isn't rock solid in case of unclean shutdowns.
 
Need you support in this case and suggestion/solution to avoid future occurrence of similar issue.

Thank you. 

--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/repo-discuss/de6aea5a-1e3e-44e3-9ed1-0c09f93cbe03n%40googlegroups.com.

Sumit Singh

unread,
Feb 5, 2026, 6:08:19 AM (3 days ago) Feb 5
to Repo and Gerrit Discussion
Hi Matthias , thank you for your response. 

This may have been caused by an unclean shutdown of the gerrit process.
"unclean shutdown" - Does this mean whenever Gerrit service down (for reason like java heap space or any other) , there is chance of h2 db corruption ?

We have upgraded our Gerrit version 3.12.3 two months back, this h2 db corruption error we seen first time after up gradation. But we noticed below info start appearing in error log after upgrade. Though gerrit is functioning proper but still seeing below msg continuously for different projects.
-------------------------
{"@timestamp":"2026-01-24T03:58:52.772Z","source_host":"Host_IP","message":"Performing visibility check for all refs. This can be expensive. [CONTEXT ratelimit_period=\"1 SECONDS\" skipped=54 TRACE_ID=\"17695898742760-e17ft4te\" project=\"project/infra/jobs\" request=\"GIT_UPLOAD\" ]","file":"DefaultRefFilter.java","line_number":"223","class":"com.google.gerrit.server.permissions.DefaultRefFilter","method":"filterRefs","logger_name":"com.google.gerrit.server.permissions.DefaultRefFilter","mdc":{},"ndc":"","level":"INFO","thread_name":"SSH git-upload-pack /project/infra/jobs (arden)","@version":2}
--------------------------
Due to big stacktrace msg our error log file grown up huge that cause log mount point /var/log/gerrit got full and we also seen error "No space left of device" 
My question is - does this may cause h2 db corruption ? since cache files also mounted at /var/gerrit/cache on vm. 

Matthias Sohn

unread,
Feb 5, 2026, 7:31:00 AM (3 days ago) Feb 5
to Sumit Singh, Repo and Gerrit Discussion
On Thu, Feb 5, 2026 at 12:08 PM Sumit Singh <sumit...@gmail.com> wrote:
Hi Matthias , thank you for your response. 

This may have been caused by an unclean shutdown of the gerrit process.
"unclean shutdown" - Does this mean whenever Gerrit service down (for reason like java heap space or any other) , there is chance of h2 db corruption ?

We have upgraded our Gerrit version 3.12.3 two months back, this h2 db corruption error we seen first time after up gradation. But we noticed below info start appearing in error log after upgrade. Though gerrit is functioning proper but still seeing below msg continuously for different projects.
-------------------------
{"@timestamp":"2026-01-24T03:58:52.772Z","source_host":"Host_IP","message":"Performing visibility check for all refs. This can be expensive. [CONTEXT ratelimit_period=\"1 SECONDS\" skipped=54 TRACE_ID=\"17695898742760-e17ft4te\" project=\"project/infra/jobs\" request=\"GIT_UPLOAD\" ]","file":"DefaultRefFilter.java","line_number":"223","class":"com.google.gerrit.server.permissions.DefaultRefFilter","method":"filterRefs","logger_name":"com.google.gerrit.server.permissions.DefaultRefFilter","mdc":{},"ndc":"","level":"INFO","thread_name":"SSH git-upload-pack /project/infra/jobs (arden)","@version":2}
--------------------------

This is a warning from DefaultRefFilter which is emitted at most once per minute 
if it needs to do a full scan of all refs to check their visibility for the user who triggered the operation.
This is a performance warning since these scans can take a long time on large repos, 
that's unrelated to corruption of persistent caches.
 
Due to big stacktrace msg our error log file grown up huge that cause log mount point /var/log/gerrit got full and we also seen error "No space left of device" 
My question is - does this may cause h2 db corruption ? since cache files also mounted at /var/gerrit/cache on vm. 

Put the log directory on a volume which has enough space and monitor free space of the important volumes
so that you can react and add more disk space or prune old log files before the server stops working when no space is left.
You can configure gerrit to rotate logs, compress them on rotation and prune old log files, see

Check h2 trace files which end up in the cache directory, on k8s-gerrit deployments they are in /var/gerrit/cache.
H2 trace files have names like e.g. git_modified_files-v2.trace.db.

For us tuning graceful shutdown options helped.

We set in gerrit.config 
httpd.gracefulStopTimeout=2m
sshd.gracefulStopTimeout=2m

for the shutdown of gerrit pods in gerrit.yaml of k8s-gerrit deployment (this is in seconds)
gracefulStopTimeout: 300

and k8s probes
        startupProbe:
          initialDelaySeconds: 10
          periodSeconds: 30
          timeoutSeconds: 80
          successThreshold: 1
          failureThreshold: 7
        readinessProbe:
          initialDelaySeconds: 60
          periodSeconds: 10
          timeoutSeconds: 1
          successThreshold: 1
          failureThreshold: 3
        livenessProbe:
          initialDelaySeconds: 60
          periodSeconds: 10
          timeoutSeconds: 1
          successThreshold: 1
          failureThreshold: 3

Sumit Singh

unread,
Feb 5, 2026, 10:18:58 AM (3 days ago) Feb 5
to Repo and Gerrit Discussion
This is a warning from DefaultRefFilter which is emitted at most once per minute 
if it needs to do a full scan of all refs to check their visibility for the user who triggered the operation.
This is a performance warning since these scans can take a long time on large repos, 
that's unrelated to corruption of persistent caches.

I guess it emitting every second , latest from container log. Is there any option to suppress or get rid of this check ?
--------------------------------------------------------------------
[2026-02-05T15:07:13.367Z] [SSH git-upload-pack sigma/build (AutoBot)] INFO  com.google.gerrit.server.permissions.DefaultRefFilter : Performing visibility check for all refs. This can be expensive. [CONTEXT ratelimit_period="1 SECONDS" skipped=9 TRACE_ID="177052013633366-79cf5g54" project="sigma/build" request="GIT_UPLOAD" ]
[2026-02-05T15:07:14.420Z] [SSH git-upload-pack project/infra/jobs (arden)] INFO  com.google.gerrit.server.permissions.DefaultRefFilter : Performing visibility check for all refs. This can be expensive. [CONTEXT ratelimit_period="1 SECONDS" skipped=12 TRACE_ID="17703045896231-515895c" project="project/infra/jobs" request="GIT_UPLOAD" ]
[2026-02-05T15:07:15.575Z] [SSH git-upload-pack project/infra/jobs (arden)] INFO  com.google.gerrit.server.permissions.DefaultRefFilter : Performing visibility check for all refs. This can be expensive. [CONTEXT ratelimit_period="1 SECONDS" skipped=15 TRACE_ID="1770304054863-d64d9b78" project="project/infra/jobs" request="GIT_UPLOAD" ]
----------------------------------------------------------------

Check h2 trace files which end up in the cache directory, on k8s-gerrit deployments they are in /var/gerrit/cache.
H2 trace files have names like e.g. git_modified_files-v2.trace.db.

In git_modified_files-v2.trace.db file
-------------------------------------------
2026-01-24 07:30:46.630535Z database: flush
org.h2.message.DbException: Out of memory. [90108-232]
at org.h2.message.DbException.get(DbException.java:212)
at org.h2.message.DbException.convert(DbException.java:401)
at org.h2.mvstore.db.Store.lambda$new$0(Store.java:122)
at org.h2.mvstore.MVStore.handleException(MVStore.java:1546)
at org.h2.mvstore.FileStore.writeInBackground(FileStore.java:1847)
at org.h2.mvstore.FileStore$BackgroundWriterThread.run(FileStore.java:2256)
Caused by: org.h2.jdbc.JdbcSQLNonTransientConnectionException: Out of memory. [90108-232]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:690)
at org.h2.message.DbException.getJdbcSQLException(DbException.java:489)
... 6 more
Caused by: java.lang.OutOfMemoryError: Java heap space
-------------------------------------------
I remember we removed 2-3 db  files like git_tags-v2.mv.db, persisted_projects-v2.mv.db then only gerrit started working back. 

Matthias Sohn

unread,
Feb 5, 2026, 11:25:06 AM (3 days ago) Feb 5
to Sumit Singh, Repo and Gerrit Discussion
On Thu, Feb 5, 2026 at 4:19 PM Sumit Singh <sumit...@gmail.com> wrote:
This is a warning from DefaultRefFilter which is emitted at most once per minute 

Sorry, I was wrong, this message is emitted at most once per second, this throttling was introduced in 3.12.0
This might have caused the h2 database corruption you observed since OOM causes the JVM to terminate abruptly.
To avoid OOM either increase the heap size or reduce the load e.g. by lowering size of thread pools or adding
more pods to distribute the load to more JVM processes if you e.g. use an k8s-gerrit HA setup.
 

Sumit Singh

unread,
Feb 6, 2026, 8:36:13 AM (2 days ago) Feb 6
to Repo and Gerrit Discussion
This might have caused the h2 database corruption you observed since OOM causes the JVM to terminate abruptly. 

Thank you for explanation. Now I'm more clear what had happened. 
Could "Performing visibility check for all refs" cause JVM OOM in future since we still seeing this msg in log ? 

To avoid OOM either increase the heap size or reduce the load e.g. by lowering size of thread pools or adding
more pods to distribute the load to more JVM processes if you e.g. use an k8s-gerrit HA setup.

We have docker container on VM ,64gb allocated for JVM out of 128gb, and monitoring graph shows max usage is 80 % peak. 

Matthias Sohn

unread,
Feb 6, 2026, 4:57:33 PM (2 days ago) Feb 6
to Sumit Singh, Repo and Gerrit Discussion
On Fri, Feb 6, 2026 at 2:36 PM Sumit Singh <sumit...@gmail.com> wrote:
This might have caused the h2 database corruption you observed since OOM causes the JVM to terminate abruptly. 

Thank you for explanation. Now I'm more clear what had happened. 
Could "Performing visibility check for all refs" cause JVM OOM in future since we still seeing this msg in log ? 

I think this is an unrelated warning and don't expect this to cause OOM.
Scanning all refs first of all can take a long time on a large repo since it needs to read data from disk.

To avoid OOM either increase the heap size or reduce the load e.g. by lowering size of thread pools or adding
more pods to distribute the load to more JVM processes if you e.g. use an k8s-gerrit HA setup.

We have docker container on VM ,64gb allocated for JVM out of 128gb, and monitoring graph shows max usage is 80 % peak. 

From these few numbers I can't tell what might have caused the OOM.
Could be caused by a single giant clone request or too many requests from a farm of CI servers.
Often the bulk of the load is caused by clone requests on large repos (>1GB).

You can check the logs, Gerrit logs the allocated memory in bytes per request in the log field "memory".

Another important metric is CPU percentage the JVM spends on Java GC.
You can e.g. use the gerrit-monitoring setup to collect and analyse such metrics.

Please avoid top posting and instead use interleaved posting style on this list 
which helps following the conversation.

-Matthias 

Reply all
Reply to author
Forward
0 new messages