RocksDB checkpoint created zero-byte WAL file

421 views
Skip to first unread message

Burak Yavuz

unread,
Aug 1, 2022, 4:29:13 PM8/1/22
to rocksdb
Hi all,

We use RocksDB's Java interface. We periodically create a checkpoint of RocksDB and persist the WAL and SST files to cloud storage incrementally.

In one of our data pipelines, we noticed that one of our periodic checkpoints resulted in a:
 - one 0 byte .log file
 - two non-zero .log files
 - no new SSTs

We noticed that there was a segfault at that time:
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f898159e3f5, pid=773, tid=0x00007f897a34a700
#
# JRE version: OpenJDK Runtime Environment (Zulu 8.56.0.21-CA-linux64) (8.0_302-b08) (build 1.8.0_302-b08)
# Java VM: OpenJDK 64-Bit Server VM (25.302-b08 mixed mode linux-amd64 )
# Problematic frame:
# C  [librocksdbjni4830164748909826191.so+0x6403f5]  rocksdb::ThreadLocalPtr::Swap(void*)+0x15
#

Then when we try to recover the state, the job started failing with:
Caused by: org.rocksdb.RocksDBException: Corruption: IO error: No such file or directory: While open a file for random read: /local_disk0/tmp/spark-fd4fcecf-6190-44ba-83bf-c6442b279775/395955.ldb: No such file or directory
    at org.rocksdb.OptimisticTransactionDB.open(Native Method)
    at org.rocksdb.OptimisticTransactionDB.open(OptimisticTransactionDB.java:40)

I'd like to add some defense in depth in case we have such issues in the future to avoid corrupting our persistent backup. My questions are:
1. Is it expected or even remotely possible to have 0 byte sized WAL and SST files. My guess is no.
2. After creating a Checkpoint, if we see new WAL files, should we expect to see new SST files as well? My guess is yes.

Thanks for your help in advance!

Best,
Burak

Siying Dong

unread,
Aug 4, 2022, 4:42:38 PM8/4/22
to Burak Yavuz, rocksdb, Peter Dillinger

There is no chance a zero size SST file in a checkpoint. Whatever SST file written to the manifest file should have contents.

It is possible that there are .log file and no sst file. In this case, the log file is not yet flushed to be an SST file.

I’m not 100% sure but from the source code, it seems that it is possible to have 0 size .log file in the checkpoints. My teammate, Peter (CCed), knows best.

 

 

Regarding the error itself, does file /local_disk0/tmp/spark-fd4fcecf-6190-44ba-83bf-c6442b279775/395955.sst exist in your directory? The error means that RocksDB expects the file exist there, but it doesn’t. You should be able to build ldb tool and use “ldb dump_manifest” command to print out contents of the MANIFEST-xxxx file and see what files are expected. (you may need to add -hex if keys are not printable and perhaps –verbose to each historic updates).

 

It might be a little bit hard to figure out what happens to the rocksdb::ThreadLocalPtr::Swap() segfault. But either way, any crash isn’t expected to cause DB corruption like what you’ve seen.

 

From: roc...@googlegroups.com <roc...@googlegroups.com> On Behalf Of Burak Yavuz
Sent: Monday, August 1, 2022 1:29 PM
To: rocksdb <roc...@googlegroups.com>
Subject: RocksDB checkpoint created zero-byte WAL file

 

Hi all, We use RocksDB's Java interface. We periodically create a checkpoint of RocksDB and persist the WAL and SST files to cloud storage incrementally. In one of our data pipelines, we noticed that one of our periodic checkpoints resulted

ZjQcmQRYFpfptBannerStart

This Message Is From an External Sender

ZjQcmQRYFpfptBannerEnd

--
You received this message because you are subscribed to the Google Groups "rocksdb" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rocksdb+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rocksdb/f819a497-1da6-4b52-9079-cacf3f14739dn%40googlegroups.com.

Burak Yavuz

unread,
Aug 5, 2022, 2:30:41 PM8/5/22
to Siying Dong, rocksdb, Peter Dillinger
Thank you Siying!

> It is possible that there are .log file and no sst file. In this case, the log file is not yet flushed to be an SST file.

We explicitly call flush before we create the Checkpoint, so I assume this shouldn't be expected still.

> it is possible to have 0 size .log file in the checkpoints.

@Peter, would love it if you can confirm.

Best,
Burak

Reply all
Reply to author
Forward
0 new messages