RocksDb corruption during compaction on unresponsive filesystem

53 views
Skip to first unread message

Lukas

unread,
Feb 22, 2023, 6:04:31 AM2/22/23
to rocksdb
Hello All,
we are running multiple RockDbs (RocksDB version: 6.29.5) inside the k8s environment where volumes are backed by the NFS4 server. We use fsync writes. In the event the NFS server becomes unresponsive we still run frequently into database corruption. The corruption seems to happen in case the NFS issue happens during the compaction.

Bellow is snipped from the rocksdb LOG.
 
 1/ Number of files at the start of db
 2/ The compaction job started due LevelL0FilesNum
 3/ Independent fs probe detected fs not responding
 4/ The compaction job still proceeded and eventually failed due detection of a corrupted file on read.
 5/ The next start failed due to missing files generated by compaction.

We are using fsync writes. Therefore I would expect the compaction would stall or fail if the underlying filesystem is not responding or responding slowly.
The log shows the compaction proceeds and failure is once the file is read again. Also, the MANIFEST is updated as it is expecting new files.

I am struggling to conclude if this is an NFS issue (where I expect COMMIT guarantees durability and it happens with different NFS server implementations) or if there is an issue in compaction.

Thanks for any hints or guidance on what should be checked or configured.
Any experience with running rocksdb on NFS-backed volumes would be highly appreciated too.

Thanks,
Regards
Lukas

--------------------------8<---------------------
2023/02/21-06:54:45.145261 140319991092928 SST files in ./data/db dir, Total Num: 187

...
2023/02/21-09:01:22.309835 140318485771968 EVENT_LOG_v1 {"time_micros": 1676970082309794, "job": 1307, "event": "compaction_started", "compaction_reason": "LevelL0FilesNum",

...
2023-02-21 09:06:24.192 ERROR 1 --- [Thread-32] Filesystem is not responding (now: 1676970384191,lastWrite: 1676970350653) - killing myself
...

2023/02/21-09:01:32.515167 140318485771968 [/compaction/compaction_job.cc:1945] [default] [JOB 1307] Generated table #112883: 32205 keys, 1606390 bytes
2023/02/21-09:01:32.515248 140318485771968 EVENT_LOG_v1 {"time_micros": 1676970092515208, "cf_name": "default", "job": 1307, "event": "table_file_creation", "file_number": 112883, "file_size": 1606390, "file_checksum": "", "file_checksum_func_name": "Unknown", "table_properties": {"data_size": 1600701, "index_size": 15563, "index_partitions": 1, "top_level_index_size": 67, "index_key_is_user_key": 1, "index_value_is_delta_encoded": 1, "filter_size": 0, "raw_key_size": 1932300, "raw_average_key_size": 60, "raw_value_size": 9776423, "raw_average_value_size": 303, "num_data_blocks": 326, "num_entries": 32205, "num_filter_entries": 0, "num_deletions": 4, "num_merge_operands": 0, "num_range_deletions": 0, "format_version": 0, "fixed_key_len": 0, "filter_policy": "", "column_family_name": "default", "column_family_id": 0, "comparator": "leveldb.BytewiseComparator", "merge_operator": "UInt64AddOperator", "prefix_extractor_name": "nullptr", "property_collectors": "[]", "compression": "ZSTD", "compression_options": "window_bits=-14; level=3; strategy=0; max_dict_bytes=0; zstd_max_train_bytes=0; enabled=0; max_dict_buffer_bytes=0; ", "creation_time": 1673266184, "oldest_key_time": 0, "file_creation_time": 1676970090, "slow_compression_estimated_data_size": 0, "fast_compression_estimated_data_size": 0, "db_id": "aa400e4d-6845-4ba2-8a25-4f2b30d05f22", "db_session_id": "RCVDVSTRFSZ06ENMBJ79", "orig_file_number": 112883}}

....
2023/02/21-09:09:59.323473 140318485771968 [ERROR] [ble/block_based/block_based_table_reader.cc:819] Cannot find Properties block from file.
2023/02/21-09:09:59.323709 140291185047232 [ERROR] [ble/block_based/block_based_table_reader.cc:819] Cannot find Properties block from file.
2023/02/21-09:09:59.325044 140318258120384 [ERROR] [ble/block_based/block_based_table_reader.cc:819] Cannot find Properties block from file.
2023/02/21-09:09:59.327625 140242633881280 [ERROR] [ble/block_based/block_based_table_reader.cc:819] Cannot find Properties block from file.
2023/02/21-09:09:59.566947 140318485771968 [WARN] [/db_impl/db_impl_compaction_flush.cc:3438] Compaction error: Corruption: block checksum mismatch: stored = 2341636434, computed = 3753742046, type = 1  in ./data/db/112883.sst offset 0 size 0
2023/02/21-09:09:59.566993 140318485771968 [/error_handler.cc:283] ErrorHandler: Set regular background error
2023/02/21-09:09:59.567403 140318485771968 (Original Log Time 2023/02/21-09:09:59.401480) [/compaction/compaction_job.cc:962] [default] compacted to: files[8 109 72 6 0 0 0] max score 0.92, MB/sec: 1.2 rd, 1.2 wr, level 1, files in(6, 103) out(104 +0 blob) MB in(71.1, 511.3 +0.0 blob) out(570.9 +0.0 blob), read-write-amplify(16.2) write-amplify(8.0) Corruption: block checksum mismatch: stored = 2341636434, computed = 3753742046, type = 1  in ./data/db/112883.sst offset 0 size 0, records in: 10461313, records dropped: 0 output_compression: ZSTD
2023/02/21-09:09:59.567407 140318485771968 (Original Log Time 2023/02/21-09:09:59.566894) EVENT_LOG_v1 {"time_micros": 1676970599401503, "job": 1307, "event": "compaction_finished", "compaction_time_micros": 516932372, "compaction_time_cpu_micros": 32009117, "output_level": 1, "num_output_files": 104, "total_output_size": 598603982, "num_input_records": 10461313, "num_output_records": 26975748, "num_subcompactions": 4, "output_compression": "ZSTD", "num_single_delete_mismatches": 0, "num_single_delete_fallthrough": 0, "lsm_state": [8, 109, 72, 6, 0, 0, 0]}
2023/02/21-09:09:59.567411 140318485771968 [ERROR] [/db_impl/db_impl_compaction_flush.cc:2936] Waiting after background compaction error: Corruption: block checksum mismatch: stored = 2341636434, computed = 3753742046, type = 1  in ./data/db/112883.sst offset 0 size 0, Accumulated background error counts: 1

...
2023/02/21-09:13:08.901184 140431150794432 SST files in ./data/db dir, Total Num: 18, files: 089704.sst 111184.sst 111197.sst 111221.sst 111251.sst 111276.sst 111299.sst 111481.sst 111600.sst

...
2023/02/21-09:13:08.940462 140431150794432 [ERROR] [/version_set.cc:2470] Unable to load table properties for file 112948 --- IO error: No such file or directory: While open a file for random read: ./data/db/112948.sst: No such file or directory
2023/02/21-09:13:08.940477 140431150794432 [ERROR] [/version_set.cc:2470] Unable to load table properties for file 112939 --- IO error: No such file or directory: While open a file for random read: ./data/db/112939.sst: No such file or directory
2023/02/21-09:13:08.940484 140431150794432 [ERROR] [/version_set.cc:2470] Unable to load table properties for file 112877 --- IO error: No such file or directory: While open a file for random read: ./data/db/112877.sst: No such file or directory
....
--------------------------8<---------------------

Dan Carfas

unread,
Feb 22, 2023, 7:24:54 AM2/22/23
to rocksdb
Hi Lukas,

Shared your question with the Speedb hive and this is the reply we have for you, by Hilik, our co-founder and chief scientist:

Lukas, the issue of recovery after hw failure is one of our top in the priority list. We will investigate this problem and would love to work with you on the verification of the solution ...  if you like please open a PR with as much info as you can collect and we will try to recreate and resolve .

Join the Speedb hive here, link to the thread with your question and this is the Speedb OSS repo link
Let us know if you need any assistance or additional info.
Dan
Reply all
Reply to author
Forward
0 new messages