I have a possible corruption of the leveldb transaction log. This
happened when the disk was almost full, so some writes were succeeding
while some were failing. Please let me know if the situation I explain
in the following paragraph is a feasible cause of this corruption.
One thread issued a CompactMemtable that was successfully able to
flush the current memtable to a new file F1 via a call to
WriteLevel0Table. Then it invoked VersionSet.LogAndApply() to record
the newly created file in the MANIFEST file.
The LogAndApply method first invoked descriptor_log_->AddRecord()
which successfully recorded this transaction in the MANIFEST file.
Then it invoked descriptor_file_->Fsync() which failed. This caused
LogAndApply to return error without updating the current in-memory
version. This means that the in-memory version of the database does
not have file F1 whereas F1 is recorded inside a valid transaction in
the MANIFEST file. A succeeding invocation of DeleteObsoleteFiles
found a redundant F1 file on disk and deleted it because it does not
belong to the set of live files in the current version.
Now, when the database is restarted, it reads in the contents from the
MANIFEST file and looks for file F1 on disk which it cannot find. Bad!
dhruba
comments/feedback are appreciated
dhruba
--
Subscribe to my posts at
http://www.facebook.com/dhruba