1) compaction seems incapable of compacting multiple files into a single file. This is a really nice feature as otherwise you end up with 10k tiny files :)
2) I saw multi-threaded compaction and got really scared this can even kill the ability to write due to write amplification. More likely you want to be able to slow down compaction in load scenarios. This also goes to #1 as compactions are usually not single file (though still naively parallelizable)
3) in the way things are made durable: this is a trade off between latency and throughput. In another system I have worked with we came up with a really dead simple way of balancing that may be useful. Queue batches. If when the writer finishes a write the queue is empty immediately fsync if not take off up to a period of time (we base time on the length of time it takes to fsync)
4) With the use of byte[] my guess is it will take down the jvm under load with larger messages.
5) quite a few places were deleting data, we have a rule never to delete without first making a copy (in case of bugs etc). A reordered write (or bad sector could cause an entire journal chunk to be deleted)
6) when switching temporary files in compaction it's highly os specific where you need fsyncs (there are none now so this can corrupt a database) eg does the close of handle sync? Does copy ensure an fsync, rename, etc. we found a ton of bugs with this when doing power pulling tests on various operating systems
7) adding a footer to the log with an incrementing hash might help detect bad data (if you write incremental in log you might even be able to save bad data)
Cheers,
Greg