Chronicle Queue Crash Safety Guaranttes

360 views
Skip to first unread message

Drew Kutcharian

unread,
Mar 11, 2015, 5:23:19 PM3/11/15
to java-ch...@googlegroups.com
Hi,

I'm new to Chronicle and I'm experimenting with a few things. What are the crash safety guarantees of VanillaChronicle? Scenarios that come to mind:

1. What happens when the appender crashes in the middle of a write?

2. Is there a possibility where index and data would be out of sync?

3. What happens when the tailer updates a record? (during marking a record processed)

4. Do I need to store CRC32 of the data in Chronicle to make sure the data is not corrupted?

My use case is to build an in-process message queue, where payload is a variable length byte[]. I'm currently thinking of using the following structure:
4 bytes, int, status (NEW, PROCESSED)
8 bytes, long, CRC32(payload)
4 bytes, int, payload.length
n bytes, byte[], payload

The way I process the records is that I check if the status is NEW, then I read the message and compute the CRC32 to make sure the message is not corrupted. If, not corrupted, I process the message and set the status to PROCESSED once processing is done.

Obviously there are a multiple places that the above process can fail, so I'm trying to figure out what scenarios I need to look out for.

Additionally, how can I do the above use case using a batch model, where I read and mark as processed multiple messages in batches?

Furthermore, can I get rid of the "length" field above?

Thanks,

Drew

Peter Lawrey

unread,
Mar 12, 2015, 6:14:05 AM3/12/15
to java-ch...@googlegroups.com

If the appender crashes during a write, the incomplete message is truncated on the next write.

The index and data could be out of sync  on disk and in the event of a power failure, the file could be corrupted. If the index is more up to date, the entries will appear to be full of null bytes, if the data is more up to date, the entries will be lost.
The best way to prevent loss is to use replication.

If a reader updates a record, you should use the thread safe operations. This will be visible to other readers however there is no event driven way for another reader to know this was one (without reading the same data)

You could use CRC32 but replication is better imho. I suggest adding the check sum to the end of the message.

The vanilla chronicle and queue version 4 don't need the length as well. The record will only be the length you wrote.

Instead of a status I would have the timestamp processed. This gives the field a dual purpose. For extra monitoring you might want;
- The time the record is written
- The time the record is first read
- The time the record was processed.

By performing the "first read" as a compare and swap from 0 you can assign the record to exactly one reader. If this fails the record was read already (possibly another worker)
You might like to record which worker it was assigned to.

Lastly you can have another process looking to see if a record is taking a long time to be read or a long time to be processed. If a record is taking a long processing time you can trigger a stack trace to start to see why it is taking a long time.

--
You received this message because you are subscribed to the Google Groups "Chronicle" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-chronicl...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Drew Kutcharian

unread,
Mar 12, 2015, 6:22:02 PM3/12/15
to java-ch...@googlegroups.com
Thanks for the detailed response. I'm building an in-process write-ahead-log and thus far loving Chronicle. Based on your comments here's the version two of this (inspired by Kafka):

Write path, single thread per chronicle:
1. Write CRC32 of the message (byte[])
2. Write the message (byte[])
3. Finish

Read path, single thread per chronicle:
1. Seek to the "index"
2. Read the CRC32
3. Read the all the "available" bytes as message
4. Verify CRC32 is correct
5. Process the message
6. If message process successful, write the "index" to a file
7. Go back to #1

This approach allows for batch processing and ability to rewind and replay.

Do you see any issues with this approach?

Also, I noticed that I can do either sync or async writes. When not synchronous, how often does Chronicle do an fsync? Is that configurable? Can I say fsync every n seconds or manually fsync?

Also a few more questions:

1. What's the logic for generating index values? Is it timestamp + sequence?

2. Why is the minimum cycle length an hour?

3. Does Chronicle delete the old index and data files, if so, how do you configure the retention?

Cheers,

Drew

Peter Lawrey

unread,
Mar 13, 2015, 3:29:12 AM3/13/15
to java-ch...@googlegroups.com
You control whether fsync is called or not.  There isn't an option for a periodic fsync.   Note: the OS does this for you so perhaps tuning your Linux kernel is the better option.

You could assume everything but the last 4 bytes is the CRC and you can read it as an int rather than a byte[]

Drew Kutcharian

unread,
Mar 13, 2015, 4:12:21 AM3/13/15
to java-ch...@googlegroups.com
How do I control when fsync is called, by setting nextSynchronous(true)? Is there a way to sync the appender without using nextSynchronous(true) so I can run it in a separate thread to periodically sync?

Peter Lawrey

unread,
Mar 13, 2015, 10:59:50 PM3/13/15
to java-ch...@googlegroups.com

At the moment it is not possible to sync without writing a record. You could add a dummy message periodically.

paul Bandler

unread,
Aug 27, 2015, 6:44:29 AM8/27/15
to Chronicle
When using multiple writers (one per thread), to achieve a graceful shutdown with all data flushed to disk, it is necessary to write a message using nextSynchronous(true) from each writer, or just one?  

Peter Lawrey

unread,
Aug 27, 2015, 7:29:57 AM8/27/15
to java-ch...@googlegroups.com
Hello Paul,

If you are about to pull the power from the machine, you need to write one such message. Then you can power off the machine without shutting down the OS.

If you are just shutting down the application or performing a graceful shut down of the OS, you don't need to do anything.

Regards,
  Peter.
Reply all
Reply to author
Forward
0 new messages