Do I need Chronicle or Chronicle+Disruptor?

2,087 views
Skip to first unread message

Paul Dillon

unread,
May 19, 2013, 1:23:32 AM5/19/13
to java-ch...@googlegroups.com
Hi Peter,
 
I'm looking at adding journaling to my existing disruptor-based application, but after looking at your library I'm wondering if I need the disruptor at all.  Specifically:
 
1. Does reading from a memory mapped file also benefit from the CPU-cache read-ahead benefit that disruptor achieves with the ring buffer?
 
2. When the reader marshals the excerpt back into an object, doesn't that entail an additional memory copy that would not be needed with disruptor?  Or at least in disruptor it would not be needed in it's core service thread?  My core service thread sometimes falls behind, and seems that it benefits from not having to do any marshalling.
 
3. The outbound side of my service doesn't require journaling - in this case, would it be less performant to use chronicle, since data must be written to disk? (versus using ringbuffer to handball to workers)
 
Many Thanks,
 
Paul

Peter Lawrey

unread,
May 19, 2013, 7:37:24 AM5/19/13
to java-ch...@googlegroups.com
adding journaling to my existing disruptor-based application, but after looking at your library I'm wondering if I need the disruptor at all. 

Thank is my opinion, but I am biased ;)

Does reading from a memory mapped file also benefit from the CPU-cache read-ahead benefit that disruptor achieves with the ring buffer?

Yes, but the differences is that it is completely continuous.  The first byte of one message starts immediately after the last byte of the previous message.

As such the "structure" is arranged as a stream.

When the reader marshals the excerpt back into an object, doesn't that entail an additional memory copy that would not be needed with disruptor? 

This is done for simplicity.  You can randomly access fields by offset and avoid the additional copy.  Also readers are not required to read all the fields, just the ones they are interested in.  With the additional copies that serialization and deserialization implies you can still achieve over 2 million events per second for real systems (not just micro-benchmarks)  Given this might be enough for you, I would start with the simplest approach which gives you more than enough performance.

 My core service thread sometimes falls behind, and seems that it benefits from not having to do any marshalling.

With Chronicle there is virtually no limit the consumers can behind the product.  The practical limit is amount of free disk space you have though staying in main memory will be faster.  e.g. You can have consumers which are not even running for hours (or all day) have no impact on the producer.

The outbound side of my service doesn't require journaling - in this case, would it be less performant to use chronicle, since data must be written to disk? (versus using ringbuffer to handball to workers)

When using Chronicle for IPC it out performs all the commercial solutions even though they don't include serlaization/deserialization or persistence.  WIthout any tuning you can expect 0.5 to 2 micro-seconds delay per event and with some tuning as low as 0.1 micro-seconds.  There is a benefit in recording timestamps against all your outputs in monitoring what you system is doing and reproducing the system as a state machine in test and development. i..e from the output any downstream system should be able to recreate the state of server server.

The bottleneck with chronicle is the long term sustained disk write throughput.  You can have large bursts of data and get about 1.5 GB/s up to about 10% of main memory e.g. say you 64 GB and 200 byte messages, you should be able to handle burst of ~ 6 GB or 30 million events at a time (would take about 4-5 seconds at this rate).  At some point (depending on your OS) the OS pushes the data to disk, at this point you will be limited by the speed of your disk, e.g. a HDD with 50 MB/s might sustain 400K events per second, but an SSD with 500 MB/write throughput can handle 2.5 M events/per second.

There is a slight overhead in all this copy and persistence but the cost is so low that it may not matter for your solution. What you get instead is an exact record of everything your system is doing, as it does it.  This precise view of the performance of every event e.g. including nanoTime() stamp through your system, gives you the information you need to tune your system at a level beyond what regular profilers can achieve.

Peter.






 
Paul

--
You received this message because you are subscribed to the Google Groups "Java Chronicle" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-chronicl...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Paul Dillon

unread,
May 19, 2013, 11:08:13 PM5/19/13
to java-ch...@googlegroups.com
Thanks Peter!  All makes sense.  Disk speed and memory are not a problem for my expected burst rate, so yes the latency should be negligible for me.
 
Lastly, about the diagram in this article:
 
 
On the processing server, I'm assuming you'd need one reader thread per GW.  What technique would you use to pass those GW messages from the multiple reader threads to the processing engine thread?
 
Thanks,
 
Paul

Peter Lawrey

unread,
May 20, 2013, 12:56:09 AM5/20/13
to java-ch...@googlegroups.com
Multiple reader threads is not a problem.  You can do this locklessly by each reader opening a Chronicle.  If you have multiple writer threads however you need synchronization or ReentrantLock.  The later might be better as it tends to support lock contention a little better.  IMHO synchronized tends to be better if you have typically one thread using a resource.  You can have multiple readers across multiple processes.  Multiple writer threads must be in the same process.  If you have multiple writer processes, each must have different chronicle files.


 
Thanks,
 
Paul

--

Paul Dillon

unread,
May 20, 2013, 1:33:48 AM5/20/13
to java-ch...@googlegroups.com
So for the case of multiple gateway servers and a single processing server:
 
On the core server:
* Multiple reader threads, each with it's own Chronicle pointing at IP of each gateway server
* Those reader threads would read a message and then write it to a single inbound Chronicle, controlling access using ReentrantLock or similar.
* The single-threaded core service would read the inbound Chronicle write it's responses to an outbound Chronicle.
 
On the gateway servers:
* HTTP or other requests are received and serviced directly, or written to a Chronicle.
* A thread reads responses from the outbound Chronicle from the core server, and relays them back to it's clients
 
Am I on the right track?
 
Thanks,
 
Paul
Reply all
Reply to author
Forward
0 new messages