Other language bindings for Chronicle Map?

1,030 views

Skip to first unread message

Kent Hoxsey

unread,

Oct 4, 2014, 11:48:26 AM10/4/14

to java-ch...@googlegroups.com

I am working out how to use Chronicle Map to feed a group of monitoring applications. Most of these applications have java APIs available, but one is C# (and unlikely to be ported anytime soon). I am wondering if anyone else has attempted (succeeded at) building a C# client to a Map.

If not, I would appreciate any guidance about how best to go about it. The interface would be one-way, with the C# client reading a Map fed by a different (higher-frequency) java process. My approach is to use IKVM.Net to create the .NET "wrapper", but I am open to more-informed suggestions.

TIA

Kent

Peter Lawrey

unread,

Oct 4, 2014, 11:56:14 AM10/4/14

to java-ch...@googlegroups.com

What I suggest is creating a tcp service you can connect to and access the map via this connection. This way the serialization can operate however you need.

--
You received this message because you are subscribed to the Google Groups "Chronicle" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-chronicl...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ben Alex

unread,

Oct 4, 2014, 7:24:10 PM10/4/14

to java-ch...@googlegroups.com

I recently had a similar interoperability challenge and used Simple Binary Encoding (SBE) for serialization. The main advantage it provides is zero-copy, alignment-friendly generation of C++ and Java source. It also generates C#, although I didn't try that. As the SBE code generator is written in Java and has JARs in Maven Central, you can easily run code generation for all your languages from Maven or similar build tool, and you don't even need to manually download or install anything. In the interests of completeness, other modern zero-copy cross-platform serialization formats include Cap'n Proto and Flat Buffers. The author of the former has written a detailed comparison here, including his analysis of SBE. I'd add that Cap'n Proto looks wonderful but its Java support is "early stage" according to its web site. As Flat Buffers is from Google, you need to consider future support.

In my use case I needed the highest performance way to deliver C events to Java code, so I had Java allocate an off-heap array and pass it along with counter pointers to a C function via BridJ. There's no performance downside of BridJ given only one C function is ever called and BridJ/JNI functions are never used to read the data prepared by C. This is critical, as JNI overhead (either C calling back into Java or Java reading C structs) is extremely high compared with deserializing a SBE-encoded event. I generated BridJ bindings with JNAerator given the latter conveniently runs within a Maven build phase. Anyway the two sides use a circular buffer pattern to move SBE-encoded events from C to Java. It is extremely fast once you do some tuning, and I've found the Java side blocks most of the time waiting for the C side. Peter's affinity library really helps here too, as you can lock the C writer thread and the Java reader thread to different cores. As a bonus the SBE serialization format is good for disk persistence and you can later replay the events without even loading the C side. For disk serialization I found SBE payloads offer excellent compression ratios and rapid decompression time via lz4, although the lack of lz4-java support for lz4 stream format 1.4 is about the only inelegant part of that solution (I worked around it by using the native lz4 command to decompress to a /dev/shm file system on demand, thus you still only have one read operation involving disks). xz also compresses SBE well (about 3 times smaller files than lz4) but decompression time is around 7 times slower than lz4..

I guess the real issue is whether your use case requires "highest throughput possible" one-way message delivery on a single machine. If that's your situation, I found the above worked well. If other characteristics apply (eg using machines that don't share memory, requiring request-reply patterns etc) it won't be suitable. Regardless SBE might still be a nice option for you to consider given it's so latency-focused, compression friendly, supports C# and is very convenient from a Java tooling perspective.

As an aside I would have loved it if Chroncile Queue could have allowed this cross-platform interop. That was my initial approach to the above interop problem. While it performed impressively in Java, the lack of a convenient inter-platform serialization approach was the main issue. I could have written C code to write out the format I would later read on the Java side, but writing a SBE schema and having it generate alignment-aware, high-performance code was simply more convenient and less error-prone. I could have written out SBE payloads into Chroncile data files, but I still would have needed to deal with memory map segments and index file appending on the C side. Hopefully someday someone will write a C library that easily appends SBE content to a Chronicle memory mapped file. Indeed integration between SBE and Chroncile Queue even in Java would be pretty nice.

Cheers

Ben

PS: Sorry if this is too off-topic, but as you mentioned "monitoring" I thought I'd add I've had very nice experiences using collectd over multicast. It is very lightweight and has numerous built-in plugins. For your custom application-level metrics (gauges, counters etc), you can transmit them out of your application via a statsd library (eg Java options). This combination doesn't slow down your applications given they're non-blocking multicast and UDP. The main challenge is where to ultimately write the collected data. For that we've been using InfluxDB, as it can emulate a Graphite endpoint and collectd offers a Graphite plugin. Add some Grafana and you have a nice JavaScript GUI that can query InfluxDB. The best part about all this is you get turnkey, system-wide, low-footprint monitoring and pretty graphing without writing any code.

Peter Lawrey

unread,

Oct 5, 2014, 4:55:59 AM10/5/14

to java-ch...@googlegroups.com

Hello,
There is a lot of good content here. You should write up in a blog as this is the sort of thing we post on the Performance Java User's Group.

I had lunch with Martin Thompson this week and I am familiar with SBE. I used to be a C programmer but I am not as current with it.
In the HFT space the trend is generally away from C/C++ as you either need the speed that FPGA or GPU gives you or you need the time to market that Java or a higher level language like Scala or Python gives you. This might be a London centric view.
As we make our money from consulting we focus on the technology areas our client's do. As such C isn't a high priority for us.

As the name suggests SBE is designed to be simple, portable, easy to decide, cross platform and faster than FIX.
You don't have to use Chronicle serialization but it offers compressed types, general object serialization, zero copy, thread safe operations.

Lz4 sounds interesting. We intend to improve our support for compression esp for wan link replication and we will have a look at this.

Another trend is towards reactive programming. In this model you avoid or offload any blocking operations away from your critical execution. Eg SEDA, As such, request/response is turned into to asynchronous messages. This is far more efficient but harder to work with esp if you have a legacy application which wasn't design with this in mind.

What is on the road map is improved off heap layout control. Our library will automatically optimise the layout but it might not do it the way you need/want.

I think we can do more to improve SBE integration. Thank you for feed back in that regard.

Peter.

--

Ben Alex

unread,

Oct 5, 2014, 8:25:55 PM10/5/14

to java-ch...@googlegroups.com

Hi Peter

C++ remains the mainstay of algo trading here in Sydney, at least in the firms I've come across. What you said makes sense though: ULLT will increasingly demand FPGAs/ASICs whereas slower HFT benefits from faster TTM, safety, tooling, libraries, productive mixed teams and so on which are simply more easily accessible in higher-level languages. Personally I've used Python, Go, Java, C/C++ and Scala for trading projects and keep returning to Java as the ideal balance for my projects. Where I've found C/C++ simply unavoidable is fast interfacing with market data. To back-test market-wide strategies over an extended period I need to replay hundreds of TBs of data as quickly as possible, even after compression and aggressive summarisation (eg trades on one millisecond boundaries, discarding all quotes and merely retaining NBBO etc). C/C++ has you amply covered in that space, but it's tricky to move that data as rapidly as possible out of C/C++ libraries into the JVM for the remainder of the processing. I guess this is so niche everyone rolls their own solution, but having tried a bunch of techniques (JNI, BridJ, C callbacks, length-framed streams, circular buffers with varying claim/commit patterns, even starting the JVM from a C++ program etc) I found the approach described in my last email was the fastest way to deal with read-only data that originates from a C/C++ library.

My initial approach was to use Chronicle Queue to do a one-time conversion of the C-based tapes into Chronicle format. The idea was pull the data from C using whatever interop technique was fastest, then ETL it into Chronicle Queue format for efficient future reads in Java. The problems were (a) the resulting Chronicle files were massive, although they did replay extremely quickly, (b) using compact functions (eg Bytes.writeCompactLong()) in a crude attempt to improve storage requirements resulted in a material speed reduction, (c) it was still far too slow to get the data out of C into the JVM in the first place (as at that stage I was using JNI-based struct parsing) and (d) it required the JVM to be running so a reliable encoder was available for Chronicle Queue. To overcome (c) I introduced SBE on circular buffers, with a view to then exploring more sophisticated compression or aggressive summarisation strategies to address (a) without needing the compact functions from (b). However with SBE solving the C to JVM throughput problem extremely well, it became natural to just keep using SBE to also resolve (d) as it allowed a standalone C/C++ program to write SBE straight to disk or stdout without even loading a JVM. That in turn allowed me to benchmark all current compression algorithms with the output files and subsequently select lz4 to compress those files without even needing compression awareness in the C/C++ program.

All of this is very unique to my particular requirements, but I hope writing it down might offer suggestions for others encountering similar challenges. I also wanted to share the actual experience of someone who actually started by trying to store the data in Chronicle Queue directly but ended up not using it. Even with SBE support, compression is still critical to make it feasible for big data projects. It's a big issue to purchase 1 PB of storage for an application if a low-overhead compressor would have reduced that down to 130 TB (that's the compression ratio of lz4 -9 on my SBE binaries). I looked for compression options in Chronicle Queue via Google and the GitHub project but didn't find any. If it's there already, can you point me to it (I'm certainly open to benchmarking it)?

As an aside I've now written a stream based decompressor for lz4. It uses java.lang.Process to spawn lz4 and redirect its stdout for decoding the SBE frames via a piped stream. It works fine. As such the lack of lz4-java stream format 1.4 support is even more easily worked around, as you don't need massive /dev/shm space. If lz4 was of interest to Chronicle, you could easily compress using the pure Java lz4 given you don't need to interface with streams created elsewhere (eg by the lz4 native program) and therefore don't need to comply with the streaming format 1.4 specification.

Cheers

Ben

Peter Lawrey

unread,

Oct 6, 2014, 1:54:31 AM10/6/14

to java-ch...@googlegroups.com

That is all interesting feedback.

--

Reply all

Reply to author

Forward

0 new messages