micro-services design for HFT application

Vero K.

unread,

Nov 8, 2015, 8:39:34 AM11/8/15

to mechanical-sympathy

hi, we are building HFT application with very serious requirements for latency, reliability and maintenance. We want to split our application into micro-services and update trading modules online by starting a new micro-service version and switching other micro-services to a new link. the problem we face is latency which is very important: we can't just split and combine services with HTTP protocol and we started to look into other solutions for micro-services communication: Java Chronicle from OpenHFT, etc.

I just wanted to ask to share community design ideas which we can use and list all possible communication solutions apart from Java Chronicle, free and commercial.

What also confuses my mind, if we will use something similar to Chronicle, will we be able to bring second updated version of a micro-service and will it be able to substitute old version nicely and communicate using chronicle nicely. How safe it to use Chronicle or other solutions for this task? Please recommend us all possible solutions with pros and cons.

Martin Thompson

unread,

Nov 8, 2015, 9:52:33 AM11/8/15

to mechanical-sympathy

I'm a bit biased but Aeron is very low-latency across a network and between threads or processes in Java and C++. It also works well with SBE for encoding the messages.

https://github.com/real-logic/Aeron

https://github.com/real-logic/simple-binary-encoding

Feel free to compare with Chronicle and see what best suits your requirements.

Martin...

Gil Tene

unread,

Nov 8, 2015, 4:10:38 PM11/8/15

to mechanical-sympathy

Regardless of your choice of transport, the main tension point between using micro-services and minimizing latency has to do with the number of handoffs involved in whatever end-to-end path you'll be taking. The most effective means for cutting down end-to-end latencies tends to be minimizing the overall number of "hops" involved. Most systems end up using the minimal number of hops actually required by their business needs (e.g. protocol gateways, persistence/journaling/replication, and load bearing/distribution needs), keeping as much of each logic step as possible to a single process and thread. Parallelizing actually parallelizeable paths is usually secondly to this first "keep number of hops small" step.

A good example of an opportunity to reduce hops can be found in some risk evaluation steps that are often found in a trading flow. You can use a separate risk-evaluation "service" which will necessarily incur the latency of a hop or two, or you can perform risk computations in-line by using in-process data that is updated outside of the critical latency path. People do some very creative things in such data update patterns to keep the critical (usually read-only) path wait-free...

Greg Young

unread,

Nov 8, 2015, 4:43:03 PM11/8/15

to mechanica...@googlegroups.com

On Sun, Nov 8, 2015 at 10:10 PM, Gil Tene <g...@azul.com> wrote:
>
> Regardless of your choice of transport, the main tension point between using
> micro-services and minimizing latency has to do with the number of handoffs
> involved in whatever end-to-end path you'll be taking. The most effective
> means for cutting down end-to-end latencies tends to be minimizing the
> overall number of "hops" involved. Most systems end up using the minimal
> number of hops actually required by their business needs (e.g. protocol
> gateways, persistence/journaling/replication, and load bearing/distribution
> needs), keeping as much of each logic step as possible to a single process
> and thread. Parallelizing actually parallelizeable paths is usually secondly
> to this first "keep number of hops small" step.
>

Just to add to this its quite common as a strategy to integrate the
"services" through events as part of this working on near real time
data as opposed to being authoritative.

> A good example of an opportunity to reduce hops can be found in some risk
> evaluation steps that are often found in a trading flow. You can use a
> separate risk-evaluation "service" which will necessarily incur the latency
> of a hop or two, or you can perform risk computations in-line by using
> in-process data that is updated outside of the critical latency path. People
> do some very creative things in such data update patterns to keep the
> critical (usually read-only) path wait-free...
>
>
> On Sunday, November 8, 2015 at 5:39:34 AM UTC-8, Vero K. wrote:
>>
>> hi, we are building HFT application with very serious requirements for
>> latency, reliability and maintenance. We want to split our application into
>> micro-services and update trading modules online by starting a new
>> micro-service version and switching other micro-services to a new link. the
>> problem we face is latency which is very important: we can't just split and
>> combine services with HTTP protocol and we started to look into other
>> solutions for micro-services communication: Java Chronicle from OpenHFT,
>> etc.
>>
>> I just wanted to ask to share community design ideas which we can use and
>> list all possible communication solutions apart from Java Chronicle, free
>> and commercial.
>>
>> What also confuses my mind, if we will use something similar to Chronicle,
>> will we be able to bring second updated version of a micro-service and will
>> it be able to substitute old version nicely and communicate using chronicle
>> nicely. How safe it to use Chronicle or other solutions for this task?
>> Please recommend us all possible solutions with pros and cons.
>

> --
> You received this message because you are subscribed to the Google Groups
> "mechanical-sympathy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mechanical-symp...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

--
Studying for the Turing test

Avi Kivity

unread,

Nov 9, 2015, 6:14:34 AM11/9/15

to mechanica...@googlegroups.com

I don't know much about about HFT, but I agree that micro-services are anti-latency and anti-throughput. The micro-er your services will be, the more dominant communication costs and latencies are.

Micro-services seem to be more about allowing uncoordinated teams to sort of work together rather than improving the quality of the service itself.

(and they seem to be yet another take on 2000-era J2EE EJBs, only with less support from the framework and more potential for breakage)

--

Greg Young

unread,

Nov 9, 2015, 6:18:39 AM11/9/15

to mechanica...@googlegroups.com

"(and they seem to be yet another take on 2000-era J2EE EJBs, only
with less support from the framework and more potential for breakage)"

Or Actors!

Richard Warburton

unread,

Nov 9, 2015, 7:36:54 AM11/9/15

to mechanica...@googlegroups.com

Hi,

"(and they seem to be yet another take on 2000-era J2EE EJBs, only
with less support from the framework and more potential for breakage)"

Or Actors!

Or Objects!

* messaging

* local retention and protection and hiding of state-process

* extreme late-binding of all things (ie dynamic service discovery)

regards,

Richard Warburton

http://insightfullogic.com

@RichardWarburto

Vero K.

unread,

Dec 2, 2015, 3:16:01 PM12/2/15

to mechanical-sympathy

guys, what do you think about OSGI as a framework for low-latency microservices?

Avi Kivity

unread,

Dec 3, 2015, 5:45:12 AM12/3/15

to mechanica...@googlegroups.com

Yes. Micro-services are a management failure. I can't get my team to release a coherent application, so I'll let each developer pick their own language and tooling, and connect them via poorly defined HTTP interfaces, ignoring the fact that soon networking latency and HTTP parsing overhead will soon dominate application latency and cpu consumption, respectively.

--

Daniel Worthington-Bodart

unread,

Dec 3, 2015, 6:49:02 AM12/3/15

to mechanica...@googlegroups.com

Microservices should actually be an internal implementation detail to a single team. They can be used successfully as long as the team doesn't expose their existence to their clients (or the rest of their organisation). This allows them to in-line them / move them back in-process when performance, coupling or refactoring support dictates (with no breaking changes) or the opposite when it becomes clear the change cadence is different from their in-process neighbours. Some libraries natively support having services being called in-process or on the wire with no code change but a big increase in performance. It's a shame more people don't explore this space, maybe I'll write a blog post.

If your management are talking about microservices you are probably doing it wrong. I wouldn't discuss in-lining a method in a performance critical section with a manager and neither should I discuss in-lining a microservice. Writing microservices in different languages is neither an anti pattern nor a recommendation, it is a tradeoff as usual. If time to market or first mover advantage is key then writing a bunch of microservices in different languages behind haproxy might get you there quicker, I've certainly done this. Once you work out what what your market actually is and your team has started to settle you will probably see a convergence of technology and a redivision of service boundaries.

I think why there tends to be such a binary opinion on microservices is the domain most people work in day to day, if you are in a web startup then time to market normally dominates your needs and so you choose tools and patterns that allow you deploy instantly to production tiny changes where the financial risk for spikes in latency are minimal, often performance will be fixed if the feature is a success or your startup doesn't run out of money. If latency literally is money then your laser focus will be to eliminate that, or if your enterprise change control process eliminates any benefit from tiny releases again microservices will look dumb.

HTTP is inherently high latency but has amazing operational tool support (reverse proxies, caches, human readable etc) and usually the price you are already paying it adds very little to insert Nginx or Haproxy. I'm not saying you would use HTTP for HFT just that the tool support is incredible and it would be very interesting to see if one could create Aeron or Chronicle compatible reverse proxies or routers or wireshark plugins. This could create a middle ground where some additional latency would be accepted on the non critical path for the ability to release independently from the core.

Richard Warburton

unread,

Dec 4, 2015, 9:39:21 AM12/4/15

to mechanica...@googlegroups.com

Hi,

HTTP is inherently high latency but has amazing operational tool support (reverse proxies, caches, human readable etc) and usually the price you are already paying it adds very little to insert Nginx or Haproxy. I'm not saying you would use HTTP for HFT just that the tool support is incredible and it would be very interesting to see if one could create Aeron or Chronicle compatible reverse proxies or routers or wireshark plugins. This could create a middle ground where some additional latency would be accepted on the non critical path for the ability to release independently from the core.

Yeah, I think you've hit the nail on the head with respect to the main reason people use HTTP. For most people developer productivity trumps performance concerns.

FYI There is an aeron wireshark dissector. I think its currently a bit out of date though - https://github.com/dameiss/wireshark-aeron. Its an interesting question though - are the chronicle wire protocols defined independently of the implementation?

I don't think I really understand what the motivating use case for reverse proxies are in the case of something like Aeron. Not saying they don't exist just not something I've hit in my Aeron usage.

I can see that a router could make sense for point to point communications. If you look at the ZeroMQ book you'll see that once you have a proper messaging system in place many of the enterprise messaging patterns like routing are easily implementable in very small quantities of code.

Shripad Agashe

unread,

Dec 5, 2015, 10:21:13 AM12/5/15

to mechanical-sympathy

Given that Http/2 is going to be binary, how would this impact the choices esp when someone is making forward looking choices?

Shripad

Jimmy Jia

unread,

Dec 5, 2015, 10:28:50 AM12/5/15

to mechanical-sympathy

As always, Gil's post a ways up has just about the right of it.

Microservices and modularization are a great fit for trading systems - you just keep them off the critical path (which for all I know these days isn't even always in software period), but let them do the quant heavy lifting separately from your critical path trading logic.

Then the challenge just becomes having your "critical path" service consume these updates as cheaply as possible (Gil pointed this out above), and making sure you don't overwhelm your network if you have chatty services or are trading in spaces with many products.

Both are solvable - really I don't see how you'd build a system that's both competitive latency-wise these days and not horrible to maintain without this sort of architecture.

Shripad Agashe

unread,

Dec 7, 2015, 11:05:19 AM12/7/15

to mechanical-sympathy

"If your management are talking about microservices you are probably doing it wrong. "

On this point I have a different POV. The moment one decides to move from monolith to a micro service architecture, a host of options open up. So far what was provided implicitly in terms of transactions/context suddenly become apparent and needs to be handled in the program. This generally requires a business decision. If you follow Pat Helland ( Building on Quicksand), he pretty much argues that business should get involved and we should move away from notion of operation consistency based on pure DB Read/Write to more application specific notion of consistency. One classic example is Amazon shopping cart implementation with DynamoDB. In that particular case, the DB returns all possible values in case of a conflict and the app needs to decide the next course of action. Pat goes on to argue that DB Write is not commutative hence is a problematic abstraction. So newer techniques like order insensitive programming indeed need business process context and may be even a change in those based on the risk to business.

Shripad

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Daniel Worthington-Bodart

unread,

Dec 7, 2015, 11:21:25 AM12/7/15

to mechanical-sympathy

HTTP/2 SPDY has higher initial latency due to the SSL handshakes etc, this can be improved a lot by various options (OSCP stapling

etc):

https ://istlsfastyet.com /

Once the initial handshake has been done, then I believe it's better than HTTP/1.1 unless on really bandwidth limited connection.

Officially HTTP/2 supports not being encrypted but none of the browser vendors are planning to support this mode of operation I believe. Not sure about the server side.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy +unsub...@googlegroups.com.

Reply all

Reply to author

Forward