Concurrency and Off-Heap access (m-mapped file) in Java for inter-process communication doesn't work

bob.tw...@gmail.com

unread,

Apr 30, 2015, 12:14:28 AM4/30/15

to mechanica...@googlegroups.com

So according to this blog post: http://psy-lob-saw.blogspot.com/2013/02/alignment-concurrency-and-torture-x86.html

Concurrency over unsafe and aligned memory does NOT work in Java. According to my tests it works 90% of the time. I was close!

What I am saying is:

Two JVMs exchanging messages through a memory-mapped file, in other words, both are accessing the same unsafe memory address through the MappedByteBuffer:

- First JVM writes a long to an aligned memory address (addressPointer % 8 == 0)

- Second JVM never sees the update. It keeps trying to re-read the long from the address but keeps getting an old value. Forever.

Is there a workaround to make this work or is it simply impossible to do it in Java?

If it is impossible, how does someone go about synchronizing access to a memory mapped file in Java, so two processes can share it and use it appropriately as a transfer concurrent queue?

Wojciech Kudla

unread,

Apr 30, 2015, 2:09:28 AM4/30/15

to mechanica...@googlegroups.com

We're using Peter's chronicle in many of our deployments. Some other environments employ IPC over a more lightweight, bespoke solutions (using mmapped files too).

Not sure if I'm interpreting your statement correctly, but IPC over mmapped files regardless of whether you use bytebuffer or unsafe does work in Java.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Wojciech Kudla

unread,

Apr 30, 2015, 2:15:47 AM4/30/15

to mechanica...@googlegroups.com

Are you using any memory fences in your tests?

bob.tw...@gmail.com

unread,

Apr 30, 2015, 2:38:34 AM4/30/15

to mechanica...@googlegroups.com

> Are you using any memory fences in your tests?

No. How do I do that? I tried putLongVolatile. It did not work either :( I suspect is a visibility problem. One JVM gets the value cached in its CPU and don't get the write coming from the other JVM. Not sure how that can happen if I am using putLongVolatile. Passing null to its first argument. Hope someone can shed a light.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Wojciech Kudla

unread,

Apr 30, 2015, 2:42:50 AM4/30/15

to mechanica...@googlegroups.com

How are you obtaining the value on the reader side?

On 30 Apr 2015 07:38, <bob.tw...@gmail.com> wrote:

> Are you using any memory fences in your tests?

No. How do I do that? I tried putLongVolatile. It did not work either :( I suspect is a visibility problem. One JVM gets the value cached in its CPU and don't get the write coming from the other JVM. Not sure how that can happen if I am using putLongVolatile. Passing null to its first argument. Hope someone can shed a light.

On Thursday, April 30, 2015 at 2:15:47 AM UTC-4, Wojciech Kudla wrote:

Are you using any memory fences in your tests?

On 30 Apr 2015 07:09, "Wojciech Kudla" <wojciec...@gmail.com> wrote:

We're using Peter's chronicle in many of our deployments. Some other environments employ IPC over a more lightweight, bespoke solutions (using mmapped files too).

Not sure if I'm interpreting your statement correctly, but IPC over mmapped files regardless of whether you use bytebuffer or unsafe does work in Java.

On 30 Apr 2015 05:14, <bob.tw...@gmail.com> wrote:

So according to this blog post: http://psy-lob-saw.blogspot.com/2013/02/alignment-concurrency-and-torture-x86.html

Concurrency over unsafe and aligned memory does NOT work in Java. According to my tests it works 90% of the time. I was close!

What I am saying is:

Two JVMs exchanging messages through a memory-mapped file, in other words, both are accessing the same unsafe memory address through the MappedByteBuffer:

- First JVM writes a long to an aligned memory address (addressPointer % 8 == 0)

- Second JVM never sees the update. It keeps trying to re-read the long from the address but keeps getting an old value. Forever.

Is there a workaround to make this work or is it simply impossible to do it in Java?

If it is impossible, how does someone go about synchronizing access to a memory mapped file in Java, so two processes can share it and use it appropriately as a transfer concurrent queue?

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

bob.tw...@gmail.com

unread,

Apr 30, 2015, 2:48:08 AM4/30/15

to mechanica...@googlegroups.com

I tried everything:

unsafe.getLong

unsafe.getLongVolatile

mappedBuffer.getLong

No luck !!!

On Thursday, April 30, 2015 at 2:42:50 AM UTC-4, Wojciech Kudla wrote:

How are you obtaining the value on the reader side?

On 30 Apr 2015 07:38, <bob.tw...@gmail.com> wrote:

> Are you using any memory fences in your tests?

No. How do I do that? I tried putLongVolatile. It did not work either :( I suspect is a visibility problem. One JVM gets the value cached in its CPU and don't get the write coming from the other JVM. Not sure how that can happen if I am using putLongVolatile. Passing null to its first argument. Hope someone can shed a light.

On Thursday, April 30, 2015 at 2:15:47 AM UTC-4, Wojciech Kudla wrote:

Are you using any memory fences in your tests?

On 30 Apr 2015 07:09, "Wojciech Kudla" <wojciec...@gmail.com> wrote:

We're using Peter's chronicle in many of our deployments. Some other environments employ IPC over a more lightweight, bespoke solutions (using mmapped files too).

Not sure if I'm interpreting your statement correctly, but IPC over mmapped files regardless of whether you use bytebuffer or unsafe does work in Java.

On 30 Apr 2015 05:14, <bob.tw...@gmail.com> wrote:

So according to this blog post: http://psy-lob-saw.blogspot.com/2013/02/alignment-concurrency-and-torture-x86.html

Concurrency over unsafe and aligned memory does NOT work in Java. According to my tests it works 90% of the time. I was close!

What I am saying is:

Two JVMs exchanging messages through a memory-mapped file, in other words, both are accessing the same unsafe memory address through the MappedByteBuffer:

- First JVM writes a long to an aligned memory address (addressPointer % 8 == 0)

- Second JVM never sees the update. It keeps trying to re-read the long from the address but keeps getting an old value. Forever.

Is there a workaround to make this work or is it simply impossible to do it in Java?

If it is impossible, how does someone go about synchronizing access to a memory mapped file in Java, so two processes can share it and use it appropriately as a transfer concurrent queue?

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

rick.ow...@gmail.com

unread,

Apr 30, 2015, 2:56:10 AM4/30/15

to mechanica...@googlegroups.com

Just my 2 cents here: why go through the trouble of doing IPC, serious? Just use the network. Is cleaner, bug-free and totally predictable. Easy to manage. Yes, it takes a coupler of microseconds more but do you really care for that? I bet not...

Martin Thompson

unread,

Apr 30, 2015, 3:03:29 AM4/30/15

to mechanica...@googlegroups.com

Have you worked with a large market data feed in finance doing low-latency trading?

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

Wojciech Kudla

unread,

Apr 30, 2015, 3:04:21 AM4/30/15

to mechanica...@googlegroups.com

I'm pretty sure for many people on this list a few micros per handoff makes a difference.
Also, mmap-backed IPC gives you journalling for (almost) free.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

rick.ow...@gmail.com

unread,

Apr 30, 2015, 3:16:07 AM4/30/15

to mechanica...@googlegroups.com

> Have you worked with a large market data feed in finance doing low-latency trading?

You won't be able to run everything on a single machine with 512 cpu cores anyways :) Stuff running on the same machine can be placed inside the same JVM and talk to each other through disruptor. Not saying is the only way, but it is how we used to trade with some positive PNL :)

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Martin Thompson

unread,

Apr 30, 2015, 3:16:26 AM4/30/15

to mechanica...@googlegroups.com

The few microseconds is typically about 16 via loopback in a realistic TCP application with a huge variance due to scheduling jitter. So if you do the math and want anything remotely predictable then you cannot deal with any problem space going into the 10s of thousands of event/messages per second. Then you need to add your data marshalling and everything else you doing on those threads.

Always go with the simplest solution for your requirements. For many the requirements are greater.

bob.tw...@gmail.com

unread,

Apr 30, 2015, 3:18:52 AM4/30/15

to mechanica...@googlegroups.com

Can anyone shed a light on my visibility problem? :P How to proper handle writer and readers exchanging sequences through the shared memory-mapped file without having visibility problems?

rick.ow...@gmail.com

unread,

Apr 30, 2015, 3:33:11 AM4/30/15

to mechanica...@googlegroups.com

If micros count then let's move everything to the same JVM and spend nanos instead of micros through pipelining (disruptor). Everything else goes to a distributed system, spamming multiple machines, not a single one. Decoupling is paramount.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Martin Thompson

unread,

Apr 30, 2015, 3:45:09 AM4/30/15

to mechanica...@googlegroups.com

I've found it works perfectly well. You can see an many examples in the Agrona and Aeron projects. Simplest being the AtomicCounter class or even something more complicated like the ManyToOneRingBuffer.

https://github.com/real-logic/Agrona

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

Martin Thompson

unread,

Apr 30, 2015, 3:50:26 AM4/30/15

to mechanica...@googlegroups.com

Numa regions on the same machine are effectively distributed nodes. IPC is the best way to communicate between them if decoupled. This is why people typically use IPC. One big address space that spans multiple nodes is often the wrong solution.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

bob.tw...@gmail.com

unread,

Apr 30, 2015, 3:57:51 AM4/30/15

to mechanica...@googlegroups.com

Thanks, Martin. I took a look but still not sure why my IPC queue is not working. Putting in words the problem is:

Tow JVMs sharing a sequence (read and write sequence) through a shared memory-mapped file. Pretty much like the disruptor. What do I have to do to guarantee the visibility of these sequences? I believe I am doing everything:

- aligning in memory

- padding to avoid false sharing

- using unsafe.putLongVolatile to write the sequence

- using unsafe.getLongVolatile to read the sequence

Do I have to do anything else to synchronize and guarantee visibility? I wish I had the AtomicLong here, but we are across JVMs doing IPC :)

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Martin Thompson

unread,

Apr 30, 2015, 4:07:00 AM4/30/15

to mechanica...@googlegroups.com

It you take the simplest thing that can possibility work then build from there is usually the best way to start. How about map a file in two processes as read and write. Then wrap the mapped buffer with my UnsafeBuffer class. Then write with putLongVolatile in one process and read with getLongVolatile in the other. Use offset 0 in the file for simplicity. You need to spin on the read to wait for the publication.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

Nitsan Wakart

unread,

Apr 30, 2015, 4:07:12 AM4/30/15

to mechanica...@googlegroups.com

Hi Bob,

In my experience using Unsafe.putVolatile/Ordered on the producer and using Unsafe.getVolatile on the consumer works just fine(assuming an aligned address). I have seen some issues with Unsafe.getVolatile on older JVMs though. Can you provide a minimal reproduction (coherent working code) and some information on your setup(JVM version/OS/hardware)?

Thanks,

Nitsan

rick.ow...@gmail.com

unread,

Apr 30, 2015, 4:09:17 AM4/30/15

to mechanica...@googlegroups.com

I am old school => one node per cpu core, disregarding hyper-threading. A cpu core is the smallest unit of processing power, there is no scape. If two processes running on the same machine need to talk to each other as fast as possible, then pipelining (disruptor) is by far the fastest solution, you will agree with me on that. That would be the case of a feed talking to a strategy for example. For the less critical stuff you will probably have other machines doing the job in a true distributed system. There is only so much you can stuff on a single machine, no?

How can numa regions make that better in this situation?

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Martin Thompson

unread,

Apr 30, 2015, 4:10:25 AM4/30/15

to mechanica...@googlegroups.com

Did you prepopulate the file with zeros to the correct size before mapping?

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

bob.tw...@gmail.com

unread,

Apr 30, 2015, 4:13:06 AM4/30/15

to mechanica...@googlegroups.com, nit...@yahoo.com

Hi Nitsan,

Thanks for trying to help. It looks like I am doing everything correctly so there must be something else happening that only code will tell :P I will write something to isolate the problem. I tested on Linux and Mac. JDK1.7. Sometimes it works, sometimes it doesn't. It is like quantum physics: when I try to debug I make it work so I can't debug :P

I will get back with some code !!!!

Nitsan Wakart

unread,

Apr 30, 2015, 4:15:55 AM4/30/15

to mechanica...@googlegroups.com

"So according to this blog post: http://psy-lob-saw.blogspot.com/2013/02/alignment-concurrency-and-torture-x86.html

Concurrency over unsafe and aligned memory does NOT work in Java. According to my tests it works 90% of the time. I was close!"

I think you misrepresented me here, the conclusion in that post is that they 'work' fine, but unaligned access is not atomic.

Nitsan Wakart

unread,

Apr 30, 2015, 4:19:31 AM4/30/15

to mechanica...@googlegroups.com

OpenJDK/Oracle 1.7 before update 40(or 45 can't remember) had some issues I've observed in this area. Please make sure you use the latest JDK and that you use the Oracle/OpenJDK/Zulu releases, I have no idea what the Apple JDK will do...

--

bob.tw...@gmail.com

unread,

Apr 30, 2015, 4:20:35 AM4/30/15

to mechanica...@googlegroups.com

> Did you prepopulate the file with zeros to the correct size before mapping?

Hey Martin! What !!!???? I must be missing something here. I am trying to pass 1 million messages from producer to consumer using the memory mapped file as a shared queue. So I map let's say 16k, put two sequences (read and write) in the beginning of the mapped buffer and go from there. Sequences are getting screwed it looks like. When you create an empty mmap file the bytes you get from it are all zero by default. But I am writing first and reading second, anyways. Totally lost here :)

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Wojciech Kudla

unread,

Apr 30, 2015, 4:30:52 AM4/30/15

to mechanica...@googlegroups.com

Nitsan,

Could you shed some more light on those issues?

Ben Evans

unread,

Apr 30, 2015, 4:32:03 AM4/30/15

to mechanica...@googlegroups.com

The "Apple JDK" never made it to version 7.

The standard JDK on Apple for 7 & 8 is OpenJDK-based & released by Oracle.

Thanks,

Ben

Martin Thompson

unread,

Apr 30, 2015, 4:46:36 AM4/30/15

to mechanica...@googlegroups.com

I think you should start with a lib written by someone who is experienced at this. Without seeing your code it is hard to tell what is going wrong.

Why not pick up a third party lib and start with that rather than write you own? Chronicle and the collections in Agrona are some good starting points.

bob.tw...@gmail.com

unread,

Apr 30, 2015, 4:50:45 AM4/30/15

to mechanica...@googlegroups.com, nit...@yahoo.com

OK. I broke the return key on my keyboard executing the test one hundred times on Linux ubuntu:

java version "1.8.0_20"

Java(TM) SE Runtime Environment (build 1.8.0_20-b26)

Java HotSpot(TM) 64-Bit Server VM (build 25.20-b23, mixed mode)

AND IT WORKS! I mean, I did not see the problem a single time.

As soon as I switch back to:

java version "1.7.0_51"

Java(TM) SE Runtime Environment (build 1.7.0_51-b13)

Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)

And give it a couple of runs the bug happens! I am running the exact same JVM on my Mac. Same problem.

Therefore I am forced to conclude that we found a bug in the JVM :) :) :) (of course, my code never has bugs!)

If you guys are able to confirm that 1.7.0_51-b13 is crooked for unsafe volatile stuff, then I would sleep well tonight (4:48am here!)

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Nitsan Wakart

unread,

Apr 30, 2015, 4:56:17 AM4/30/15

to mechanica...@googlegroups.com

"Could you shed some more light on those issues?"

See this post:

Psychosomatic, Lobotomy, Saw: SPSC revisited part III - FastFlow + Sparse Data

"...all the [queue] implementations which utilize Unsafe to access the array [i.e. Unsafe.putOrdered/getVolatile] (Y5/SPSCQueue5 and later, most of the FFBuffer implementations) were in fact broken when running the busy spin test and would regularly hang indefinitely. I tested locally and found I could not reproduce the issue at all for the SPSCQueue variations and could only reproduce it for the FFBuffer implementations which used no fences (the original) or only the STORE/STORE barrier (FFO1/2). Between us we have managed to verify that this is not an issue for JDK7u40 which had a great number of concurrency related bug fixes in it, but is an issue for previous versions! I strongly suggest you keep this in mind and test thoroughly if you intend to use this code in production."

Nitsan Wakart

unread,

Apr 30, 2015, 4:59:03 AM4/30/15

to mechanica...@googlegroups.com

I can't really confirm anything without some code...

Martin Thompson

unread,

Apr 30, 2015, 5:08:24 AM4/30/15

to mechanica...@googlegroups.com

On Thursday, 30 April 2015 09:09:17 UTC+1, rick.ow...@gmail.com wrote:

I am old school => one node per cpu core, disregarding hyper-threading. A cpu core is the smallest unit of processing power, there is no scape. If two processes running on the same machine need to talk to each other as fast as possible, then pipelining (disruptor) is by far the fastest solution, you will agree with me on that. That would be the case of a feed talking to a strategy for example. For the less critical stuff you will probably have other machines doing the job in a true distributed system. There is only so much you can stuff on a single machine, no?

The Disruptor is not even close to the fastest solution here. :-) It has too much indirection and thus not prefetcher friendly compared to other approaches. When setting references via it you are also subject to card marking.

There is a lot you can do on a single machine these days. The new Haswell Xeons are 18 cores per socket with terabytes of ram.

Going across machines is also necessary for resilience alone. However because we do does not mean our on box designs have to be compromised.

How can numa regions make that better in this situation?

You might want to break up a large JVM for many reasons. Things I often see a need for are having different setting for Biased Locking, conditional card marking, GC, etc. You might also want to configure the OS around the JVM for security, or capabilities, or resource binding that does not involve individual core pinning.

Sometimes a 4 socket box is a better option that 4 servers. Node0 can be the gateway to the outside world with nodes 1-3 running different applications like a ticker plant, trading strategies, and back office.

bob.tw...@gmail.com

unread,

Apr 30, 2015, 5:18:01 AM4/30/15

to mechanica...@googlegroups.com, nit...@yahoo.com

Ok folks !!!! I can confirm that everything works great on Linux ubuntu:

java version "1.7.0_80"

Java(TM) SE Runtime Environment (build 1.7.0_80-b15)

Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)

> Between us we have managed to verify that this is not an issue for JDK7u40 which had a great number of concurrency related bug fixes in it

Not sure about u40, but u51 is not working for sure!

I will try to come up with some code that clearly isolates the problem so I can post here.

> were in fact broken when running the busy spin test and would regularly hang indefinitely.

It was exactly that. Looks like a visibility problem to me. A Thread.sleep(1) was solving the issue because it flushes the cpu cache I would imagine...

Cheers !!!!!!!!!!

Nitsan Wakart

unread,

Apr 30, 2015, 5:55:04 AM4/30/15

to mechanica...@googlegroups.com

Here's some (ugly) code for verifying:

https://github.com/nitsanw/java-ping/blob/master/src/IpcPingServerSOLV.java

https://github.com/nitsanw/java-ping/blob/master/src/coordinated/IpcPingClientSOLV.java

Let me know if you still see the issues.

Vitaly Davidovich

unread,

Apr 30, 2015, 7:20:37 AM4/30/15

to mechanica...@googlegroups.com, Nitsan Wakart

Can you show the code? I don't think Thread.sleep () has any direct relationship to this and you don't need to "flush the cpu cache" (not quite right terminology, by the way). More likely there's some code generation problem that Thread.sleep () is perhaps perturbing.

Can you reproduce this if you run with Xint? Also try -client to use the C1 JIT compiler.

sent from my phone

--

You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

Vitaly Davidovich

unread,

Apr 30, 2015, 7:23:45 AM4/30/15

to mechanica...@googlegroups.com

As mentioned in the other thread, it may be worth nothing that unaligned read/write can be atomic so long as it doesn't cross a cacheline, depending on cpu.

sent from my phone

Wojciech Kudla

unread,

Apr 30, 2015, 7:31:16 AM4/30/15

to mechanica...@googlegroups.com

I remember reading slightly opposing opinions on word tearing.
One says it's not atomic, the other states that it would force atomicity by locking the bus.
I guess that depends on a particular cpu model. Is there any information on what to expect from different Intel CPUs?

Vitaly Davidovich

unread,

Apr 30, 2015, 7:46:24 AM4/30/15

to mechanica...@googlegroups.com

Intel software dev manual talks about atomicity guarantees. I think atomicity of unaligned accesses within cacheline has been around for a while on Intel; what's fairly recent (Sandy Bridge+) is unaligned accesses being same speed as aligned, for most part. Store-load forwarding also works for them, for the most part, IIRC.

If you use lock'd instructions to read/write unaligned and cacheline spanning locations, then you do get atomicity due to bus arbitration - perhaps that's what you're thinking of? AFAIK, there's no automatic bus arbitration for plain unaligned accesses on intel.

sent from my phone

Richard Warburton

unread,

Apr 30, 2015, 8:33:09 AM4/30/15

to mechanica...@googlegroups.com

Hi,

OK. I broke the return key on my keyboard executing the test one hundred times on Linux ubuntu:

Keyboards are for life, not just for Christmas. Also posix shell has for loops:

for i in `seq 1 100`; do echo $i; done

regards,

Richard Warburton

http://insightfullogic.com

@RichardWarburto

Reply all

Reply to author

Forward