And the memory consume semantic to exploit better perf on weak memory model CPUs...it's interesting to read code like this for a java programmer like me :) It help to understand what java could gain adding a little bit of control for certain operation...
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
I spent a while looking at read-copy-update (RCU) and QSBR for Aeron's C++ API. Opted not to use any of those techniques due to the need to have users make idle calls for reclamation. Which is difficult for a library to impose effectively. But hooking pthread calls and some other options are quite valid. And at a systems perspective, it is trivial to add these type of reclamation calls to an idle strategy. I'm a proponent of the general approach because of what it opens up.It's worth noting that C++11 atomics are quite laborious in comparison to the JMM. The weaker memory models for std::atomic are.... tricky. But more useful than most in the C++ community seem to want to go along with. But not sure Java could actually leverage them, honestly. I'm not sure if the optimizations they enable are things the current optimizers in Java are able to do much with. So much data dependency.
--On Mon, Feb 8, 2016 at 10:11 AM, Francesco Nigro <nigr...@gmail.com> wrote:And the memory consume semantic to exploit better perf on weak memory model CPUs...it's interesting to read code like this for a java programmer like me :) It help to understand what java could gain adding a little bit of control for certain operation...
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
On Monday, February 8, 2016, Todd Montgomery <tm...@nard.net> wrote:I spent a while looking at read-copy-update (RCU) and QSBR for Aeron's C++ API. Opted not to use any of those techniques due to the need to have users make idle calls for reclamation. Which is difficult for a library to impose effectively. But hooking pthread calls and some other options are quite valid. And at a systems perspective, it is trivial to add these type of reclamation calls to an idle strategy. I'm a proponent of the general approach because of what it opens up.It's worth noting that C++11 atomics are quite laborious in comparison to the JMM. The weaker memory models for std::atomic are.... tricky. But more useful than most in the C++ community seem to want to go along with. But not sure Java could actually leverage them, honestly. I'm not sure if the optimizations they enable are things the current optimizers in Java are able to do much with. So much data dependency.These are coming to Java by way of VarHandle, with the exception of memory_order_consume I think.By "so much data dependency" you mean pointer/memory chasing?
On Mon, Feb 8, 2016 at 10:11 AM, Francesco Nigro <nigr...@gmail.com> wrote:
And the memory consume semantic to exploit better perf on weak memory model CPUs...it's interesting to read code like this for a java programmer like me :) It help to understand what java could gain adding a little bit of control for certain operation...
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hmmm. Might have to look at the latest VarHandles changes then. Didn't notice them before.Yes, sorry for being imprecise. As I was doing some optimizations before using memory_order_relaxed (not Aeron, though), the more indirection (specifically pointer chasing),the less optimizations could be done. In one case, a change to use std::make_shared was enough to allow the optimizer to do its thing. So, I've started to dig down and look when Ihave the chance. Haven't been able to really find analogues in the JVM yet. i.e. same kinds of things don't do anything useful.Aeron usage of relaxed is in the rate reporter and tied to the samples only. Which was an easy optimization. But you can play with heap allocating the RateReporter andsee if that has the same optimizations. When I checked clang pre-7, it didn't optimize as well then.
On Mon, Feb 8, 2016 at 6:21 PM, Vitaly Davidovich <vit...@gmail.com> wrote:
On Monday, February 8, 2016, Todd Montgomery <tm...@nard.net> wrote:I spent a while looking at read-copy-update (RCU) and QSBR for Aeron's C++ API. Opted not to use any of those techniques due to the need to have users make idle calls for reclamation. Which is difficult for a library to impose effectively. But hooking pthread calls and some other options are quite valid. And at a systems perspective, it is trivial to add these type of reclamation calls to an idle strategy. I'm a proponent of the general approach because of what it opens up.It's worth noting that C++11 atomics are quite laborious in comparison to the JMM. The weaker memory models for std::atomic are.... tricky. But more useful than most in the C++ community seem to want to go along with. But not sure Java could actually leverage them, honestly. I'm not sure if the optimizations they enable are things the current optimizers in Java are able to do much with. So much data dependency.These are coming to Java by way of VarHandle, with the exception of memory_order_consume I think.By "so much data dependency" you mean pointer/memory chasing?
On Mon, Feb 8, 2016 at 10:11 AM, Francesco Nigro <nigr...@gmail.com> wrote:
And the memory consume semantic to exploit better perf on weak memory model CPUs...it's interesting to read code like this for a java programmer like me :) It help to understand what java could gain adding a little bit of control for certain operation...
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
Assembly for my claims - http://goo.gl/5rU0l3
On Monday, February 8, 2016 at 9:27:04 PM UTC-8, Rajiv Kurian wrote:
The only good theoretical use of memory_order_relaxed I find is updating counters, progress bars etc where you give the compiler more room to play around with dependencies than with memory_order_release. Do you have any short examples where this actually generates better code? On x86 at least both GCC and Clang generate a "lock add" for fetch_and_add in both relaxed and release mode. Similarly they generate a simple "mov" for a store operation for both relaxed and release mode. So the only place where I see it generating better code is some affordance through playing around with dependencies.
What I found weird is that neither GCC nor Clang optimize a loop of stores in release or relaxed mode to a single store. As far as I can tell that transformation is totally valid.
On Monday, February 8, 2016 at 6:49:02 PM UTC-8, Todd L. Montgomery wrote:
Hmmm. Might have to look at the latest VarHandles changes then. Didn't notice them before.Yes, sorry for being imprecise. As I was doing some optimizations before using memory_order_relaxed (not Aeron, though), the more indirection (specifically pointer chasing),the less optimizations could be done. In one case, a change to use std::make_shared was enough to allow the optimizer to do its thing. So, I've started to dig down and look when Ihave the chance. Haven't been able to really find analogues in the JVM yet. i.e. same kinds of things don't do anything useful.Aeron usage of relaxed is in the rate reporter and tied to the samples only. Which was an easy optimization. But you can play with heap allocating the RateReporter andsee if that has the same optimizations. When I checked clang pre-7, it didn't optimize as well then.
On Mon, Feb 8, 2016 at 6:21 PM, Vitaly Davidovich <vit...@gmail.com> wrote:
On Monday, February 8, 2016, Todd Montgomery <tm...@nard.net> wrote:I spent a while looking at read-copy-update (RCU) and QSBR for Aeron's C++ API. Opted not to use any of those techniques due to the need to have users make idle calls for reclamation. Which is difficult for a library to impose effectively. But hooking pthread calls and some other options are quite valid. And at a systems perspective, it is trivial to add these type of reclamation calls to an idle strategy. I'm a proponent of the general approach because of what it opens up.It's worth noting that C++11 atomics are quite laborious in comparison to the JMM. The weaker memory models for std::atomic are.... tricky. But more useful than most in the C++ community seem to want to go along with. But not sure Java could actually leverage them, honestly. I'm not sure if the optimizations they enable are things the current optimizers in Java are able to do much with. So much data dependency.These are coming to Java by way of VarHandle, with the exception of memory_order_consume I think.By "so much data dependency" you mean pointer/memory chasing?
On Mon, Feb 8, 2016 at 10:11 AM, Francesco Nigro <nigr...@gmail.com> wrote:
And the memory consume semantic to exploit better perf on weak memory model CPUs...it's interesting to read code like this for a java programmer like me :) It help to understand what java could gain adding a little bit of control for certain operation...
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
The only good theoretical use of memory_order_relaxed I find is updating counters, progress bars etc where you give the compiler more room to play around with dependencies than with memory_order_release. Do you have any short examples where this actually generates better code? On x86 at least both GCC and Clang generate a "lock add" for fetch_and_add in both relaxed and release mode. Similarly they generate a simple "mov" for a store operation for both relaxed and release mode. So the only place where I see it generating better code is some affordance through playing around with dependencies.
What I found weird is that neither GCC nor Clang optimize a loop of stores in release or relaxed mode to a single store. As far as I can tell that transformation is totally valid.
On Monday, February 8, 2016 at 6:49:02 PM UTC-8, Todd L. Montgomery wrote:
Hmmm. Might have to look at the latest VarHandles changes then. Didn't notice them before.Yes, sorry for being imprecise. As I was doing some optimizations before using memory_order_relaxed (not Aeron, though), the more indirection (specifically pointer chasing),the less optimizations could be done. In one case, a change to use std::make_shared was enough to allow the optimizer to do its thing. So, I've started to dig down and look when Ihave the chance. Haven't been able to really find analogues in the JVM yet. i.e. same kinds of things don't do anything useful.Aeron usage of relaxed is in the rate reporter and tied to the samples only. Which was an easy optimization. But you can play with heap allocating the RateReporter andsee if that has the same optimizations. When I checked clang pre-7, it didn't optimize as well then.
On Mon, Feb 8, 2016 at 6:21 PM, Vitaly Davidovich <vit...@gmail.com> wrote:
On Monday, February 8, 2016, Todd Montgomery <tm...@nard.net> wrote:I spent a while looking at read-copy-update (RCU) and QSBR for Aeron's C++ API. Opted not to use any of those techniques due to the need to have users make idle calls for reclamation. Which is difficult for a library to impose effectively. But hooking pthread calls and some other options are quite valid. And at a systems perspective, it is trivial to add these type of reclamation calls to an idle strategy. I'm a proponent of the general approach because of what it opens up.It's worth noting that C++11 atomics are quite laborious in comparison to the JMM. The weaker memory models for std::atomic are.... tricky. But more useful than most in the C++ community seem to want to go along with. But not sure Java could actually leverage them, honestly. I'm not sure if the optimizations they enable are things the current optimizers in Java are able to do much with. So much data dependency.These are coming to Java by way of VarHandle, with the exception of memory_order_consume I think.By "so much data dependency" you mean pointer/memory chasing?
On Mon, Feb 8, 2016 at 10:11 AM, Francesco Nigro <nigr...@gmail.com> wrote:
And the memory consume semantic to exploit better perf on weak memory model CPUs...it's interesting to read code like this for a java programmer like me :) It help to understand what java could gain adding a little bit of control for certain operation...
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
I last looked about a year or so ago and at that time neither GCC nor Clang optimized memory_order_consumer in cases where it could actually be optimized (both preferring to just treat it like memory_order_acquire). Not that it even matters for x86 much. Though it seems like there are a few papers out there to narrow its definition and actually make it implementable - http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4036.pdf
On Monday, February 8, 2016 at 6:21:40 PM UTC-8, Vitaly Davidovich wrote:
On Monday, February 8, 2016, Todd Montgomery <tm...@nard.net> wrote:I spent a while looking at read-copy-update (RCU) and QSBR for Aeron's C++ API. Opted not to use any of those techniques due to the need to have users make idle calls for reclamation. Which is difficult for a library to impose effectively. But hooking pthread calls and some other options are quite valid. And at a systems perspective, it is trivial to add these type of reclamation calls to an idle strategy. I'm a proponent of the general approach because of what it opens up.It's worth noting that C++11 atomics are quite laborious in comparison to the JMM. The weaker memory models for std::atomic are.... tricky. But more useful than most in the C++ community seem to want to go along with. But not sure Java could actually leverage them, honestly. I'm not sure if the optimizations they enable are things the current optimizers in Java are able to do much with. So much data dependency.These are coming to Java by way of VarHandle, with the exception of memory_order_consume I think.By "so much data dependency" you mean pointer/memory chasing?
On Mon, Feb 8, 2016 at 10:11 AM, Francesco Nigro <nigr...@gmail.com> wrote:
And the memory consume semantic to exploit better perf on weak memory model CPUs...it's interesting to read code like this for a java programmer like me :) It help to understand what java could gain adding a little bit of control for certain operation...
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
On Tuesday, February 9, 2016, Rajiv Kurian <geet...@gmail.com> wrote:The only good theoretical use of memory_order_relaxed I find is updating counters, progress bars etc where you give the compiler more room to play around with dependencies than with memory_order_release. Do you have any short examples where this actually generates better code? On x86 at least both GCC and Clang generate a "lock add" for fetch_and_add in both relaxed and release mode. Similarly they generate a simple "mov" for a store operation for both relaxed and release mode. So the only place where I see it generating better code is some affordance through playing around with dependencies.Yeah I don't think x86 is the best example for the weaker order operations given the cpu has pretty strong memory model (only store-load can be reordered). Relaxed mode can be good if it allows other scheduling to be done around the load/store, but again, given x86 OoO engine there's not much juice to be squeezed there.On weaker memory models, these things can be more useful. For example, some archs have a weak CAS which doesn't act like a full fence (unlike x86). Other archs don't order stores which means a relaxed and release store will issue different set of instructions. Other archs like Itanium are in-order so compiler does scheduling - I suspect one would see a lot more different code motion opened up by using weak order operations.What I found weird is that neither GCC nor Clang optimize a loop of stores in release or relaxed mode to a single store. As far as I can tell that transformation is totally valid.Why do you think this is a valid transformation? AFAICT all the ordered instructions require the compiler to actually perform them; they can schedule other things around them and omit cpu fences in some of them, but the operation must be performed (like volatile, except with atomicity).
On Monday, February 8, 2016 at 6:49:02 PM UTC-8, Todd L. Montgomery wrote:
Hmmm. Might have to look at the latest VarHandles changes then. Didn't notice them before.Yes, sorry for being imprecise. As I was doing some optimizations before using memory_order_relaxed (not Aeron, though), the more indirection (specifically pointer chasing),the less optimizations could be done. In one case, a change to use std::make_shared was enough to allow the optimizer to do its thing. So, I've started to dig down and look when Ihave the chance. Haven't been able to really find analogues in the JVM yet. i.e. same kinds of things don't do anything useful.Aeron usage of relaxed is in the rate reporter and tied to the samples only. Which was an easy optimization. But you can play with heap allocating the RateReporter andsee if that has the same optimizations. When I checked clang pre-7, it didn't optimize as well then.
On Mon, Feb 8, 2016 at 6:21 PM, Vitaly Davidovich <vit...@gmail.com> wrote:
On Monday, February 8, 2016, Todd Montgomery <tm...@nard.net> wrote:I spent a while looking at read-copy-update (RCU) and QSBR for Aeron's C++ API. Opted not to use any of those techniques due to the need to have users make idle calls for reclamation. Which is difficult for a library to impose effectively. But hooking pthread calls and some other options are quite valid. And at a systems perspective, it is trivial to add these type of reclamation calls to an idle strategy. I'm a proponent of the general approach because of what it opens up.It's worth noting that C++11 atomics are quite laborious in comparison to the JMM. The weaker memory models for std::atomic are.... tricky. But more useful than most in the C++ community seem to want to go along with. But not sure Java could actually leverage them, honestly. I'm not sure if the optimizations they enable are things the current optimizers in Java are able to do much with. So much data dependency.These are coming to Java by way of VarHandle, with the exception of memory_order_consume I think.By "so much data dependency" you mean pointer/memory chasing?
On Mon, Feb 8, 2016 at 10:11 AM, Francesco Nigro <nigr...@gmail.com> wrote:
And the memory consume semantic to exploit better perf on weak memory model CPUs...it's interesting to read code like this for a java programmer like me :) It help to understand what java could gain adding a little bit of control for certain operation...
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
On Tuesday, February 9, 2016, Rajiv Kurian <geet...@gmail.com> wrote:I last looked about a year or so ago and at that time neither GCC nor Clang optimized memory_order_consumer in cases where it could actually be optimized (both preferring to just treat it like memory_order_acquire). Not that it even matters for x86 much. Though it seems like there are a few papers out there to narrow its definition and actually make it implementable - http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4036.pdfMy understanding is m_o_consume is really meant for archs like Alpha that don't respect data dependent loads through a pointer, which are rare (is there something else besides Alpha with such a property?).
On Monday, February 8, 2016 at 6:21:40 PM UTC-8, Vitaly Davidovich wrote:
On Monday, February 8, 2016, Todd Montgomery <tm...@nard.net> wrote:I spent a while looking at read-copy-update (RCU) and QSBR for Aeron's C++ API. Opted not to use any of those techniques due to the need to have users make idle calls for reclamation. Which is difficult for a library to impose effectively. But hooking pthread calls and some other options are quite valid. And at a systems perspective, it is trivial to add these type of reclamation calls to an idle strategy. I'm a proponent of the general approach because of what it opens up.It's worth noting that C++11 atomics are quite laborious in comparison to the JMM. The weaker memory models for std::atomic are.... tricky. But more useful than most in the C++ community seem to want to go along with. But not sure Java could actually leverage them, honestly. I'm not sure if the optimizations they enable are things the current optimizers in Java are able to do much with. So much data dependency.These are coming to Java by way of VarHandle, with the exception of memory_order_consume I think.By "so much data dependency" you mean pointer/memory chasing?
On Mon, Feb 8, 2016 at 10:11 AM, Francesco Nigro <nigr...@gmail.com> wrote:
And the memory consume semantic to exploit better perf on weak memory model CPUs...it's interesting to read code like this for a java programmer like me :) It help to understand what java could gain adding a little bit of control for certain operation...
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
On Tuesday, February 9, 2016, Rajiv Kurian <geet...@gmail.com> wrote:The only good theoretical use of memory_order_relaxed I find is updating counters, progress bars etc where you give the compiler more room to play around with dependencies than with memory_order_release. Do you have any short examples where this actually generates better code? On x86 at least both GCC and Clang generate a "lock add" for fetch_and_add in both relaxed and release mode. Similarly they generate a simple "mov" for a store operation for both relaxed and release mode. So the only place where I see it generating better code is some affordance through playing around with dependencies.Yeah I don't think x86 is the best example for the weaker order operations given the cpu has pretty strong memory model (only store-load can be reordered). Relaxed mode can be good if it allows other scheduling to be done around the load/store, but again, given x86 OoO engine there's not much juice to be squeezed there.On weaker memory models, these things can be more useful. For example, some archs have a weak CAS which doesn't act like a full fence (unlike x86). Other archs don't order stores which means a relaxed and release store will issue different set of instructions. Other archs like Itanium are in-order so compiler does scheduling - I suspect one would see a lot more different code motion opened up by using weak order operations.What I found weird is that neither GCC nor Clang optimize a loop of stores in release or relaxed mode to a single store. As far as I can tell that transformation is totally valid.Why do you think this is a valid transformation? AFAICT all the ordered instructions require the compiler to actually perform them; they can schedule other things around them and omit cpu fences in some of them, but the operation must be performed (like volatile, except with atomicity).
On Monday, February 8, 2016 at 6:49:02 PM UTC-8, Todd L. Montgomery wrote:
Hmmm. Might have to look at the latest VarHandles changes then. Didn't notice them before.Yes, sorry for being imprecise. As I was doing some optimizations before using memory_order_relaxed (not Aeron, though), the more indirection (specifically pointer chasing),the less optimizations could be done. In one case, a change to use std::make_shared was enough to allow the optimizer to do its thing. So, I've started to dig down and look when Ihave the chance. Haven't been able to really find analogues in the JVM yet. i.e. same kinds of things don't do anything useful.Aeron usage of relaxed is in the rate reporter and tied to the samples only. Which was an easy optimization. But you can play with heap allocating the RateReporter andsee if that has the same optimizations. When I checked clang pre-7, it didn't optimize as well then.
On Mon, Feb 8, 2016 at 6:21 PM, Vitaly Davidovich <vit...@gmail.com> wrote:
On Monday, February 8, 2016, Todd Montgomery <tm...@nard.net> wrote:I spent a while looking at read-copy-update (RCU) and QSBR for Aeron's C++ API. Opted not to use any of those techniques due to the need to have users make idle calls for reclamation. Which is difficult for a library to impose effectively. But hooking pthread calls and some other options are quite valid. And at a systems perspective, it is trivial to add these type of reclamation calls to an idle strategy. I'm a proponent of the general approach because of what it opens up.It's worth noting that C++11 atomics are quite laborious in comparison to the JMM. The weaker memory models for std::atomic are.... tricky. But more useful than most in the C++ community seem to want to go along with. But not sure Java could actually leverage them, honestly. I'm not sure if the optimizations they enable are things the current optimizers in Java are able to do much with. So much data dependency.These are coming to Java by way of VarHandle, with the exception of memory_order_consume I think.By "so much data dependency" you mean pointer/memory chasing?
On Mon, Feb 8, 2016 at 10:11 AM, Francesco Nigro <nigr...@gmail.com> wrote:
And the memory consume semantic to exploit better perf on weak memory model CPUs...it's interesting to read code like this for a java programmer like me :) It help to understand what java could gain adding a little bit of control for certain operation...
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
On Tuesday, February 9, 2016, Rajiv Kurian <geet...@gmail.com> wrote:I last looked about a year or so ago and at that time neither GCC nor Clang optimized memory_order_consumer in cases where it could actually be optimized (both preferring to just treat it like memory_order_acquire). Not that it even matters for x86 much. Though it seems like there are a few papers out there to narrow its definition and actually make it implementable - http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4036.pdfMy understanding is m_o_consume is really meant for archs like Alpha that don't respect data dependent loads through a pointer, which are rare (is there something else besides Alpha with such a property?).
On Monday, February 8, 2016 at 6:21:40 PM UTC-8, Vitaly Davidovich wrote:
On Monday, February 8, 2016, Todd Montgomery <tm...@nard.net> wrote:I spent a while looking at read-copy-update (RCU) and QSBR for Aeron's C++ API. Opted not to use any of those techniques due to the need to have users make idle calls for reclamation. Which is difficult for a library to impose effectively. But hooking pthread calls and some other options are quite valid. And at a systems perspective, it is trivial to add these type of reclamation calls to an idle strategy. I'm a proponent of the general approach because of what it opens up.It's worth noting that C++11 atomics are quite laborious in comparison to the JMM. The weaker memory models for std::atomic are.... tricky. But more useful than most in the C++ community seem to want to go along with. But not sure Java could actually leverage them, honestly. I'm not sure if the optimizations they enable are things the current optimizers in Java are able to do much with. So much data dependency.These are coming to Java by way of VarHandle, with the exception of memory_order_consume I think.By "so much data dependency" you mean pointer/memory chasing?
On Mon, Feb 8, 2016 at 10:11 AM, Francesco Nigro <nigr...@gmail.com> wrote:
And the memory consume semantic to exploit better perf on weak memory model CPUs...it's interesting to read code like this for a java programmer like me :) It help to understand what java could gain adding a little bit of control for certain operation...
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
They are some what problematic though. One can imagine upgrading a progress bar from some download thread using release order and the UI thread reading this and showing it on the screen. The compiler is free to figure out that your progress bar atomic goes from 0 - 100 and set it directly to 100, kind of killing the experience. Given how release works I think the compiler could bring move things (like the real work of downloading) above the release store and set it directly to 100.
Yeah I feel like some of these transformations are not done because of the assumption that atomics subsume volatile semantics.
They are some what problematic though. One can imagine upgrading a progress bar from some download thread using release order and the UI thread reading this and showing it on the screen. The compiler is free to figure out that your progress bar atomic goes from 0 - 100 and set it directly to 100, kind of killing the experience. Given how release works I think the compiler could bring move things (like the real work of downloading) above the release store and set it directly to 100.
On Tuesday, February 9, 2016, Rajiv Kurian <geet...@gmail.com> wrote:
Yeah I feel like some of these transformations are not done because of the assumption that atomics subsume volatile semantics.My guess is they simply play conservative; given the relative recency of atomics and the serious problems miscompiles can cause, I don't blame them :). That paper alludes to ongoing research to determine what exact transformations are safe.
They are some what problematic though. One can imagine upgrading a progress bar from some download thread using release order and the UI thread reading this and showing it on the screen. The compiler is free to figure out that your progress bar atomic goes from 0 - 100 and set it directly to 100, kind of killing the experience. Given how release works I think the compiler could bring move things (like the real work of downloading) above the release store and set it directly to 100.Well yeah, but it's a data race to begin with if relaxed mode is used; e.g consumer may not see increments of the progress bar simply due to scheduling. If it's racily observing some counter then anything goes, which means 0 to 100 is possible. That being said, the conservative (and safe although not as performant) approach is to treat it like volatile in this regard. It sounds like over time compilers will get more aggressive there.
Starting conservatively and gradually getting more aggressive has its own pitfalls -- people grow to expect the existing behavior, then moan when it changes.
On Tuesday, February 9, 2016, Rajiv Kurian <geet...@gmail.com> wrote:Yeah I feel like some of these transformations are not done because of the assumption that atomics subsume volatile semantics.My guess is they simply play conservative; given the relative recency of atomics and the serious problems miscompiles can cause, I don't blame them :). That paper alludes to ongoing research to determine what exact transformations are safe.
They are some what problematic though. One can imagine upgrading a progress bar from some download thread using release order and the UI thread reading this and showing it on the screen. The compiler is free to figure out that your progress bar atomic goes from 0 - 100 and set it directly to 100, kind of killing the experience. Given how release works I think the compiler could bring move things (like the real work of downloading) above the release store and set it directly to 100.Well yeah, but it's a data race to begin with if relaxed mode is used; e.g consumer may not see increments of the progress bar simply due to scheduling. If it's racily observing some counter then anything goes, which means 0 to 100 is possible. That being said, the conservative (and safe although not as performant) approach is to treat it like volatile in this regard. It sounds like over time compilers will get more aggressive there.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Starting conservative and going aggressive later means real programs will break down the line with a new compiler and people will shout at the compiler writers and they will revert their changes or we'll be left with broken programs.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
Starting conservative and going aggressive later means real programs will break down the line with a new compiler and people will shout at the compiler writers and they will revert their changes or we'll be left with broken programs.I'd expect compiler devs to take possible breakage into account when determining if further optimizations are worth the risk. But improving the optimizer (or stdlib for that matter) almost always carries some risk of breaking someone, for various definitions of "breaking" (races, UB, perf degradation, etc). That's the price of progress, IMHO. What's a better alternative?
On Tue, Feb 9, 2016 at 1:12 PM, Rajiv Kurian <geet...@gmail.com> wrote:
On Tuesday, February 9, 2016 at 9:29:26 AM UTC-8, Vitaly Davidovich wrote:
On Tuesday, February 9, 2016, Rajiv Kurian <geet...@gmail.com> wrote:Yeah I feel like some of these transformations are not done because of the assumption that atomics subsume volatile semantics.My guess is they simply play conservative; given the relative recency of atomics and the serious problems miscompiles can cause, I don't blame them :). That paper alludes to ongoing research to determine what exact transformations are safe.
They are some what problematic though. One can imagine upgrading a progress bar from some download thread using release order and the UI thread reading this and showing it on the screen. The compiler is free to figure out that your progress bar atomic goes from 0 - 100 and set it directly to 100, kind of killing the experience. Given how release works I think the compiler could bring move things (like the real work of downloading) above the release store and set it directly to 100.Well yeah, but it's a data race to begin with if relaxed mode is used; e.g consumer may not see increments of the progress bar simply due to scheduling. If it's racily observing some counter then anything goes, which means 0 to 100 is possible. That being said, the conservative (and safe although not as performant) approach is to treat it like volatile in this regard. It sounds like over time compilers will get more aggressive there.This will happen even in release mode and not just relaxed mode since operations after the release store might be brought above it but not the other way round. It is a data race - exactly!. The compiler is free to do this reordering per the C++11 memory model. The problem is that people grow used to the existing behavior which as it stands is super conservative and after these optimizations programs break very subtly. This is not a simple "The programmer was doing it wrong so they deserved it" thing. Real software will be written with these wrong assumptions and they'll break when the compiler decides to take advantage of the spec to optimize further. For a good parallel one can see all the software that has subtly (or sometimes very obviously) broken when the compiler took advantage of undefined behavior to create nasal demons. Clang has become especially aggressive in the recent years and has gone as far as optimizing any function with undefined behavior to a "rep ret". Starting conservative and going aggressive later means real programs will break down the line with a new compiler and people will shout at the compiler writers and they will revert their changes or we'll be left with broken programs.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
Compiler devs are way too eager to win benchmark games to do anything else
Look at all the programs (illegal but working) that they've broken through optimizations. Null pointer checks elided because there was a dereference before, range checks elided because x + 1 > x is always assumed to be true, "optimizing" your handwritten memcpy to a std lib memcpy which is not async safe thus breaking code.
The alternative is to be conservative and let developers have control. I know how to write 100 to memory directly don't do it for me.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
Yes the language semantics are to be blamed - no argument there. The fact that C/C++ is chock full of UB instead of implementation defined behavior is to be blamed.
As a sad side note undefined behavior has now even crept into the so called language agnostic backends like LLVM. We are at a state that it is not even possible to verify that a C/C++ program does not invoke UB. I agree with you, we need a different language with more well defined semantics. I know that every transformation that the compiler is doing is legal - my only point was that it breaks real programs and compiler devs know it will and they still don't care.
The memset, memcpy examples are particularly grievous and illustrates my point better - writing my own memset does not cause UB, but the compiler changes my program's semantics a lot. Another example is that memset to 0 followed by free results in the memset being elided - again no UB but it's just the compiler being too smart, totally defeating the programmers intention to make things secure. The fact that compilers are so aware of the standard library is somewhat weird to me.
Thanks for bringing up the CPU example - I was going to bring that up myself. What does a CPU do when you do something illegal? - it will either fail to decode if it is an illegal instruction or throw an exception. It won't silently recompile your code to eliminate checks, return early or call your parents. Dividing by zero in a cpu causes a divide by zero exception. Shifting by more than the width of an integer has implementation defined behavior. Neither of these "errors" can cause your cpu to issue instructions that will delete your hard-drive. Programmers are fine with implementation defined behavior (note: it HAS to be defined by all valid implementations even though not by the standard) IMHO and not undefined behavior which can cause anything including erasing your hard drive or making demons fly out of your nose.
I wasn't quite arguing for the transformations - I was surprised that the compilers didn't do it given all the other transformations they do. I know how difficult it is for compilers to warn users of UB when they take advantage of it especially because of how it falls out of other passes like inlining etc. But it goes to show that compiler writers will prioritize optimizations for UB instead of building the (really really complex) infrastructure required to pass information between optimization passes so that undefined behavior can be pointed out.
I want a new language - have had enough of C :)
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
Yes the language semantics are to be blamed - no argument there. The fact that C/C++ is chock full of UB instead of implementation defined behavior is to be blamed.
I'm not sure implementation defined is so much better (it is better though) than UB. It makes code non portable across compilers, or perhaps even across compiler versions as each version may have different implementation-defined semantics.
As a sad side note undefined behavior has now even crept into the so called language agnostic backends like LLVM. We are at a state that it is not even possible to verify that a C/C++ program does not invoke UB. I agree with you, we need a different language with more well defined semantics. I know that every transformation that the compiler is doing is legal - my only point was that it breaks real programs and compiler devs know it will and they still don't care.While I understand how some optimizations the compiler does can seem hostile to the developer, I don't think it's because compiler devs don't care. They simply look for code patterns/shapes, and then optimize them the exact same way -- the compiler cannot differentiate between a security-sensitive zero'ing of a stack buffer vs needless zero'ing that occurred because a lot of code was inlined, constants folded and propagated, branches folded/pruned out, calls devirtualized, and you're left with a pointless zero'ing once the dust settles. So it seems like the correct/reasonable thing to do here is find ways to communicate to the compiler that some piece of code cannot be optimized a certain way. AFAIK, there are security specific zero'ing calls in various C runtimes. But that's the gist of the issue -- compiler simply does not have enough info to determine true intent.
The memset, memcpy examples are particularly grievous and illustrates my point better - writing my own memset does not cause UB, but the compiler changes my program's semantics a lot. Another example is that memset to 0 followed by free results in the memset being elided - again no UB but it's just the compiler being too smart, totally defeating the programmers intention to make things secure. The fact that compilers are so aware of the standard library is somewhat weird to me.This is yet again another case of compiler pattern matching without knowing the context. memset to 0 followed by free can just as well fall out of other optimizations, and it generally makes sense to elide the zero'ing just like it makes sense to remove other needless operations that weren't needless on their own but become such after sufficient prior optimization.
Thanks for bringing up the CPU example - I was going to bring that up myself. What does a CPU do when you do something illegal? - it will either fail to decode if it is an illegal instruction or throw an exception. It won't silently recompile your code to eliminate checks, return early or call your parents. Dividing by zero in a cpu causes a divide by zero exception. Shifting by more than the width of an integer has implementation defined behavior. Neither of these "errors" can cause your cpu to issue instructions that will delete your hard-drive. Programmers are fine with implementation defined behavior (note: it HAS to be defined by all valid implementations even though not by the standard) IMHO and not undefined behavior which can cause anything including erasing your hard drive or making demons fly out of your nose.I wasn't thinking of CPUs in terms of them doing completely nonsensical things in the face of illegal instructions, traps, etc. Instead, I was thinking of them doing things out of order, specifically reordering memory loads/stores with respect to their program order. If someone is writing racy programs that happen to work today, should the CPU vendor be disallowed from making further reorder optimizations for fear of breaking someone whose code happens to be working today by virtue of current CPU reordering? This is analogous to more aggressively optimizing things like memory_order_relaxed.
I wasn't quite arguing for the transformations - I was surprised that the compilers didn't do it given all the other transformations they do. I know how difficult it is for compilers to warn users of UB when they take advantage of it especially because of how it falls out of other passes like inlining etc. But it goes to show that compiler writers will prioritize optimizations for UB instead of building the (really really complex) infrastructure required to pass information between optimization passes so that undefined behavior can be pointed out.I think they're trying to get better at this, both from compiler warnings and the toolchains including address sanitizers, thread sanitizers, UB sanitizers, etc. Compiler writers are like anyone else in terms of exploiting constraints -- if you're programming something with certain constraints on the domain, it behooves you to take advantage of them to optimize performance, at least in release builds. Now, if those constraints happen to be hard to program against for the users or error prone, well then the blame is with the constraints (i.e. C/C++ spec).
But let's look at the other side of the token - Java. It has a lot more safety and defined behavior, but poor performance model. The JIT now needs to pull heroics to try and recover some of the performance loss. It's able to do that in some cases, but not nearly all of them leading to people opting out of those safety checks. I do think that safe-by-default with explicit opt-in escape hatches is a better default than fully unsafe. Its optimizer isn't nearly as aggressive because it has to ensure the semantics are preserved with optimizations applied (i.e. java code running in interpreter must behave the same way as JIT compiled code, modulo speed). A quick and simple example of such an artifact is supposing you have:final Foo _foo;int bar() {return _foo.doSomething();}class Foo {int doSomething() { return 8;}}JIT inlines doSomething and returns 8 from bar(), but it still performs an explicit null check of _foo. Instance final fields aren't treated as compile-time constants (but, mind you, *static* final fields are!) because there are frameworks that set them via reflection even though that's not defined behavior. So now Hotspot engineers are thinking of clever ways of addressing this by speculatively treating them as constants, but detecting such writes to them dynamically and deopt'ing the compiled code. But now that's a lot of complexity (which may bring about its own bugs) to perpetuate users' bad behavior which was never allowed but happened to work. The "rest of us" who aren't doing such things get worse performance in the meantime.So basically every language will have some warts. C/C++ tend to care a lot more about performance than safety vs, say, Java and they focus more on exploiting that. Java cares more about safety/portability/uniformity, and will always lean towards those things if there's ever a safety vs performance tradeoff to be made. No matter what, with both languages having so many users, there will always be people who are unhappy with the direction :).
I want a new language - have had enough of C :)You're not alone, but thus far there's no suitable replacement for the niches C occupies (things like Rust look promising, but it'll need to attain some critical mass of users before gaining sufficient momentum to make a dent here, IMO.).
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
Racy programs have undefined ordering, but they can't delete your hard drive can they?
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
Racy programs have undefined ordering, but they can't delete your hard drive can they?They can delete your hard drive. They could even delete you, if the program controls things connected to the real world, like the brakes on your car or a nuclear missile in a silo :)
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
CPU will not change your code but it can execute it in different order leading to undefined behavior from application point of view. It may not cause things like buffer overflows or double frees in managed languages, but it may make you send bogus orders to an exchange, corrupt data(base), turn off a pacemaker, make autopilot on a plane do something bogus, and so on. If you rely on what memory_order_relaxed does today (what started this discussion), your code will break if that changes. The bottom line is if you're relying on unspecified behavior, you can get undefined behavior in your system. If you rely on implementation details of a library, your code will break when that changes. How exactly the bug will manifest itself is secondary.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
Some UB is for things the compiler can't foresee or can't truly detect. Check out the UB associated with const_cast. http://en.cppreference.com/w/cpp/language/const_castThen imagine if that had to be defined for all possible access patterns.
Personally, I can deal with UB. I can find the rules and I can, for the most part understand them. Whether it be a hole in the spec or just avoiding a compatibility nightmare.
This has been a great discussion. Since we are bashing on C++ some, thought I would throw some stuff out for thought.I've got to say, though, the memcpy/memset optimization behavior is not new. Variables and actions being optimized away is not new. Relying on side effects has alwaysbeen a dicey idea. Doesn't matter the language. Java has the same issues with variables that can be optimized out. Which is common in micro-benchmarks. It should notbe surprised that stdlib is optimized since it is now part of C++ spec. And has been part of the C spec for a very long time. It's been optimized for a while in many compilers.
In fact, these issues are so well known and understood that there is a CERT attached to some and similar practices. Some links.
And if you think that Java is a secure language.... yeah, read through
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsubscribe@googl
In fact, these issues are so well known and understood that there is a CERT attached to some and similar practices. Some links.Do not depend on undefined behavior is easier said and done. Just like the compilers find it difficult to report undefined behavior (but no difficulty exploiting it), undefined behavior can arise from multiple levels of inlining of the user's code and a human might not be able to track it as well as the compiler. Why else are smart people like Ulrich Drepper and the maintainers of OpenSSL writing so much UB?
And if you think that Java is a secure language.... yeah, read throughThat seems like a long list. A lot of it seems to be around vulnerability around things like SQL injection which is not a Java thing. Others seem to be around catching exceptions which is a best practice. None of the points seem to be undefined behavior. Java has some implementation defined behavior which is very different from nasal demons C++ undefined behavior.
On Tue, Feb 9, 2016 at 3:51 PM, Rajiv Kurian <geet...@gmail.com> wrote:In fact, these issues are so well known and understood that there is a CERT attached to some and similar practices. Some links.Do not depend on undefined behavior is easier said and done. Just like the compilers find it difficult to report undefined behavior (but no difficulty exploiting it), undefined behavior can arise from multiple levels of inlining of the user's code and a human might not be able to track it as well as the compiler. Why else are smart people like Ulrich Drepper and the maintainers of OpenSSL writing so much UB?Writing quality code isn't about being smart so much as rigorous and relentless. Having seen OpenSSL, it grew a lot of cruft over the years. It has issues. A lot of them.Some obvious. Some not. And also, anyone can make mistakes.
If we started blaming a language as the root of all evil, we better start with JavaScript because there is some horrendous web apps out there. :)
And if you think that Java is a secure language.... yeah, read throughThat seems like a long list. A lot of it seems to be around vulnerability around things like SQL injection which is not a Java thing. Others seem to be around catching exceptions which is a best practice. None of the points seem to be undefined behavior. Java has some implementation defined behavior which is very different from nasal demons C++ undefined behavior.Java is a language controlled by a single company. C++ is a standards committee. This has pluses and minuses and compromises.
To be more specific around Java, check into the layers of buffer, etc. around https://www.securecoding.cert.org/confluence/display/java/MSC59-J.+Limit+the+lifetime+of+sensitive+dataWhich is similar in nature. Different beasts, but similar security intention.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
int is platform dependent in size. It must be "at least" 16-bits. http://en.cppreference.com/w/c/language/arithmetic_typesAnd, yes, some platforms do have 16 bit and some 64 bit ints. I'm a fan of using fixed size types.
Interesting, if you play around with that, it seems to only warn when it can find a constant/literal that is passed directly to "<<". i.e. change shift45 to be: return z << 45 and the warning shows up.To me this suggests the compiler can't infer passed the function declaration that particular warning. Which isn't surprising, really.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="mail
Do not depend on undefined behavior is easier said and done. Just like the compilers find it difficult to report undefined behavior (but no difficulty exploiting it), undefined behavior can arise from multiple levels of inlining of the user's code and a human might not be able to track it as well as the compiler. Why else are smart people like Ulrich Drepper and the maintainers of OpenSSL writing so much UB?Writing quality code isn't about being smart so much as rigorous and relentless. Having seen OpenSSL, it grew a lot of cruft over the years. It has issues. A lot of them.Some obvious. Some not. And also, anyone can make mistakes.When experienced coders are regularly tapping into UB, one can blame the language. It is a weird mix of too low level but no low level enough to be able to be specific. We deserve better. It's been more than 50 years.
If we started blaming a language as the root of all evil, we better start with JavaScript because there is some horrendous web apps out there. :)I write and read C++ regularly so I am more passionate about blaming it. I am sure I'd be blaming Javascript if I wrote it more often.
And if you think that Java is a secure language.... yeah, read throughThat seems like a long list. A lot of it seems to be around vulnerability around things like SQL injection which is not a Java thing. Others seem to be around catching exceptions which is a best practice. None of the points seem to be undefined behavior. Java has some implementation defined behavior which is very different from nasal demons C++ undefined behavior.
To be more specific around Java, check into the layers of buffer, etc. around https://www.securecoding.cert.org/confluence/display/java/MSC59-J.+Limit+the+lifetime+of+sensitive+data
Which is similar in nature. Different beasts, but similar security intention.All of them to stem from Java garbage collection and objects not disappearing as soon as they are out of scope. The compiler is not eliminating memsets. There is a very straight forward way of writing zeros into a buffer in Java. C's compiler eliminates zeroing out memory even when it is actually read later when handed back by the memory allocator. More over given there is little (if any) undefined behavior in Java, it doesn't do any of the crazy things that C compilers allow one to do.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="mail--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
On Tue, Feb 9, 2016 at 4:19 PM, Rajiv Kurian <geet...@gmail.com> wrote:Do not depend on undefined behavior is easier said and done. Just like the compilers find it difficult to report undefined behavior (but no difficulty exploiting it), undefined behavior can arise from multiple levels of inlining of the user's code and a human might not be able to track it as well as the compiler. Why else are smart people like Ulrich Drepper and the maintainers of OpenSSL writing so much UB?Writing quality code isn't about being smart so much as rigorous and relentless. Having seen OpenSSL, it grew a lot of cruft over the years. It has issues. A lot of them.Some obvious. Some not. And also, anyone can make mistakes.When experienced coders are regularly tapping into UB, one can blame the language. It is a weird mix of too low level but no low level enough to be able to be specific. We deserve better. It's been more than 50 years.I can't argue with that. It's a firearm without any safety. It's wise to carry it carefully.If we started blaming a language as the root of all evil, we better start with JavaScript because there is some horrendous web apps out there. :)I write and read C++ regularly so I am more passionate about blaming it. I am sure I'd be blaming Javascript if I wrote it more often.I tend to not blame the language.In C/C++, when I find something amiss, I check the spec and look into it. Usually I find out I've been doing something wrong and the spec says don't do it. To which I just sigh and learn. I don't like it, I just accept it. Call me resigned.In Java, I tend to look at the doc, see basically it told me what to expect, then sigh and learn. Except when the optimizer won't inline. Then I use coarse language and throw things. :)
By infer, I did mean the warning subsystem. We could argue about the merits of how best to surface this type of thing so that the optimization and warning are both happy. But your point is well taken. Sometimes the compiler can warn and sometimes it can't. Just because how it is implemented. And with gray areas of the spec, it is going to be messy.
Meh, I blame the language. The language being the spec that is. The language is specific about when it is NOT specific - doesn't help much because it is ultimately not very specific. I like this test that I read on twitter the other day. Would you use a compiler which will give you 4x speed but if you have a single undefined behavior anywhere it will kill your entire family? I'd chose no if I was given the choice :)
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
On Tuesday, February 9, 2016 at 3:13:31 PM UTC-8, Vitaly Davidovich wrote:CPU will not change your code but it can execute it in different order leading to undefined behavior from application point of view. It may not cause things like buffer overflows or double frees in managed languages, but it may make you send bogus orders to an exchange, corrupt data(base), turn off a pacemaker, make autopilot on a plane do something bogus, and so on. If you rely on what memory_order_relaxed does today (what started this discussion), your code will break if that changes. The bottom line is if you're relying on unspecified behavior, you can get undefined behavior in your system. If you rely on implementation details of a library, your code will break when that changes. How exactly the bug will manifest itself is secondary.I think we agree that one should not depend on the implementation details of your library. I also agree that most of the changes the C compiler makes are legal - because of the lax spec. The lax spec is what I am against. The lax spec means that operations perfectly defined by the underlying hardware are potentially converted to nasal demons instead of causing compiler errors. Please note the big difference - x86 allows "sal eax 33", but C compilers choose to convert a shift by 33 to a 'rep ret' without a single warning. So the compiler identifies the problem, takes advantage of it and chooses NOT to inform you about it. And you are okay with it?
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+
I think this discussion is winding down but I'll comment on a few select things ...
On Tuesday, February 9, 2016, Rajiv Kurian <geet...@gmail.com> wrote:
On Tuesday, February 9, 2016 at 3:13:31 PM UTC-8, Vitaly Davidovich wrote:CPU will not change your code but it can execute it in different order leading to undefined behavior from application point of view. It may not cause things like buffer overflows or double frees in managed languages, but it may make you send bogus orders to an exchange, corrupt data(base), turn off a pacemaker, make autopilot on a plane do something bogus, and so on. If you rely on what memory_order_relaxed does today (what started this discussion), your code will break if that changes. The bottom line is if you're relying on unspecified behavior, you can get undefined behavior in your system. If you rely on implementation details of a library, your code will break when that changes. How exactly the bug will manifest itself is secondary.I think we agree that one should not depend on the implementation details of your library. I also agree that most of the changes the C compiler makes are legal - because of the lax spec. The lax spec is what I am against. The lax spec means that operations perfectly defined by the underlying hardware are potentially converted to nasal demons instead of causing compiler errors. Please note the big difference - x86 allows "sal eax 33", but C compilers choose to convert a shift by 33 to a 'rep ret' without a single warning. So the compiler identifies the problem, takes advantage of it and chooses NOT to inform you about it. And you are okay with it?No, I'm not ok with it. Your shift45 example in this thread is a good one and illustrates a problem with the compiler itself, not just the spec (I think we agree the spec is the root enabler of this mess).But, let's suppose overflowing shifts were implementation defined rather than UB. In the shift example, it's highly likely that shifting an int32 by 45 is a bug. With impl defined behavior you'll get some answer, but it will be just as wrong as compiler returning 0 since it's a bug. That wrong answer will have undefined consequences for further execution.
The bigger issue, IMO, is the compiler can remove *subsequent* perfectly legal code if it's dominated by the illegal code.
I understand what you are saying but this is not necessarily true. If I am coding for x86 for example, shift by greater than bit width does what I want it to do when writing a bit map. Here is an example for a 128 bit bitmap -http://goo.gl/WvzT2c Note how the setBitWithDefensiveShift() and setBitWithoutDefensiveShift() have the exact same assembly. The compiler notices that the extra mod 64 is redundant on x86 and eliminates it in the setBithWithDefensiveShift() version. If you were to call either function with a random bitIndex your output would be exactly the same. However when you try the 2 functions with the exact same constant numbers, the setBitWithoutDefensiveShift() invokes undefined behavior and the setBitWithDefensiveShift() one doesn't! This is pretty sad given the compiler had the smarts to know that the mod 64 was not needed to begin with and compiled both functions down to the same assembly. Now here is the version where we forbid the compiler from inlining - http://goo.gl/zmM3kv The assembly is now exactly the same for the implementation and the invocations. We are at the mercy of the compiler
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
I understand what you are saying but this is not necessarily true. If I am coding for x86 for example, shift by greater than bit width does what I want it to do when writing a bit map. Here is an example for a 128 bit bitmap -http://goo.gl/WvzT2c Note how the setBitWithDefensiveShift() and setBitWithoutDefensiveShift() have the exact same assembly. The compiler notices that the extra mod 64 is redundant on x86 and eliminates it in the setBithWithDefensiveShift() version. If you were to call either function with a random bitIndex your output would be exactly the same. However when you try the 2 functions with the exact same constant numbers, the setBitWithoutDefensiveShift() invokes undefined behavior and the setBitWithDefensiveShift() one doesn't! This is pretty sad given the compiler had the smarts to know that the mod 64 was not needed to begin with and compiled both functions down to the same assembly. Now here is the version where we forbid the compiler from inlining - http://goo.gl/zmM3kv The assembly is now exactly the same for the implementation and the invocations. We are at the mercy of the compilerWell, in the UB examples you're explicitly invoking UB. In the no UB, you're explicitly reducing the bitIndex because you know overflow is UB. The fact the compiler treats it as nop and generates same assembly is beside the point -- you wrote different semantics. Having said that, I agree the compiler could do a better job of warning on this - this is another example of missing diagnostics, like discussed earlier in the thread.
Soooooo, use macros instead of inline functions! :) j/k
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
Dan Eloff wrote:
> I agree wholeheartedly, if the compiler can detect UB and abuse it, I would rather it dump an (even unhelpful) error
> instead.
In theory this can be done, but in practice doing this without a huge
number of false positives is a major layering issue. For instance,
the component that "proves" `x s< (x+1)` usually has no idea if
simplifying that predicate will actually result in a transform that
exploits the UB on overflow. I'd say you'd have to structure a C/C++
compiler from ground up to accommodate the kind of design that would
print a diagnostic when UB has been exploited.
Also, relevant to this thread, there has been some proposals around cutting down the amount of
UB in C/C++ to manageable levels:
http://blog.regehr.org/archives/1180.
-- Sanjoy
On Wednesday, February 10, 2016 at 10:08:12 AM UTC-8, Vitaly Davidovich wrote:I understand what you are saying but this is not necessarily true. If I am coding for x86 for example, shift by greater than bit width does what I want it to do when writing a bit map. Here is an example for a 128 bit bitmap -http://goo.gl/WvzT2c Note how the setBitWithDefensiveShift() and setBitWithoutDefensiveShift() have the exact same assembly. The compiler notices that the extra mod 64 is redundant on x86 and eliminates it in the setBithWithDefensiveShift() version. If you were to call either function with a random bitIndex your output would be exactly the same. However when you try the 2 functions with the exact same constant numbers, the setBitWithoutDefensiveShift() invokes undefined behavior and the setBitWithDefensiveShift() one doesn't! This is pretty sad given the compiler had the smarts to know that the mod 64 was not needed to begin with and compiled both functions down to the same assembly. Now here is the version where we forbid the compiler from inlining - http://goo.gl/zmM3kv The assembly is now exactly the same for the implementation and the invocations. We are at the mercy of the compilerWell, in the UB examples you're explicitly invoking UB. In the no UB, you're explicitly reducing the bitIndex because you know overflow is UB. The fact the compiler treats it as nop and generates same assembly is beside the point -- you wrote different semantics. Having said that, I agree the compiler could do a better job of warning on this - this is another example of missing diagnostics, like discussed earlier in the thread.Yes, if it is UB and the compiler can detect it then it should create some noise instead of generating noops (or worse). Except I don't think it should be a warning - it should be a in your face "We are about to delete your code - fix it or we won't compile!!" error.
Also shows that implementation defined would be fine in this example because the implementation (x86) is perfectly fine with the shift above register width and the code is still perfectly right for the implementation it was built for.
I didn't generate separate semantics for the platform I was coding for. It is the exact same semantics (as observed by the compiler). When I mempcy an uint64_t to a uint8_t array of size 8, I get platform defined behavior not a 'ret'. I don't see any good reason why shift should be otherwise besides "Because the standard says so". There is no good reason to make shift by greater than register width undefined when every hardware I know of defines it. Shift is just one example - detectable unaligned loads, detectable data races, detectable read from uninitialized storage, detectable access beyond lifetime are all things that can have much better behavior when implementation defined or unspecified (or preferably error message) than the current nasal demons UB. Returning 0 can hide problems forever. The compiler is allowed to return the right answer every time in your test binary and invoke the UB only in the release binary. John Reghr's blog post that Sanjoy linked to goes into more details. I don't see any of his suggestions leading to any performance loss in real life code.Only in the C/C++ community do I see people okay with this adversarial nature between compilers and programmers. The whole "you did wrong, so you must be punished" attitude doesn't make sense to me. The compiler folks took great effort in adding all these smarts to detect UB. But since reporting these would require a complete rewrite of the compiler - they decided exploiting it for "performance gains" was a better idea. The spec and its implementations are supposed to help us find our errors, not punish us for the sins we've committed. Every time there is a security bug from such UB, I see the security community (full of experienced C developers) wanting a simple C compiler (don't think this will help) or a friendly C dialect (I do think this will help). We could rally for a better language/spec or we could accept programs written by expert C programmers breaking every other compiler upgrade. Like John Reghr says here - I'm pretty sure I know which one of these will happen.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
On Wednesday, February 10, 2016, Rajiv Kurian <geet...@gmail.com> wrote:
On Wednesday, February 10, 2016 at 10:08:12 AM UTC-8, Vitaly Davidovich wrote:I understand what you are saying but this is not necessarily true. If I am coding for x86 for example, shift by greater than bit width does what I want it to do when writing a bit map. Here is an example for a 128 bit bitmap -http://goo.gl/WvzT2c Note how the setBitWithDefensiveShift() and setBitWithoutDefensiveShift() have the exact same assembly. The compiler notices that the extra mod 64 is redundant on x86 and eliminates it in the setBithWithDefensiveShift() version. If you were to call either function with a random bitIndex your output would be exactly the same. However when you try the 2 functions with the exact same constant numbers, the setBitWithoutDefensiveShift() invokes undefined behavior and the setBitWithDefensiveShift() one doesn't! This is pretty sad given the compiler had the smarts to know that the mod 64 was not needed to begin with and compiled both functions down to the same assembly. Now here is the version where we forbid the compiler from inlining - http://goo.gl/zmM3kv The assembly is now exactly the same for the implementation and the invocations. We are at the mercy of the compilerWell, in the UB examples you're explicitly invoking UB. In the no UB, you're explicitly reducing the bitIndex because you know overflow is UB. The fact the compiler treats it as nop and generates same assembly is beside the point -- you wrote different semantics. Having said that, I agree the compiler could do a better job of warning on this - this is another example of missing diagnostics, like discussed earlier in the thread.Yes, if it is UB and the compiler can detect it then it should create some noise instead of generating noops (or worse). Except I don't think it should be a warning - it should be a in your face "We are about to delete your code - fix it or we won't compile!!" error.In this case, yes, but compilers routinely delete your code as part of normal mundane optimization. It would need to be smart enough to not cause false positives. Then you probably have pass ordering issues inside the backend where the pass that does this transform no longer has access to the AST or at least unmodified IR.Also shows that implementation defined would be fine in this example because the implementation (x86) is perfectly fine with the shift above register width and the code is still perfectly right for the implementation it was built for.Good argument if you're writing assembly, but you're not. As you said earlier, it's no longer a portable assembly. It may never have been in principle since you were always targeting an abstract C machine but compilers didn't do much optimization. If you're writing C/C++ you need to obey its rules (and their rules kinda suck by modern standards, which I agree with)
I didn't generate separate semantics for the platform I was coding for. It is the exact same semantics (as observed by the compiler). When I mempcy an uint64_t to a uint8_t array of size 8, I get platform defined behavior not a 'ret'. I don't see any good reason why shift should be otherwise besides "Because the standard says so". There is no good reason to make shift by greater than register width undefined when every hardware I know of defines it. Shift is just one example - detectable unaligned loads, detectable data races, detectable read from uninitialized storage, detectable access beyond lifetime are all things that can have much better behavior when implementation defined or unspecified (or preferably error message) than the current nasal demons UB. Returning 0 can hide problems forever. The compiler is allowed to return the right answer every time in your test binary and invoke the UB only in the release binary. John Reghr's blog post that Sanjoy linked to goes into more details. I don't see any of his suggestions leading to any performance loss in real life code.Only in the C/C++ community do I see people okay with this adversarial nature between compilers and programmers. The whole "you did wrong, so you must be punished" attitude doesn't make sense to me. The compiler folks took great effort in adding all these smarts to detect UB. But since reporting these would require a complete rewrite of the compiler - they decided exploiting it for "performance gains" was a better idea. The spec and its implementations are supposed to help us find our errors, not punish us for the sins we've committed. Every time there is a security bug from such UB, I see the security community (full of experienced C developers) wanting a simple C compiler (don't think this will help) or a friendly C dialect (I do think this will help). We could rally for a better language/spec or we could accept programs written by expert C programmers breaking every other compiler upgrade. Like John Reghr says here - I'm pretty sure I know which one of these will happen.I'm not sure anyone is OK with it - that's why there are attempts at creating new languages with less booby traps, sanitizers, static analysis tools, etc. I'm just not sure much can be done about C, realistically speaking. I think if you have to write C then you have to play by the (tricky, hostile, brittle, error prone, confusing, etc) rules. There are definitely C projects that are of top notch quality despite all these issues (e.g sqlite, postgresql, Linux, the various BSDs), so it's possible - just don't rely on compiler on holding your hand.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
Let me just ask you this after having this discussion -- why are you using C?
...
On Wednesday, February 10, 2016 at 12:14:37 PM UTC-8, Vitaly Davidovich wrote:Let me just ask you this after having this discussion -- why are you using C?Because I used to get paid to do it (no longer).
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
On Wednesday, February 10, 2016, Rajiv Kurian <geet...@gmail.com> wrote:
On Wednesday, February 10, 2016 at 12:14:37 PM UTC-8, Vitaly Davidovich wrote:Let me just ask you this after having this discussion -- why are you using C?Because I used to get paid to do it (no longer).Fair enough (and what I thought your answer would be).So suppose you were going to write some systems-like or performance sensitive software (e.g database, user space networking, OS kernel, AAA game, financial exchange, flight simulator, etc) today that needs to run across a variety of hardware - what would you pick today?
...
struct Foo { Foo *next; void release() { Foo *tmp = 0; for (Foo *it = next; it; it = tmp) {
#ifndef NDEBUG
some kind of check/assertion on 'it'
#endif tmp = it->next; } } }; void test(Foo &f) { f.release(); }
--
Just to give one example where UB (or something similar) can be used for good (instead of evil) by compilers:struct Foo { Foo *next; void release() { Foo *tmp = 0; for (Foo *it = next; it; it = tmp) {#ifndef NDEBUGsome kind of check/assertion on 'it'#endif tmp = it->next; } } }; void test(Foo &f) { f.release(); }This loop doesn't vanish in release builds on GCC, but does on clang. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67809The reason clang can optimize it away is that an infinite loop that doesn't call any library I/O functions or synchronize on anything isn't allowed (not exactly UB, but the compiler is allowed to assume this according to the reference cited in the bug). Thus, clang assumes the loop will terminate (else it'd be UB I guess), and just drops the whole thing. GCC still does all the pointer chasing even when there's nothing to do.
I've very much enjoyed this thread, and share a few of the UB concerns with Rajiv. That said; I really don't find UB in general an issue in my day-to-day programming. Having spent some time working on compilers I do have a lot more sympathy for compiler writers and the near-impossible job they have in generating efficient, correct code. The lack of warnings for some of the more interesting cases is a usability concern I suppose. The ease-of-use ship for C++ sailed in the late 90s though... (albeit came nearer to shore with C++14 et al).
...
This loop doesn't vanish in release builds on GCC, but does on clang. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67809The reason clang can optimize it away is that an infinite loop that doesn't call any library I/O functions or synchronize on anything isn't allowed (not exactly UB, but the compiler is allowed to assume this according to the reference cited in the bug). Thus, clang assumes the loop will terminate (else it'd be UB I guess), and just drops the whole thing. GCC still does all the pointer chasing even when there's nothing to do.Removing "empty" loops is a possibly good optimization. Moreover one man's optimization is another man's debugging nightmare. I don't think it is using the undefined behavior escape hatch or so I hope.
This sounds a bit like optimizing away a for (int i = 0; i < 10; i++); since there are no side effects observable. Weirdly enough void spinForever() { while(1); } does not get compiled to a 'rep ret' even at -O3.
The ease-of-use ship for C++ sailed in the late 90s though... (albeit came nearer to shore with C++14 et al).C++14 stows the shot gun away towards the back of the store. The savvy customer can still find it and shoot themselves in the foot ;)
--
It's a bit philosophical, but "time passing" is kind of a side effect, which may be used, e.g., to prevent timing attacks in some crypto code (this isn't the "right" way to do it, but just illustration) . Of course, just making code faster as part of standard optimization affects timing (on purpose!), but the crypto guys are consistently fighting compilers AFAIK (and dropping to assembly in cases precisely to subvert compiler optimizations).
I do agree that clang not removing while(1) is odd considering it eliminated a much less obvious loop.BTW, if you add -march=<some intel>, gcc removes the rep prefix (is that what you fixed Matt?).
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Vitaly Davidovich wrote:
So if I understand correctly, LLVM never actually proved the loop
terminates here (it can't), but release() was marked as noexcept and
readonly and the call was removed erroneously; by erroneously, I mean
If in your language infinite loops are observable things, like in
Java, and are not UB, like in C++, then it was the inference of
readonly that was broken since, clearly, the function has a side
effect. If your language is C++ then "undefined behavior" was
"exploited" to infer "readonly". This is one of the places where
semantics of C/C++ has subtly leaked into the semantics of LLVM IR
(and needs to be fixed).
Removal of a "readonly nounwind" call is, in itself, fine -- since
readonly functions are allowed to only read memory and compute
results.
This also demonstrates the layering difficulty I was talking about
earlier -- the analysis step that inferred UB is a different layer
than the one that actually used the inference to do something
interesting. The latter does not know how "readonly" was inferred,
and the former does not know if inferring readonly is going to affect
the program in an interesting way.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
On Thu, Feb 11, 2016 at 4:11 PM Rajiv Kurian <geet...@gmail.com> wrote:This loop doesn't vanish in release builds on GCC, but does on clang. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67809The reason clang can optimize it away is that an infinite loop that doesn't call any library I/O functions or synchronize on anything isn't allowed (not exactly UB, but the compiler is allowed to assume this according to the reference cited in the bug). Thus, clang assumes the loop will terminate (else it'd be UB I guess), and just drops the whole thing. GCC still does all the pointer chasing even when there's nothing to do.Removing "empty" loops is a possibly good optimization. Moreover one man's optimization is another man's debugging nightmare. I don't think it is using the undefined behavior escape hatch or so I hope.How can the compiler prove that the link list is not circular? GCC thinks it can't, and so cannot optimize away the loop. Clang seems to subscribe to the "C++ [intro.multithread]" interpretation that an infinite loop is not allowed (without sync or external calls).This sounds a bit like optimizing away a for (int i = 0; i < 10; i++); since there are no side effects observable. Weirdly enough void spinForever() { while(1); } does not get compiled to a 'rep ret' even at -O3.It's a little different as the compiler can in principal see that "obviously" that loop will never return. It is a little confusing why clang doesn't turn it into a ret. (BTW we should really stop perpetuating the whole "rep ret" thing! It drove me mad enough to patch GCC for my comp
...
FWIW, Java says result is undefined in this case as well. It's a memory safe language so if you index out of bounds you get an exception, but that's UB in C. It's just saying anything can happen because it can't promise memory safety, for example. In other words, it's UB because of the other more simple UB behaviors that such a search might elicit (null deref, overflow, infinite loop, etc).
...