I thought Erlang would optimize the ++ operator when the left side is known
at compile time.
For example, if the compiler sees the following outside of a function
signature:
"Foo" ++ Bar
It could rewrite it as:
[$F, $o, $o | assert_list(Bar)]
However, I ran some benchmarks and it seems the optimization does not
happen (on R15B).
With a local dummy implementation of assert_list(Bar), I got that the first
format is 50% slower than the second one.
That said, given the possibility something is odd in my setup, does Erlang
optimize it or not? If not, could it?
On Thu, May 31, 2012 at 1:55 PM, José Valim <jose.va...@gmail.com> wrote: > I thought Erlang would optimize the ++ operator when the left side is known > at compile time.
> For example, if the compiler sees the following outside of a function > signature:
> "Foo" ++ Bar
> It could rewrite it as:
> [$F, $o, $o | assert_list(Bar)]
The '++' operator does not verify that the second argument is a list. Therefore, the compiler will rewrite the first expression to simply:
[$F,$o,$o|Bar]
> However, I ran some benchmarks and it seems the optimization does not happen > (on R15B). > With a local dummy implementation of assert_list(Bar), I got that the first > format is 50% slower than the second one.
Have you made sure that you run each test in a newly spawned process? Do you run the test for long enough time?
If you have set up your benchmark environment properly, and your second example is still faster, I can only assume that the code sits better in cache.
> The '++' operator does not verify that the second argument is a list.
> Therefore, the compiler will rewrite the first expression to simply:
> [$F,$o,$o|Bar]
Yes, thanks. For some weird reason I thought Erlang restricted the right
side to be a list in such cases.
> Have you made sure that you run each test in a newly spawned process?
Do you run the test for long enough time?
Running in a newly spawned process did the trick, the results are now
consistent.
I was already running it 1000 times, here is the code (without using spawn):
I assume having a "clean" heap (reducing the chance of garbage collection)
is the reason why it is a good idea to run benchmarks in a newly spawned
process?
1 jun 2012 kl. 10:46 skrev "José Valim" <jose.va...@gmail.com>:
The '++' operator does not verify that the second argument is a list.
> Therefore, the compiler will rewrite the first expression to simply:
> [$F,$o,$o|Bar]
Yes, thanks. For some weird reason I thought Erlang restricted the right
side to be a list in such cases.
> Have you made sure that you run each test in a newly spawned process?
Do you run the test for long enough time?
Running in a newly spawned process did the trick, the results are now
consistent.
I was already running it 1000 times, here is the code (without using spawn):
I assume having a "clean" heap (reducing the chance of garbage collection)
is the reason why it is a good idea to run benchmarks in a newly spawned
process?
If you want to vary one parameter in a test, make sure all other parameters
stays the same otherwise you have no idea what you are measuring!
Code loading, heaps, position of mars .. Everything has to be taken into
account.
> 1 jun 2012 kl. 10:46 skrev "Jos� Valim" <jose.va...@gmail.com > <mailto:jose.va...@gmail.com>>:
>> The '++' operator does not verify that the second argument is a list.
>> Therefore, the compiler will rewrite the first expression to simply:
>> [$F,$o,$o|Bar]
>> Yes, thanks. For some weird reason I thought Erlang restricted the >> right side to be a list in such cases.
>> Have you made sure that you run each test in a newly spawned
>> process?
>> Do you run the test for long enough time?
>> Running in a newly spawned process did the trick, the results are now >> consistent.
>> I was already running it 1000 times, here is the code (without using >> spawn):
>> I assume having a "clean" heap (reducing the chance of garbage >> collection) is the reason why it is a good idea to run benchmarks in >> a newly spawned process?
> If you want to vary one parameter in a test, make sure all other > parameters stays the same otherwise you have no idea what you are > measuring!
> Code loading, heaps, position of mars .. Everything has to be taken > into account.
The effect of Mars magnetic field on that of Earth, even if Earth's aphelion and the perihelion of Mars were to coincide and the interplanetary distance were to be at its theoretical minimum, is negligible compared to the normal variations in Earth's magnetic field.
Therefore it's very unlikely that the relative position of Mars would affect the probability of a memory bitflip occurring due to cosmic radiation and thus affecting the benchmark.
> On 2012-06-01 12:07, Bj�rn-Egil Dahlberg wrote:
>> 1 jun 2012 kl. 10:46 skrev "Jos� Valim" <jose.va...@gmail.com >> <mailto:jose.va...@gmail.com>>:
>>> The '++' operator does not verify that the second argument is a
>>> list.
>>> Therefore, the compiler will rewrite the first expression to simply:
>>> [$F,$o,$o|Bar]
>>> Yes, thanks. For some weird reason I thought Erlang restricted the >>> right side to be a list in such cases.
>>> Have you made sure that you run each test in a newly spawned
>>> process?
>>> Do you run the test for long enough time?
>>> Running in a newly spawned process did the trick, the results are >>> now consistent.
>>> I was already running it 1000 times, here is the code (without using >>> spawn):
>>> I assume having a "clean" heap (reducing the chance of garbage >>> collection) is the reason why it is a good idea to run benchmarks in >>> a newly spawned process?
>> If you want to vary one parameter in a test, make sure all other >> parameters stays the same otherwise you have no idea what you are >> measuring!
>> Code loading, heaps, position of mars .. Everything has to be taken >> into account.
> The effect of Mars magnetic field on that of Earth, even if Earth's > aphelion and the perihelion of Mars were to coincide and the > interplanetary distance were to be at its theoretical minimum, is > negligible compared to the normal variations in Earth's magnetic field.
> Therefore it's very unlikely that the relative position of Mars would > affect the probability of a memory bitflip occurring due to cosmic > radiation and thus affecting the benchmark.
My point, in case you missed it, is that you can't disregard something out of hand. You have to be sure. In physics you might be familiar to similar topics from dimensional analysis, i.e. how to come up with a model and how to disregard useless parameters.
All I'm saying is that you should be sure of what you are measuring. =)
The very unlikely is the important part here ;) also imagine mars position causes it to catch a astroid in it's gravitation field and slingshots it right into your computer - now that would screw your benchmark results!
--
Heinz N. Gies
he...@licenser.net
http://licenser.net
>> 1 jun 2012 kl. 10:46 skrev "José Valim" <jose.va...@gmail.com>:
>>> The '++' operator does not verify that the second argument is a list.
>>> Therefore, the compiler will rewrite the first expression to simply:
>>> [$F,$o,$o|Bar]
>>> Yes, thanks. For some weird reason I thought Erlang restricted the right side to be a list in such cases.
>>> Have you made sure that you run each test in a newly spawned process? >>> Do you run the test for long enough time?
>>> Running in a newly spawned process did the trick, the results are now consistent.
>>> I was already running it 1000 times, here is the code (without using spawn):
>>> I assume having a "clean" heap (reducing the chance of garbage collection) is the reason why it is a good idea to run benchmarks in a newly spawned process?
>> If you want to vary one parameter in a test, make sure all other parameters stays the same otherwise you have no idea what you are measuring!
>> Code loading, heaps, position of mars .. Everything has to be taken into account.
> The effect of Mars magnetic field on that of Earth, even if Earth's aphelion and the perihelion of Mars were to coincide and the interplanetary distance were to be at its theoretical minimum, is negligible compared to the normal variations in Earth's magnetic field.
> Therefore it's very unlikely that the relative position of Mars would affect the probability of a memory bitflip occurring due to cosmic radiation and thus affecting the benchmark.