[erlang-questions] trouble with erlang or erlang is a ghetto

Joel Reymont

unread,

Jul 26, 2011, 3:07:19 PM7/26/11

to erlang-questions@erlang.org Questions

Did I miss a lively and heated discussion?

http://www.unlimitednovelty.com/2011/07/trouble-with-erlang-or-erlang-is-ghetto.html

Bring it on!

--------------------------------------------------------------------------
- for hire: mac osx device driver ninja, kernel extensions and usb drivers
---------------------+------------+---------------------------------------
http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont
---------------------+------------+---------------------------------------

_______________________________________________
erlang-questions mailing list
erlang-q...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

Bob Ippolito

unread,

Jul 26, 2011, 3:20:41 PM7/26/11

to Joel Reymont, erlang-questions@erlang.org Questions

"Funargs: Ruby-like blocks for Erlang" is the discussion thread that
resulted in that blog post:
http://erlang.org/pipermail/erlang-questions/2011-July/060104.html

Joel Reymont

unread,

Jul 26, 2011, 3:22:51 PM7/26/11

to Bob Ippolito, erlang-questions@erlang.org Questions

On Jul 26, 2011, at 8:20 PM, Bob Ippolito wrote:

> "Funargs: Ruby-like blocks for Erlang" is the discussion thread that
> resulted in that blog post:

Yes, I've been dutifully ignoring it for the past few days :-).

This post warrants a new discussion then!

Joel Reymont

unread,

Jul 26, 2011, 3:45:08 PM7/26/11

to Bob Ippolito, erlang-questions@erlang.org Questions

I'll offer a datapoint…

I hate Erlang, although it's been a great way for me to make money. OpenPoker, for example, keeps on giving and is being licensed by a major game development company.

To Tony's list of Erlang warts I would also add the complete lack of transparency during performance optimization. This has bit me during OpenPoker development and also during the "Hot wheels" optimization contest [1].

You do have the facility for call counting (eprof) as well as for per-process profiling with fprof. A regular Erlang system these days will likely have thousands of processes running, processing requests and returning a response. What you get with fprof is a useless profiling listing of thousands of processes.

What you really want is to know what happens along the request processing path. It took N seconds to process the request. Where did this time go? You want to average this over hundreds of requests and see where the bottlenecks are. Do you really want this? You are out of luck, I'm sorry. No tools exist to help you.

That said, I recently came off a project where we build an ad serving platform with OCaml and ZeroMQ. I pushed for this technology stack as I wanted to see if this is a suitable replacement for Erlang. The short answer is no.

We ended up defining tons of Google Protocol Buffers messages for everything we needed and writing boring serialization and deserialization code. Also, what would normally be an Erlang process turned out to be a separate executable. There was some loss of compilation speed as the number of these executables multiplied. Still, it was quite manageable until we got to sharing data.

We started by using plain Redis. This required a server to "front" Redis and process various requests, updating other processes of state changes via ZeroMQ. Serializing things to strings and back is an incredible pain in the ass!

Later, it turned out that Redis was not the ideal choice since a lot of the long-running processes required most of the data in Redis and it was too slow to suck it out upon startup of each of these processes. We ended up dumping data into "bootstrap files", easily and quickly loadable into OCaml.

What I'm trying to say here is that our application clearly benefited from code running "on top" of the data, e.g. Erlang processes using ETS or Mnesia. Our scores of Protobuf messages would have become simple Erlang tuples or records. Data sharing would have become a non-issue.

I wish Tony good luck with his search for the Erlang replacement. I haven't found one yet.

[1] https://groups.google.com/d/topic/erlang-programming/2pRrWneJwG8/overview

Frédéric Trottier-Hébert

unread,

Jul 26, 2011, 4:05:03 PM7/26/11

to Joel Reymont, erlang-questions@erlang.org Questions

Yes, you did: http://erlang.org/pipermail/erlang-questions/2011-July/thread.html#60104

--
Fred Hébert
http://www.erlang-solutions.com

Frédéric Trottier-Hébert

unread,

Jul 26, 2011, 4:06:05 PM7/26/11

to Frédéric Trottier-Hébert, erlang-questions@erlang.org Questions

Woops, sorry for the very late reply. It appears the e-mails got clogged in the mail server and I thought it was brand new. Disregard that e-mail and the previous one.

--
Fred Hébert
http://www.erlang-solutions.com

Jesper Louis Andersen

unread,

Jul 26, 2011, 4:23:54 PM7/26/11

to Joel Reymont, erlang-questions@erlang.org Questions

On Tue, Jul 26, 2011 at 21:45, Joel Reymont <joe...@gmail.com> wrote:

> What you really want is to know what happens along the request processing path. It took N seconds to process the request. Where did this time go? You want to average this over hundreds of requests and see where the bottlenecks are. Do you really want this? You are out of luck, I'm sorry. No tools exist to help you.

I agree this thing would be really neat to have. To make it somewhat
efficient, I'd definitely propose an sFlow-like approach. Add a flag
to erlang:trace/3 such that we can trace a function based upon a
sampling value. For instance, that 1/8192 messages on average is
traced. Together with call/return we now know how much time was spent
in that function as a whole. If you also add dependent trace probes
(if the 1/8192 trigger, then these should be enabled as well) you
should in principle be able to build such a tool out of the trace
facilities. It is *almost* there.

--
J.

Joel Reymont

unread,

Jul 26, 2011, 4:51:06 PM7/26/11

to Jesper Louis Andersen, erlang-questions@erlang.org Questions

On Jul 26, 2011, at 9:23 PM, Jesper Louis Andersen wrote:

> Add a flag to erlang:trace/3 such that we can trace a function based upon a
> sampling value. For instance, that 1/8192 messages on average is
> traced. Together with call/return we now know how much time was spent
> in that function as a whole.

There may be message passing along the request path, processes talking to one another.

You want to capture and time this interaction.

You can do all this by capturing and processing gigabytes of trace messages, figuring out where the request starts and ends and taking it from there.

--------------------------------------------------------------------------
- for hire: mac osx device driver ninja, kernel extensions and usb drivers
---------------------+------------+---------------------------------------
http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont
---------------------+------------+---------------------------------------

_______________________________________________

Loïc Hoguin

unread,

Jul 26, 2011, 5:05:05 PM7/26/11

to Joel Reymont, erlang-questions@erlang.org Questions

On 07/26/2011 09:07 PM, Joel Reymont wrote:
> Did I miss a lively and heated discussion?
>
> http://www.unlimitednovelty.com/2011/07/trouble-with-erlang-or-erlang-is-ghetto.html
>
> Bring it on!

All I'm seeing there is "Make Erlang like language XYZ". Not that Erlang
can't learn from other languages and platforms, I'm sure it does, but
Erlang focuses on reliability and you can't be reliable and add new
trendy features every year while remaining reliable. Adding features to
Erlang is a long and progressive process and this sort of development
pace doesn't with the "following trends" approach most programmers use
with regards to programming languages. Two days ago Java was trendy,
yesterday it was Ruby, today it's Javascript, tomorrow it'll be another,
everytime reinventing the loop or partially taking from other languages.
Personally I like the slow but certain approach better.

So yeah, maybe Erlang doesn't have the best features of Lisp, and
Clojure, and Java, and Ruby, and Javascript, and-- but neither do them.
I'm a firm believer of the "right tool for the right job", and it turns
out that Erlang actually is good for most server application jobs, so it
pleases me quite well. And it doesn't need to have 10 million features
to fit that role, it just needs to be reliable, concurrent and distributed.

I disagree about Erlang's syntax though. Not that it doesn't suck, but I
don't know any language syntax which isn't sucking (I'm sure you'll tell
me your favorite language has great syntax). They all have issues, they
all have traps you must learn to avoid, and nothing you can do will
change that. Maybe you can add some syntax sugar to avoid typing 5
characters here and there, but seriously what's the point? If the syntax
changes don't make me think differently about the way I program, and
don't allow me to do things I couldn't do before, then they're pretty
much irrelevant to the matter at hand: getting things done. There *are*
interesting concepts that could be good, one day, in Erlang. But not
having them doesn't prevent us from being efficient so they should go
after making Erlang even more reliable, and also after adding missing
library components to the pool of applications. Unless you're doing your
project for fun or something, in which case have fun. But toying with
syntax doesn't make much sense if you need to bring in money.

Erlang does very well what it's designed for, and I'm thankful for that.
I'm using Erlang to do a lot of things. I wouldn't use Erlang to do
everything. But Erlang allows me to get the work done and more
importantly it allows me to move on to other projects quickly because
Erlang provides me with everything needed to avoid almost all possible
bugs (Dialyzer, PropEr, eunit and ct are wonderful to work with), and
once in production I'm confident my applications will always be running
and available. That's something I can't say for any other language or
platform out there today.

--
Loïc Hoguin
Dev:Extend

Jesper Louis Andersen

unread,

Jul 26, 2011, 6:14:42 PM7/26/11

to Joel Reymont, erlang-questions@erlang.org Questions

On Tue, Jul 26, 2011 at 22:51, Joel Reymont <joe...@gmail.com> wrote:
>
> On Jul 26, 2011, at 9:23 PM, Jesper Louis Andersen wrote:
>
>> Add a flag to erlang:trace/3 such that we can trace a function based upon a
>> sampling value. For instance, that 1/8192 messages on average is
>> traced. Together with call/return we now know how much time was spent
>> in that function as a whole.
>
> There may be message passing along the request path, processes talking to one another.
>
> You want to capture and time this interaction.

Ah yes, that is something I completely missed. Indeed you want to
trace along messages also, but that makes it less obvious what to do.
I am more for capturing such instrumentation information on a random
sampling basis. You could do it manually though by inserting
instrumentation functions along the path and then use those functions
as hooks for a unique tag. That is probably what I'd do today if I
needed this. Then I'd trace the instrumentation functions at random
intervals.

--
J.

Tim Watson

unread,

Jul 26, 2011, 6:53:13 PM7/26/11

to Joel Reymont, erlang-questions@erlang.org Questions

> To Tony's list of Erlang warts I would also add the complete lack of transparency during performance optimization. This has bit me during OpenPoker development and also during the "Hot wheels" optimization contest [1].
>
> You do have the facility for call counting (eprof) as well as for per-process profiling with fprof. A regular Erlang system these days will likely have thousands of processes running, processing requests and returning a response. What you get with fprof is a useless profiling listing of thousands of processes.
>
> What you really want is to know what happens along the request processing path. It took N seconds to process the request. Where did this time go? You want to average this over hundreds of requests and see where the bottlenecks are. Do you really want this? You are out of luck, I'm sorry. No tools exist to help you.
>

Better profiling tool support would be nice. A few people started
trying to pull together something that would provide better visibility
- my own project, nodewatch, which is heavily based on eper and more
recently Richard Jones and friends produced
https://github.com/beamspirit/bigwig during the spawnfest competition.
I've stopped work on nodewatch for now, to see whether bigwig will
move faster - I don't have that much spare time and if others are
doing the same thing as their day job then I needn't bother. My
ultimate aim was to provide configurable aggregated stats as well as
trace support with time budgeting in mind (e.g., how long in the web
tier, how long hitting the back end, etc).

Jon Watte

unread,

Jul 27, 2011, 12:30:06 AM7/27/11

to Jesper Louis Andersen, erlang-questions@erlang.org Questions

FWIW: We tag each application-level request with a "what" and "when" and pass this record along to other parts of the system. When the application gets back its reply, it computes statistics. This allows us to track messages that take surprisingly long time -- typically based on some external dependency -- and alert on how many they are, and get some coarse-level statistics.

The most fun is when the request enters through one node, but finishes/exits through another node, because the clocks are not 100% in sync, so we sometimes end up with messages taking negative time. The fix for that is to also tag the "source node" in the per-request record that we carry along, and finish off by forwarding that record back to the creator for measurement.

Sincerely,

jw

--
Americans might object: there is no way we would sacrifice our living standards for the benefit of people in the rest of the world. Nevertheless, whether we get there willingly or not, we shall soon have lower consumption rates, because our present rates are unsustainable.

Richard O'Keefe

unread,

Jul 27, 2011, 12:42:19 AM7/27/11

to Loïc Hoguin, erlang-questions@erlang.org Questions

On 27/07/2011, at 9:05 AM, Loïc Hoguin wrote:

> On 07/26/2011 09:07 PM, Joel Reymont wrote:
>> Did I miss a lively and heated discussion?
>>
>> http://www.unlimitednovelty.com/2011/07/trouble-with-erlang-or-erlang-is-ghetto.html
>>
>> Bring it on!

There are some substantive issues in that blog entry.

(1) Frames have not been implemented yet.

Now if I had provided a model implementation, things might have been
different. Or if Joe had provided a model implementation of his
earlier and fundamentally similar "proper structs", again things
might have been different.

My excuse is that the BEAM architecture is simply not documented
anywhere that I can find, and I have too many other things to do
to grovel around in the guts in Erlang to figure it out. So let
me add my item here:

(2) BEAM has no usable documentation.

One reason this is important is because some people have liked the
ideas underneath Erlang well enough to try to build other syntaxes
for it. But that is inexcusably hard with BEAM undocumented.

(3) There doesn't seem to be anything in the Erlang _approach_ that
should interfere with scaling, but the _implementation_ appears
not to scale well much past 16 cores.

I don't know if that is current information. If true, it's an
important limitation of the implementation which I'm sure will be
given a lot of attention. I don't expect it to stay true.

(4) He doesn't seem to like the Erlang garbage collector, but that's
something which has changed more than once, and he does not
offer any actual _measurements_.

I tried the experiment of allocating 1,000,000,000 list cells
(but never keeping more than 10,000 of them at a time).
Erlang, byte-codes: 7.58 seconds (= 7.58 nsec/allocation).
Erlang, native : 3.92 seconds (= 3.92 nsec/allocation).
Java -O -server : 11.40 seconds (= 11.40 nsec/allocation).
Java -O -client : 12.26 seconds (= 12.26 nsec/allocation).

Java has come a long way. I don't have an Azul system to try.

He praised tcmalloc. I note that it only recently became usable
without pain on my laptop (MacOS X) and that building it produced
reams of warning messages about using a deprecated interface, so
it may not work much longer. It doesn't work at all on the other
machine on my desk. (Erlang works on both.) I wrote a similar
benchmark in C and linked it with libtcmalloc.a. I killed that
program after it had run for more than 10 times as long as the
Erlang code. So when he says "Erlang ... can't take advantage of
libraries like tcmalloc", there doesn't appear to be any
advantage that Erlang *could* take.

In short, Erlang's memory management is criticised, and it MAY be
that this is justified, but the blog entry provides no EVIDENCE.

By the way, savour the irony. "Erlang's approach [of] using
separate heaps per process", which he criticises, is in fact
used elsewhere: ptmalloc does it, the tcmalloc documentation
makes it absolutely clear that tcmalloc does this also (more
precisely, it uses a per-thread cache, which is what the "tc"
part of the name means), and some recent Java systems have
done the same thing, with a per-thread cache for memory
management, backed by a shared heap. The point of the per-
thread cache is to reduce locking. There is a spectrum of
approaches from nothing shared to everything shared, and it
seems clear that everyone sees merit in not being at either
extreme. Progress must be driven by measurement.

(5) He doesn't like HiPE. For myself, I don't _care_ whether HiPE is
a JIT or a jackal, as long as it gives me a useful improvement in
performance. Inlining across module boundaries _has_ been tried,
I believe (there's a paper about it somewhere), but it's hard to
reconcile with hot loading. HiPE was, of course, a project
contributed by "the community", and depended on funding which I
believe has come to an end. Anyone who wants a better compiler
should try to find funds for it. Sun and Google have vastly
deeper pockets than Kostis Sagonas!

I think it is particularly unfair to criticise HiPE for having a
limited range of back ends when it works on *MORE* systems than
the tcmalloc library he praised.

(6) Erlang is not general purpose.

But it was never intended to be, and isn't advertised as such.

In fact Erlang loves state, but it wants state to be encapsulated
within processes.

"What should you do if you want to deal with a shared-state
concurrency program in Erlang?"
Lie down until the feeling passes off.

If you want shared-state concurrency, I can tell you where to find
Ada. I can tell you where to find concurrent ML (Mlton does it OK).
I can tell you where to find Haskell, which is actually pretty
amazing these days.

One can respect Erlang without being married to it.

(7) He doesn't like the syntax.
Well, it's not quite as much of a disaster as Java syntax, and for
sure it's not as ugly as CAML or F#. (They are so ugly that they
make SML look beautiful, and for someone who prefers Haskell to
SML on aesthetic grounds, that's saying a lot.) The funny thing
is that what makes Erlang syntax clunky is precisely its *similarity*
to classical languages like Pascal and C...

Given documentation for BEAM, we might get more alternative syntaxes
to play with.

It's fair to point out that Erlang resulted from an experiment with
several approaches, so responsible steps were taken to make sure that
it wasn't _too_ bad.

(8) He criticises immutable state on the grounds that while you can share
tails of a list, you can't share prefixes or infixes. The answer, of
course, is multifold:
- there are immutable data structures where you *can* share infixes
and you *can* use them in Erlang, it's just that they don't have
built in syntax. (For that matter, it would be possible to implement
Erlang so that slices of tuples could be shared just like slices of
strings in Java. SML does this. In fact, Concurrent SML would answer
so many of his issues that I'm surprised he didn't mention it.)
- this is only a problem if you *want* to share prefixes or infixes,
and somehow I never do
- in languages with mutable state you cannot safely share ANYTHING.
The argument that allocating pure objects is a bad match for modern
hardware is a non sequitur. Let me quote a paper about Fork/Join
parallelism for Java: "In many ways, modern GC facilities are perfect
matches to fork/join frameworks: These programs can generate enormous
numbers of tasks, nearly all of which quickly turn into garbage after
they are executed." That is, generating enormous amounts of garbage
can be a >good< thing, provided it's the kind that garbage collectors
manage well. I've been on the garbage collection mailing list for a
while, and the Memory Management proceedings have contained papers
showing that garbage collection can be *worse* for locality (and thus
modern hardware) and papers showing that it can be *better* for
locality (and thus modern hardware). What this means is that one
cannot simply *assume* "side effects = good for cache, immutability
= bad", one must *measure*.

(9) He doesn't like the standard library.

Interestingly enough, I hear the same kind of thing in the Haskell
mailing list about the Haskell "standard Prelude". And there is a
project to develop a new standard Prelude.

I believe there is general agreement that an improved Erlang library
could be developed and would be very nice to have. (I've always
found the differences between say ETS and DETS more confusing than
helpful.)

This is something that can be done piecemeal and by individuals.

The things I would like to say about the Java libraries would have to
be displayed on asbestos screens... Heck, I like the Ada libraries
less than I used to; the additions are *pointful* but not to my mind
*tasteful*.

And so it goes.

There *are* things about Erlang that can be improved.
Some things *have* been improved, some things are being improved,
and some things don't need anyone to wait for Ericsson to do them.

Michael Truog

unread,

Jul 27, 2011, 2:08:15 AM7/27/11

to erlang-questions@erlang.org Questions

On 07/26/2011 09:42 PM, Richard O'Keefe wrote:
> (3) There doesn't seem to be anything in the Erlang _approach_ that
> should interfere with scaling, but the _implementation_ appears
> not to scale well much past 16 cores.
>
> I don't know if that is current information. If true, it's an
> important limitation of the implementation which I'm sure will be
> given a lot of attention. I don't expect it to stay true.

Any problems scaling on systems with more than 16 cores just seems to be related to the current cost of those systems (would love to know what the tests currently show on the Tileras > 16 cores though). Erlang appears to have much more natural scalability when you compare it to Java, and the criticism of the Erlang garbage collector is unsubstantiated (as previously mentioned). You can not say an approach is wrong because it isn't the Java-way, or perhaps the Sun-way. The per-process garbage collection avoids central state, so it encourages scalability with a design for parallelism. Throwing a ton of money at Azul to push a single heap garbage collector beyond normal limits might sound fun, but it only shows how long and drawn out technical failure can be (like Sun's stock price was for instance, an almost perfect bell curve).

It is hard to believe that there is controversy still here. Per-process garbage collection avoids shared state which avoids the need for low-level locking which makes Erlang scalable. To argue for some other approach in Erlang seems like idiocy to me, because it wouldn't be a real Actor model that can provide fault-tolerance (actually keep failures isolated). Why would you want to fool around with some broken fake Actor implementation that is unscalable (like your operating system, for instance)? Seems like a waste of time, just like this garbage collector complaint.

Yes, HiPE has issues, Erlang is not meant for all programs, and the syntax is different. Nothing is perfect.

Not even the standard library is perfect. I have yet to hear of a perfect standard library in any language. I think having problems with a standard library is a natural problem because the people that write the standard library impose a taxonomy on functionality that not everyone shares naturally, because we are not psychic. Learning the standard library just seems like a natural process in computer science when you get to relate a standard library to the others you had the misfortune to use in the past. Why bother complaining about a taxonomy that is just as bad as any other? If it serves the purpose it was designed for well, then there is no reason to care, it just happens to be different from what you might expect based on your own limited knowledge.

> And so it goes.
>
> There *are* things about Erlang that can be improved.
> Some things *have* been improved, some things are being improved,
> and some things don't need anyone to wait for Ericsson to do them.

The frustration with these issues seems natural and common, both on this mailing list and elsewhere. However, I think it is important to be proactive rather than giving in to emotional arguments that lack justification or evidence.

Ulf Wiger

unread,

Jul 27, 2011, 3:16:20 AM7/27/11

to Michael Truog, erlang-questions@erlang.org Questions

On 27 Jul 2011, at 08:08, Michael Truog wrote:

> On 07/26/2011 09:42 PM, Richard O'Keefe wrote:
>> (3) There doesn't seem to be anything in the Erlang _approach_ that
>> should interfere with scaling, but the _implementation_ appears
>> not to scale well much past 16 cores.
>>
>> I don't know if that is current information. If true, it's an
>> important limitation of the implementation which I'm sure will be
>> given a lot of attention. I don't expect it to stay true.
>
> Any problems scaling on systems with more than 16 cores just seems to be related to the current cost of those systems (would love to know what the tests currently show on the Tileras > 16 cores though).

I guess it's fair to say that the Erlang/OTP doesn't have a "dangerously sexy" approach to SMP scalability. They recognise that their sponsors [1] would have their heads if BEAM started hanging or dumping core in the sole interest of scaling to ridiculously many cores on hardware that none of the sponsors are using. Instead, they try to deliver incremental improvements without ever sacrificing the rock-solid performance of BEAM that Erlang users have come to expect. They're not batting 100% in that regard, but close enough to be very impressive, IMHO.

They *have* run benchmarks on 64-core machines, but mainly to learn about what's around the corner, and understand what changes will be needed to get there. You will soon hear about some pretty exciting developments in the area of Erlang for massive scalability, but there is a time and a place for everything… ;-)

What is interesting for most Erlang users (I think), is not how well Erlang can scale with synthetic benchmarks, but on *real applications*, where lots of other factors affect the result, and scalability is only one parameter (albeit a very important one).

Having had the privilege of participating in key roles throughout the development lifecycle of a number of telecom-class Erlang-based systems, I've come to think that constant-factor nuisances (like syntax quibbles) are simply unimportant. What matters in these projects is that you get the fundamentals right, esp. in regard to messaging, fault-tolerance and provisioning. If you don't, you will never be able to control schedules and costs, and your product is unlikely to ever be profitable.

In this environment, very few programming languages are even considered: usually, it's either C/C++ or Java, and in some companies, Erlang as well. The .NET languages are usually ruled out because their commercial viability is largely tied to the Windows platform, and such a single-platform focus is an issue when you expect your product to serve for decades. Ruby, Python etc. can be useful for writing test scripts, and perhaps some supporting tools, or perhaps even some peripheral component in the system, but (at this point in time) not much more than that.

I realise that most people today are not in this environment, but rather developing small [2] projects, often in areas where Ruby, Python, Clojure, Scala, Haskell et al are perfectly viable alternatives. I think it's great that the Erlang/OTP team is getting feedback from this crowd, and even pressure to compete with the "coolest" languages out there. Erlang will be better for it.

But let's do this with some mutual respect. I'm not even going to begin commenting on the "Erlang cargo culters" references, and find it curious that Erlang is so often said to "suck", even though the same people often admit that there is no viable replacement - isn't this much more of a criticism against those other languages, which apparently don't just suck enough to give discomfort, but to the point of being near unusable in this realm?

Having been in the position, some years ago, where I was thinking about changing jobs, and couldn't really find any jobs where I could make good use of my Erlang experience (even if not in Erlang), I find today's software market much more exciting - where other languages are drawing inspiration from Erlang, and in some cases really challenging it, even in the concurrency domain.

[1] Sponsors = commercial projects within Ericsson that use Erlang to build 5-nines-style complex messaging products for commercial gain. They typically need to use NEBS-compliant ATCA hardware from reputable vendors, which is standardised across the corporation to minimise supply chain costs. In these settings, I would expect most systems to be running on 4- or 8-core machines today, perhaps starting to move up to 16-core in a few places.

[2] To me, anything under 100 KLOC is small.

BR,
Ulf W

Ulf Wiger, CTO, Erlang Solutions, Ltd.
http://erlang-solutions.com

Max Lapshin

unread,

Jul 27, 2011, 3:34:13 AM7/27/11

to Ulf Wiger, erlang-questions@erlang.org Questions

On Wed, Jul 27, 2011 at 11:16 AM, Ulf Wiger
<ulf....@erlang-solutions.com> wrote:
>
> I guess it's fair to say that the Erlang/OTP doesn't have a "dangerously sexy" approach to SMP scalability. They recognise that their sponsors [1] would have their heads if BEAM started hanging or dumping core in the sole interest of scaling to ridiculously many cores on hardware that none of the sponsors are using. Instead, they try to deliver incremental improvements without ever sacrificing the rock-solid performance of BEAM that Erlang users have come to expect. They're not batting 100% in that regard, but close enough to be very impressive, IMHO.
>

Many people forget about it. Erlang messaging system seems to be
99,999% bug free. Scala is full of silly bugs.
I'd preferer to build software, that I sell on bug free platform and
great thanks to your team for it.

The only thing that really affects me is lack of profiling tools or
lack of documentation on them.

Thomas Lindgren

unread,

Jul 27, 2011, 5:07:44 AM7/27/11

to erlang-questions

----- Original Message -----
> From: Ulf Wiger <ulf....@erlang-solutions.com>
> To: Michael Truog <mjt...@gmail.com>
> Cc: "erlang-q...@erlang.org Questions" <erlang-q...@erlang.org>
> Sent: Wednesday, July 27, 2011 9:16 AM
> Subject: Re: [erlang-questions] trouble with erlang or erlang is a ghetto
>
> ...

> I guess it's fair to say that the Erlang/OTP doesn't have a
> "dangerously sexy" approach to SMP scalability. They recognise that
> their sponsors [1] would have their heads if BEAM started hanging or dumping
> core in the sole interest of scaling to ridiculously many cores on hardware that
> none of the sponsors are using.

Furthermore, IMO it's something of a mistake to obsess over single-process SMP. I think distributed erlang is pretty sexy.

Best,
Thomas

Hendrik Visage

unread,

Jul 27, 2011, 6:21:05 AM7/27/11

to Thomas Lindgren, erlang-questions

On Wed, Jul 27, 2011 at 11:07 AM, Thomas Lindgren
<thomasl...@yahoo.com> wrote:

> ----- Original Message -----
>> From: Ulf Wiger <ulf....@erlang-solutions.com>
>

>> ...
>> I guess it's fair to say that the Erlang/OTP doesn't have a
>> "dangerously sexy" approach to SMP scalability. They recognise that
>> their sponsors [1] would have their heads if BEAM started hanging or dumping
>> core in the sole interest of scaling to ridiculously many cores on hardware that
>> none of the sponsors are using.
>
>
> Furthermore, IMO it's something of a mistake to obsess over single-process SMP. I think distributed erlang is pretty sexy.

And *that* my dear friends, is where the "clouds" are hanging out ;)

A rack in each city centre with a bunch of 1u/2u boxes, using
something like GoogleFS, and we have close to a petabytes of
distributed disaster tolerant storage and processing, and then I also
say: Bring me my Erlang :)

Dmitrii Dimandt

unread,

Jul 27, 2011, 6:27:10 AM7/27/11

to erlang-questions@erlang.org Questions

On Jul 27, 2011, at 10:16 AM, Ulf Wiger wrote:

On 27 Jul 2011, at 08:08, Michael Truog wrote:

On 07/26/2011 09:42 PM, Richard O'Keefe wrote:
(3) There doesn't seem to be anything in the Erlang _approach_ that
  should interfere with scaling, but the _implementation_ appears
  not to scale well much past 16 cores.

  I don't know if that is current information. If true, it's an
  important limitation of the implementation which I'm sure will be
  given a lot of attention. I don't expect it to stay true.

Any problems scaling on systems with more than 16 cores just seems to be related to the current cost of those systems (would love to know what the tests currently show on the Tileras > 16 cores though).

I guess it's fair to say that the Erlang/OTP doesn't have a "dangerously sexy" approach to SMP scalability. They recognise that their sponsors [1] would have their heads if BEAM started hanging or dumping core in the sole interest of scaling to ridiculously many cores on hardware that none of the sponsors are using. Instead, they try to deliver incremental improvements without ever sacrificing the rock-solid performance of BEAM that Erlang users have come to expect. They're not batting 100% in that regard, but close enough to be very impressive, IMHO.

They *have* run benchmarks on 64-core machines, but mainly to learn about what's around the corner, and understand what changes will be needed to get there. You will soon hear about some pretty exciting developments in the area of Erlang for massive scalability, but there is a time and a place for everything… ;-)

Also don't forget that there's a grant given to universities to further develop Erlang to become massively multicore, as Kostis Sagonas said at Erlang Factory in London

===================================

Dmitrii Dimandt
dmi...@dmitriid.com

------------------------------------------------------------
Erlang in Russian
http://erlanger.ru/

TurkeyTPS

http://turkeytps.com/

------------------------------------------------------------

LinkedIn: http://www.linkedin.com/in/dmitriid

GitHub: https://github.com/dmitriid

Ulf Wiger

unread,

Jul 27, 2011, 7:29:18 AM7/27/11

to Dmitrii Dimandt, erlang-questions@erlang.org Questions

On 27 Jul 2011, at 12:27, Dmitrii Dimandt wrote:

>> They *have* run benchmarks on 64-core machines, but mainly to learn about what's around the corner, and understand what changes will be needed to get there. You will soon hear about some pretty exciting developments in the area of Erlang for massive scalability, but there is a time and a place for everything… ;-)
>>
>
>
> Also don't forget that there's a grant given to universities to further develop Erlang to become massively multicore, as Kostis Sagonas said at Erlang Factory in London

Well, that's what I was referring to, but there will be a more coordinated announcement soon. :)

…That, and another one, with a different slant.

Lukas Larsson

unread,

Jul 27, 2011, 8:06:00 AM7/27/11

to Michael Truog, erlang-questions@erlang.org Questions

Here[1] is a pretty resent publication about testing Erlang on a TILEPro64. It also contains a quite pretty detailed description of how the beam VM works internally with it's scheduling and memory allocations algorithms.

Lukas

[1] http://kth.diva-portal.org/smash/record.jsf?pid=diva2:392243

----- Original Message -----
From: "Michael Truog" <mjt...@gmail.com>
To: "erlang-q...@erlang.org Questions" <erlang-q...@erlang.org>
Sent: Wednesday, 27 July, 2011 8:08:15 AM
Subject: Re: [erlang-questions] trouble with erlang or erlang is a ghetto

Valentin Micic

unread,

Jul 27, 2011, 8:24:07 AM7/27/11

to Lukas Larsson, erlang-questions@erlang.org Questions

On 27 Jul 2011, at 2:06 PM, Lukas Larsson wrote:

Yes, HiPE has issues, Erlang is not meant for all programs, and the syntax is different. Nothing is perfect.

Particularly programers...

V/

Mihai Balea

unread,

Jul 27, 2011, 8:58:42 AM7/27/11

to Ulf Wiger, erlang-questions@erlang.org Questions

On Jul 27, 2011, at 7:29 AM, Ulf Wiger wrote:

>
> On 27 Jul 2011, at 12:27, Dmitrii Dimandt wrote:
>
>>> They *have* run benchmarks on 64-core machines, but mainly to learn about what's around the corner, and understand what changes will be needed to get there. You will soon hear about some pretty exciting developments in the area of Erlang for massive scalability, but there is a time and a place for everything… ;-)
>>>
>>
>>
>> Also don't forget that there's a grant given to universities to further develop Erlang to become massively multicore, as Kostis Sagonas said at Erlang Factory in London
>
> Well, that's what I was referring to, but there will be a more coordinated announcement soon. :)
>
>
> …That, and another one, with a different slant.

Are you referring to this announcement?

http://www.erlang-solutions.com/press-releases/3/entry/1253

Mihai

Dmitrii Dimandt

unread,

Jul 27, 2011, 9:11:25 AM7/27/11

to erlang-questions@erlang.org Questions

On Jul 27, 2011, at 3:58 PM, Mihai Balea wrote:

>
> On Jul 27, 2011, at 7:29 AM, Ulf Wiger wrote:
>
>>
>> On 27 Jul 2011, at 12:27, Dmitrii Dimandt wrote:
>>
>>>> They *have* run benchmarks on 64-core machines, but mainly to learn about what's around the corner, and understand what changes will be needed to get there. You will soon hear about some pretty exciting developments in the area of Erlang for massive scalability, but there is a time and a place for everything… ;-)
>>>>
>>>
>>>
>>> Also don't forget that there's a grant given to universities to further develop Erlang to become massively multicore, as Kostis Sagonas said at Erlang Factory in London
>>
>> Well, that's what I was referring to, but there will be a more coordinated announcement soon. :)
>>
>>
>> …That, and another one, with a different slant.
>
> Are you referring to this announcement?
>
> http://www.erlang-solutions.com/press-releases/3/entry/1253

Oh wow. Congrats to Erlang Solutions!

Ulf Wiger

unread,

Jul 27, 2011, 9:38:39 AM7/27/11

to Mihai Balea, erlang-questions@erlang.org Questions

Yeah, something like that. Being on vacation, I haven't kept up with the announcements. :)

BR,
Ulf W

On 27 Jul 2011, at 14:58, Mihai Balea wrote:

>
> On Jul 27, 2011, at 7:29 AM, Ulf Wiger wrote:
>
>>
>> On 27 Jul 2011, at 12:27, Dmitrii Dimandt wrote:
>>
>>>> They *have* run benchmarks on 64-core machines, but mainly to learn about what's around the corner, and understand what changes will be needed to get there. You will soon hear about some pretty exciting developments in the area of Erlang for massive scalability, but there is a time and a place for everything… ;-)
>>>>
>>>
>>>
>>> Also don't forget that there's a grant given to universities to further develop Erlang to become massively multicore, as Kostis Sagonas said at Erlang Factory in London
>>
>> Well, that's what I was referring to, but there will be a more coordinated announcement soon. :)
>>
>>
>> …That, and another one, with a different slant.
>
> Are you referring to this announcement?
>
> http://www.erlang-solutions.com/press-releases/3/entry/1253
>
> Mihai

Ulf Wiger, CTO, Erlang Solutions, Ltd.
http://erlang-solutions.com

_______________________________________________

Ulf Wiger

unread,

Jul 27, 2011, 10:27:17 AM7/27/11

to Dmitrii Dimandt, erlang-questions@erlang.org Questions

On 27 Jul 2011, at 15:11, Dmitrii Dimandt wrote:

Are you referring to this announcement?

http://www.erlang-solutions.com/press-releases/3/entry/1253

Oh wow. Congrats to Erlang Solutions!

Congrats to Erlang in general! This is a major infusion of research money on two fronts: low-level multicore performance, and high-level SMP and large-cluster performance.

Ericsson is a key player in the RELEASE project, as is Kostis, with one leg in Uppsala and one in Athens. University of Kent continues its erlang-related research and Heriot-Watt University picks it up again after a hiatus. ;-)

Another important commercial player is EDF, adding both expertise and case studies on massively scaled discrete-time simulations. One ESL angle will be using Moebius to enable capability-based orchestration and testing of large heterogeneous clusters.

The two projects also have great collaboration potential, partly due to the common Erlang ingredient, but also due to good ties between Heriot-Watt, Kent and S:t Andrews.

Here is a factsheet for the RELEASE project:

http://www.macs.hw.ac.uk/~trinder/RELEASEfactsheet.pdf

No corresponding factsheed for Paraphrase, I'm afraid. More detailed announcements will follow.

Ulf Wiger

unread,

Jul 27, 2011, 4:23:17 PM7/27/11

to erlang-pr...@googlegroups.com, erlang-questions@erlang.org Questions

On 27 Jul 2011, at 18:58, Richard Bucker wrote:

> There is nothing in the language or VM that makes it's sigma rating any better or worse than any other language or framework.

Erlang doesn't pretend to have a sigma rating. What language does (except perhaps in some very narrow context)?

Erlang has been used to build systems with better than 5-nines availability, according the rules for measuring In-Service Performance (ISP) in Telecoms. That's what it was made for.

> Indeed Mnesia and Mnesia's replication cannot add to it's rating as it fails all the time. And hello-world is not an acceptable app to test or rate.

As it happens the system responsible for the 99.99999% availability figure floating around [1], did in fact use mnesia. Klarna uses mnesia, and has best-in-class availability among online factoring services in Scandinavia (I believe - I have no hard data, but their CTO was pretty confident at the last Erlang Factory in London). There are other examples, and just about every database used in anger has its share of horror stories too. Perhaps you shouldn't believe everything you read in the blogosphere? ;-)

[1] That was a real data point, made official by British Telecom - which is why it could be used by Joe in a talk at MIT. The average ISP reported from all our deployments was lower, but consistently better than 99.999% while I was keeping track. I cannot give you the exact numbers.

> Erlang solves some interesting problems and creates some new ones. It's more of a "religion of functionalism" than anything else. And for the same reasons it's a little dangerous and risky.

I'm not sure what you base that on. Erlang's history has clearly been driven by pragmatism, and has made it this far mainly because it proved excellent for solving the problems it was designed for.

But I admit - Erlang has gone through some different phases:
* from being young and hyped,
* to being banned, bashed and nearly killed,
* to seeing some kind of truce, where it was tolerated, as long as its proponents didn't try to make a big deal out of it,
* to wooing part of the Open Source crowd and seeing some hype again (and some bashing too, for good measure).

Having experienced all of those, I have long since given up on trying to convince anyone who doesn't want to be convinced. For the purpose of the list, I will try to comment on things that I find incorrect, and share my experience when asked.

If you don't see anything special about Erlang, and consider using it a risk, I'd assume the sensible thing is not to use it. Are you willing to entertain the possibility that other people reach a very different conclusion, without being blinded by religious fervour?

BR,
Ulf W

Ulf Wiger, CTO, Erlang Solutions, Ltd.
http://erlang-solutions.com

_______________________________________________

Max Lapshin

unread,

Jul 27, 2011, 4:36:19 PM7/27/11

to Ulf Wiger, erlang-pr...@googlegroups.com, erlang-questions@erlang.org Questions

Ulf. There maybe many speculations about "most convenient language" or
"best garbage collection", but one serious problem is really
mentioned.

It is lack of data structure, which can survive upgrade of only one
part of software.

I can't create redistributable plugins for erlyvideo that can use
headers with record description, I have to write such things:
ems_media:get(Media, last_dts)
and frankly speaking I don't understand how do other people live without this.

I understand, that C language has the same problem, but maybe there
are some ideas to make some delayed instantiation of record usage in
modules?
I.e. add "record_get(Record, name)" instruction to beam, which should
be translated to "element(Record, N)" on loading?

Joel Reymont

unread,

Jul 27, 2011, 4:39:04 PM7/27/11

to Max Lapshin, erlang-q...@erlang.org

On Jul 27, 2011, at 9:36 PM, Max Lapshin wrote:

> I can't create redistributable plugins for erlyvideo that can use
> headers with record description, I have to write such things:
> ems_media:get(Media, last_dts)
> and frankly speaking I don't understand how do other people live without this.

What exactly is the problem with this?

Records are just tuples, no?

P.S. I don't think one needs to post to the Google group as it mirrors this list.

--------------------------------------------------------------------------
- for hire: mac osx device driver ninja, kernel extensions and usb drivers
---------------------+------------+---------------------------------------
http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont
---------------------+------------+---------------------------------------

_______________________________________________

Max Lapshin

unread,

Jul 27, 2011, 4:49:01 PM7/27/11

to Joel Reymont, erlang-q...@erlang.org

On Thu, Jul 28, 2011 at 12:39 AM, Joel Reymont <joe...@gmail.com> wrote:
>
> What exactly is the problem with this?
>
> Records are just tuples, no?
>

Problem happens, when plugin is compiled against version 2, where
record rtmp_session was looking like:

#rtmp_session{pid, user_id}

but launched against version 3, where rtmp_session is:

#rtmp_session{pid, user_id, session_id}

Yes, records are only tuples and their field names are lost in runtime.

Loïc Hoguin

unread,

Jul 27, 2011, 4:49:48 PM7/27/11

to Joel Reymont, erlang-q...@erlang.org

On 07/27/2011 10:39 PM, Joel Reymont wrote:
>
> On Jul 27, 2011, at 9:36 PM, Max Lapshin wrote:
>
>> I can't create redistributable plugins for erlyvideo that can use
>> headers with record description, I have to write such things:
>> ems_media:get(Media, last_dts)
>> and frankly speaking I don't understand how do other people live without this.
>
>
> What exactly is the problem with this?
>
> Records are just tuples, no?

I think that's what confuses most people. They see records as something
more, when they're just tuples and aren't designed to be anything else.

Records need to be upgraded the same as tuples when changes are made and
of course it takes a little more care than struct/frames would.

--
Loïc Hoguin
Dev:Extend

Ulf Wiger

unread,

Jul 27, 2011, 5:10:13 PM7/27/11

to Max Lapshin, erlang-questions@erlang.org Questions

Max,

Yes, records are awkward for sharing across interfaces for this reason.

When I wrote the Diameter application (which is now part of OTP), I tried to mitigate the problem in two ways:

- Allow the user to pass a proplist as an alternative to the record. This requires a conversion step, and some slight overhead.

- Use 'exprecs' (part of http://github.com/esl/parse_trans, although the Diameter app has its own version), to generate accessor functions similar to what you describe. The notation is perhaps not the most beautiful, but you get used to it.

Exprecs actually also has some upgrade support, where you can specify different versions of the same record:

http://erlang.2086793.n4.nabble.com/RFC-exprecs-and-record-versions-td2114178.html

I don't know if anyone has ever used it; I've never received comments on it, as far as I recall.

BR,

Ulf W

Richard O'Keefe

unread,

Jul 27, 2011, 8:01:28 PM7/27/11

to Michael Truog, erlang-questions@erlang.org Questions

On 27/07/2011, at 6:08 PM, Michael Truog wrote:
>
> Not even the standard library is perfect. I have yet to hear of a perfect standard library in any language. I think having problems with a standard library is a natural problem because the people that write the standard library impose a taxonomy on functionality that not everyone shares naturally, because we are not psychic. Learning the standard library just seems like a natural process in computer science when you get to relate a standard library to the others you had the misfortune to use in the past. Why bother complaining about a taxonomy that is just as bad as any other? If it serves the purpose it was designed for well, then there is no reason to care, it just happens to be different from what you might expect based on your own limited knowledge.

Let me offer a two-fold contrast with Eiffel here.

When Betrand Meyer designed Eiffel, he didn't just design a language,
he designed a *consistent* naming convention to be applied across
*all* library classes. From time to time this got revised, but it
did mean that there was a *documented* convention: once you had learned
the names for one class, you knew a great deal about what the names in
another class meant.

I tried to follow a similar policy in the Quintus Prolog library.

That's the kind of "inconsistency" people talk about when they criticise
the Erlang/OTP libraries: same name for different things, different names
for same things.

That's the contrast that makes Eiffel look good and Erlang look bad.

Now for the other contrast.

The Eiffel libraries were proprietary, so other Eiffel implementors
had to come up with others. Eventually the N.I.C.E. came up with a
common but minimal library. Oy was it minimal! The nearest thing
there was to a _useful_ Eiffel library was the GOBO library, and one
of the Eiffel implementations (as it happened, the one I used) changed
too much for the GOBO maintainer to keep up (or me; I stopped using
Eiffel).

The *coverage* of a library can be more important than its *consistency*.
And to be honest, I think coverage is going to drive Erlang/OTP library
development more than consistency is. If someone wants to use SCTP, it
is probably more important to them to have a finished SCTP module they
can use now than to have a beautiful module they might be able to use
next year, maybe.

It might be interesting to run some sort of poll to see what currently
unsupported areas people care about most, particularly the people who
would say "I'd like to use Eiffel but I cannot because it does not
have library support for <whatever>."

Loïc Hoguin

unread,

Jul 27, 2011, 8:15:40 PM7/27/11

to Richard O'Keefe, erlang-questions@erlang.org Questions

On 07/28/2011 02:01 AM, Richard O'Keefe wrote:
> It might be interesting to run some sort of poll to see what currently
> unsupported areas people care about most, particularly the people who
> would say "I'd like to use Eiffel but I cannot because it does not
> have library support for <whatever>."

Proper Unicode support in Erlang would probably get many votes.

--
Loïc Hoguin
Dev:Extend

Richard O'Keefe

unread,

Jul 27, 2011, 8:39:44 PM7/27/11

to Lukas Larsson, erlang-questions@erlang.org Questions

On 28/07/2011, at 12:06 AM, Lukas Larsson wrote:

> Here[1] is a pretty resent publication about testing Erlang on a TILEPro64. It also contains a quite pretty detailed description of how the beam VM works internally with it's scheduling and memory allocations algorithms.
>
> Lukas
>
> [1] http://kth.diva-portal.org/smash/record.jsf?pid=diva2:392243

Thanks a lot for the reference.
You've got to love this sentence from the abstract:

scalability can be improved by reducing lock contention.

We all _know_ that fire is hot and water is wet, but sometimes
it's nice to see it proved...

It looks as though the claim that Erlang runs out of steam at 16
is a furphy, and that there are good ideas about how to make it
even better.

Bob Ippolito

unread,

Jul 27, 2011, 9:04:03 PM7/27/11

to Loïc Hoguin, erlang-questions@erlang.org Questions

On Wed, Jul 27, 2011 at 5:15 PM, Loïc Hoguin <es...@dev-extend.eu> wrote:
> On 07/28/2011 02:01 AM, Richard O'Keefe wrote:
>> It might be interesting to run some sort of poll to see what currently
>> unsupported areas people care about most, particularly the people who
>> would say "I'd like to use Eiffel but I cannot because it does not
>> have library support for <whatever>."
>
> Proper Unicode support in Erlang would probably get many votes.

UTF8 support works great these days, what else should you need? ;)

-bob

Loïc Hoguin

unread,

Jul 27, 2011, 9:26:17 PM7/27/11

to Bob Ippolito, erlang-questions@erlang.org Questions

On 07/28/2011 03:04 AM, Bob Ippolito wrote:
> On Wed, Jul 27, 2011 at 5:15 PM, Loïc Hoguin <es...@dev-extend.eu> wrote:
>> On 07/28/2011 02:01 AM, Richard O'Keefe wrote:
>>> It might be interesting to run some sort of poll to see what currently
>>> unsupported areas people care about most, particularly the people who
>>> would say "I'd like to use Eiffel but I cannot because it does not
>>> have library support for <whatever>."
>>
>> Proper Unicode support in Erlang would probably get many votes.
>
> UTF8 support works great these days, what else should you need? ;)

You can output UTF8 as binary, yes. Maybe as strings too (I'm not really
using those so I wouldn't know). But to give an example, can you search
inside your UTF8 text for the word "trouvé" including all different
variants of the é character (perhaps even just 'e')? Byte search isn't
doing any good here.

--
Loïc Hoguin
Dev:Extend

Bob Ippolito

unread,

Jul 27, 2011, 9:38:28 PM7/27/11

to Loïc Hoguin, erlang-questions@erlang.org Questions

On Wed, Jul 27, 2011 at 6:26 PM, Loïc Hoguin <es...@dev-extend.eu> wrote:
> On 07/28/2011 03:04 AM, Bob Ippolito wrote:
>> On Wed, Jul 27, 2011 at 5:15 PM, Loïc Hoguin <es...@dev-extend.eu> wrote:
>>> On 07/28/2011 02:01 AM, Richard O'Keefe wrote:
>>>> It might be interesting to run some sort of poll to see what currently
>>>> unsupported areas people care about most, particularly the people who
>>>> would say "I'd like to use Eiffel but I cannot because it does not
>>>> have library support for <whatever>."
>>>
>>> Proper Unicode support in Erlang would probably get many votes.
>>
>> UTF8 support works great these days, what else should you need? ;)
>
> You can output UTF8 as binary, yes. Maybe as strings too (I'm not really
> using those so I wouldn't know). But to give an example, can you search
> inside your UTF8 text for the word "trouvé" including all different
> variants of the é character (perhaps even just 'e')? Byte search isn't
> doing any good here.

It sounds like you want a unicode normalization library, I don't think
this is really a search problem. In Python you'd do this with the
unicodedata module. You're right that there is nothing that ships with
Erlang for this purpose, at least not that I know of. It seems like
this might be easy to solve in a third party library, maybe a binding
to ICU. At least one of these probably already exists.

-bob

Jon Watte

unread,

Jul 27, 2011, 10:23:17 PM7/27/11

to Hendrik Visage, erlang-questions

A rack in each city? Is the Erlang kernel getting more latency tolerant?

Btw, when it comes to "cost effective hardware":

With regards to hardware: The price/performance curve right now favors two dies with 6 hyperthreaded cores on each, for a total of 24 hardware threads. Intel will soon be selling (if they aren't already) 10, 12 and higher count cores, and you can buy motherboards and core logic with support for four-way and eight-way dies, but the price/performance isn't quite there yet.

When it comes to RAM, you get 32 GB of high-quality, high-performance ECC RAM for about $600. That's almost less that some companies charge for shipping :-/

When it comes to disk space, you can pay $100 for a SATA spindle, or $200 for a SAS spindle, or $400 for a SSD disk. However, that SSD disk will have 20x the transaction throughput of the other spindles, so again, there's really no question that any new box should have an SSD. Even large, database server type boxes generally should have SSDs, except for long-term storage nodes (I'm talking dozens of terabytes and up) where transaction throughput simply doesn't matter.

We're also getting to the point where 10 Gbps Ethernet is palatable. It's certainly cheaper to get one 10 Gbps port than 10 1 Gbps ports in switching and network hardware. I imagine that a year from now, you'll start seeing 1 Gbps being pushed down to the "connectivity" part of the segment, and 10 Gbps will be default for any new switches, on-motherboard networks, etc.

So, the *current* best-value hardware looks something like:

32 GB of RAM

24 hardware threads (in 2-way NUMA, btw -- does BEAM pay attention to memory affinity?)

240 GB SSD, 1 or 2 (RAID-1 for redundancy)

Probably 10 Gbps networking

Next year's best-value hardware will probably look something like:

64 GB of RAM

40 hardware threads (still with 2-way NUMA)

240 GB SSD, 1 or 2 (RAID-1 for redundancy) (it will be cheaper than this year, but still the "sweet spot" unless you're building RAID 6 volumes or something)

Definitely 10 Gbps networking

We're all living in the future! :-)

Sincerely,

jw

--
Americans might object: there is no way we would sacrifice our living standards for the benefit of people in the rest of the world. Nevertheless, whether we get there willingly or not, we shall soon have lower consumption rates, because our present rates are unsustainable.

Jon Watte

unread,

Jul 27, 2011, 10:28:00 PM7/27/11

to Max Lapshin, erlang-pr...@googlegroups.com, erlang-questions@erlang.org Questions

It is lack of data structure, which can survive upgrade of only one
part of software.

All modern languages have a simple hash map class that is efficient and built-into the language. JavaScript, Python, Ruby, etc.

In Erlang, we have dict, and gb_tree, and proplist, and ets, but none of them are as easy to use, and some of them have performance problems once you scale over a few dozen or a few hundred elements, and others (ets) are way overkill for this particular use case.

Or would anyone want to argue that "dict" is actually as good/productive as JavaScript objects or Python dicts?

Sincerely,

jw

Loïc Hoguin

unread,

Jul 28, 2011, 4:39:59 AM7/28/11

to Bob Ippolito, erlang-questions@erlang.org Questions

Hello,

>> You can output UTF8 as binary, yes. Maybe as strings too (I'm not really
>> using those so I wouldn't know). But to give an example, can you search
>> inside your UTF8 text for the word "trouvé" including all different
>> variants of the é character (perhaps even just 'e')? Byte search isn't
>> doing any good here.
>
> It sounds like you want a unicode normalization library, I don't think
> this is really a search problem. In Python you'd do this with the
> unicodedata module. You're right that there is nothing that ships with
> Erlang for this purpose, at least not that I know of. It seems like
> this might be easy to solve in a third party library, maybe a binding
> to ICU. At least one of these probably already exists.

Well yeah. Actually I should have just mentioned something simpler like
to_upper that produces quite unexpected effects when done wrong in Unicode.

I retract my statement though. Michael Uvarov forwarded me off-list to
this library that seems to be just what's needed for any kind of Unicode
string manipulation, although I didn't test it:
https://github.com/freeakk/ux

--
Loïc Hoguin
Dev:Extend

Ulf Wiger

unread,

Jul 28, 2011, 6:33:39 AM7/28/11

to Jon Watte, erlang-questions

On 28 Jul 2011, at 04:23, Jon Watte wrote:

A rack in each city? Is the Erlang kernel getting more latency tolerant?

I assume you are referring to the occasional "node not responding" issues? As far as I know, the kernel doesn't have issues with latency.

The problems with Distributed Erlang are related to a heavy-handed backpressure solution, where processes trying to send to the dist_port are simply suspended if the output queue exceeds a given threshold. When the queue falls under the threshold, all suspended processes are resumed. Since the algorithm doesn't differentiate between processes, this fate can befall the net ticker as well.

This *has* been improved a bit in the latest release, and I believe more improvements are forthcoming. Simply increasing the thresholds (currently not configurable) should mostly eliminate the problem.

So, the *current* best-value hardware looks something like:
32 GB of RAM
24 hardware threads (in 2-way NUMA, btw -- does BEAM pay attention to memory affinity?)
240 GB SSD, 1 or 2 (RAID-1 for redundancy)

Probably 10 Gbps networking

BEAM is starting to make use of NUMA, for example when allowing you to control the binding of schedulers to cores. See e.g.

http://www.erlang.org/doc/man/erl.html (search for NUMA)

I believe that the current activities are mainly laying the groundwork for some more powerful optimizations, e.g. delayed deallocation, but R14B actually included quite a few improvements already, esp. in regard to locking.

http://www.erlang.org/download/otp_src_R14B.readme

Still, micro benchmarks have indicated that memory allocation (not least GC meta-data) locking issues mainly start affecting performance somewhere beyond 40 cores. The question is how much this really affects applications on the "usual" hardware of today?

One problem is that it's hard to do detailed profiling on complex real-world applications. The issues limiting scalability might well be wholly unrelated to core VM aspects such as GC, scheduling and message passing. In the first SMP experiments with Ericsson's Telephony Gateway Controller, the big bottleneck was the big lock protecting the ports and linked-in drivers.

Next year's best-value hardware will probably look something like:
64 GB of RAM
40 hardware threads (still with 2-way NUMA)
240 GB SSD, 1 or 2 (RAID-1 for redundancy) (it will be cheaper than this year, but still the "sweet spot" unless you're building RAID 6 volumes or something)

Definitely 10 Gbps networking

Yes, but one thing I learned while at Ericsson was that NEBS-compliant ATCA processor boards don't exactly stay on the leading edge of processor capacity. The top-end blade servers today seem to host up to two dual- or quad-core processors. This is not to say that everyone has to evolve at the same pace, but the main funding sources for Erlang/OTP tend to follow this path.

Now, Joe has publicly mentioned running an application on a 24-core architecture, for which the optimum setup at the time seemed to be 4 erlang nodes - one for each physical CPU. The problems arise when the application isn't embarrassingly parallel, but requires processes to actually interact with each other, sometimes in fairly complex ways. Also, these many-core architectures are still finding their way in terms of memory access architectures, and each vendor has different ideas. The combination of bottlenecks in the VM, limitations of the hardware architecture, and complex interaction patterns, can easily result in emergent behaviour, which can be quite dramatic.

My own take on this is that it's something we will have to learn to live with, and I started developing Jobs - a load control framework (http://github.com/esl/jobs) to allow for "traffic regulation" of erlang-based systems similarly to how one achieves quality of service on TCP-based networks (another messaging system).

The key, in my experience, is not usually to go as fast as possible, but to deliver reliable and predictable performance that is good enough for the problem at hand. As many have pointed out through the years, Erlang was never about delivering maximum performance.

BR,

Ulf W

Jesper Louis Andersen

unread,

Jul 28, 2011, 6:37:15 AM7/28/11

to Jon Watte, erlang-pr...@googlegroups.com, erlang-questions@erlang.org Questions

On Thu, Jul 28, 2011 at 04:28, Jon Watte <jwa...@gmail.com> wrote:

> All modern languages have a simple hash map class that is efficient and
> built-into the language. JavaScript, Python, Ruby, etc.
> In Erlang, we have dict, and gb_tree, and proplist, and ets, but none of
> them are as easy to use, and some of them have performance problems once you
> scale over a few dozen or a few hundred elements, and others (ets) are way
> overkill for this particular use case.
>
> Or would anyone want to argue that "dict" is actually as good/productive as
> JavaScript objects or Python dicts?

JavaScript, Python, Ruby, etc almost all have ephemeral dict
structures. In Erlang, the structure must be persistent, like in
Clojure/Haskell and so on. This rules out a number of possible
structures.

As it stands, "dict" is probably the right replacement, with gb_trees
for some situations as they have different speed in those. The "dict"
does fairly well, but it is implemented in Erlang and fast as it is,
it cannot compete with a C-implementation of a ephemeral structure.

Should we want to BIF/NIFify such a structure, I'd recommend either a
patricia hash table or a HAMT. In fact, that would be an interesting
thing to try. Though I wonder how much I can alter the in-memory
structure of the tree from Erlang terms - I'd really like some cache
efficient packing there for tree nodes.

--
J.

Joel Reymont

unread,

Jul 28, 2011, 6:41:47 AM7/28/11

to Ulf Wiger, erlang-questions

On Jul 28, 2011, at 11:33 AM, Ulf Wiger wrote:

> The problems with Distributed Erlang are related to a heavy-handed backpressure solution, where processes trying to send to the dist_port are simply suspended if the output queue exceeds a given threshold. When the queue falls under the threshold, all suspended processes are resumed. Since the algorithm doesn't differentiate between processes, this fate can befall the net ticker as well.

I thought my net splits were due to heavy process messaging traffic and the net ticker messages falling behind.

That didn't quite explain it but what you said does.

--------------------------------------------------------------------------
- for hire: mac osx device driver ninja, kernel extensions and usb drivers
---------------------+------------+---------------------------------------
http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont
---------------------+------------+---------------------------------------

_______________________________________________

Ulf Wiger

unread,

Jul 28, 2011, 6:51:13 AM7/28/11

to Richard O'Keefe, erlang-questions@erlang.org Questions

On 28 Jul 2011, at 02:39, Richard O'Keefe wrote:

On 28/07/2011, at 12:06 AM, Lukas Larsson wrote:

...

[1] http://kth.diva-portal.org/smash/record.jsf?pid=diva2:392243

...

It looks as though the claim that Erlang runs out of steam at 16
is a furphy, and that there are good ideas about how to make it
even better.

Indeed (although I had to look up the word "furphy").

The picture on page 64, illustrating near-perfect scalability of the Big Bang benchmark on a simulated 128-core SPARC [1] is nice - partly because not everyone has access to the fairly pricey Simics simulations, but also because it illustrates how much of the scalability problem on multicore is about negotiating the divide between processor speed and memory bandwidth [2].

One may also ponder, based on this, how important it is to understand the memory access characteristics - and by extension, the dependency patterns - of the application you want to scale. These are exciting times… :)

BR,

Ulf W

[1] I learned from Simics experts that SPARC is easy to scale this way because it is a very symmetrical architecture. In contrast, many other vendors have specific architectures for their 4-, 8-, 16-core systems, and so on, so just tweaking the number of cores is not very meaningful.

[2] For those who haven't read the paper, the simulation assumed zero cost for memory access.

Ulf Wiger

unread,

Jul 28, 2011, 7:00:45 AM7/28/11

to Joel Reymont, erlang-questions

On 28 Jul 2011, at 12:41, Joel Reymont wrote:

On Jul 28, 2011, at 11:33 AM, Ulf Wiger wrote:

The problems with Distributed Erlang are related to a heavy-handed backpressure solution, where processes trying to send to the dist_port are simply suspended if the output queue exceeds a given threshold. When the queue falls under the threshold, all suspended processes are resumed. Since the algorithm doesn't differentiate between processes, this fate can befall the net ticker as well.

I thought my net splits were due to heavy process messaging traffic and the net ticker messages falling behind.

That didn't quite explain it but what you said does.

Yeah, we (or mainly, Michal Ptaszek) had reason to dig into this fairly recently, and found that tuning can really make a big difference. Still, the whole area should be revisited for smarter overload handling.

A particularly interesting fault situation was when this dynamic ended up suspending the rpc server. It could still receive and process requests (spawning dynamic workers for the processing), but was suspended practically every time it tried to send a reply. Eventually, its message queue used up all memory and killed the node. :)

Actually, these changes in R14B01 are relevant:

    OTP-8901  The runtime system is now less eager to suspend processes
	      sending messages over the distribution. The default value of
	      the distribution buffer busy limit has also been increased
	      from 128 KB to 1 MB. This in order to improve throughput.

    OTP-8912  The distribution buffer busy limit can now be configured at
	      system startup. For more information see the documentation of
	      the erl +zdbbl command line flag. (Thanks to Scott Lystig
	      Fritchie)

and possibly also this:

    OTP-8916  The inet driver internal buffer stack implementation has been
	      rewritten in order to reduce lock contention.

(http://www.erlang.org/download/otp_src_R14B01.readme)

Joel Reymont

unread,

Jul 28, 2011, 7:03:12 AM7/28/11

to Ulf Wiger, erlang-questions

On Jul 28, 2011, at 12:00 PM, Ulf Wiger wrote:

> Yeah, we (or mainly, Michal Ptaszek) had reason to dig into this fairly recently, and found that tuning can really make a big difference.

Tuning what?

> A particularly interesting fault situation was when this dynamic ended up suspending the rpc server.

How did you find out?

Robert Virding

unread,

Jul 28, 2011, 7:33:23 AM7/28/11

to Jesper Louis Andersen, erlang-pr...@googlegroups.com, erlang-questions@erlang.org Questions

On 28/07/2011 12:37, Jesper Louis Andersen wrote:
> On Thu, Jul 28, 2011 at 04:28, Jon Watte<jwa...@gmail.com> wrote:
>
>> All modern languages have a simple hash map class that is efficient and
>> built-into the language. JavaScript, Python, Ruby, etc.
>> In Erlang, we have dict, and gb_tree, and proplist, and ets, but none of
>> them are as easy to use, and some of them have performance problems once you
>> scale over a few dozen or a few hundred elements, and others (ets) are way
>> overkill for this particular use case.
>>
>> Or would anyone want to argue that "dict" is actually as good/productive as
>> JavaScript objects or Python dicts?
> JavaScript, Python, Ruby, etc almost all have ephemeral dict
> structures. In Erlang, the structure must be persistent, like in
> Clojure/Haskell and so on. This rules out a number of possible
> structures.
>
> As it stands, "dict" is probably the right replacement, with gb_trees
> for some situations as they have different speed in those. The "dict"
> does fairly well, but it is implemented in Erlang and fast as it is,
> it cannot compete with a C-implementation of a ephemeral structure.
>
> Should we want to BIF/NIFify such a structure, I'd recommend either a
> patricia hash table or a HAMT. In fact, that would be an interesting
> thing to try. Though I wonder how much I can alter the in-memory
> structure of the tree from Erlang terms - I'd really like some cache
> efficient packing there for tree nodes.

dict uses linear hashing (as does ets) to make sure that there are no
long pauses to interrupt the interactivity of the system. This would
definitely be more efficient if it were to be done in C in a NIF.
Unfortunately you would still have to copy parts of the tree when doing
updates to preserve the immutability property of erlang data. Or use a
smart way of doing destructive updates and keeping change lists.

Robert

--
Robert Virding, Erlang Solutions Ltd.

Ulf Wiger

unread,

Jul 28, 2011, 7:56:00 AM7/28/11

to Joel Reymont, erlang-questions

On 28 Jul 2011, at 13:03, Joel Reymont wrote:

>
> On Jul 28, 2011, at 12:00 PM, Ulf Wiger wrote:
>
>> Yeah, we (or mainly, Michal Ptaszek) had reason to dig into this fairly recently, and found that tuning can really make a big difference.
>
> Tuning what?

As I mentioned (indirectly) in that email, the distribution buffer busy limit is now tunable, but it has also been increased, as of R14B01, from 128 KB to 1 MB, so that may well be enough.

>
>> A particularly interesting fault situation was when this dynamic ended up suspending the rpc server.
>
> How did you find out?

Well, we could see from the crash dumps that it was the rpc server's message queue that caused the OOM, and it also listed as 'suspended' (I believe). Initially, we had enabled a system_monitor on busy_dist_port events. As that helped us identify busy_dist_port as a culprit, we [1] inserted some debug printouts in the VM to increase our understanding. Of course, Tony Rogvall and Rickard Green were also extremely helpful. :)

BR,
Ulf W

[1] Michal Ptaszek, that is. I was an enthusiastic cheerleader and messenger boy, relaying ideas - and having a few of my own.

Ulf Wiger, CTO, Erlang Solutions, Ltd.
http://erlang-solutions.com

_______________________________________________

Joel Reymont

unread,

Jul 28, 2011, 8:07:24 AM7/28/11

to Ulf Wiger, erlang-questions Questions

On Jul 28, 2011, at 12:56 PM, Ulf Wiger wrote:

> I was an enthusiastic cheerleader and messenger boy

I shall make that my official title.

--------------------------------------------------------------------------
- for hire: mac osx device driver ninja, kernel extensions and usb drivers
---------------------+------------+---------------------------------------
http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont
---------------------+------------+---------------------------------------

_______________________________________________

Dmitrii Dimandt

unread,

Jul 28, 2011, 10:26:47 AM7/28/11

to erlang-questions@erlang.org Questions

On Jul 28, 2011, at 3:01 AM, Richard O'Keefe wrote:

It might be interesting to run some sort of poll to see what currently
unsupported areas people care about most, particularly the people who
would say "I'd like to use Eiffel but I cannot because it does not
have library support for <whatever>."

You might want to split such a poll in several groups — one for hackers/enthusiasts/early adopters and one "for the average programmer".

Because the results will be quite different.

The first group would probably want something like worker pools, various protocol implementations, control structures or what not.

The second group would want image manipulation libraries on par with imagemagick, full PCRE support (with unicode!) and a unified database driver a la JDBC.

Some of these choices may overlap, but not all of them.

===================================

Dmitrii Dimandt
dmi...@dmitriid.com

------------------------------------------------------------
Erlang in Russian
http://erlanger.ru/

TurkeyTPS

http://turkeytps.com/

------------------------------------------------------------

LinkedIn: http://www.linkedin.com/in/dmitriid

GitHub: https://github.com/dmitriid

Scott Lystig Fritchie

unread,

Jul 28, 2011, 1:45:50 PM7/28/11

to Ulf Wiger, erlang-questions

Ulf Wiger <ulf....@erlang-solutions.com> wrote:

uw> Actually, these changes in R14B01 are relevant:

uw> [...]

uw> OTP-8912 The distribution buffer busy limit can now be configured at
uw> system startup. For more information see the documentation of the
uw> erl +zdbbl command line flag. (Thanks to Scott Lystig Fritchie)

I would be quite interested to hear if others have used that flag and
what values have been helpful or unhelpful. Since I moved to Basho, I
haven't been able to get updates "from the field" about how my Former
Employer has been using it. (Or they've told me and I've forgotten
about it.) Perhaps one of my former colleagues lurking here on the list
could mention something?

It would also be quite interesting to know if R14's net_kernel can still
be deadlocked. (I'd submitted a patch for R13 to fix a nasty one, and
the OTP folks reworked it a bit for inclusion into R14.)

The rpc server deadlock ... ouch. One trick for the net_kernel patch
was to spawn a new process to send the reply: if that short-lived
process blocked on a busy TCP port, at least the main server wouldn't be
blocked.

-Scott

Tim Watson

unread,

Jul 28, 2011, 6:06:19 PM7/28/11

to Dmitrii Dimandt, erlang-questions@erlang.org Questions

> The second group would want image manipulation libraries on par with
> imagemagick, full PCRE support (with unicode!) and a unified database driver
> a la JDBC.

I'm not sure if I qualify as "average" or not, but I would really like
to have an edbc API that all the various database drivers (incl. 3rd
party ones) supported. I toyed with the idea of writing one and doing
some compile time code generation to generate mappings to the drivers
we use (postgres, mysql, odbc for mssql) but haven't done it yet as
none of the team could agree on what it should look like. :)

Richard O'Keefe

unread,

Jul 28, 2011, 7:45:21 PM7/28/11

to Ulf Wiger, erlang-questions

One of the things criticised in the blog entry that we've been responding to was
that
{ok,Foo} = bar(...),
{ok,Foo} = ugh(...)
is too easy to write (when you really meant, say, Foo0, Foo1).

This is a well defined part of the language, and it would not be a good idea to
ban it. But how about an optional style warning (and we really need
-warn(on | off, [warning...])
directives) whenever a bound variable appears in a pattern on the left of "="?

OvermindDL1

unread,

Jul 28, 2011, 9:19:59 PM7/28/11

to Tim Watson, Dmitrii Dimandt, erlang-questions@erlang.org Questions

It should be possible to make a nice interface that is parsed transformed into optimized code based on the used engine in a config file or something.

Tim Watson

unread,

Jul 29, 2011, 5:18:00 AM7/29/11

to OvermindDL1, Dmitrii Dimandt, erlang-questions@erlang.org Questions

On 29 July 2011 02:19, OvermindDL1 <overm...@gmail.com> wrote:
> It should be possible to make a nice interface that is parsed transformed
> into optimized code based on the used engine in a config file or something.
>

That's exactly what I have in mind. In fact, I was thinking of doing
this for a logging API first of all (so you can choose between
log4erl, error_logger, lager, etc) and then pulling out the common
configuration and code generation features into a library/tool and
reusing that to build a database connectivity API. I'll post an update
if I actually get around to it! :)

Tim Watson

unread,

Jul 29, 2011, 5:30:10 AM7/29/11

to Richard O'Keefe, erlang-questions

On 29 July 2011 00:45, Richard O'Keefe <o...@cs.otago.ac.nz> wrote:
> One of the things criticised in the blog entry that we've been responding to was
> that
> {ok,Foo} = bar(...),
> {ok,Foo} = ugh(...)
> is too easy to write (when you really meant, say, Foo0, Foo1).
>
> This is a well defined part of the language, and it would not be a good idea to
> ban it. But how about an optional style warning (and we really need
> -warn(on | off, [warning...])
> directives) whenever a bound variable appears in a pattern on the left of "="?

We would need to be able to set that warning with a very localised
scope though. What you are intending in such a match might be
*exactly* what the code says - call bar/1 to get a Foo and then call
ugh/2 and match (assert) that the resulting tuple contains exactly the
same Foo. Things like a session id, (ets) table id and so on are
probably examples of this. I won't comment on whether this is good API
design or not.

A -warn(on | off, [Warning]) sounds lovely, but would only work at
module level (or as some option passed in to the compiler) so how
would I do (or override) this for an individual line of code or for a
single function? What about function level attributes? That might get
the scope at which the switch is applied tight enough to be useful:

-pragma(no_warnings, [bound_variable_in_match]).
fwibble_ardvarks() ->
{ok,Foo} = bar(...),
{ok,Foo} = ugh(...),
%% etc....
Foo.

Harder to implement though, as a bigger change to the compiler I'm
guessing. It would however, open up lots of other possibilities -
imagine how useful function annotations would be for parse_transform
writers, or even at runtime if they were preserved in the generated
beam!

-transactional(required | start_new, [{distributed, false}]).
do_something(Data) ->
Object = convert(Data),
mnesia:write(Object).

Alex Arnon

unread,

Jul 29, 2011, 5:27:53 PM7/29/11

to Tim Watson, Dmitrii Dimandt, erlang-questions@erlang.org Questions

I had built a rather basic prototype that uses a jnode running a JDBC proxy.
This approach would increase latency, but provide access to every RDBMS under the sun.

I think that what is truly missing is an agreed-upon, standard EDBC API definition. IMHO one that is based on JDBC could certainly be good enough.
Once that is in place, implementation is just a detail :)
Erlang Community, time to claim your place in the Enterprise!!!! :)

Tim Watson

unread,

Jul 29, 2011, 8:44:52 PM7/29/11

to Alex Arnon, Dmitrii Dimandt, erlang-questions@erlang.org Questions

On 29 July 2011 22:27, Alex Arnon <alex....@gmail.com> wrote:
> I had built a rather basic prototype that uses a jnode running a JDBC proxy.
> This approach would increase latency, but provide access to every RDBMS
> under the sun.

That really scares me, and I would hesitate to use it. If you planning
on conquering the enterprise, I suspect that jumping out of you main
language VM into another one just to do database access isn't going to
go down too well. Although I could be wrong, because honestly I've
seen big companies buy "All Kinds of Stuff" (TM).

Once I'd gone to that much trouble, I might as well write the whole
shebang in Java. In fact, I prefer picking "the right tool for the
job" for each part of a project and Erlang doesn't fit every niche so
sometimes turning to Java, Python and other things is the correct
choice. Erlang does have some decent open source libraries for doing
database access (to postgres at least) and I've seen a very good
closed source Oracle driver written as a port_driver using the OCL
libraries - quick and stable, but sadly not available to the general
public. The author claims he'll help do a rewrite when native
processes get added to Erlang. :)

>
> I think that what is truly missing is an agreed-upon, standard EDBC API
> definition. IMHO one that is based on JDBC could certainly be good enough.
> Once that is in place, implementation is just a detail :)
> Erlang Community, time to claim your place in the Enterprise!!!! :)
>

I kind of agree that a standard API would be good. Unless it comes
from Ericsson or gets into the OTP distro, it's not really going to be
a standard though. If someone starts one, they really need to get on
this list and push people for feedback. I'd also suggest making it as
thin a veneer as possible on top of the real library/application being
used, as this is less likely to introduce problems.

I think a big part of the reason why medium/large enterprises don't
tend to use Erlang as a core development language, actually has
nothing to do with technology. It's a resourcing issue. It's harder to
find real Erlang expertise (i.e., commercial experience) than it is to
find Java and/or .NET developers and that increases both cost and
risk. I certainly think that Erlang's productivity and other factors
(like its built-in support for making fault tolerant applications) can
actually outweigh the initial cost of hiring good developers, but
again this is a risk that many big businesses (which are mainly run by
accountants in my experience) won't take. Start-ups and niche sectors
are a completely different kettle of fish.

Finally, I agree that JDBC has some good patterns to follow, but it's
whole design is very Object Oriented (i.e., mutable state oriented)
and therefore not much of it is really a good fit for Erlang. Things
that might translate nicely as features are (IMHO):

1. Runtime discovery of drivers
2. Runtime discovery of data source(s) a la JNDI - this is very useful
in Java applications and something like it (maybe gproc based?) would
be nice
3. Standard API for dealing with connections (and possibly connection
pools, although I'm not sure I'd want to open that particular can of
worms myself)
4. Standard API for working with database metadata (e.g.,
INFORMATION_SCHEMA, system views and the like)
5. Standard API for working with result sets (e.g., moving/scrolling,
indexed access to columns in the current row, etc)
6. QLC queries for interacting with result sets?

Not sure if that last one is really a good idea - it's very late here
and I've had a busy week.

AFAICT good API design for Erlang doesn't usually involve exposing
internal data structures (records, etc) to the end user and any state
is usually hidden as much as possible. My assumption is that the
various API modules involved would usually just return an opaque
handle to the user, which is then used during further calls. I
certainly wouldn't go off using parameterised modules as they're not
supported. So you'd get something more like:

{ok, DSHandle} = edbc_datasource:get_named_datasource(postgres_test),
{ok, ConnHandle} = edbc_connection:open(DSHandle, [PerConnOptions]),
{ok, TxnHandle} = edbc_transaction:begin(ConnHandle, [{isolation,
read_committed]),
try
%% fall back to normal (not prepared) statement when this doesn't work...
{ok, StatementHandle} = ebdc_connection:prepare_statement(ConnHandle, SQL),
{ok, StatementHandle} = edbc_statement:bind_named_params([{id,
12345}, {customer_name, "Mr Smith"}]),

{ok, Cursor} = edbc_connection:execute_query(ConnHandle, StatementHandle),
edbc_rowset:foreach(fun(Row, Col, Value) -> io:format("~p:~p =
~p~n", [Row, Col, Value]) end, Cursor),

%% version of execute/2 that takes raw SQL instead of prepared statements
edbc_connection:execute_update(ConnHandle, "INSERT INTO Food
VALUES ('Ice Cream', 'Vanilla')"),
{ok, committed} = edbc_transaction:commit(TxnHandle)
catch
_:_ -> {ok, aborted} = edbc_transaction:rollback(TxnHandle)
after
{ok, closed} = edbc_connection:close(ConnHandle)

Probably a lot of these calls to specific modules could be hidden
behind a slightly higher level API:

{ok, Conn} = edbc:open_connection(postgres_test),
Results = edbc:map(Conn, fun do_with_each_cell/2, SqlQuery),
%% or
Tree = edbc:fold(Conn, Query,
fun(Row, Col, Val, Acc) -> gb_trees:enter({Row, Col}, Val, Acc) end),
etc.....

You're right that having a good standard API against which all of the
various implementations could be used is a very nice idea. The
question is, how close are the existing libraries/applications to a
*nice* API (whatever that is) and how much work is involved in
bridging between them. This is why I'd like to see if the binding
between API and implementation can be generated - I'd like to avoid
having state in the API and let the underlying implementations deal
with their own states, servers, errors, etc.

Jon Watte

unread,

Jul 30, 2011, 1:04:32 PM7/30/11

to Ulf Wiger, erlang-questions

BEAM is starting to make use of NUMA, for example when allowing you to control the binding of schedulers to cores. See e.g.

The real efficiencies come from processes (and their working set) being bound to particular spots in the memory hierarchy, though.

I imagine that, because processes have affinity for schedulers, then schedulers being bound to cores would help some, but when the runtime decides to migrate a process, it probably needs to consider how far down the memory hierarchy to "fork" the process. (These days, we have to worry about L1, L2, L3 and NUMA nodes as distinct points in this choice)

Similarly, if one queue is falling behind, and another queue on the same core (using hyper-threading) is lightly loaded, it doesn't really make sense to shift load between those two hyper-threads as they are limited on the same functional units and L1.

It sounds like you're already considering these things; just making sure it's stated explicitly in this thread!

Yes, but one thing I learned while at Ericsson was that NEBS-compliant ATCA processor boards don't exactly stay on the leading edge of processor capacity.

That is of course a consideration. On the other hand, I imagine projects run over several years, and what's best price/performance in data centers today will likely trickle down to the telecom industry eventually :-)

The key, in my experience, is not usually to go as fast as possible, but to deliver reliable and predictable performance that is good enough for the problem at hand. As many have pointed out through the years, Erlang was never about delivering maximum performance.

I'm with you there! If I know that I can add another node to get close to linear X% increased capacity, that's really powerful.

Sincerely,

jw

Alex Arnon

unread,

Jul 31, 2011, 6:44:48 AM7/31/11

to Tim Watson, Dmitrii Dimandt, erlang-questions@erlang.org Questions

On Sat, Jul 30, 2011 at 3:44 AM, Tim Watson <watson....@gmail.com> wrote:

On 29 July 2011 22:27, Alex Arnon <alex....@gmail.com> wrote:
> I had built a rather basic prototype that uses a jnode running a JDBC proxy.
> This approach would increase latency, but provide access to every RDBMS
> under the sun.

That really scares me, and I would hesitate to use it. If you planning
on conquering the enterprise, I suspect that jumping out of you main
language VM into another one just to do database access isn't going to
go down too well. Although I could be wrong, because honestly I've
seen big companies buy "All Kinds of Stuff" (TM).

I assure you, I am not planning on conquering any Enterprises :)
However, I disagree with the above... please see below.

Once I'd gone to that much trouble, I might as well write the whole
shebang in Java. In fact, I prefer picking "the right tool for the
job" for each part of a project and Erlang doesn't fit every niche so
sometimes turning to Java, Python and other things is the correct
choice. Erlang does have some decent open source libraries for doing
database access (to postgres at least) and I've seen a very good
closed source Oracle driver written as a port_driver using the OCL
libraries - quick and stable, but sadly not available to the general
public. The author claims he'll help do a rewrite when native
processes get added to Erlang. :)

I agree with the sentiment. Java will indeed usually get the least friction for any implementation, which for the Enterprise-minded is usually reason enough to use it for everything.
However, in many places one could find instances where building at least a prototype in Erlang might be the Right Thing. Several back-end or even front-end services I have built I would certainly have knocked up in Erlang very rapidly, just as an idea to Show The Boss or even as a real deployment candidate. In all these instances, the database was either Oracle or MS-SQL, with reasonably low DB access performance expectations.
Now, like you said above, just dropping into another VM for whatever service is going to "look bad". Certainly if you've really built it from scratch for your pet project, but what if we were talking about something that's been battle-proven? That maybe speaks Enterprise lingo? Maybe put a nice face on it, for instance - add a RESTful JSON interface or something, as a diversion? :) Like you said, the Enterprise is often willing to take a lot of crap just to have something that looks "standard" - and deploying a Lean JDBC Proxy can go down relatively easily.

>
> I think that what is truly missing is an agreed-upon, standard EDBC API
> definition. IMHO one that is based on JDBC could certainly be good enough.
> Once that is in place, implementation is just a detail :)
> Erlang Community, time to claim your place in the Enterprise!!!! :)
>

I kind of agree that a standard API would be good. Unless it comes
from Ericsson or gets into the OTP distro, it's not really going to be
a standard though. If someone starts one, they really need to get on
this list and push people for feedback. I'd also suggest making it as
thin a veneer as possible on top of the real library/application being
used, as this is less likely to introduce problems.

Right.
Something that I'd use the JDBC Proxy Abomination suggested above is a fallback. In that case, you're not expecting blazing performance - however, you'll have a stable backend to bang your API out on. Gradually adding clean Erlang backends can come later.

I think a big part of the reason why medium/large enterprises don't
tend to use Erlang as a core development language, actually has
nothing to do with technology. It's a resourcing issue. It's harder to
find real Erlang expertise (i.e., commercial experience) than it is to
find Java and/or .NET developers and that increases both cost and
risk. I certainly think that Erlang's productivity and other factors
(like its built-in support for making fault tolerant applications) can
actually outweigh the initial cost of hiring good developers, but
again this is a risk that many big businesses (which are mainly run by
accountants in my experience) won't take. Start-ups and niche sectors
are a completely different kettle of fish.

I think that's spot on.
Neither should they change this approach.
If you want to leap forward in language/tool technology in the Enterprise, there's always stuff like Scala (which is an excellent step up from Java, and starting to get some Traction).

AFAICT, the existing APIs are similar to a severely cut-down JDBC API. So nothing very surprising there - I'm guessing most (if not all) have been built on a per-need basis.
I agree that one should try to veer as much away from heavily-stateful APIs, and this can probably be avoided.
However, since we ARE dealing with RDBMS's, some of the flavour of using them is bound to come through the API - connection pooling, transactions and error handling/reporting should definitely be part of any API that expects to be widely useful. I'd also add an Erlang-y flavour to some operations, like streaming of result sets, to work with the platform's grain and not so much against it.

Tim Watson

unread,

Jul 31, 2011, 3:10:45 PM7/31/11

to Alex Arnon, Dmitrii Dimandt, erlang-questions@erlang.org Questions

>
> I agree with the sentiment. Java will indeed usually get the least friction
> for any implementation, which for the Enterprise-minded is usually reason
> enough to use it for everything.
> However, in many places one could find instances where building at least a
> prototype in Erlang might be the Right Thing. Several back-end or even
> front-end services I have built I would certainly have knocked up in Erlang
> very rapidly, just as an idea to Show The Boss or even as a real deployment
> candidate. In all these instances, the database was either Oracle or MS-SQL,
> with reasonably low DB access performance expectations.
> Now, like you said above, just dropping into another VM for whatever service
> is going to "look bad". Certainly if you've really built it from scratch for
> your pet project, but what if we were talking about something that's been
> battle-proven? That maybe speaks Enterprise lingo? Maybe put a nice face on
> it, for instance - add a RESTful JSON interface or something, as a
> diversion? :) Like you said, the Enterprise is often willing to take a lot
> of crap just to have something that looks "standard" - and deploying a Lean
> JDBC Proxy can go down relatively easily.
>

I've written a few real production systems in Erlang, but they were
all OSS applications and it was the right fit. Even in a telco, most
BSS applications tend not to be written in Erlang. I'm not saying
that's right or good - I personally would rather build something in
Erlang than Java - but there we have it.

>
> Right.
> Something that I'd use the JDBC Proxy Abomination suggested above is a
> fallback. In that case, you're not expecting blazing performance - however,
> you'll have a stable backend to bang your API out on. Gradually adding clean
> Erlang backends can come later.
>

Yeah I get that it will work ok, it just feels a bit weird not to use
existing native implementations that are known to work. ESL have a
postgres driver which I'm assuming (?) is used in production
applications. The database support for Zotonic is also very solid in
my experience. As I said, a guy at work wrote a driver based Oracle
back end and it's absolutely rock solid in production - the app quite
data access centric and has never once failed in the last 3 years
whilst it deals with a reasonable load (avg. few thousand
regular/daily users with peak times dipping into five figures).

>
> I think that's spot on.
> Neither should they change this approach.
> If you want to leap forward in language/tool technology in the Enterprise,
> there's always stuff like Scala (which is an excellent step up from Java,
> and starting to get some Traction).
>

Like I said, we use Erlang at work where it's deemed appropriate.
Nobody there thinks Erlang "is a ghetto" but they do see it as niche
and the fact that they pay sub contractors a lot more for it than bog
standard Java applications is something they think about a lot. Having
said that IMHO the fact that these Erlang based systems never seem to
go wrong is testament to what I call the "you get what you pay for"
effect combined with "some tools are better than others". Just my
opinion though.

>
> AFAICT, the existing APIs are similar to a severely cut-down JDBC API. So
> nothing very surprising there - I'm guessing most (if not all) have been
> built on a per-need basis.

That's why I'd like to start by fronting a well established API - like
the ESL postgres driver - just to see what's already there and build
on it in the general case.

> I agree that one should try to veer as much away from heavily-stateful APIs,
> and this can probably be avoided.

Yes - just avoiding exposing internal data structures by providing a
handle is probably enough to do this. The various modules can actually
return whatever (real state if required) but the API should just hide
this by exporting opaque types only:

-opaque connection_handle :: term().
-export_type([connection_handle/0]).

> However, since we ARE dealing with RDBMS's, some of the flavour of using
> them is bound to come through the API - connection pooling, transactions and
> error handling/reporting should definitely be part of any API that expects
> to be widely useful. I'd also add an Erlang-y flavour to some operations,
> like streaming of result sets, to work with the platform's grain and not so
> much against it.
>

Yes all these things (pooling, transactions, etc) are necessary. I
maintain that you can provide them to the user without the
implementation details leaking though.

I also agree with the notion of having operations that are Erlang-y.
Not sure what the best thing to do with streaming result sets would
be, but I suspect looking at the existing implementations would
clarify what people expect to see. For sure you don't always want to
have to wait until the entire result set has been processed before you
can do anything. I do think having operations that convert result sets
using some transformation fun/function would be useful though - you
might think of there as the Erlang equivalent of the Spring Framework
JdbcTemplate ResultSetConverter or RowMapper interfaces. Except in
Erlag you just a function, plain and simple.

Is your JDBC thing open source? I might have a stab at this just for
fun - maybe fronting your library and the ESL postgres API as the
first cut.

Cheers,

Tim

Jon Watte

unread,

Jul 31, 2011, 7:00:46 PM7/31/11

to Ulf Wiger, erlang-questions

One problem is that it's hard to do detailed profiling on complex real-world applications. The issues limiting scalability might well be wholly unrelated to core VM aspects such as GC, scheduling and message passing. In the first SMP experiments with Ericsson's Telephony Gateway Controller, the big bottleneck was the big lock protecting the ports and linked-in drivers.

Speaking of which: what *is* best practice for profiling bottlenecks in a big Erlang system/application?

I've read the Erlang and OTP books, but they don't talk about it, and trapexit.org or other references are also vexingly vague on this subject.

Sincerely,

jw

Richard O'Keefe

unread,

Jul 31, 2011, 8:05:09 PM7/31/11

to Tim Watson, erlang-questions

On 29/07/2011, at 9:30 PM, Tim Watson wrote:

> On 29 July 2011 00:45, Richard O'Keefe <o...@cs.otago.ac.nz> wrote:
>> One of the things criticised in the blog entry that we've been responding to was
>> that
>> {ok,Foo} = bar(...),
>> {ok,Foo} = ugh(...)
>> is too easy to write (when you really meant, say, Foo0, Foo1).
>>
>> This is a well defined part of the language, and it would not be a good idea to
>> ban it. But how about an optional style warning (and we really need
>> -warn(on | off, [warning...])
>> directives) whenever a bound variable appears in a pattern on the left of "="?
>
> We would need to be able to set that warning with a very localised
> scope though. What you are intending in such a match might be
> *exactly* what the code says - call bar/1 to get a Foo and then call
> ugh/2 and match (assert) that the resulting tuple contains exactly the
> same Foo. Things like a session id, (ets) table id and so on are
> probably examples of this. I won't comment on whether this is good API
> design or not.

There is another way to express that intention that does not require
the use of this anti-pattern.

crash_if_different(X, X) -> ok.

... ->
...
{ok,Bar_Foo} = bar(...),
{ok,Ugh_Foo} = ugh(...),
crash_if_different(Bar_Foo, Ugh_Foo),
...

> A -warn(on | off, [Warning]) sounds lovely, but would only work at
> module level (or as some option passed in to the compiler)

Where did that restriction come from?
Not from anything *I* wrote.
I was imitating Prolog style warnings, which take effect from the
point where they are asserted to the point where they are next set.
For example,
:- style_check(+charset).
p(fübär).
:- style_check(-charset).
p(façade).
gets a non-portable characters warning for the first clause of p/1
but not for the second clause.

> so how
> would I do (or override) this for an individual line of code or for a
> single function?

Switch the warning on (or off) just before the function,
then switch it off (or on) again just after.

As for single lines of code, crash_if_different/2 means there is no
use case for such an override.

> What about function level attributes? That might get
> the scope at which the switch is applied tight enough to be useful:
>
> -pragma(no_warnings, [bound_variable_in_match]).

This is exactly what I proposed, just spelled differently.
I don't think the spelling of the style check directives matters much,
and have no objection to using -pragma here.

> fwibble_ardvarks() ->
> {ok,Foo} = bar(...),
> {ok,Foo} = ugh(...),
> %% etc....
> Foo.

Alex Arnon

unread,

Aug 1, 2011, 4:20:32 AM8/1/11

to Tim Watson, Dmitrii Dimandt, erlang-questions@erlang.org Questions

I agree, it does feel unnatural. However, in this case:
1. It's rather simple, easy to make consistent and stable.
2. It enables us to deliver immediate value to the customer... HELP I'VE GOT ENTERPRISITIS!!!
But seriously:
2. It makes any and all RDBMS's available, immediately.
3. It's a fallback.

>
> I think that's spot on.
> Neither should they change this approach.
> If you want to leap forward in language/tool technology in the Enterprise,
> there's always stuff like Scala (which is an excellent step up from Java,
> and starting to get some Traction).
>

Like I said, we use Erlang at work where it's deemed appropriate.
Nobody there thinks Erlang "is a ghetto" but they do see it as niche
and the fact that they pay sub contractors a lot more for it than bog
standard Java applications is something they think about a lot. Having
said that IMHO the fact that these Erlang based systems never seem to
go wrong is testament to what I call the "you get what you pay for"
effect combined with "some tools are better than others". Just my
opinion though.

>
> AFAICT, the existing APIs are similar to a severely cut-down JDBC API. So
> nothing very surprising there - I'm guessing most (if not all) have been
> built on a per-need basis.

That's why I'd like to start by fronting a well established API - like
the ESL postgres driver - just to see what's already there and build
on it in the general case.

OK.

Right.
One beneficial side effect of streaming result sets is that it enables processing enormous sets without blowing up your process/node. Transformation/map/reduce/whichever is probably a good starting point.

Is your JDBC thing open source? I might have a stab at this just for
fun - maybe fronting your library and the ESL postgres API as the
first cut.

It was a prototype, got to the usable-but-hell-no-you're-not-seeing-this-pile stage. :)
I don't think I've even updated my main repo with the last changes in a few months.

Basically what I did was:
1. register a global controller (edbc_master). This is optional, meant to be used for general monitoring of the connection processes.
2. Each connection is an {Erlang process, JNode} pair, linked to its creator/user and registered with 'edbc_master'.
    The connection is started with a configuration such as:
    { { proxy_dir, "c:/dev/edbc/java/deploy" },
      { driver_class, "com.oracle.xxx.yyy.OracleDriver" },
      { hostname, "localhost" },
      { port, 4444 },
      ...
      other connection-stringy stuff
    }
3. The Controller process (the linked Erlang process) gets a "unique" node name for the JNode from the edbc_master, and spawns it.
4. A basic handshake with X-second timeout between the Controller process and a counterpart on the JNode (registered as 'edbc_proxy' on the JNode I think) is done, including adding monitors on success.
5. From this point on, operations are forwarded via the Controller to the JNode, which executes them, packages up the results and sends them back.

What I've done is somewhat crude - a process per connection etc., no proper support for most datatypes - but it was just a couple of days' work to get a simple SELECT/UPDATE/INSERT working. And it would work on Oracle, MSSql, Postgres, MySQL, Sybase, what have you.

Joel Reymont

unread,

Aug 1, 2011, 6:11:12 AM8/1/11

to Jon Watte, erlang-questions

What about a new thread on this subject?

On Aug 1, 2011, at 12:00 AM, Jon Watte wrote:

> Speaking of which: what *is* best practice for profiling bottlenecks in a big Erlang system/application?
> I've read the Erlang and OTP books, but they don't talk about it, and trapexit.org or other references are also vexingly vague on this subject.

--------------------------------------------------------------------------

- for hire: mac osx device driver ninja, kernel extensions and usb drivers
---------------------+------------+---------------------------------------
http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont
---------------------+------------+---------------------------------------

_______________________________________________

Tim Watson

unread,

Aug 1, 2011, 6:42:20 AM8/1/11

to Joel Reymont, erlang-questions

Good point Joel - will do.

Tim Watson

unread,

Aug 1, 2011, 6:56:26 AM8/1/11

to Richard O'Keefe, erlang-questions

On 1 August 2011 01:05, Richard O'Keefe <o...@cs.otago.ac.nz> wrote:
>> A -warn(on | off, [Warning]) sounds lovely, but would only work at
>> module level (or as some option passed in to the compiler)
>
> Where did that restriction come from?
> Not from anything *I* wrote.
> I was imitating Prolog style warnings, which take effect from the
> point where they are asserted to the point where they are next set.
> For example,
> :- style_check(+charset).
> p(fübär).
> :- style_check(-charset).
> p(façade).
> gets a non-portable characters warning for the first clause of p/1
> but not for the second clause.
>

Um, that's lovely but we're talking about Erlang and it doesn't
support that syntax.

>> so how
>> would I do (or override) this for an individual line of code or for a
>> single function?
>
> Switch the warning on (or off) just before the function,
> then switch it off (or on) again just after.
>

Now you're going to have to exercise your patience with me here Mr
O'Keefe, as I've lost you. How do I turn warnings on/off for
individual functions? Was your proposal to introduce the `-warn(....)`
syntax as an attribute that can be applied to a function and if so, is
adding annotations to functions a new feature? I'm not aware this
works today and that's why I chimed in.

> As for single lines of code, crash_if_different/2 means there is no
> use case for such an override.
>

Yes I can see that now.

>> What about function level attributes? That might get
>> the scope at which the switch is applied tight enough to be useful:
>>
>> -pragma(no_warnings, [bound_variable_in_match]).
>
> This is exactly what I proposed, just spelled differently.
> I don't think the spelling of the style check directives matters much,
> and have no objection to using -pragma here.
>

Agreed - I like `pragma` but likewise I'm not really fussed. My point
was that AFAIK you can't put annotations on functions today and this
would be a new feature - with the annotation info preserved at runtime
in the generated beam if possible.

Richard O'Keefe

unread,

Aug 2, 2011, 1:59:50 AM8/2/11

to Tim Watson, erlang-questions

The basic misunderstanding here is that I was describing one
desirable instance of a style warnings facility that does not
yet exist. I was not supposing or implying that any means of
selectively {en,dis}abling warning exists. That was not the
focus of my attention. I assumed for the purpose of argument
something like the $SET <check> ... $RESET <check> $POP <check>
facility found in 40-year-old Burroughs compilers for Algol,
Fortran, COBOL, BASIC et al or the similar age (*$+I*) (*$-R*)
check switches found in many Pascal compilers, not any sort of
function annotation scheme.

That's not to say that a function annotation scheme (perhaps not
entirely unlike the one found in Ciao Prolog) might not be a good
thing, just that at this stage of the conversation, it's more
useful to focus on

1. Do we want lint-like checks for Erlang at all?
2. Is matching against an already-bound variable a check we want?
3. What are some other checks we might have a use for?

The SmallLint checks found in the Refactoring Browser for Smalltalk
contain many examples of such checks, most of which don't apply to
Erlang, although being able to check for consistency in the spelling
of function and variable names is rather nice. Here's one that
makes sense for Erlang:

if G1 -> S1, S
; G2 -> S2, S
...
end

could have been

if G1 -> S1
; G2 -> S2
...
end,
S

Sam Bobroff

unread,

Aug 2, 2011, 8:52:55 PM8/2/11

to Richard O'Keefe, erlang-questions

Hello all,

I'll put in my $0.02 on this because I've just been bitten by a bug that would have been caught by this new warning. I'll stick to the point:

[snip]

2. Is matching against an already-bound variable a check we want?

[snip]

Yes, because it protects against a kind of bug that is otherwise very easy to introduce in Erlang:

Quite often when debugging I want to add a new line or two to a function, to examine or check an intermediate value. e.g.:

...

Result = function_call(),

validate(Result),

...

The problem is that I now have to scan down the whole rest of the function just in case Result has already been used because, if it has, then it's going to compile fine and cause a run time error!

It's such a simple scenario that it seems very annoying that Erlang can't help me to avoid it. The "cost" of not being able to do a "backwards" assignment with the newly bound variable on the right doesn't seem like a cost at all: I find code that is that way round, or mixes assignments on the left and right (!) to be unnecessarily confusing. I'd be happy if this was required by the language but as this is the real world I'd be happy with a warning I could turn on.

Sam.

Anthony Shipman

unread,

Aug 3, 2011, 7:05:26 AM8/3/11

to erlang-questions

On Wed, 3 Aug 2011 10:52:55 am Sam Bobroff wrote:
> Quite often when debugging I want to add a new line or two to a function,
> to examine or check an intermediate value. e.g.:
>
> ...
> Result = function_call(),
> validate(Result),

I find a show function useful

show(Name, Value) ->
io:format("~s ~p\n", [Name, Value]),
Value.

noshow(_, V) -> V.

validate(show("result", function_call())).

--
Anthony Shipman Mamas don't let your babies
a...@iinet.net.au grow up to be outsourced.

Frédéric Trottier-Hébert

unread,

Aug 3, 2011, 9:02:57 AM8/3/11

to Richard O'Keefe, erlang-questions

On 2011-08-02, at 01:59 AM, Richard O'Keefe wrote:

> 2. Is matching against an already-bound variable a check we want?

To me, matching against an already-bound variable is is a valid assertion, as much as 'ok = function_call()' might be, and as implicit as '{ok, Pid}' or '{error, already_started}' might be. Matching on already-bound variables is part of my standard code when trying to crash early when possible, and also part of many basic test suites that simply do pattern matching here and there. To me it's as basic as using the same variable twice in a single pattern (f([A,A,B,B|_]) when A =/= B -> ...), or something similar with records.

I would likely not use the check at all, and if it were to be added, would prefer it to be a compiler argument (which could be enabled in -compile(...).) I foresee little use of such warnings for myself and would dislike to see it becoming a default setting.

--
Fred Hébert
http://www.erlang-solutions.com

Joe Armstrong

unread,

Aug 3, 2011, 9:37:50 AM8/3/11

to Frédéric Trottier-Hébert, erlang-questions

Actually I like things as they are. My eye is trained to see bound and
free occurrences of a variable.

On the other hand it would be *excellent* if the color coding in emacs
(whatever) was changed to reflect the
binding status of a variable.

If variables in the LHS of an equals were colored green if they were
unbound and red
if bound then one would see at a glance what was intended.

(( assuming you're not red-green color blind that is )) - or set the
variables in a different type face.

/Joe

2011/8/3 Frédéric Trottier-Hébert <fred....@erlang-solutions.com>:

Thomas Lindgren

unread,

Aug 3, 2011, 11:05:57 AM8/3/11

to erlang-questions

>From: Joe Armstrong <erl...@gmail.com>
...

>Actually I like things as they are. My eye is trained to see bound and
>free occurrences of a variable.
>
>On the other hand it would be *excellent* if the color coding in emacs
>(whatever) was changed to reflect the
>binding status of a variable.
>
>If variables in the LHS of an equals were colored green if they were
>unbound and red
>if bound then one would see at a glance what was intended.
>
>(( assuming you're not red-green color blind that is )) - or set the
>variables in a different type face.

This raises the question of whether the order of matching variables is well-defined?

I'm thinking of things like (order inside pattern):

f(A) -> {X, X} = A, X.

or (order of argument matching):

same([X|Xs], [X|Ys]) -> same(Xs, Ys);
same([], []) -> true;
same(_, _) -> false.

I seem to recall that Mike Williams wanted a fully determined evaluation order long ago, for the Erlang spec. Was this implemented? And did it extend to pattern matching?

Best,
Thomas

Richard Carlsson

unread,

Aug 3, 2011, 12:17:39 PM8/3/11

to erlang-q...@erlang.org

On 08/03/2011 05:05 PM, Thomas Lindgren wrote:
> I seem to recall that Mike Williams wanted a fully determined
> evaluation order long ago, for the Erlang spec. Was this implemented?

Yes. Arguments to functions and constructors are always evaluated
left-to-right. This linearization is done in the transformation to Core
Erlang, which doesn't in itself define an evaluation order for function
arguments - the order from the Erlang source code is expressed on the
Core level using let-bindings. (Run "erlc +to_core foo.erl" and look at
the resulting file foo.core to see the explicit evaluation order.)

But I don't know if this rule has been formally documented anywhere
after the work on the formal Erlang specification got indefinitely
suspended. For reference, here's an old exchange between me, Robert, and
Ulf:
http://erlang.org/pipermail/erlang-questions/2008-October/039170.html

(I also recall that Barklund had originally suggested that arguments of
list constructors should be evaluated right-to-left as an exception,
probably in order to reduce the number of temporaries needed while
constructing a list like [f(), g(), h()], but this was voted down since
it didn't make much sense to a human being reading the code.)

> And did it extend to pattern matching?

No, there was never any discussion about that. If the match succeeds
then it would succeed no matter what the order, and if a you get a
'badmatch' error, the entire matched term is blamed, not just the first
subterm that doesn't match, so I can't see that there would be any
observable difference depending on the order of matched subterms.

/Richard

Ulf Wiger

unread,

Aug 3, 2011, 1:05:26 PM8/3/11

to Richard Carlsson, erlang-q...@erlang.org

According to the Barklund spec (page 62), it said that "a compiler must only accept a program if for any evaluation order, there will not be an applied occurrence of an unbound variable".

"For example, in a context where X is unbound, the expression (X=8) + X should give a compile-time error."

(And it does, too. I checked).

BR,
Ulf W

On 3 Aug 2011, at 18:17, Richard Carlsson wrote:

>> And did it extend to pattern matching?
>
> No, there was never any discussion about that. If the match succeeds then it would succeed no matter what the order, and if a you get a 'badmatch' error, the entire matched term is blamed, not just the first subterm that doesn't match, so I can't see that there would be any observable difference depending on the order of matched subterms.

Ulf Wiger, CTO, Erlang Solutions, Ltd.
http://erlang-solutions.com

Richard Carlsson

unread,

Aug 3, 2011, 5:47:13 PM8/3/11

to Ulf Wiger, erlang-q...@erlang.org

On 08/03/2011 07:05 PM, Ulf Wiger wrote:
>
> According to the Barklund spec (page 62), it said that "a compiler
> must only accept a program if for any evaluation order, there will
> not be an applied occurrence of an unbound variable".
>
> "For example, in a context where X is unbound, the expression (X=8) +
> X should give a compile-time error."
>
> (And it does, too. I checked).

That's a slightly different thing - a requirement that any path that
reaches a use of a variable must first pass through a binding of that
variable. For instance:

f(Y) ->
case test(Y) of
true -> X = 1;
false -> ok
end,
g(X).

is not a valid program, because if the false path is taken, X is not
defined. The quote from the spec can be seen as a general design
principle, which in practice is implemented by the scoping rules: For
the case expression, a variable binding is only propagated out of a case
if it is guaranteed to be bound in all clauses. For a function call (A+B
is really a call to erlang:'+'(A,B)), all arguments are evaluated in the
same ingoing environment , so even though the first argument to +
creates a binding of X, this does not affect the environment for
evaluating the second argument, in which X is not yet bound, so in both
these examples the scoping rules forbid them.

Hence, the scoping rules partially determine the order of evaluation of
the subexpressions, for the programs that pass this check (and assuming
there is not a bug in the lower levels of the compiler), but on top of
this, Erlang defines a strict left-to-right evaluation order for
function arguments that has nothing to do with scoping. E.g.,

{io:format("hello"), io:format("ulf")}

should always print "helloulf" and never "ulfhello".

An interesting twist on the example (X=8) + X is that if *both*
arguments create a binding of the same variable, as in:

(X=8) + (X=8)

then Erlang will first evaluate the subexpressions, then *check* that
the results match so that a single agreed value of X can be exported out
from the whole expression, and only in that case will it call + on the
results. You can experiment with the following to see what I mean:

f(Y) ->
(X=Y) + (X=8).

(and no, it doesn't matter if the match against Y is on the right or
left of the plus sign). f(8) will return 16, and f(Y) for any other Y
will cause a badmatch. (I'm glad this worked - for a moment I was afraid
that someone had forgotten to implement the check.)

/Richard

Richard Carlsson

unread,

Aug 3, 2011, 6:05:20 PM8/3/11

to Ulf Wiger, erlang-q...@erlang.org

On 08/03/2011 07:05 PM, Ulf Wiger wrote:

> On 3 Aug 2011, at 18:17, Richard Carlsson wrote:
>
>>> And did it extend to pattern matching?
>>
>> No, there was never any discussion about that. If the match
>> succeeds then it would succeed no matter what the order, and if a
>> you get a 'badmatch' error, the entire matched term is blamed, not
>> just the first subterm that doesn't match, so I can't see that
>> there would be any observable difference depending on the order of
>> matched subterms.

I forgot to clarify this. As I said in the previous mail, for an
expression such as {f(), g()}, Erlang defines the evaluation order to be
f() first, then g(). But in the case of pattern matching:

{P1, P2} = SomeExpression

(where P1 and P2 can be unbound variables, bound variables, or any other
patterns large or small, including where P1 and P2 are exactly the same)
- if the match works, then the resulting bindings (if any) from the
pattern as a whole must be the same regardless of the order you matched
the subpatterns against the value of SomeExpression. And if the match
fails, then you just see a {badmatch, SomeExpression} error which
doesn't reveal in what part of SomeExpression the match first discovered
that the value couldn't be matched against the pattern, even though
there could be more than one place, as in {1,2,3}={3,2,1}. Hence,
whether the matching tries P1 or P2 first is an internal implementation
detail.

/Richard

Robert Virding

unread,

Aug 3, 2011, 6:17:21 PM8/3/11

to Joe Armstrong, erlang-questions

I am assuming that the check only applies for variables in matches with '=' and not in case/receive patterns. In that case I would not mind having such a warning turnable on/off at will. I seldom test for values in '=' matches, only in case/receive.

The emacs colouring is difficult given how erlang handles variables and evaluation ordering etc.

Robert

Robert Virding

unread,

Aug 3, 2011, 6:24:13 PM8/3/11

to Richard Carlsson, erlang-q...@erlang.org

I must say that I find this a little strange. If the evaluation order of subexpressions is defined, which it is, then I would have expected that the whole effect of a subexpression would be visible in "later" subexpressions. Hence you could do things like:

(X=8) + X
foo(X = bar(42), X)

but not

X + (X=8)
foo(X, X = bar(42))

because of the evaluation order. Although I suppose it does make it easier to see the effects of variable bindings if the are delayed. I will have to go back an re-read the spec. It should really be updated.

Robert

Richard Carlsson

unread,

Aug 3, 2011, 6:39:39 PM8/3/11

to Robert Virding, erlang-q...@erlang.org

On 08/04/2011 12:24 AM, Robert Virding wrote:
> I must say that I find this a little strange. If the evaluation order
> of subexpressions is defined, which it is, then I would have expected
> that the whole effect of a subexpression would be visible in "later"
> subexpressions. Hence you could do things like:
>
> (X=8) + X
> foo(X = bar(42), X)
>
> but not
>
> X + (X=8)
> foo(X, X = bar(42))
>
> because of the evaluation order.

You could, yes, but it's not a good idea. For example, a refactoring
that switches the order of arguments to a function would have to
consider any bindings in calling occurrences and lift them out to
preserve the evaluation order *and* the bindings:

foo(f(), X=bar(), g(X))

must become

V1 = f()
X = bar(42),
V2 = g(X),
foo(V1, V2, X)

or possibly

foo(f(), begin X=bar(42), g(X) end, X)

if you want to switch the order of the last two arguments.

The rule that all arguments are evaluated in the same environment makes
the world a happier place. If I want Basic, I know where I can get it. ;-)

Håkan Huss

unread,

Aug 4, 2011, 3:13:21 AM8/4/11

to Robert Virding, erlang-questions

On Thu, Aug 4, 2011 at 00:17, Robert Virding
<robert....@erlang-solutions.com> wrote:
> I am assuming that the check only applies for variables in matches with '=' and not in case/receive patterns. In that case I would not mind having such a warning turnable on/off at will. I seldom test for values in '=' matches, only in case/receive.
>

There is one area of code where I use matching of bound variables with
= a lot, and that is in ct test cases. A typical example is:

...
Alarms0 = get_current_alarm_list(),
<code that should not raise an alarm>
Alarms0 = get_current_alarm_list(),
<code that should cause an alarm to be raised>
Alarms1 = get_current_alarm_list(),
verify_alarm_raised(Alarms1, Alarms0, <alarm that should be
present in Alarms1 but not Alarms0>),
<code that should clear the alarm>
Alarms0 = get_current_alarm_list(),
...

While ROK's crash_if_different could be used here, my personal opinion
is that the test code as written expresses the test better and does
not require that I introduce a bunch of new variables to hold old
values.

In normal code, however, I use this very sparingly. Then again, I
seldom experience this as a problem. Thinking back, most of the times
I've been bitten by this is probably in case clauses, especially when
adding code. Typically, when using canonical variables like Pid,
Reason or Result without checking that the code isn't in a branch
which already uses it. And since I agree with Robert that case and
receive should not cause a warning, I don't really feel the need for
it.

Regards,
/Håkan

Motiejus Jakštys

unread,

Aug 4, 2011, 3:50:48 AM8/4/11

to Richard O'Keefe, erlang-questions

On Fri, Jul 29, 2011 at 11:45:21AM +1200, Richard O'Keefe wrote:
> One of the things criticised in the blog entry that we've been responding to was
> that
> {ok,Foo} = bar(...),
> {ok,Foo} = ugh(...)
> is too easy to write (when you really meant, say, Foo0, Foo1).

I usually remember all variable names in a function I am working on,
because functions are small (and easy to skim through in 0.3 sec).

Don't write functions with more than 15 to 20 lines of code. Split large
function into several smaller ones. Don't solve the problem by writing
long lines, remember?
http://www.erlang.se/doc/programming_rules.shtml#REF32141

This advice solved this problem for me. I re-use bound variables for
crash early reason, therefore would not like for anything to change in
the current behaviour.

Motiejus

Tim Watson

unread,

Aug 4, 2011, 5:55:23 AM8/4/11

to Håkan Huss, erlang-questions

> In normal code, however, I use this very sparingly. Then again, I
> seldom experience this as a problem. Thinking back, most of the times
> I've been bitten by this is probably in case clauses, especially when
> adding code. Typically, when using canonical variables like Pid,
> Reason or Result without checking that the code isn't in a branch
> which already uses it. And since I agree with Robert that case and
> receive should not cause a warning, I don't really feel the need for
> it.
>

I suspect most experienced Erlang developers don't get bitten by this.
Going back to the blog post that was the genesis of the original
thread, the person in question (whose name I've forgotten) was
complaining about this because the match fails at runtime. I think
it's just an example of someone whose not used to single assignment
and finds that they don't like it in practise. I would qualify that
statement by pointing out the poster also went on to actually suggest
that single assignment is bad. I don't agree but each to their own.

Richard O'Keefe

unread,

Aug 4, 2011, 6:29:55 PM8/4/11

to Tim Watson, erlang-questions

On 4/08/2011, at 9:55 PM, Tim Watson wrote:
> I suspect most experienced Erlang developers don't get bitten by this.
> Going back to the blog post that was the genesis of the original
> thread, the person in question (whose name I've forgotten) was
> complaining about this because the match fails at runtime. I think
> it's just an example of someone whose not used to single assignment
> and finds that they don't like it in practise. I would qualify that
> statement by pointing out the poster also went on to actually suggest
> that single assignment is bad. I don't agree but each to their own.

The person in question was Tony Arcieri, who built Reia
(see http://reia-lang.org/). There are three things that seem
obvious about Tony Arcieri:
(1) He likes Ruby. (Nobody who didn't would want to build
something like Reia.)
(2) He knows rather a lot about Erlang. (Nobody who didn't
*could* build something like Reia.)
(3) He is a skilled programmer. (Nobody who wasn't could
get something like Reia working and out there -- as long
as directories don't include dots in their names...)

It is important to understand the distinction that Tony Arcieri
correctly makes between
(A) single assignment in the *semantics* (data are immutable), and
(B) single assignment in the *syntax* (variables can be assigned
only once).
The point he makes is that you can have (A), which he seems to be
happy with, without requiring (B), which he is not: you can treat
re*assignment* as re*binding*. Thus in

X = f(),
X = g(X),
X = h(X)

each occurrence of X on the left would be a new binding, equivalent to

X0 = f(),
X1 = f(X0),
X2 = h(X1)

but without requiring the programmer to keep track of numeric suffixes.
There's a classic paper showing that the imperative programming language
Euclid can actually be viewed as a functional programming language using
such renaming tricks (and taking advantage of the fact that procedure
interfaces had to specify _all_ the variables that could be used (extra
inputs) and set (extra outputs) and the aliasing rules that meant that
in-place mutation could not be detected). This _does_ include loops.

A language treating assignment as rebinding could look more comfortable
to an ex-imperative programmer without actually sacrificing any of the
virtues of Erlang. It's not a change that could be made to Erlang by
now, but it certainly would not need a new VM, and it would not introduce
any failure modes that are not now present.

I strongly criticised a recent suggestion of Tony Arcieri's, but while I
personally do not feel a need for conventional-seeming variables in
Erlang code, there is much good sense in his position on single assignment.

I

Jeff Schultz

unread,

Aug 4, 2011, 10:22:14 PM8/4/11

to Richard Carlsson, erlang-q...@erlang.org

On Wed, Aug 03, 2011 at 11:47:13PM +0200, Richard Carlsson wrote:
>> According to the Barklund spec (page 62), it said that "a compiler
>> must only accept a program if for any evaluation order, there will
>> not be an applied occurrence of an unbound variable".
>>
>> "For example, in a context where X is unbound, the expression (X=8) +
>> X should give a compile-time error."
>>
>> (And it does, too. I checked).

Thanks for pointing that out. I'd have missed that one.

> An interesting twist on the example (X=8) + X is that if *both* arguments
> create a binding of the same variable, as in:
>
> (X=8) + (X=8)
>
> then Erlang will first evaluate the subexpressions, then *check* that the
> results match so that a single agreed value of X can be exported out from
> the whole expression, and only in that case will it call + on the results.
> You can experiment with the following to see what I mean:

The difference between the two cases seems a bit weird to me. I would
have expected that the meaning of a call f(Expr1, Expr2) (I'll write
it as |f(Expr1, Expr2)|) where the Expr are other Erlang expressions
should be exactly the same as

|T1 = Expr1|,
|T2 = Expr2|,
f(T1, T2)

where the T are new variables and |.| is applied recursively.

In this case, |(X = 8) + X| means X = 8, T1 = X, T2 = X, T1 + T2,
which is okay, and |X + (X = 8)| means T1 = X, X = 8, T2 = X, T1 + T2,
which is a compile-time error.

While I appreciate the intent of Barklund's rule above, I don't think
it plays well with Erlang's explicit left-to-right order of
evaluation.

Erlang really could do with a concise, agreed, current specification,
preferrably executable.

Jeff Schultz

Richard O'Keefe

unread,

Aug 4, 2011, 10:55:38 PM8/4/11

to Motiejus Jakštys, erlang-questions

On 4/08/2011, at 7:50 PM, Motiejus Jakštys wrote:

> On Fri, Jul 29, 2011 at 11:45:21AM +1200, Richard O'Keefe wrote:
>> One of the things criticised in the blog entry that we've been responding to was
>> that
>> {ok,Foo} = bar(...),
>> {ok,Foo} = ugh(...)
>> is too easy to write (when you really meant, say, Foo0, Foo1).

Please remember, people, that this wasn't *my* complaint,
but a complaint in the original "Erlang is a Ghetto" blog.
I was wondering whether we could do anything useful to
help people who _do_ tend to make that kind of mistake.

>
> I usually remember all variable names in a function I am working on,
> because functions are small (and easy to skim through in 0.3 sec).

I don't make that kind of mistake much myself.
However, I wouldn't take that "functions are small" for granted.

>
> Don't write functions with more than 15 to 20 lines of code.

I took the Erlang/OTP R12B-5 *.erl files and ran them through a
filter that removes comments. Blank lines were not counted.
A line beginning with ['a-z] was taken as beginning a clause,
and a line ending with [.] was taken as ending a function.

> summary(clauses.per.function)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.000 1.000 1.000 2.297 2.000 495.000
> summary(lines.per.clause)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.000 2.000 2.000 3.998 4.000 797.000
> summary(lines.per.function)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.000 2.000 5.000 9.184 10.000 2414.000

So yes, the mean is somewhere around 10. However,
8.8% of functions in R12B-5 were over 20 lines, and
2.7% were over 40 lines.
0.6% were over 100 lines. The "whopper" at 2,414 lines
is admittedly extreme; the next smallest function is
"only" 1,072 lines.

> Split large
> function into several smaller ones. Don't solve the problem by writing
> long lines, remember?
> http://www.erlang.se/doc/programming_rules.shtml#REF32141

Oh yes, line length.

> summary(columns.per.nonempty.line)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.00 24.00 39.00 40.07 54.00 1515.00

If you look at the source code of Reia, which I have,
you may come away with the same impression I did, which is
that Tony Arcieri writes pretty clean code that is no
hardship to read. He tends to spread things out vertically
more than I do, but I have to say that it's tidy and
readable. The complaint about unintentionally writing a
variable name more than once did NOT come from someone
who doesn't know how to write clean code.

You will also notice pretty quickly that some functions
just *can't* be kept below 21 lines: if you have more
than 20 cases to deal with, you are going to need more
than 20 lines to do it. And a compiler runs into that
kind of problem quite often. Here are the summaries
for Reia:

> summary(clauses.per.function)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.000 1.000 1.000 2.542 2.000 105.000
> summary(lines.per.clause)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.000 2.000 2.000 3.666 3.000 35.000
> summary(lines.per.function)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.000 2.000 5.000 9.316 8.000 229.000

Since the scope of variables is a clause, not a
function, the size we should be concerned with is
clause size, and we see that each and every one
of the clauses in Reia fits well within an edit
window on a modern machine. (The big function
is one which has, for good reason, lots of clauses;
the individual clauses are not particularly large.)

You may recall a recent thread in which I was not
particularly kind to an Erlang "enhancement" suggestion
of Arcieri's. I think that was a bad idea, but nobody
looking at his code could accuse him of ignorance of
Erlang or incompetence as a programmer. If he says that
he runs into unintentionally-repeated-variable-names
often enough to be a nuisance, then this is a problem
that a *GOOD* programmer can have in Erlang, even if you
or I do not.

I gave my second lecture about static checking this
morning, and spent some time explaining the importance
of turning off inappropriate warnings so that you can
see the real problems. I gave this example from my
Smalltalk-to-C compiler's run-time library:

#define ik_bitShift(x, y) \
(AREINTS(x, y) && -y < ASINT(INT_BITS) \
? (Word)((Int)(x) >> (int)(((-y) >> 2)) &~ (ASINT(1)-1)) \
: k_bitShift(x, y))

and the code that gets generated

t4 = ik_bitShift(l1 /*a*/, ASINT(24));

producing warnings like

"foobar.c", line 229490: warning: shift count negative or too big: >> 1073741800

from Sun cc, gcc-4.5, and clang. These compilers are smart enough to
see that the shift count is out of range, but so dumb that they pay
no attention whatsoever to the preceding test which ensures that the
shift in question can never under any circumstances be executed. I get
hundreds of lines of worse-than-useless warnings, which I cannot switch
off, because these compiler writers were so arrogantly cock-sure that
THEY would never be mistaken about what was probably wrong.

For efficiency the test should probably be special-cased by the Smalltalk
compiler rather than the C compiler, but I want to make that decision when
it rises to the top of the TODO list naturally, not because half-smart
compilers make my life a misery if I don't.

I really don't want to get in the way of programmers who know what they are
doing and don't have a particular problem, but that's why I suggested being
able to switch off warnings selectively.

Tim Watson

unread,

Aug 5, 2011, 2:25:49 AM8/5/11

to Richard O'Keefe, erlang-questions

On 4 August 2011 23:29, Richard O'Keefe <o...@cs.otago.ac.nz> wrote:
> The person in question was Tony Arcieri, who built Reia

Wasn't paying attention to the details. Although I haven't used Reia,
I am aware of it and it is indeed very a impressive thing to have
built.

>
> It is important to understand the distinction that Tony Arcieri
> correctly makes between
> (A) single assignment in the *semantics* (data are immutable), and
> (B) single assignment in the *syntax* (variables can be assigned
> only once).

Thanks for clearing that up - I confess I didn't notice the
distinction but on second reading you're quite right.

> The point he makes is that you can have (A), which he seems to be
> happy with, without requiring (B), which he is not: you can treat
> re*assignment* as re*binding*. Thus in
>
> X = f(),
> X = g(X),
> X = h(X)
>

I must confess I find that very confusing. I have written plenty of
imperative code over the years but it just feels wrong to keep
reassigning like that even in Ruby. Given that Ruby is Object Oriented
as well as imperative, I'd expect to see something more like `f.g.h`
perhaps with parens in later versions where they seem to be required
once again. If the code is not doing anything with these intermediate
variables, why type them in the first place? After all, `h(f(f()))` is
actually more concise than the alternative, and rubyists do seem to
like conciseness (although I'm being a little flippant, as this
doesn't mean they all prefer inlining).

When I can make function/method parameters and/or local variables
immutable, readonly, `final`, or whatever syntax is available to me, I
do. I find this delegates checking for a common class of errors to the
compiler. In Erlang I might use `{ok, X} = f(), {still_ok, X} = g(X)`
as a deliberate check from time to time.

Richard Carlsson

unread,

Aug 5, 2011, 4:47:29 AM8/5/11

to Jeff Schultz, erlang-q...@erlang.org

On 08/05/2011 04:22 AM, Jeff Schultz wrote:
> The difference between the two cases seems a bit weird to me. I would
> have expected that the meaning of a call f(Expr1, Expr2) (I'll write
> it as |f(Expr1, Expr2)|) where the Expr are other Erlang expressions
> should be exactly the same as
>
> |T1 = Expr1|,
> |T2 = Expr2|,
> f(T1, T2)
>
> where the T are new variables and |.| is applied recursively.
>
> In this case, |(X = 8) + X| means X = 8, T1 = X, T2 = X, T1 + T2,
> which is okay, and |X + (X = 8)| means T1 = X, X = 8, T2 = X, T1 + T2,
> which is a compile-time error.
>
> While I appreciate the intent of Barklund's rule above, I don't think
> it plays well with Erlang's explicit left-to-right order of
> evaluation.

I think you have misunderstood something - or maybe I'm missing
something in your reasoning. Barklund's stated rule means that |(X = 8)
+ X| is just as invalid in Erlang as |X + (X = 8)|, because even though
it will work in a left-to-right evaluation order (which is ultimately
the order in which the arguments _will_ be executed), it will not work
in _any_ evaluation order. The compiler therefore rejects it.

The nice consequence is that you can always naively reorder the
arguments of a function or operator call (in a valid program), because
there's no possibility that one affects the variable bindings expected
of another.

Hence, the rule actually plays very _well_ with the explicit evaluation
order: it says that even though it's tempting to think it would be good
to allow this kind of variable propagation since the evaluation order is
strictly left-to-right anyway, that's not a good idea, because it
actually makes programs harder to reason about and harder to refactor
without introducing bugs.

The general linearization of f(Expr1, Expr2) can be expressed as:

%% *if* Expr1 and Expr2 don't depend on each other's bindings:
T1 = Expr1,
T2 = Expr2,
%% for all X,Y,... exported from both Expr1 and Expr2:
X1 = X2, Y1 = Y2, ... % ensure they are the same
f(T1, T2)

(of course, the check of bindings only happens in extremely rare cases
like my example f(X=8, X=8), so there's normally no overhead for this).

Of course, side effects in argument expressions are a different matter.
An expression like f(P ! foo, P ! bar) will _always_ cause foo to arrive
before bar at P, because the evaluation order is fixed. And naively
reordering arguments with side effects in them could easily screw up
your program - which is why having side effects in arguments is bad style.

/Richard

Robert Virding

unread,

Aug 5, 2011, 9:56:41 AM8/5/11

to Richard Carlsson, erlang-q...@erlang.org

This wouldn't help as you still have an order dependency.

I don't have any problems with refactoring tools, I never use them. :-) Being old-school I prefer my code exactly as it is written, if I had wanted it in another way I would have written it in another way.

I view {X,Y} = foo(A) as *one* expression so think it is more logical to see all the effects of it after it has been evaluated, even the variable bindings. It is not a big issue as most people, including me, seldom write code where this would be an issue, although not handling bindings in this way could make some code a bit more verbose.

> The rule that all arguments are evaluated in the same environment
> makes
> the world a happier place. If I want Basic, I know where I can get
> it. ;-)

Yes, write my own. The next language on the Erlang VM, Ebasic, for real programming.

Robert

Garrett Smith

unread,

Aug 7, 2011, 7:24:22 PM8/7/11

to Richard O'Keefe, erlang-questions

On Thu, Aug 4, 2011 at 5:29 PM, Richard O'Keefe <o...@cs.otago.ac.nz> wrote:
>
> The person in question was Tony Arcieri, who built Reia
> (see http://reia-lang.org/). There are three things that seem
> obvious about Tony Arcieri:
> (1) He likes Ruby. (Nobody who didn't would want to build
> something like Reia.)
> (2) He knows rather a lot about Erlang. (Nobody who didn't
> *could* build something like Reia.)
> (3) He is a skilled programmer. (Nobody who wasn't could
> get something like Reia working and out there -- as long
> as directories don't include dots in their names...)

I reread his post, largely due to your generous (and dignified)
representation of his position in this thread.

The crux of his argument is this:

"This is the type of error you only encounter at runtime. It can lay
undetected in your codebase, unless you're writing tests."

As if other dynamically typed languages (Ruby) excel in catching
"errors" at compile time.

This isn't even a straw man. It's straw. His post, for whatever
reason, is an emotional outburst and largely unconstructive. At least
that's my reading of it, twice.

If your average number of lines per function is < 10, honestly, this
is *really* a problem?

Nonsense.

Garrett

Richard O'Keefe

unread,

Aug 7, 2011, 10:13:15 PM8/7/11

to Garrett Smith, erlang-questions

On 8/08/2011, at 11:24 AM, Garrett Smith wrote:
> I reread his post, largely due to your generous (and dignified)
> representation of his position in this thread.

Well, given how I jumped all over his proposal to adopt
Ruby-style "blocks" into Erlang, I thought it was time to
shake the venom out of the pen...

> The crux of his argument is this:
>
> "This is the type of error you only encounter at runtime. It can lay
> undetected in your codebase, unless you're writing tests."
>
> As if other dynamically typed languages (Ruby) excel in catching
> "errors" at compile time.

My point here was that this kind of error *can* be caught at
compile time. I have a Smalltalk to C compiler (it does not handle
exceptions yet, so it's not really ready to release) and it's
actually quite surprising how much you *can* detect at compile time.

For example, I have a rule language that lets me write rules like these:

rule: 'isNil is sacred'
defines: #isNil -> is: Object | is: UndefinedObject

rule: '= and hash must be consistent (A)'
defines: #= -> defines: #hash

rule: '= and hash must be consistent (B)'
defines: #hash -> defines: #=

rule: 'network addresses are values'
kindOf: NetworkAddress -> isAbstract | isValue

rule: 'streams are NOT values'
kindOf: InputStream | kindOf: OutputStream -> isAbstract | isMutable

This is not traditional in Smalltalk, but it's surprising the number
of mistakes they've caught. The rule language is compiled into AWK
by an AWK script, which was rather fun. There are a couple of
hundred rules and should be more. Of course, now I need a checker for
the checking language! One of the rules above had a typo I just noticed
when pasting it in...

Jeff Schultz

unread,

Aug 8, 2011, 3:33:18 AM8/8/11

to Richard Carlsson, erlang-q...@erlang.org

On Fri, Aug 05, 2011 at 10:47:29AM +0200, Richard Carlsson wrote:
> On 08/05/2011 04:22 AM, Jeff Schultz wrote:
>> While I appreciate the intent of Barklund's rule above, I don't think
>> it plays well with Erlang's explicit left-to-right order of
>> evaluation.

> I think you have misunderstood something - or maybe I'm missing something
> in your reasoning. Barklund's stated rule means that |(X = 8) + X| is just
> as invalid in Erlang as |X + (X = 8)|, because even though it will work in
> a left-to-right evaluation order (which is ultimately the order in which
> the arguments _will_ be executed), it will not work in _any_ evaluation
> order. The compiler therefore rejects it.

> The nice consequence is that you can always naively reorder the arguments
> of a function or operator call (in a valid program), because there's no
> possibility that one affects the variable bindings expected of another.

My point was that in a strict language with side-effects and an
explicit evaluation order, the rule doesn't do anything useful. It
doesn't, for example, identify any code where the program will "go
wrong" that won't be picked up by the use-before-definition check
anyway.

Jeff Schultz

Richard Carlsson

unread,

Aug 8, 2011, 5:10:51 AM8/8/11

to Jeff Schultz, erlang-q...@erlang.org

On 08/08/2011 09:33 AM, Jeff Schultz wrote:
> On Fri, Aug 05, 2011 at 10:47:29AM +0200, Richard Carlsson wrote:
>> On 08/05/2011 04:22 AM, Jeff Schultz wrote:
>>> While I appreciate the intent of Barklund's rule above, I don't think
>>> it plays well with Erlang's explicit left-to-right order of
>>> evaluation.
>
>> I think you have misunderstood something - or maybe I'm missing something
>> in your reasoning. Barklund's stated rule means that |(X = 8) + X| is just
>> as invalid in Erlang as |X + (X = 8)|, because even though it will work in
>> a left-to-right evaluation order (which is ultimately the order in which
>> the arguments _will_ be executed), it will not work in _any_ evaluation
>> order. The compiler therefore rejects it.
>
>> The nice consequence is that you can always naively reorder the arguments
>> of a function or operator call (in a valid program), because there's no
>> possibility that one affects the variable bindings expected of another.
>
> My point was that in a strict language with side-effects and an
> explicit evaluation order, the rule doesn't do anything useful. It
> doesn't, for example, identify any code where the program will "go
> wrong" that won't be picked up by the use-before-definition check
> anyway.

As I said, the rule should be read as a general design principle for
Erlang - it says that there has to *be* a check, one way or another (and
at compile time), for potentially uninitialized uses - regardless of
argument evaluation order. Otherwise, it's not Erlang. You could make a
nonstandard dialect of Erlang that uses right-to-left evaluation order
instead of left-to-right, and it would still be mostly the same
language, but if you break the overall rule, then your language is no
longer even Erlang-like.

For comparison, C is a "strict language with side effects" and lax
evaluation order - but even if you fixed the evaluation order, there
would be no rule that said that a C program containing a use of an
uninitialized variable is not a legal program - it's just that its
behaviour is undefined if you execute such a path at runtime.

Finally, it's best to keep the concepts of side effects and variable
bindings separate. In a language like Erlang, variable bindings have
nothing to do with side effects.

/Richard

Kenneth Lundin

unread,

Aug 17, 2011, 1:09:03 PM8/17/11

to Richard O'Keefe, erlang-questions

We in the OTP team think this is a good idea but we can't say when we would have time to implement it.

It should be quite simple to implement mostly involving the module erl_lint.

A nice user contrib would make it possible to introduce quicker.

The possibility to turn on and off per function is not so important ,
Can be added later if needed and would require some extra thought.

/Kenneth Erlang/OTP Ericsson

Den 29 jul 2011 01.45, "Richard O'Keefe" <o...@cs.otago.ac.nz> skrev:

One of the things criticised in the blog entry that we've been responding to was
that
{ok,Foo} = bar(...),
{ok,Foo} = ugh(...)
is too easy to write (when you really meant, say, Foo0, Foo1).

This is a well defined part of the language, and it would not be a good idea to
ban it. But how about an optional style warning (and we really need
-warn(on | off, [warning...])
directives) whenever a bound variable appears in a pattern on the left of "="?

Steve Vinoski

unread,

Aug 17, 2011, 2:02:57 PM8/17/11

to Frédéric Trottier-Hébert, erlang-questions

2011/8/3 Frédéric Trottier-Hébert <fred....@erlang-solutions.com>:

> On 2011-08-02, at 01:59 AM, Richard O'Keefe wrote:
>
>> 2. Is matching against an already-bound variable a check we want?
>
> To me, matching against an already-bound variable is is a valid assertion, as much as 'ok = function_call()' might be, and as implicit as '{ok, Pid}' or '{error, already_started}' might be. Matching on already-bound variables is part of my standard code when trying to crash early when possible, and also part of many basic test suites that simply do pattern matching here and there. To me it's as basic as using the same variable twice in a single pattern (f([A,A,B,B|_]) when A =/= B -> ...), or something similar with records.
>
> I would likely not use the check at all, and if it were to be added, would prefer it to be a compiler argument (which could be enabled in -compile(...).) I foresee little use of such warnings for myself and would dislike to see it becoming a default setting.

I completely agree. I use matching this way quite a bit, especially in
testing, and wouldn't want to see any warnings for it by default.

--steve

Kenneth Lundin

unread,

Aug 17, 2011, 4:09:23 PM8/17/11

to Steve Vinoski, erlang-questions

I agree that it is very common in test code to match against already bound variables.

But in production code it is quite rare.

The warning should definitely not be on by default.

We will also run a check over all OTP libraries and possibly other production code

as well to see how it turns out before we make any definite decisions.

But as said we have not given the implementation any priority at all yet.

/Kenneth

2011/8/17 Steve Vinoski <vin...@ieee.org>:

OvermindDL1

unread,

Aug 19, 2011, 3:11:38 AM8/19/11

to Kenneth Lundin, erlang-questions, Steve Vinoski

Not so rare in my production code for note, I use it heavily in a few of my math oriented modules and would not want it on by default. I have personally not yet been bit by the bug that the warning would prevent in my few years in erlang, although I have other functional backgrounds.

Reply all

Reply to author

Forward