[erlang-questions] What I dislike about Erlang

276 views
Skip to first unread message

Richard O'Keefe

unread,
Aug 31, 2012, 2:20:22 AM8/31/12
to Erlang Questions, cos...@cs.otago.ac.nz, cos...@mailhub2.otago.ac.nz
We've just had a thread about what people like about Erlang.
We also had the announcement of TinyMQ.
So I'm going to use this as an example of what's *really*
wrong with Erlang.

Don't get me wrong. I endorse everything everyone else has
said in favour of Erlang. Erlang is like democracy: the worst
thing in its class except for all the others, and something
that is increasingly imitated by people who just don't get
some of the fundamental things about it.

I also endorse what people have said in praise of TinyMQ.
There are lots of things that it does right:
- there is a README
- there are EDoc comments with @specs for the public
interface
- the functions and variables are named well enough that
I was never in doubt about what any part of the code was
up to, at least not for longer than a second or two
- the hard work of process management is delegated to OTP
behaviours
At this point, it's looking better than anything I've written.

Make no mistake: I am not saying that Erlang or TinyMQ are *bad*.
They are good things; I'm just ranting somewhat vaguely about
why they should be better.


LUMPS OF INDISTINGUISHABLE CODE.

Up to a certain level of hand-waving, TinyMQ can be roughly
understood thus:
The TinyMQ *system* is a monitor
guarding a dictionary mapping strings to channnels,
where
a channel is a monitor
guarding a bag of subscribers and
a sliding window of {Message, Timestamp} pairs.

YOU CANNOT SEE THIS AT A GLANCE.

This is not Evan Miller's fault. *Anything* you write in
Erlang is going to end up as lumps of indistinguishable code,
because there is nothing else for it to be.

This is also true in C, C++, Java, C#, Javascript, Go,
Eiffel, Smalltalk, Prolog, Haskell, Clean, SML, ...,
not to mention Visual Basic and Fortran.

Almost the only languages I know where it doesn't *have* to
be true are Lisp, Scheme, and Lisp-Flavoured Erlang. Arguably
Prolog *could* be in this group, but in practice it usually is
in the other camp. Thanks to the preprocessor, C *can* be
made rather more scrutable, but for some reason this is frowned on.

There's the e2 project (http://e2project.org) which is a step
in a good direction, but it doesn't do much about this problem.
A version of TinyMQ using e2_service instead of gen_server
would in fact exacerbate the problem by mushing
handle_call/3, handle_cast/2, and handle_info/2 into one
function, turning three lumps into one bigger lump.

LUMPS OF DATA.

Take tinymq_channel_controller as an example.
Using an OTP behaviour means that all six dimensions of the state
are mushed together in one data structure. This goes a long way
towards hiding the fact that

supervisor, channel, and max_age are never changed
messages, subscribers, and last_pull *are* changed.

One teeny tiny step here would be to offer an alternative set of
callbacks for some behaviours where the "state" is separated into
immutable "context" and mutable "state", so that it is obvious
*by construction* that the context information *can't* be changed.

Another option would be to have some way of annotation in a
-record declaration that a field cannot be updated.

I prefer the segregation approach on the grounds of no language
change being needed and the improved efficiency of not copying
fields that can't have changed. Others might prefer the revise
-record approach on the grounds of not having to change or
duplicate the OTP behaviours.

I had to reach each file in detail
- to find that certain fields *happened* not to be changed
- to understand the design well enough to tell that this was
almost certainly deliberate.

WE DOCUMENT THE WRONG THINGS.

It's well known that there are two kinds of documentation,
"external" documentation for people writing clients of a module,
and "internal" documentation for people maintaining the module
itself. It's also well known that the division is simplistic;
if the external documentation is silent about material points
you have to read the internal documentation.

In languages like Prolog and Erlang and Scheme where you build
data structures out of existing "universal" types and have no
data structure declarations, we tend to document procedures
but not data. This is backwards. If you understand the data,
and especially its invariants, the code is often pretty obvious.

There are two examples of this in TinyMQ. One is specific to
TinyMQ. The other other is nearly universal in Erlang practice.

Erlang systems are made of lots of processes sending messages
to each other. Joe Armstrong has often said THINK ABOUT THE
PROTOCOLS. But Erlang programmers very seldom *write* about
the protocols.

Using the OTP behaviours, a "concurrent object" is implemented
as a module with a bunch of interface functions that forward
messages through the OTP layer to the callback code managed by
whatever behaviour it is. This protocol is unique to each kind
of concurrent object. It's often generated in one module (the
one with the interface functions) and consumed in another (the
one with the callback code), as it is in TinyMQ. And it's not
documented.

It is possible to reconstruct this protocol by reading the code
in detail and noting down what you see. It is troublesome when,
as in TinyMQ, the two modules disagree about the protocol. It's
clear that _something_ is wrong, but what, exactly?

For example, tinymq_controller has a case
handle_cast({set_max_age, newMaxAge}, State) ->
but this is the only occurrence of set_max_age anywhere in TinyMQ.
Is its presence in tinymq_controller an example of dead code,
or is its absence from the rest of the application an example
of missing code? The same question can be asked about 'expire'
(which would forget a channel without making it actually go away,
if it could ever be invoked, which it can't.)

Almost as soon as I started reading Erlang code many years ago
it seemed obvious to me that documenting (and if possible, type
checking) these internal protocols was a very important part of
Erlang internal documentation. There must be something wrong
with my brain, because other people don't seem to feel this lack
anywhere nearly as strongly as I do. I think Joe Armstrong sort
of sees this at the next level up or he would never have invented
UBF.

But Occam, Go, and Sing# have typed channels, so they *are*
addressing the issue, and *do* have a natural central point to
document what the alternatives of an internal protocol signify.

Another documentation failure is that we fail to document what
is not there. In TinyMQ, a channel automatically comes into
existence when you try to use it. Perhaps as a consequence of
this, there is no way to shut a channel down. In TinyMQ, old
messages are not removed from a channel when they expire, but
the next time someone does a 'subscribe' (waves hands) or a 'poll'
or a 'push' *after* they expire. So if processes stop sending
and requesting messages to some channel, the last few messages,
no matter how large, may hang around forever. I'm sure there
is a reason, but because it's a reason for something *not* being
there, there's no obvious place to hang the comment, and there
isn't one. (Except for the dead 'expire' clause mentioned above.)

IT'S HARD TO SPOT SALIENT DETAIL IN A SEA OF GLUE CODE.

The central fact about TinyMQ is that it holds the messages of
a channel in a simple list of {Message, Timestamp} pairs. As
a result, every operation on the data takes time linear in the
current size.

This is not stated anywhere in any comments nor in the README.
You have to read the code in detail to discover this. And it
is a rather nasty surprise. If a channel holds N messages,
the operations *can* be done in O(log(N)) time. (I believe it
is possible to do even better.) Some sliding window applications
have a bound on the number of elements in the window. This one
has a bound on the age of elements, but they could arrive at a
very high rate, so N *could* get large.

It is very easy to implement the necessary operations using lists,
so much so that they are present in several copies. Revising the
TinyMQ implementation to work better with long queues would be
harder than necessary because of this. And this goes un-noticed
because there is so much glue code for the guts to get lost in.

Given that Evan Miller took the trouble to use library components
for structuring this application, why didn't he take the next step,
and use the existing 'sliding window' library data structure?

Because there is none!

Yet sliding windows of one sort or another have come up before in
this mailing list. Perhaps we should have a Wiki page on
trapexit to gather requirements for one or more sliding window
libraries. Or perhaps not. "true religion jeans for women" --
what has that or "Cheap Nike Shoes" to do with Erlang/OTP
(http://www.trapexit.org/forum/viewforum.php?f=20)?





_______________________________________________
erlang-questions mailing list
erlang-q...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

Max Lapshin

unread,
Aug 31, 2012, 2:42:57 AM8/31/12
to Richard O'Keefe, cos...@cs.otago.ac.nz, Erlang Questions, cos...@mailhub2.otago.ac.nz
I really don't understand, what makes you sad.

Is it that people don't write internal documentation? Perhaps it is
because code is a documenation.

Francesco Mazzoli

unread,
Aug 31, 2012, 3:36:10 AM8/31/12
to Richard O'Keefe, cos...@cs.otago.ac.nz, Erlang Questions, cos...@mailhub2.otago.ac.nz
At Fri, 31 Aug 2012 10:42:57 +0400,
Max Lapshin wrote:
> I really don't understand, what makes you sad.
>
> Is it that people don't write internal documentation? Perhaps it is because
> code is a documenation.

Did you read his post? His main point about documentation is that people tend
to document functions but not data structures, which is something that I fully
agree with.

This problem derives from Erlang being unityped - In languages like Haskell or
SML with proper ADTs documenting them is a natural thing to do, and that's
usually the main part of the documentation in general.

--
Francesco * Often in error, never in doubt

Rapsey

unread,
Aug 31, 2012, 3:43:41 AM8/31/12
to Richard O'Keefe, Erlang Questions
My biggest gripe with erlang are the limitations of records. Anyone know when frames will make an appearance?


Sergej

Max Lapshin

unread,
Aug 31, 2012, 3:57:53 AM8/31/12
to Francesco Mazzoli, cos...@cs.otago.ac.nz, Erlang Questions, cos...@mailhub2.otago.ac.nz
On Fri, Aug 31, 2012 at 11:36 AM, Francesco Mazzoli <f...@mazzo.li> wrote:
>
> Did you read his post? His main point about documentation is that people tend
> to document functions but not data structures, which is something that I fully
> agree with.
>

This post has nothing to do with erlang. It is just an ancient
question: waste your time on documenting inner code or not.
I never document such things, because it is silly: you spend more time
on rewriting documentation.

And I perfectly read ffmpeg sources without any documentation inside them.


So, about erlang.

There are two BIG problems in erlang: error_logger and lack of frames.

error_logger is a problem, because 99% times, when erlang VM crashes,
it was error_logger, that allocated 40 GB of RAM and died.
If erlang is eating 100% CPU, it is error_logger. Error_logger is the
heel of Achilles of erlang. And it is a pity, because it makes
impossible
to create gen_server with states larger than 100 KBytes.


Lack of frames makes almost impossible hot code upgrades with changing
of data structures. It makes impossible distributing plugins.

Loïc Hoguin

unread,
Aug 31, 2012, 4:42:34 AM8/31/12
to Max Lapshin, cos...@cs.otago.ac.nz, Erlang Questions, cos...@mailhub2.otago.ac.nz
On 08/31/2012 08:42 AM, Max Lapshin wrote:
> I really don't understand, what makes you sad.
>
> Is it that people don't write internal documentation? Perhaps it is
> because code is a documenation.

Only to an extent.

If your gen_server is only meant to be accessed through the API
functions, then sure, that's fine.

If it can receive messages from other sources, then these should
probably be documented. For gen_server the module description would be a
good place to put explanations, and handle_info should probably have a
detailed spec, clause by clause, on what it can receive.

I think the difficulty in documenting the protocol is that we write
modules while the protocol is about processes. What do we do about
processes that use more than one module that can send messages? I go
about it with an "Internals" documentation but that's separate from the
code.

--
Loïc Hoguin
Erlang Cowboy
Nine Nines
http://ninenines.eu

Thomas Lindgren

unread,
Aug 31, 2012, 4:47:53 AM8/31/12
to Erlang Questions




----- Original Message -----
> From: Max Lapshin <max.l...@gmail.com>

>
> Lack of frames makes almost impossible hot code upgrades with changing
> of data structures. It makes impossible distributing plugins.


The whole "upgrade your record/state" thing is pretty awkward (because at this point you usually have two definitions of the same record, for those who haven't thought about it). One approach is to write a shim module and do a more elaborate upgrade, which is a pain. Another approach is to grunt that "records are just syntactic sugar for tuples" and hack away. But I think this problem could also be fixed by changing the gen_* interface. Add two callbacks, invoked at the appropriate points:

* convert internal state to key-value list (when exiting the old version)
* convert key-value list to internal state (when entering the new version)

No more clashing record definitions. It would also be nice if the preprocessor could generate the from/to conversions. 

Usual disclaimers apply.

Best,
Thomas

Francesco Mazzoli

unread,
Aug 31, 2012, 6:45:03 AM8/31/12
to Max Lapshin, Francesco Mazzoli, cos...@cs.otago.ac.nz, Erlang Questions, cos...@mailhub2.otago.ac.nz
At Fri, 31 Aug 2012 11:57:53 +0400,
Max Lapshin wrote:
> This post has nothing to do with erlang. It is just an ancient question: waste
> your time on documenting inner code or not.

Again, did you read the post?

It's well known that there are two kinds of documentation, "external"
documentation for people writing clients of a module, and "internal"
documentation for people maintaining the module itself. It's also well known
that the division is simplistic; if the external documentation is silent about
material points you have to read the internal documentation.

The "internal" documentation, as you call it, is the only one that really
matter. It's no waste of time. It rarely happens that after using a library
for more than I few hours I don't need to look at the code to understand what
the functions I'm using do.

Moreover, as ROK says, I really don't see the distinction: the "internal" and
"external" documentation should be one.

> I never document such things, because it is silly: you spend more time on
> rewriting documentation.

I'd much rater have something that explains the internal design that some edoc
that tells me more or less what functions will do.


> And I perfectly read ffmpeg sources without any documentation inside them.

Well, some code is pretty self explanatory. But most code isn't.

--
Francesco * Often in error, never in doubt

o...@cs.otago.ac.nz

unread,
Aug 31, 2012, 8:19:32 AM8/31/12
to Max Lapshin, Francesco Mazzoli, cos...@cs.otago.ac.nz, Erlang Questions, cos...@mailhub2.otago.ac.nz
> On Fri, Aug 31, 2012 at 11:36 AM, Francesco Mazzoli <f...@mazzo.li> wrote:
>>
>> Did you read his post? His main point about documentation is that
>> people tend
>> to document functions but not data structures, which is something that I
>> fully
>> agree with.
>>
>
> This post has nothing to do with erlang. It is just an ancient
> question: waste your time on documenting inner code or not.

Waste? In this case, code that is basically *good*, written
by a competent Erlang programmer, has a serious performance
limitation --- which may well have been a complete non-issue
in its original context, but now that the code is available
for use in other contexts, may be very important. And this is
not documented *anywhere*, and it is hard to see in the code
because it is swamped by other things.

> I never document such things, because it is silly: you spend more time
> on rewriting documentation.

Nobody said documentation had to be *bulky*.
In my Smalltalk system, I've found that the code changes like
dreams as I refactor and extend it, but that the data structure
invariants are far more stable. It is precisely the act of
documenting those invariants that makes it *possible* to work
so easily on the code.

> And I perfectly read ffmpeg sources without any documentation inside them.

Perhaps that is why the ffmpeg site says the software
"work really well 99% of the time". That kind of failure
rate is not tolerated in the Erlang world.

Hmm. Why am I believing this? Ad fontes! Calculemus!
Download ffmpeg sources.
Count SLOC: 460,014 lines
Count comments: 77,320 lines (excluding copyright notices & GNU licences)
That's roughly one comment line for every 6 SLOC.
This is *NOT* "sources without any documentation inside them".

Returning to Erlang, I thought I had made the point that
I wanted the structure of the code to reveal what is going
on, but it doesn't. And for the specific example considered,
10 lines of commentary would have rendered roughly 200 SLOC
of code pretty near superfluous to understanding.

Joe Armstrong

unread,
Aug 31, 2012, 10:20:31 AM8/31/12
to Richard O'Keefe, cos...@cs.otago.ac.nz, Erlang Questions, cos...@mailhub2.otago.ac.nz
On Fri, Aug 31, 2012 at 8:20 AM, Richard O'Keefe <o...@cs.otago.ac.nz> wrote:
Thank you.

I was wondering - perhaps it is wrong to publish source code - If you
have to read
the internal documentation to understand the external behavior of a program then
the external documentation is not good enough.

Dare you publish only binary code and documentation of the interfaces?

I find that large numbers of programs cannot be understood by reading
the documentation
of the interfaces - you have to read the internal documentation and
(horrors) the code.

Reading code is no fun - since you always wonder *why* they wrote it
that way, and not some
other way and you get tempted to change it.

I think you should only publish binary code and external
documentation. If a user wants
to know how to use the code and have to ask then you have failed to
document your code.

If a user wants to see the code, not because they wish to use it, but
because they wish
to see how you solved the problem - then you can let them see the code.

Programs and code are supposed to be black-boxes. If you have to open
the black box and
peep inside then they are not black boxes any more.

The practice of reading code to figure out how to use the code is
crazy and an incredible
waste of time.

When programming I spend most of my time fixing things that should not be broken
and figuring out stuff that should be documented.

I have said many times - code is the result of research - It might
take me hours of research
to write twenty lines of code.If I publish the 20 lines and throw away
the research I am doing nobidy
a favor.

Programs should be released with all the necessary documents needed to
understand the code.

/Joe

Tim Watson

unread,
Aug 31, 2012, 10:23:15 AM8/31/12
to Joe Armstrong, cos...@cs.otago.ac.nz, Erlang Questions, cos...@mailhub2.otago.ac.nz

On 31 Aug 2012, at 15:20, Joe Armstrong wrote:
> Programs and code are supposed to be black-boxes. If you have to open
> the black box and
> peep inside then they are not black boxes any more.
>
> The practice of reading code to figure out how to use the code is
> crazy and an incredible
> waste of time.
>

+1

> When programming I spend most of my time fixing things that should not be broken
> and figuring out stuff that should be documented.
>

+1, much to my annoyance

> I have said many times - code is the result of research - It might
> take me hours of research
> to write twenty lines of code.If I publish the 20 lines and throw away
> the research I am doing nobidy
> a favor.


I wish this was brainwashed into every programmer at birth. :)
signature.asc

Ivan Uemlianin

unread,
Aug 31, 2012, 10:25:31 AM8/31/12
to erlang-q...@erlang.org
On 31/08/2012 15:23, Tim Watson wrote:
>
> On 31 Aug 2012, at 15:20, Joe Armstrong wrote:
>>...
>> I have said many times - code is the result of research - It might
>> take me hours of research
>> to write twenty lines of code.If I publish the 20 lines and throw away
>> the research I am doing nobidy
>> a favor.
>
>
> I wish this was brainwashed into every programmer at birth. :)

... and every manager!


--
============================================================
Ivan A. Uemlianin PhD
Llaisdy
Speech Technology Research and Development

iv...@llaisdy.com
www.llaisdy.com
llaisdy.wordpress.com
github.com/llaisdy
www.linkedin.com/in/ivanuemlianin

"hilaritas excessum habere nequit"
(Spinoza, Ethica, IV, XLII)
============================================================

Max Lapshin

unread,
Aug 31, 2012, 11:18:31 AM8/31/12
to Ivan Uemlianin, erlang-q...@erlang.org
Guys. It is nice in theory, but I have other practice.

For example, there is such a protocol, like MPEG-TS (in fact, problem
is deeper, it is a whole set of protocols).
There is no single program, that implements it for 100%. Every program
lacks something.
Without source code it is impossible to find out, what is the problem.

So I consider, that "black-box" approach doesn't work, because every
code has bugs. If you can't modify this code, don't use it in your
program at all.

Ivan Uemlianin

unread,
Aug 31, 2012, 11:24:24 AM8/31/12
to erlang-q...@erlang.org
One point about documentation is the language it's written in. If the
documentation was written in Greek but you only understand Turkish, the
documentation won't be much good.
--
============================================================
Ivan A. Uemlianin PhD
Llaisdy
Speech Technology Research and Development

iv...@llaisdy.com
www.llaisdy.com
llaisdy.wordpress.com
github.com/llaisdy
www.linkedin.com/in/ivanuemlianin

"hilaritas excessum habere nequit"
(Spinoza, Ethica, IV, XLII)
============================================================

Evan Miller

unread,
Aug 31, 2012, 1:42:17 PM8/31/12
to Richard O'Keefe, cos...@cs.otago.ac.nz, Erlang Questions, cos...@mailhub2.otago.ac.nz
Richard,

Thanks for your comments. To preface, I plead guilty to charges of
gross negligence in failing to document TinyMQ's internals. This was
laziness on my part.

I released TinyMQ only because I felt guilty for sitting on the code
for about a year. Like many open-source programmers, I have a lot of
demands on my attention, and it is not clear in advance what
documentation is actually worth writing. The @spec and @doc strings
for the public API seemed like a good start. But if it turned out that
no one was interested in using the library in the first place, why
should I bother documenting internal protocols and data structures?
I've wasted many hours in the past documenting, refactoring, and
generally cleaning up application internals for the benefit of
nebulous "others", only to receive zero patches and no indication that
any of my efforts were of any assistance to anyone.

So in the spirit of your capitalized complaints, I will just say:

ALL YOU HAVE TO DO IS ASK

Want to know about the big-O performance characteristics? Just ask.
Want to know how channel creation works? Just ask. As a lazy person,
if a few people ask me the same thing I'll usually add a note to the
README in order to avert future emails from strangers. We all like a
well-documented project, but without feedback and communication it is
not clear where one's efforts are best spent on a project that doesn't
have an explicit client. If I knew in advance who would be using and
reading the code (i.e. if I wrote this code for an employer), I would
put more effort into writing documents for that specific audience. But
as a rule, if I am just putting some code "out there", I would rather
wait and see what people would like to know about, rather than
pre-emptively document every thought that has ever occurred to me
relating to the code base.

Now, I know you were not trying to pick on TinyMQ, and your interest
is more in how Erlang tends to result in lumps of code that obscure
key characteristics of the application. I agree with the assessment,
but I am not quite as hopeless about the situation.

I would like to see the development of graphical tools that let you
see in an instant how applications are structured and how they behave.
I am thinking of something like Pman on steroids, where I can *watch*
messages travel between processes, *inspect* gen_server state, and
*test* the system by seeing the result of single function calls or
many (load-testing). I'd like to be able to do all this with my mouse,
and generally get the feeling that I am watching the operation of a
machine that *shows* me how messages are passed, processes are
created, and state is updated.

Did anyone else ever play Marble Drop from Maxis in the late 90s? That
is the kind of interface I would like to see for the Erlang run-time.

For now, I'll update the README.

Evan
--
Evan Miller
http://www.evanmiller.org/

Garrett Smith

unread,
Aug 31, 2012, 2:14:46 PM8/31/12
to Evan Miller, cos...@cs.otago.ac.nz, Erlang Questions, cos...@mailhub2.otago.ac.nz
It's fun to watch this process. Back in the days when most source code
closed, you'd never get to see the iterations of maturing code.

Now we write something and then put it on a publicly accessible server
where people can see it in its earliest, rawest form.

And to get minds like RoK and Joe to weigh in -- it's very special I think.

Garrett Smith

unread,
Aug 31, 2012, 2:16:15 PM8/31/12
to Evan Miller, cos...@cs.otago.ac.nz, Erlang Questions, cos...@mailhub2.otago.ac.nz
This of course wasn't meant to imply that TinyMQ was early or raw!

Evan Miller

unread,
Aug 31, 2012, 2:57:46 PM8/31/12
to Richard O'Keefe, cos...@cs.otago.ac.nz, Erlang Questions, cos...@mailhub2.otago.ac.nz
A few more comments specific to TinyMQ:

On Fri, Aug 31, 2012 at 1:20 AM, Richard O'Keefe <o...@cs.otago.ac.nz> wrote:
> Another documentation failure is that we fail to document what
> is not there. In TinyMQ, a channel automatically comes into
> existence when you try to use it. Perhaps as a consequence of
> this, there is no way to shut a channel down. In TinyMQ, old
> messages are not removed from a channel when they expire, but
> the next time someone does a 'subscribe' (waves hands) or a 'poll'
> or a 'push' *after* they expire. So if processes stop sending
> and requesting messages to some channel, the last few messages,
> no matter how large, may hang around forever. I'm sure there
> is a reason, but because it's a reason for something *not* being
> there, there's no obvious place to hang the comment, and there
> isn't one. (Except for the dead 'expire' clause mentioned above.)
>
> IT'S HARD TO SPOT SALIENT DETAIL IN A SEA OF GLUE CODE.

This probably proves your point, but just to be clear old messages
will only "hang around forever" if 1. there are no new messages on a
channel and 2. the channel continues to receive requests. If there is
no activity on a channel for max_age, the gen_server timeout is
invoked here:

https://github.com/evanmiller/tinymq/blob/master/src/tinymq_channel_controller.erl#L76

That calls "expire" (erasing the reference to the channel) and then
exits the supervisor (eliminating the channel and all of its
messages). So in the absence of channel activity, the longest a
message will hang around is 2 * max_age, e.g. when there is a push
\epsilon seconds before an old message is set to expire.

>
> The central fact about TinyMQ is that it holds the messages of
> a channel in a simple list of {Message, Timestamp} pairs. As
> a result, every operation on the data takes time linear in the
> current size.
>
> This is not stated anywhere in any comments nor in the README.
> You have to read the code in detail to discover this. And it
> is a rather nasty surprise. If a channel holds N messages,
> the operations *can* be done in O(log(N)) time. (I believe it
> is possible to do even better.) Some sliding window applications
> have a bound on the number of elements in the window. This one
> has a bound on the age of elements, but they could arrive at a
> very high rate, so N *could* get large.

I have added some notes about TinyMQ's run-time characteristics to the README:

https://github.com/evanmiller/tinymq/commit/568bc7ce94ebaa42e3ce047c375c73323042853a

>
> It is very easy to implement the necessary operations using lists,
> so much so that they are present in several copies. Revising the
> TinyMQ implementation to work better with long queues would be
> harder than necessary because of this. And this goes un-noticed
> because there is so much glue code for the guts to get lost in.
>
> Given that Evan Miller took the trouble to use library components
> for structuring this application, why didn't he take the next step,
> and use the existing 'sliding window' library data structure?
>
> Because there is none!

This would be a useful addition!

Evan

>
> Yet sliding windows of one sort or another have come up before in
> this mailing list. Perhaps we should have a Wiki page on
> trapexit to gather requirements for one or more sliding window
> libraries. Or perhaps not. "true religion jeans for women" --
> what has that or "Cheap Nike Shoes" to do with Erlang/OTP
> (http://www.trapexit.org/forum/viewforum.php?f=20)?
>
>
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-q...@erlang.org
> http://erlang.org/mailman/listinfo/erlang-questions



Evan Miller

unread,
Aug 31, 2012, 3:20:06 PM8/31/12
to Richard O'Keefe, cos...@cs.otago.ac.nz, Erlang Questions, cos...@mailhub2.otago.ac.nz
On Fri, Aug 31, 2012 at 12:42 PM, Evan Miller <emmi...@gmail.com> wrote:
> Did anyone else ever play Marble Drop from Maxis in the late 90s? That
> is the kind of interface I would like to see for the Erlang run-time.

For those of you who were not in middle school in 1997, here is a
screenshot of Marble Drop:

http://www.mobygames.com/images/shots/l/168377-marble-drop-windows-screenshot-one-of-the-many-puzzless.jpg

The game was to figure out what order to drop colored balls down into
the different funnels to achieve a desired ordering of balls at the
bottom, but every drop would (predictably) change the internal state
of the level as the marble passed by. I think of the colored balls as
being like Erlang messages, the funnels as being processes that
receive messages from the outside, the gates and switches as being
various bits of internal state, and the trays at the bottom being
assertions about the final state of the application.

I don't believe in "visual programming" per se but with the right
interface I think "visual debugging" would make Erlang programs much
more comprehensible (and fun).

Evan

Per Melin

unread,
Sep 2, 2012, 10:55:47 AM9/2/12
to Evan Miller, Erlang Questions
On Aug 31, 2012, at 20:57 , Evan Miller wrote:

> A few more comments specific to TinyMQ:
>
> On Fri, Aug 31, 2012 at 1:20 AM, Richard O'Keefe <o...@cs.otago.ac.nz> wrote:
>> Another documentation failure is that we fail to document what
>> is not there. In TinyMQ, a channel automatically comes into
>> existence when you try to use it. Perhaps as a consequence of
>> this, there is no way to shut a channel down. In TinyMQ, old
>> messages are not removed from a channel when they expire, but
>> the next time someone does a 'subscribe' (waves hands) or a 'poll'
>> or a 'push' *after* they expire. So if processes stop sending
>> and requesting messages to some channel, the last few messages,
>> no matter how large, may hang around forever. I'm sure there
>> is a reason, but because it's a reason for something *not* being
>> there, there's no obvious place to hang the comment, and there
>> isn't one. (Except for the dead 'expire' clause mentioned above.)
>>
>> IT'S HARD TO SPOT SALIENT DETAIL IN A SEA OF GLUE CODE.
>
> This probably proves your point, but just to be clear old messages
> will only "hang around forever" if 1. there are no new messages on a
> channel and 2. the channel continues to receive requests. If there is
> no activity on a channel for max_age, the gen_server timeout is
> invoked here:
>
> https://github.com/evanmiller/tinymq/blob/master/src/tinymq_channel_controller.erl#L76

There must be a bug there. You only specify a timeout in the return of init, but that does not persist beyond the first message received. I can't see how that timeout would ever be invoked.


> That calls "expire" (erasing the reference to the channel) and then
> exits the supervisor (eliminating the channel and all of its
> messages).

If you by that mean that it kills the supervisor, it does not. It kills itself, with the supervisor pid as its exit message.

Either way, why do you create a new supervisor for each channel, and why do you create it outside the supervisor tree (i.e from a gen_server and not another supervisor)? What purpose does it serve?

Evan Miller

unread,
Sep 2, 2012, 11:12:19 AM9/2/12
to Per Melin, Erlang Questions
On Sun, Sep 2, 2012 at 9:55 AM, Per Melin <per....@gmail.com> wrote:
> On Aug 31, 2012, at 20:57 , Evan Miller wrote:
>
>> A few more comments specific to TinyMQ:
>>
>> On Fri, Aug 31, 2012 at 1:20 AM, Richard O'Keefe <o...@cs.otago.ac.nz> wrote:
>>> Another documentation failure is that we fail to document what
>>> is not there. In TinyMQ, a channel automatically comes into
>>> existence when you try to use it. Perhaps as a consequence of
>>> this, there is no way to shut a channel down. In TinyMQ, old
>>> messages are not removed from a channel when they expire, but
>>> the next time someone does a 'subscribe' (waves hands) or a 'poll'
>>> or a 'push' *after* they expire. So if processes stop sending
>>> and requesting messages to some channel, the last few messages,
>>> no matter how large, may hang around forever. I'm sure there
>>> is a reason, but because it's a reason for something *not* being
>>> there, there's no obvious place to hang the comment, and there
>>> isn't one. (Except for the dead 'expire' clause mentioned above.)
>>>
>>> IT'S HARD TO SPOT SALIENT DETAIL IN A SEA OF GLUE CODE.
>>
>> This probably proves your point, but just to be clear old messages
>> will only "hang around forever" if 1. there are no new messages on a
>> channel and 2. the channel continues to receive requests. If there is
>> no activity on a channel for max_age, the gen_server timeout is
>> invoked here:
>>
>> https://github.com/evanmiller/tinymq/blob/master/src/tinymq_channel_controller.erl#L76
>
> There must be a bug there. You only specify a timeout in the return of init, but that does not persist beyond the first message received. I can't see how that timeout would ever be invoked.
>

Perhaps I am confused, but I thought the gen_server timeout will also
occur between messages.

>
>> That calls "expire" (erasing the reference to the channel) and then
>> exits the supervisor (eliminating the channel and all of its
>> messages).
>
> If you by that mean that it kills the supervisor, it does not. It kills itself, with the supervisor pid as its exit message.

Thanks, that is a bug in the code.

>
> Either way, why do you create a new supervisor for each channel, and why do you create it outside the supervisor tree (i.e from a gen_server and not another supervisor)? What purpose does it serve?
>

The gen_server does "double duty" as a controller and a supervisor;
this is poor design on my part. There should probably be a single
supervisor process for all the channels, monitored by the same
supervisor that monitors the main controller.

Per Melin

unread,
Sep 2, 2012, 11:41:54 AM9/2/12
to Evan Miller, Erlang Questions
On Sep 2, 2012, at 17:12 , Evan Miller wrote:

>>> This probably proves your point, but just to be clear old messages
>>> will only "hang around forever" if 1. there are no new messages on a
>>> channel and 2. the channel continues to receive requests. If there is
>>> no activity on a channel for max_age, the gen_server timeout is
>>> invoked here:
>>>
>>> https://github.com/evanmiller/tinymq/blob/master/src/tinymq_channel_controller.erl#L76
>>
>> There must be a bug there. You only specify a timeout in the return of init, but that does not persist beyond the first message received. I can't see how that timeout would ever be invoked.
>>
>
> Perhaps I am confused, but I thought the gen_server timeout will also
> occur between messages.

The timeout only persists until it fires or a message comes in. The timer is not restarted by a message, it is discarded. So you'll need to supply the timeout duration again in each and every reply/noreply tuple from handle_call, handle_cast and handle_info.

I find this inconvenient and very error prone though, and normally use erlang:send_after(Time, self(), timeout) instead. You'll of course need to keep track of the reference though and reset the timer yourself.

Richard O'Keefe

unread,
Sep 2, 2012, 7:23:54 PM9/2/12
to Evan Miller, cos...@cs.otago.ac.nz, Erlang Questions, cos...@mailhub2.otago.ac.nz

On 1/09/2012, at 5:42 AM, Evan Miller wrote:

> Richard,
>
> Thanks for your comments. To preface, I plead guilty to charges of
> gross negligence in failing to document TinyMQ's internals. This was
> laziness on my part.

I do hope that I did not give offence.
There would not have been any point in criticising something *bad*.
I picked on your code for three reasons:
- I had just downloaded it and spent some time reading it carefully.
- It was small enough that I *could* read it carefully in full.
- The problems I had doing that are ones that I have had over and
over again, and the biggest of them is nothing to do with you.

> I released TinyMQ only because I felt guilty for sitting on the code
> for about a year. Like many open-source programmers, I have a lot of
> demands on my attention, and it is not clear in advance what
> documentation is actually worth writing.

Limitations and examples.

> The @spec and @doc strings
> for the public API seemed like a good start. But if it turned out that
> no one was interested in using the library in the first place, why
> should I bother documenting internal protocols and data structures?

All I can say is that the best programmers I have worked with
wrote the documentation *first*. It's part of the "research" phase
that Joe Armstrong mentioned. It's how you get quality code.

As Dijkstra once meant but said differently, "I am a bear of very
little brain", and the first person I have to explain the code to
is myself.

> I've wasted many hours in the past documenting, refactoring, and
> generally cleaning up application internals for the benefit of
> nebulous "others", only to receive zero patches and no indication that
> any of my efforts were of any assistance to anyone.
>
> So in the spirit of your capitalized complaints, I will just say:
>
> ALL YOU HAVE TO DO IS ASK
>
> Want to know about the big-O performance characteristics?

But this is where you *START* the design!

If it's a well known data structure, you can get away with leaving
it to the books, but sliding windows, while common, are _not_ a
well known data structure. (Well, ones where the window is a *count*
of elements are tolerably well known, but ones where the window is
an *age* are not. I asked about this in the Haskell mailing list,
and Chris Okasaki -- Mr Functional Data Structures himself -- didn't
know of anything off-the-shelf, although he did suggest something
that might just work. He also raised an interesting question about
the interface.)

> Just ask.

By the time I've asked and got a reply, seeing as we're in very
different time zones, I'd have written my own.

Last year I was involved in a project where I had to "just ask".
The turnaround time was a week at best, and about half of my
questions never did get answered.

> Now, I know you were not trying to pick on TinyMQ, and your interest
> is more in how Erlang tends to result in lumps of code that obscure
> key characteristics of the application. I agree with the assessment,
> but I am not quite as hopeless about the situation.
>
> I would like to see the development of graphical tools that let you
> see in an instant how applications are structured and how they behave.
> I am thinking of something like Pman on steroids, where I can *watch*
> messages travel between processes, *inspect* gen_server state, and
> *test* the system by seeing the result of single function calls or
> many (load-testing). I'd like to be able to do all this with my mouse,
> and generally get the feeling that I am watching the operation of a
> machine that *shows* me how messages are passed, processes are
> created, and state is updated.

Something very like such graphical tools already exists;
I recall viewing some videos someone had made using Ubigraph
showing the dynamic process structure of Erlang programs.
What I want, though, is to see the structure of something without
having to run it or even compile it.

I'm thinking more in terms of some sort of preprocessor taking
something visibly and obviously a server process and emitting the
lower-level Erlang code. The thing is that I believe that it is
better to make something really obvious at the outset than to
provide better tools for undoing obfuscation.



Oh, let's put this all in perspective.

Apart from my general complaint that all Erlang code tends to
look alike (just like all Smalltalk or Prolog or Haskell or Lisp
code tends to look alike), the internal documentation
that I'd have liked to see is this:

TinyMQ is a gen_server guarding a dictionary mapping
channel names (normally strings) to channels, where
a channel is a supervised gen_server guarding a
list of subscribers and a list of {Message,Timestamp}
pairs. Messages expire after MaxAge seconds but are
not removed until the next time the channel does something

There's an hour's close reading time saved right there.

Steve Davis

unread,
Sep 3, 2012, 3:33:19 PM9/3/12
to erlang-pr...@googlegroups.com, Erlang Questions, cos...@cs.otago.ac.nz, cos...@mailhub2.otago.ac.nz, o...@cs.otago.ac.nz
Think I said before that the greatest missing piece of code maintainability is not type safety, but documentation of intent.
Reply all
Reply to author
Forward
0 new messages