On Thu, May 10, 2012 at 8:10 AM, Richard O'Keefe <o...@cs.otago.ac.nz> wrote:
[...]
> Joe's little document made it clear that > (a) there are a lot more X registers allowed in Erlang that in > Quintus Prolog; > (b) maintaining them is more expensive than in Quintus Prolog; > (c) the nilling instructions I expected don't exist; > (d) there is a (temporary) space leak: if register K is live > at an allocation point, all registers <= K are assumed to > be live.
There is not a temporary space leak in practice, because the compiler will insert an instruction that will clear (set to NIL) each dead register below K before any allocation point.
On Thu, May 10, 2012 at 9:59 AM, Joe Armstrong <erl...@gmail.com> wrote:
> Now there are two levels at which one could describe the Beam - level one
> is the relationship between erlang code and the beam instructions -
> this is what I described.
> To describe the next level - we suddenly jump from a one chapter
> description to a
> entire book. This is a book that is tricky to write - I guess no one
> person knows
> all the answers.
This is not entirely on topic, but close enough (and probably only
Björn could know the answer, I have a feeling that reading the code
would require many years of study :-)): I have been wondering how much
is the runtime tied up to the BEAM instruction set.
More precisely, would it be possible (at least conceptually) to
separate the scheduler + process manager + message passing
infrastructure from the instruction executor?
One common thing is the internal data format (because the runtime
creates for example exit messages) so there are restrictions imposed
by that, but it would be kind of cool to be able to plug in an
interpreter for a different instruction set, in case one wants to use
something else than Erlang and this other thing can't be compiled to
Erlang, Erlang core or BEAM.
On Monday, May 07, Jonathan Coveney wrote:
> This question seems to come up now and again, and it's surprising to me
> that a crucial part of the documentation isn't better documented.
I began writing about the instruction set and possibly binary format at
some point, but got sidetracked by trying to explain "why isn't this
documented, and is that a good or a bad thing?"
(I have a bit of text and graphics, though, which perhaps I ought to
publish.)
One interesting thing came out of that effort, though: I noticed that
- there is no tail-call version of the call_fun (function object
application) instruction
- even so, TCO still works in beam as it should
- the reason for this is that a tail-call version exists in the *internal*
version of the instruction set, which is introduced by rewriting the
sequence "call_fun; deallocate; return" (iirc)
- Erjang had, of course, missed this the first time around, so I corrected
it once I'd found out.
I've been wondering whether this special case was caught in your
description of beam...
Of course, what I should have been wondering is whether the JS interpreter
had got it right :-)
That kind of irregularity is probably not unrelated to the unpublishedness
of the format.
On the other hand, this is one gotcha that affects not only would-be
producers of beam code (which is probably the harder part to get right,
when you don't know which invariants you have to maintain in the produced
code), but it also affects beam consumers (which is in some ways easier, at
least when you can ignore everything GC-related as it was the case with
Erjang) which have to do the same kind of rewrite.
> ---------- Forwarded message ----------
> From: Joe Armstrong <erl...@gmail.com>
> Date: Mon, May 7, 2012 at 10:46 AM
> Subject: Re: [erlang-questions] Is there a good source for
> documentation on BEAM?
> To: Jonathan Coveney <jcove...@gmail.com>
> Hi,
> I did start writing a description but it's not very complete.
> This is on my list of things-to-do-one-day-when-you-get-time
> "As for what I see would cause a slowdown: the attention of the key
> hackers would be spent on writing this
> documentation (and then maintaining it, I assume)."
> Perhaps better: volunteers could document it (on a relatively
> controlled wiki, for example). Then the "key hackers" could mention
> any needed corrections.
This is certainly possible, and it would be a lot better than nothing.
The problem is of course that it goes backwards:
volunteers can document what *is* there, not what was *meant* to
be there, and cannot document why certain things that *aren't*
there shouldn't be.
Peitho: Here you are, Socrates, your very own orrery.
Socrates: But Peitho, where is the description of how it works?
Peitho: When you've figured it out, why don't you write that?
Socrates: [censored]
On 10/05/2012, at 8:28 PM, Björn Gustavsson wrote:
> On Thu, May 10, 2012 at 8:10 AM, Richard O'Keefe <o...@cs.otago.ac.nz> wrote:
> [...]
>> Joe's little document made it clear that
>> (a) there are a lot more X registers allowed in Erlang that in
>> Quintus Prolog;
>> (b) maintaining them is more expensive than in Quintus Prolog;
>> (c) the nilling instructions I expected don't exist;
>> (d) there is a (temporary) space leak: if register K is live
>> at an allocation point, all registers <= K are assumed to
>> be live.
> There is not a temporary space leak in practice, because the compiler
> will insert an instruction that will clear (set to NIL) each dead register
> below K before any allocation point.
And *THAT* is precisely the kind of BEAM documentation that
volunteers will not be able to reconstruct without a disproportionate
amount of effort:
This operand gives the number of the highest live
X register, so that if a garbage collection is needed,
the collector knows which registers to trace from.
If any X registers with smaller numbers are dead at
this point, the compiler MUST ensure that they contain
immediate values, by nilling them if necessary.
Couldn't some of the bootstrap Perl scripts like beam_makeops and make_tables be
rewritten and documented in Erlang? I think it would make things more obvious if
they were not obscure Perl scripts without comments. Furthermore it would make
Erlang/OTP eat more of its own dog food.
The only thing that would need to be changed with regard to the bootstrap itself
is that their output would have to be versioned just as the erts/preloaded/ BEAM
files. A new command should also be added to otp_build to update them.
> Let me illustrate the Icon approach by showing you a fragment of the
> micro-BEAM I wrote to get the performance numbers in the frames proposal.
> (The whole thing is fragmentary.)
> This computes the maximum using the micro-Erlang term ordering.
> If src and snd are tagged immediate integers the comparison is
> done inline; the compare() function is called otherwise.
> @c
> T = @src;
> U = @snd;
> @dst = cmp(T, >, U) ? T : U;
> @step;
> @e
> ...
> @i
> check_record src, size, const
> @d
> Type test.
> Fail unless src is tagged as a pointer to a tuple or frame,
> the first word it points to is size, and the second is the
> const (which must be an atom, but we don't check that).
> Used for record matching.
> @c
> T = @src;
> if (!is_tuple(T)) @fail "is_record"; else
> if (FIELD(T, TUP_TAG, 0) != @size) @fail "is_record"; else
> if (FIELD(T, TUP_TAG, 1) != @const) @fail "is_record"; else
> @step;
> @e
> ...
> There is a preprocessor written in AWK that turns this into
> several C files. One of them is the emulator cases. For
> the check_record instruction you get
> #line 75 "frame.master"
> case CHECK_RECORD:
> #line 76 "frame.master"
> T = reg[(int)P[1]];
> #line 77 "frame.master"
> if (!is_tuple(T))
> P = failure, operation = "is_record"; else
> #line 78 "frame.master"
> if (FIELD(T, TUP_TAG, 0) != 4)
> P = failure, operation = "is_record"; else
> #line 79 "frame.master"
> if (FIELD(T, TUP_TAG, 1) != P[3])
> P = failure, operation = "is_record"; else
> #line 80 "frame.master"
> P += 4;
> break;
> where I've broken the long lines (the preprocessor doesn't).
> The #line directives are option.
> @i introduces an instruction; the next line is a template
> for it saying what the operands are.
> @d introduces the description for people.
> @c introduces the code. In it, various built-in @macros
> are expanded.
> One advantage of doing it this way is that by using
> @step to update the PC I *cannot* get the offset wrong;
> the preprocessor counted the operands and their sizes
> for me. Similarly, what I write has *no* operand numbers;
> the preprocessor counted those, and supplies all necessary
> casts as well. I can shuffle operands around (in @i)
> without revising the code (in @c), and have.
> It wouldn't be too hard to write another preprocessor that
> built some kind of documentation (HTML would probably be
> easiest) out of this, but since this was an experiment,
> it didn't seem worth while.
> Why did I write the preprocessor?
> Well, to be honest, the first draft didn't use one.
> I got a bit sick of debugging, and wrote the preprocessor
> (based on vague memories of Icon) to eliminate a class of
> errors. It turned out to be _easier_ to develop a
> documented emulator than an undocumented one.
On Mon, 2012-05-14 at 15:02 +0200, Anthony Ramine wrote:
> Couldn't some of the bootstrap Perl scripts like beam_makeops and make_tables be
> rewritten and documented in Erlang? I think it would make things more obvious if
> they were not obscure Perl scripts without comments. Furthermore it would make
> Erlang/OTP eat more of its own dog food.
> The only thing that would need to be changed with regard to the bootstrap itself
> is that their output would have to be versioned just as the erts/preloaded/ BEAM
> files. A new command should also be added to otp_build to update them.
> There may be an obvious reason for them not to be generated by Erlang itself but
> I'm not aware of it.
> Regards.
> --
> Anthony Ramine
> Le 9 mai 2012 ŕ 01:58, Richard O'Keefe a écrit :
> > Let me illustrate the Icon approach by showing you a fragment of the
> > micro-BEAM I wrote to get the performance numbers in the frames proposal.
> > (The whole thing is fragmentary.)
> > This computes the maximum using the micro-Erlang term ordering.
> > If src and snd are tagged immediate integers the comparison is
> > done inline; the compare() function is called otherwise.
> > @c
> > T = @src;
> > U = @snd;
> > @dst = cmp(T, >, U) ? T : U;
> > @step;
> > @e
> > ...
> > @i
> > check_record src, size, const
> > @d
> > Type test.
> > Fail unless src is tagged as a pointer to a tuple or frame,
> > the first word it points to is size, and the second is the
> > const (which must be an atom, but we don't check that).
> > Used for record matching.
> > @c
> > T = @src;
> > if (!is_tuple(T)) @fail "is_record"; else
> > if (FIELD(T, TUP_TAG, 0) != @size) @fail "is_record"; else
> > if (FIELD(T, TUP_TAG, 1) != @const) @fail "is_record"; else
> > @step;
> > @e
> > ...
> > There is a preprocessor written in AWK that turns this into
> > several C files. One of them is the emulator cases. For
> > the check_record instruction you get
> > where I've broken the long lines (the preprocessor doesn't).
> > The #line directives are option.
> > @i introduces an instruction; the next line is a template
> > for it saying what the operands are.
> > @d introduces the description for people.
> > @c introduces the code. In it, various built-in @macros
> > are expanded.
> > One advantage of doing it this way is that by using
> > @step to update the PC I *cannot* get the offset wrong;
> > the preprocessor counted the operands and their sizes
> > for me. Similarly, what I write has *no* operand numbers;
> > the preprocessor counted those, and supplies all necessary
> > casts as well. I can shuffle operands around (in @i)
> > without revising the code (in @c), and have.
> > It wouldn't be too hard to write another preprocessor that
> > built some kind of documentation (HTML would probably be
> > easiest) out of this, but since this was an experiment,
> > it didn't seem worth while.
> > Why did I write the preprocessor?
> > Well, to be honest, the first draft didn't use one.
> > I got a bit sick of debugging, and wrote the preprocessor
> > (based on vague memories of Icon) to eliminate a class of
> > errors. It turned out to be _easier_ to develop a
> > documented emulator than an undocumented one.
There is already a chicken-and-egg problem, that's why there are some BEAM files
in the Git repository, look in erts/preloaded/ebin.
Nothing prevents us from generating these C files from some Erlang code and
versioning them in Git too. This way the system would be bootstrapped from the
previously generated files in the repository.
> Perhaps there is a chicken-and-egg problem with requiring Erlang to
> generate files used to build Erlang?
> bengt
> On Mon, 2012-05-14 at 15:02 +0200, Anthony Ramine wrote:
>> Couldn't some of the bootstrap Perl scripts like beam_makeops and make_tables be
>> rewritten and documented in Erlang? I think it would make things more obvious if
>> they were not obscure Perl scripts without comments. Furthermore it would make
>> Erlang/OTP eat more of its own dog food.
>> The only thing that would need to be changed with regard to the bootstrap itself
>> is that their output would have to be versioned just as the erts/preloaded/ BEAM
>> files. A new command should also be added to otp_build to update them.
>> There may be an obvious reason for them not to be generated by Erlang itself but
>> I'm not aware of it.
>> Regards.
>> --
>> Anthony Ramine
>> Le 9 mai 2012 ŕ 01:58, Richard O'Keefe a écrit :
>>> Let me illustrate the Icon approach by showing you a fragment of the
>>> micro-BEAM I wrote to get the performance numbers in the frames proposal.
>>> (The whole thing is fragmentary.)
>>> This computes the maximum using the micro-Erlang term ordering.
>>> If src and snd are tagged immediate integers the comparison is
>>> done inline; the compare() function is called otherwise.
>>> @c
>>> T = @src;
>>> U = @snd;
>>> @dst = cmp(T, >, U) ? T : U;
>>> @step;
>>> @e
>>> ...
>>> @i
>>> check_record src, size, const
>>> @d
>>> Type test.
>>> Fail unless src is tagged as a pointer to a tuple or frame,
>>> the first word it points to is size, and the second is the
>>> const (which must be an atom, but we don't check that).
>>> Used for record matching.
>>> @c
>>> T = @src;
>>> if (!is_tuple(T)) @fail "is_record"; else
>>> if (FIELD(T, TUP_TAG, 0) != @size) @fail "is_record"; else
>>> if (FIELD(T, TUP_TAG, 1) != @const) @fail "is_record"; else
>>> @step;
>>> @e
>>> ...
>>> There is a preprocessor written in AWK that turns this into
>>> several C files. One of them is the emulator cases. For
>>> the check_record instruction you get
>>> #line 75 "frame.master"
>>> case CHECK_RECORD:
>>> #line 76 "frame.master"
>>> T = reg[(int)P[1]];
>>> #line 77 "frame.master"
>>> if (!is_tuple(T))
>>> P = failure, operation = "is_record"; else
>>> #line 78 "frame.master"
>>> if (FIELD(T, TUP_TAG, 0) != 4)
>>> P = failure, operation = "is_record"; else
>>> #line 79 "frame.master"
>>> if (FIELD(T, TUP_TAG, 1) != P[3])
>>> P = failure, operation = "is_record"; else
>>> #line 80 "frame.master"
>>> P += 4;
>>> break;
>>> where I've broken the long lines (the preprocessor doesn't).
>>> The #line directives are option.
>>> @i introduces an instruction; the next line is a template
>>> for it saying what the operands are.
>>> @d introduces the description for people.
>>> @c introduces the code. In it, various built-in @macros
>>> are expanded.
>>> One advantage of doing it this way is that by using
>>> @step to update the PC I *cannot* get the offset wrong;
>>> the preprocessor counted the operands and their sizes
>>> for me. Similarly, what I write has *no* operand numbers;
>>> the preprocessor counted those, and supplies all necessary
>>> casts as well. I can shuffle operands around (in @i)
>>> without revising the code (in @c), and have.
>>> It wouldn't be too hard to write another preprocessor that
>>> built some kind of documentation (HTML would probably be
>>> easiest) out of this, but since this was an experiment,
>>> it didn't seem worth while.
>>> Why did I write the preprocessor?
>>> Well, to be honest, the first draft didn't use one.
>>> I got a bit sick of debugging, and wrote the preprocessor
>>> (based on vague memories of Icon) to eliminate a class of
>>> errors. It turned out to be _easier_ to develop a
>>> documented emulator than an undocumented one.
> There is already a chicken-and-egg problem, that's why there are some BEAM
> files
> in the Git repository, look in erts/preloaded/ebin.
> Nothing prevents us from generating these C files from some Erlang code and
> versioning them in Git too. This way the system would be bootstrapped from
> the
> previously generated files in the repository.
> --
> Anthony Ramine
> Le 14 mai 2012 ŕ 15:08, Bengt Kleberg a écrit :
> > Greetings,
> > Perhaps there is a chicken-and-egg problem with requiring Erlang to
> > generate files used to build Erlang?
> > bengt
> > On Mon, 2012-05-14 at 15:02 +0200, Anthony Ramine wrote:
> >> Couldn't some of the bootstrap Perl scripts like beam_makeops and
> make_tables be
> >> rewritten and documented in Erlang? I think it would make things more
> obvious if
> >> they were not obscure Perl scripts without comments. Furthermore it
> would make
> >> Erlang/OTP eat more of its own dog food.
> >> The only thing that would need to be changed with regard to the
> bootstrap itself
> >> is that their output would have to be versioned just as the
> erts/preloaded/ BEAM
> >> files. A new command should also be added to otp_build to update them.
> >> There may be an obvious reason for them not to be generated by Erlang
> itself but
> >> I'm not aware of it.
> >> Regards.
> >> --
> >> Anthony Ramine
> >> Le 9 mai 2012 ŕ 01:58, Richard O'Keefe a écrit :
> >>> Let me illustrate the Icon approach by showing you a fragment of the
> >>> micro-BEAM I wrote to get the performance numbers in the frames
> proposal.
> >>> (The whole thing is fragmentary.)
> >>> This computes the maximum using the micro-Erlang term ordering.
> >>> If src and snd are tagged immediate integers the comparison is
> >>> done inline; the compare() function is called otherwise.
> >>> @c
> >>> T = @src;
> >>> U = @snd;
> >>> @dst = cmp(T, >, U) ? T : U;
> >>> @step;
> >>> @e
> >>> ...
> >>> @i
> >>> check_record src, size, const
> >>> @d
> >>> Type test.
> >>> Fail unless src is tagged as a pointer to a tuple or frame,
> >>> the first word it points to is size, and the second is the
> >>> const (which must be an atom, but we don't check that).
> >>> Used for record matching.
> >>> @c
> >>> T = @src;
> >>> if (!is_tuple(T)) @fail "is_record"; else
> >>> if (FIELD(T, TUP_TAG, 0) != @size) @fail "is_record"; else
> >>> if (FIELD(T, TUP_TAG, 1) != @const) @fail "is_record"; else
> >>> @step;
> >>> @e
> >>> ...
> >>> There is a preprocessor written in AWK that turns this into
> >>> several C files. One of them is the emulator cases. For
> >>> the check_record instruction you get
> >>> where I've broken the long lines (the preprocessor doesn't).
> >>> The #line directives are option.
> >>> @i introduces an instruction; the next line is a template
> >>> for it saying what the operands are.
> >>> @d introduces the description for people.
> >>> @c introduces the code. In it, various built-in @macros
> >>> are expanded.
> >>> One advantage of doing it this way is that by using
> >>> @step to update the PC I *cannot* get the offset wrong;
> >>> the preprocessor counted the operands and their sizes
> >>> for me. Similarly, what I write has *no* operand numbers;
> >>> the preprocessor counted those, and supplies all necessary
> >>> casts as well. I can shuffle operands around (in @i)
> >>> without revising the code (in @c), and have.
> >>> It wouldn't be too hard to write another preprocessor that
> >>> built some kind of documentation (HTML would probably be
> >>> easiest) out of this, but since this was an experiment,
> >>> it didn't seem worth while.
> >>> Why did I write the preprocessor?
> >>> Well, to be honest, the first draft didn't use one.
> >>> I got a bit sick of debugging, and wrote the preprocessor
> >>> (based on vague memories of Icon) to eliminate a class of
> >>> errors. It turned out to be _easier_ to develop a
> >>> documented emulator than an undocumented one.
The reason for not having documentation of the BEAM instructions is just a
matter of prioritization nothing else.
Given a limited amount of resources we have not focused much on the
internal design documentation, since we have
enough designers knowing this.
I don't think a documentation of these things is so far away now (but no
promise).
I also want to comment regarding stability and compatibility of the .beam
file format and thus instruction set.
We always keep backwards compatibility with 2 major releases. This means
that for example an R15B based system
can load and run .beam files produced with an R13B based system. In
practice this means 3-4 years of compatibility,
The other way around, i.e loading .beam files produced with R15B on an
older system is not supported.
The sparse documentation of Erlang internals can be seen as a
opportunity also. I'm not sure if this needs to be done by the OTP
team at Ericsson.
The first opportunity is for a 1-3 day tutorial about Erlang internals
during the Erlang conferences (probably as part of Erlang University).
I would book a training like this as soon as it is offered alongside a
conference I can attend (Are you listening Erlang Solutions ?)
The second opportunity is for a book author. Perfect would be a book
about Erlang internals like TCP/IP Illustrated Vol 2. The
Implementation (http://www.amazon.com/TCP-IP-Illustrated-Vol-Implementation/dp/020163...)
where basically every line of code of the BSD TCP/IP implementation is
explained.
I think the persons who are teaching the tutorial/writing the book
need not necessarily be members of the OTP team but should rather be
great teachers/writers. The would need good access to the people that
have the internals knowledge however.
With this we can have Erlang internals knowledge (I'm not only talking
about beam) more spread and the OTP team can concentrate on making
their great new releases.
> I don't think a documentation of these things is so far away now (but no
> promise).
> I also want to comment regarding stability and compatibility of the .beam
> file format and thus instruction set.
> We always keep backwards compatibility with 2 major releases. This means
> that for example an R15B based system
> can load and run .beam files produced with an R13B based system. In practice
> this means 3-4 years of compatibility,
> The other way around, i.e loading .beam files produced with R15B on an older
> system is not supported.
----- Original Message -----
> From: Richard O'Keefe <o...@cs.otago.ac.nz>
> To: Thomas Lindgren <thomasl_erl...@yahoo.com>
> Cc: Michael Turner <michael.eugene.tur...@gmail.com>; "erlang-questi...@erlang.org" <erlang-questi...@erlang.org>
> Sent: Thursday, May 10, 2012 8:10 AM
> Subject: Re: [erlang-questions] Is there a good source for documentation on BEAM?
>( 1) We were told that BEAM documentation isn't needed because
> there are other Erlang implementations.
> (2) I ask whether any of those other implementations ever kept
> up with The Real Thing. (By the way, as far as I know,
> none of them ever supported bit syntax, and my recent
> attempt to install GERL failed miserably.)
> (3) Suddenly we are told that the abandoning of those other
> things just *proves* that we don't need BEAM documentation.
> ?
It looks like this discussion has terminated, and I think I've made whatever points I wanted to be made so I'll leave it at that. But since this seemed to be unclear, let me recapitulate:
BEAM docs are not needed to produce a second source implementation, as shown by several examples. (1)
Also, there has so far been little practical interest shown by Erlang users in such a second source. So implementation efforts may be in vain. (3)
My personal view, at least, is that most of the difficulty in "keeping up with The Real Thing", Erlang/OTP, would be not in reproducing BEAM but in writing a fully compatible implementation tracking the rest of the runtime, ERTS. (2)
So, is there a _practical_ case for doing these docs? In particular, will the effort result in useful external contributions that outweigh time spent? Not at all clear to me.
> BEAM docs are not needed to produce a second source implementation, as shown by several examples. (1)
People were not asking for BEAM documentation in order to produce
a second source implementation. That's a red herring. People
ask for BEAM documentation in order to understand and perhaps
revise the main implementation.
One thing that Open Source is about is about community contributions
and it is awfully hard to contribute to something that is
under-documented.
> Also, there has so far been little practical interest shown by Erlang users in such a second source. So implementation efforts may be in vain. (3)
Again, the red herring. This is *NOT* about second implementations.
All of the additional implementations I've ever heard of used their
own back end, *not* BEAM, and would *not* have received any great
benefit from BEAM being documented.
There is detailed documentation available for Lua and Icon and the
WAM and several other VMs around, so there's not even any great
advantage in learning about VMs from BEAM.
> My personal view, at least, is that most of the difficulty in "keeping up with The Real Thing", Erlang/OTP, would be not in reproducing BEAM but in writing a fully compatible implementation tracking the rest of the runtime, ERTS. (2)
Well, not _just_ that, but that's a big part of it.
> So, is there a _practical_ case for doing these docs? In particular, will the effort result in useful external contributions that outweigh time spent? Not at all clear to me.
Bearing in mind that "BEAM documentation" and "second system implementation"
are about as irrelevant to each other as any two topics about the same
language can be, what _might_ we realistically expect from BEAM documentation?
(1) There are now several languages implemented on top of Erlang.
Some are interpreted in Erlang, some are compiled to Erlang ASTs,
and some are compiled to BEAM. Compiling to the BEAM is *much*
harder than it needs to be because of the current undocumentation.
This isn't *replacing* Erlang/OTP but *augmenting* it.
(2) I've proposed a number of minor but handy extensions to Erlang
syntax that lack a reference implementation because I have no idea
what the compiler should generate for them; these could use the
existing BEAM.
(3) There are existing things that could be compiled more efficiently.
I'm thinking here in particular of list comprehension. I'm not
sure if the improved translation can be done with the existing
BEAM or if minor extensions would be needed, because the
undocumentation for the BEAM does not make clear any range
limits or other restrictions on BEAM instructions.
(4) There are even bigger changes, like frames, where even estimating
the scope of the changes is hard because of the undocumentation.
But above all you are making an assumption which I utterly reject,
namely that documentation is a COST and ONLY a cost, that producing
documentation provides no DIRECT benefits to the people writing the
documentation.
On the contrary,
- you may find defects as you document
- you may be able to structure the documentation so that
test cases can automatically be extracted
- you may in the very act of explaining a limitation see
how it can be overcome
- you may be able to extract parts of the implementation
automatically from the documentation
and I could go on.
There's a slogan I learned from a business textbook:
find the indispensable man and fire him!
There was something I found unsettling: we were told in this thread that there wasn't any need for documentation because
the Erlang/OTP maintainers had enough people who _knew_ this
stuff.
One of my colleagues here was managing a software project
once; a key employee went on holiday and was murdered in a
far country. A former student of mine was starting up a
new company with me as a consultant, and being a worse
driver than he thought, drove at speed into a tree. The
tree survived. One of my former colleagues, a very
intelligent and likable guy, was cycling down a Melbourne
street and got knocked over by a hit-and-run driver. The
result was head injury and someone who could dress himself
but couldn't program if his life depended on it. When I
was at Quintus, the founder who wrote the compiler and was
the only person who really understood it quit and was never
heard from again.
Fergus O'Brien supervised an MSc on "Organisational
Forgetting" and the fact of organisational forgetting has
haunted the fringes of my mind ever since.
People die. People get head injuries. People quit.
Documents get lost. (If you ever find an architecture-
of-the-ICL-2900 manual, I'd like to see it.) Even if
things _are_ written down, organisations forget _where_.
If they aren't written down, they WILL be forgotten
sooner or later.
People Well, my colleague Andrew Trotman once had a key employee
On Tue, May 29, 2012 at 12:08 AM, Richard O'Keefe <o...@cs.otago.ac.nz> wrote:
> On 29/05/2012, at 3:25 AM, Thomas Lindgren wrote:
>> BEAM docs are not needed to produce a second source implementation, as shown by several examples. (1)
> People were not asking for BEAM documentation in order to produce
> a second source implementation. That's a red herring. People
> ask for BEAM documentation in order to understand and perhaps
> revise the main implementation.
> One thing that Open Source is about is about community contributions
> and it is awfully hard to contribute to something that is
> under-documented.
>> Also, there has so far been little practical interest shown by Erlang users in such a second source. So implementation efforts may be in vain. (3)
> Again, the red herring. This is *NOT* about second implementations.
> All of the additional implementations I've ever heard of used their
> own back end, *not* BEAM, and would *not* have received any great
> benefit from BEAM being documented.
> There is detailed documentation available for Lua and Icon and the
> WAM and several other VMs around, so there's not even any great
> advantage in learning about VMs from BEAM.
>> My personal view, at least, is that most of the difficulty in "keeping up with The Real Thing", Erlang/OTP, would be not in reproducing BEAM but in writing a fully compatible implementation tracking the rest of the runtime, ERTS. (2)
> Well, not _just_ that, but that's a big part of it.
>> So, is there a _practical_ case for doing these docs? In particular, will the effort result in useful external contributions that outweigh time spent? Not at all clear to me.
> Bearing in mind that "BEAM documentation" and "second system implementation"
> are about as irrelevant to each other as any two topics about the same
> language can be, what _might_ we realistically expect from BEAM documentation?
> (1) There are now several languages implemented on top of Erlang.
> Some are interpreted in Erlang, some are compiled to Erlang ASTs,
> and some are compiled to BEAM. Compiling to the BEAM is *much*
> harder than it needs to be because of the current undocumentation.
> This isn't *replacing* Erlang/OTP but *augmenting* it.
> (2) I've proposed a number of minor but handy extensions to Erlang
> syntax that lack a reference implementation because I have no idea
> what the compiler should generate for them; these could use the
> existing BEAM.
> (3) There are existing things that could be compiled more efficiently.
> I'm thinking here in particular of list comprehension. I'm not
> sure if the improved translation can be done with the existing
> BEAM or if minor extensions would be needed, because the
> undocumentation for the BEAM does not make clear any range
> limits or other restrictions on BEAM instructions.
> (4) There are even bigger changes, like frames, where even estimating
> the scope of the changes is hard because of the undocumentation.
> But above all you are making an assumption which I utterly reject,
> namely that documentation is a COST and ONLY a cost, that producing
> documentation provides no DIRECT benefits to the people writing the
> documentation.
> On the contrary,
> - you may find defects as you document
> - you may be able to structure the documentation so that
> test cases can automatically be extracted
> - you may in the very act of explaining a limitation see
> how it can be overcome
> - you may be able to extract parts of the implementation
> automatically from the documentation
> and I could go on.
> There's a slogan I learned from a business textbook:
> find the indispensable man and fire him!
> There was something I found unsettling: we were told in this
> thread that there wasn't any need for documentation because
> the Erlang/OTP maintainers had enough people who _knew_ this
> stuff.
> One of my colleagues here was managing a software project
> once; a key employee went on holiday and was murdered in a
> far country. A former student of mine was starting up a
> new company with me as a consultant, and being a worse
> driver than he thought, drove at speed into a tree. The
> tree survived. One of my former colleagues, a very
> intelligent and likable guy, was cycling down a Melbourne
> street and got knocked over by a hit-and-run driver. The
> result was head injury and someone who could dress himself
> but couldn't program if his life depended on it. When I
> was at Quintus, the founder who wrote the compiler and was
> the only person who really understood it quit and was never
> heard from again.
> Fergus O'Brien supervised an MSc on "Organisational
> Forgetting" and the fact of organisational forgetting has
> haunted the fringes of my mind ever since.
> People die. People get head injuries. People quit.
> Documents get lost. (If you ever find an architecture-
> of-the-ICL-2900 manual, I'd like to see it.) Even if
> things _are_ written down, organisations forget _where_.
> If they aren't written down, they WILL be forgotten
> sooner or later.
Brilliant! and well argued. Thank you Richard. This last argument is
excellent brain-food.
<aside> I like the way many Erlang discussion threads turn into
meta-discussions about the underlying
problems .. great stuff </aside>
So "stuff" should be well documented precisely because one day the creator
of the stuff will get bored or die and the encompassing organisation
will collectively
forget how it works.
I've notice a lot the "some other guy knows this stuff" phenomena -
I've been chasing a particular
problem for months. I always get the "I don't know but X knows" - I
ask X and they refer to X1 etc.
But X1 does not know ... I'm running out of leads. It appears that
*nobody* knows.
Fortunately the BEAM is not yet there. I know who knows - and when I
ask them they *do* know
so I just hope the entire OTP group aren't on the same plane to an
Erlang conference in a far away
land ...
I conclude it should be documented.
The next question is "priority" ... in the real world we juggle
priorities. The case *is* made that
we should document the BEAM but is this more or less important than
implementing frames etc?
As always, a tricky question...
/Joe
> People
> Well, my colleague Andrew Trotman once had a key employee
When there was some documentation long ago, the bulk of it was about
the dozens and dozens of BEAM opcodes, and that's not actually the
important part.
What *really* matters are all the undocumented assumptions in the
system. What are the rules about floating point registers? Which BIFs
can cause garbage collection and how are they treated differently than
non-GC BIFs? When should registers be marked as unused? When are
destructive tuple updates allowed? What inefficient-looking sequences
are improved by the BEAM loader?
I can learn BEAM by disassembling code (anyone who is interested can
can use http://prog21.dadgum.com/127.html as a starting point), but I
can't learn the underlying rules and philosophies of the VM.
BEAM documentation can be as simple as:
* One-line-per-instruction description of opcodes.
* Couple of pages of VM architecture and rules.
* Two HOWTO docs: adding a BIF and adding a VM instruction.
* List of transformations done by the loader.
_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions
> If they aren't written down, they WILL be forgotten
> sooner or later.
I would also like to suggest that the act of writing down a design forces a degree of rigour that is hard to achieve otherwise. I have several times thought that I had considered a design from 'every angle', only to find that when I *write it down*, some of the angles were imperial and some were metric, and some were just smoke and mirrors.
When something is written down, the handwaving stops and the engineering starts.
On Wed, May 30, 2012 at 3:39 PM, james <ja...@mansionfamily.plus.com> wrote: > I would also like to suggest that the act of writing down a design forces a > degree of rigour that is hard to achieve otherwise. I have several times > thought that I had considered a design from 'every angle', only to find that > when I *write it down*, some of the angles were imperial and some were > metric, and some were just smoke and mirrors.
> When something is written down, the handwaving stops and the engineering > starts.
I *completely* agree with this. At my work we *have* to write down our systems, the API's, interfaces, backend, everything, as we code it. It really does help to solidify how the system should work, helps us to consider angles that we would not have considered before. Even if we never use the documentation again, the fact that we are forced to write it in *detail* and keep it up to date really does have a great boon on the quality of code. If you do not find problems in your code when writing documentation, then likely your documentation is not detailed enough. _______________________________________________ erlang-questions mailing list erlang-questi...@erlang.org http://erlang.org/mailman/listinfo/erlang-questions
On Mon, May 7, 2012 at 4:47 AM, Joe Armstrong <erl...@gmail.com> wrote:
> I think it works like this:
> 1) first you don't understand how the X works (X=Beam, JVM, X11,
> ... you name it)
> 2) You struggle - and think - google and have a hot bath
> 3) Eureka - bath flows over
> 4) Now you can understand it - and you can also remember why you
> could not understand it
> 5) Now it's easy you understand it
> 6) You see no reason to document it since it's obvious
> Round about 4) there is a small window of opportunity to explain to
> other people how it works.
> Once you get to 6) it's very difficult to remember what it felt like
> at point 2) and consequently difficult
> to write decent documentation.
> /Joe
> On Mon, May 7, 2012 at 11:27 AM, Michael Turner
> <michael.eugene.tur...@gmail.com> wrote:
>> "Actually, I don't think such docs are all _that_ crucial -- who
>> really needs to know, except a small number of VM implementors?"
>> Aren't Erlang's chances of greater mindshare improved by making it
>> easier to become a VM implementor? I doubt very much that Java would
>> be where it is today had it not been for clear VM specification.
>> That's not to say that Erlang should follow in all of Java's
>> footsteps, even if it could. But I have to say I was a boggled to
>> learn that you can't find out what the VM opcodes mean without reading
>> the source (and maybe not even then, if the source contains bugs
>> vis-a-vis some idealized machine model.)
>> -michael turner
>> On Mon, May 7, 2012 at 5:46 PM, Thomas Lindgren
>> <thomasl_erl...@yahoo.com> wrote:
>>>>________________________________
>>>> From: Jonathan Coveney <jcove...@gmail.com>
>>>>To: erlang-questi...@erlang.org
>>>>Sent: Monday, May 7, 2012 8:39 AM
>>>>Subject: [erlang-questions] Is there a good source for documentation on BEAM?
>>>>This question seems to come up now and again, and it's surprising to me that a crucial part of the documentation isn't better documented. Is there a reason that it is the case? Is the reason that there is no VM spec to give the devs the flexibility to change the intermediate layer without having to worry about backwards compatibility to the degree that Java does?
>>> Actually, I don't think such docs are all _that_ crucial -- who really needs to know, except a small number of VM implementors? (And they should read the source to get at all the goodies.) But perhaps someone on the list might be moved to do a tutorial presentation on an Erlang Factory or something?
>>> (By the way, I too assume not doing it is to avoid getting bogged down into minutiae.)
On Wed, May 30, 2012 at 11:39 PM, james <ja...@mansionfamily.plus.com> wrote:
>> If they aren't written down, they WILL be forgotten
>> sooner or later.
> I would also like to suggest that the act of writing down a design forces a
> degree of rigour that is hard to achieve otherwise. I have several times
> thought that I had considered a design from 'every angle', only to find that
> when I *write it down*, some of the angles were imperial and some were
> metric, and some were just smoke and mirrors.
I agree 100% - when I get *really stuck* - I write out in clear
English what I want my program to do.
I often start programming without a clear idea of what my program is to do - I
"think" I have a clear idea but as the program evolves I find that the
original idea was
unclear. Writing down in another language (ie English, instead of
Erlang or C or whatever) forces
a cognitive shift in my brain - I suspect that actually *different*
parts of the brain are involved.
I can also feel the ideas moving around in my head - feel is too
strong a word here.
There is also another strange phenomena - until a program is
completed, the problem
is not really solved. But as soon as the program is completed in its
entirety, a strange thing
happens - then, and not sooner, I realise that a better solution was possible.
Why is this? - I suspect a different part of the brain is involved.
Some part of the brain
says "move on problem completed" and as soon as that happens the "move
on" part of the brain
realises that the solution you have just arrived at was flawed - so
throw it all away and start again.
Knuth, he of the wise ways, says this rewriting process should be
repeated seven times.
In organisations this causes problems. Just about when the system is
to be delivered
you fix the final bug and then realize that it was
all wrong and needs a total rewrite, and you start the rewrite the day
before delivery ...
I have never met a project manager on the planet who understands this
(Project mangers are from Venus,
Programmers are from Mars)
I also believe in working in a distraction-free environment (no phone,
email, twitters etc) you have to
listen very carefully to catch the small fleeting thoughts in your
brain. When I solve sudoku I
get instant flashes where I see where numbers are to be placed - but
these are fleeting and easy to miss.
I suspect my right-brain instantly sees a solution and tries to tell
my left-brain, but since I wasn't listening I missed it.
Programming is actually a form of "applied thinking" where you can
actually test the results of the
thought process. Since all the real work takes place inside the brain,
a brain-friendly environment is
essential. I once worked in an open-plan office, after a while I
noticed my (programmer) productivity
dropped to zero and that all my programs were written at home.
I could wax on but i'm supposed to be writing a book, but have been
distracted by the Erlang mailing
list ...
Cheers
/Joe
> When something is written down, the handwaving stops and the engineering
> starts.
Joe writes: "... you fix the final bug and then realize that it was
all wrong and needs a total rewrite, and you start the rewrite the day
before delivery ...."
A friend of mine, former coder who went on to manage coders, once met
me for a beer and moaned that he'd spent the day "fighting a sudden
outbreak of Truth and Beauty." ;-)
This is actually an advantage of generating as much of your
implementation as possible from your documentation (as ROK suggests
above): it gets you more of a Separation of Concerns between where you
want your Truth (the spec) and where you need the Beauty (the code.)
If Semantic Mediawiki isn't powerful enough to generate at least the
*makings* of a VM implementation from a rigorous (but still reasonably
human-readable spec), I don't know what is.
The above main page is, of course, wrong. In almost every possible
way. But anybody can help out in fixing it ....
On Fri, Jun 1, 2012 at 4:03 PM, Joe Armstrong <erl...@gmail.com> wrote:
> On Wed, May 30, 2012 at 11:39 PM, james <ja...@mansionfamily.plus.com> wrote:
>>> If they aren't written down, they WILL be forgotten
>>> sooner or later.
>> I would also like to suggest that the act of writing down a design forces a
>> degree of rigour that is hard to achieve otherwise. I have several times
>> thought that I had considered a design from 'every angle', only to find that
>> when I *write it down*, some of the angles were imperial and some were
>> metric, and some were just smoke and mirrors.
> I agree 100% - when I get *really stuck* - I write out in clear
> English what I want my program to do.
> I often start programming without a clear idea of what my program is to do - I
> "think" I have a clear idea but as the program evolves I find that the
> original idea was
> unclear. Writing down in another language (ie English, instead of
> Erlang or C or whatever) forces
> a cognitive shift in my brain - I suspect that actually *different*
> parts of the brain are involved.
> I can also feel the ideas moving around in my head - feel is too
> strong a word here.
> There is also another strange phenomena - until a program is
> completed, the problem
> is not really solved. But as soon as the program is completed in its
> entirety, a strange thing
> happens - then, and not sooner, I realise that a better solution was possible.
> Why is this? - I suspect a different part of the brain is involved.
> Some part of the brain
> says "move on problem completed" and as soon as that happens the "move
> on" part of the brain
> realises that the solution you have just arrived at was flawed - so
> throw it all away and start again.
> Knuth, he of the wise ways, says this rewriting process should be
> repeated seven times.
> In organisations this causes problems. Just about when the system is
> to be delivered
> you fix the final bug and then realize that it was
> all wrong and needs a total rewrite, and you start the rewrite the day
> before delivery ...
> I have never met a project manager on the planet who understands this
> (Project mangers are from Venus,
> Programmers are from Mars)
> I also believe in working in a distraction-free environment (no phone,
> email, twitters etc) you have to
> listen very carefully to catch the small fleeting thoughts in your
> brain. When I solve sudoku I
> get instant flashes where I see where numbers are to be placed - but
> these are fleeting and easy to miss.
> I suspect my right-brain instantly sees a solution and tries to tell
> my left-brain, but since I wasn't listening I missed it.
> Programming is actually a form of "applied thinking" where you can
> actually test the results of the
> thought process. Since all the real work takes place inside the brain,
> a brain-friendly environment is
> essential. I once worked in an open-plan office, after a while I
> noticed my (programmer) productivity
> dropped to zero and that all my programs were written at home.
> I could wax on but i'm supposed to be writing a book, but have been
> distracted by the Erlang mailing
> list ...
> Cheers
> /Joe
>> When something is written down, the handwaving stops and the engineering
>> starts.
When you don't understand something you can't write the documentation, while when you understand you see no need to write the documentation. So you never write the documentation.
----- Original Message ----- > Excellent! This should be codified into Armstrong's Law of > Technology > Obfuscation!
> On Mon, May 7, 2012 at 4:47 AM, Joe Armstrong <erl...@gmail.com> > wrote: > > I think it works like this:
> > 1) first you don't understand how the X works (X=Beam, JVM, X11, > > ... you name it) > > 2) You struggle - and think - google and have a hot bath > > 3) Eureka - bath flows over > > 4) Now you can understand it - and you can also remember why you > > could not understand it > > 5) Now it's easy you understand it > > 6) You see no reason to document it since it's obvious
> > Round about 4) there is a small window of opportunity to explain to > > other people how it works. > > Once you get to 6) it's very difficult to remember what it felt > > like > > at point 2) and consequently difficult > > to write decent documentation.
> > /Joe
> > On Mon, May 7, 2012 at 11:27 AM, Michael Turner > > <michael.eugene.tur...@gmail.com> wrote: > >> "Actually, I don't think such docs are all _that_ crucial -- who > >> really needs to know, except a small number of VM implementors?"
> >> Aren't Erlang's chances of greater mindshare improved by making it > >> easier to become a VM implementor? I doubt very much that Java > >> would > >> be where it is today had it not been for clear VM specification. > >> That's not to say that Erlang should follow in all of Java's > >> footsteps, even if it could. But I have to say I was a boggled to > >> learn that you can't find out what the VM opcodes mean without > >> reading > >> the source (and maybe not even then, if the source contains bugs > >> vis-a-vis some idealized machine model.)
> >> -michael turner
> >> On Mon, May 7, 2012 at 5:46 PM, Thomas Lindgren > >> <thomasl_erl...@yahoo.com> wrote:
> >>>>________________________________ > >>>> From: Jonathan Coveney <jcove...@gmail.com> > >>>>To: erlang-questi...@erlang.org > >>>>Sent: Monday, May 7, 2012 8:39 AM > >>>>Subject: [erlang-questions] Is there a good source for > >>>>documentation on BEAM?
> >>>>This question seems to come up now and again, and it's surprising > >>>>to me that a crucial part of the documentation isn't better > >>>>documented. Is there a reason that it is the case? Is the reason > >>>>that there is no VM spec to give the devs the flexibility to > >>>>change the intermediate layer without having to worry about > >>>>backwards compatibility to the degree that Java does?
> >>> Actually, I don't think such docs are all _that_ crucial -- who > >>> really needs to know, except a small number of VM implementors? > >>> (And they should read the source to get at all the goodies.) But > >>> perhaps someone on the list might be moved to do a tutorial > >>> presentation on an Erlang Factory or something?
> >>> (By the way, I too assume not doing it is to avoid getting bogged > >>> down into minutiae.)
> >>> If you want to learn more about some of the intellectual roots, > >>> try these: > >>> http://wambook.sourceforge.net/
1. The first type is the one that scratches the surface, gives a few usage
examples, and generally helps newcomers and users.
It explains why you'd use the library/module/thing, and directs in how to
do so. It holds the user's hands so they can quickly solve whatever problem
they have.
The raw version of this is either reading test cases or Open Source code
that uses the product.
2. The reference manual is a listing made to help those who already know
the item. It details everything but gives few explanations. It can be used
to deepen one's knowledge, but won't do any hand holding, or very little.
EDocs and the general javadoc-style stuff fit this well.
The raw version of it is 'read the source, Luke'. It works for experienced
users, and is of nearly no use to others.
3. Architecture is the doc that explains why the app was built the way it
is, a view of how it works from 10,000 feet. It shows how to understand the
app were you to dive in its source.
This kind of doc explains the rationale of some choices made, and is
especially helpful to developers or contributors to your code, so that they
do not undo future plans, respect trade off decisions you made, and know
where to dive in to extend things without causing headaches to anyone.
I know of no raw version of this; it's experience and in-depth knowledge of
the product's life. Writing this doc is often vital to raise the quality
and relevance of contributions received.
----
In my opinion, you don't have great documentation until you covered the 3
aspects. You can have poor, decent, or good docs without all of them, but
greatness requires to cover all bases.
A self-proclaimed complete book, in-depth tutorial, or course, should touch
all 3, for example.
Understanding your code well and making it self-explanatory sadly rarely
helps with more than one out of 3 points.
On Jun 1, 2012 12:35 PM, "Robert Virding" <