[erlang-questions] Getting rid of the preprocessor

28 views
Skip to first unread message

Vlad Dumitrescu

unread,
May 24, 2012, 4:52:46 AM5/24/12
to erlang-questions
Hi,

There have been discussions now and then about the preprocessor and
how it would be better to get rid of it. As someone that has to parse
and handle source files before they are preprocessed, I am very much
affected by it -- with macros that can be anything, an raw Erlang file
can look like pretty much anything. Even if the macros are in practice
mostly well-behaved, using them as alternatives to inlining code
creates problem when debugging, for example, because you can't step
through a macro invocation.

One of the main reasons that was presented for not being able to
replace the preprocessor with something that works at the lexical
level is that records would need to be first made first-class citizens
(as frames or whatever). I don't understand why that is -- the
preprocessor doesn't do anything with records, their expansion is done
later, by the compiler. Could someone comment on that, please?

Is this issue a "definitely no-no" for OTP, or would it be considered
if we can come up with a solution that doesn't break anything?

<aside>
Stepping through the code in the debugger is being thwarted even by
the use of parse transforms. This could be handled by letting the
debugger reconstruct the source it shows from the parsed
representation and not from the source file.
</aside>

best regards,
Vlad
_______________________________________________
erlang-questions mailing list
erlang-q...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

Max Bourinov

unread,
May 24, 2012, 7:44:28 AM5/24/12
to Vlad Dumitrescu, erlang-questions
Hi Vlad,

I have relatively big OTP system. I use macros. I don't have any problems you described. What I do wrong?

I don't debug my code because I think debugging is bad. I do unit tests and ct tests. If after those tests there is a need for debug - my tests are bad. I do better tests. Will this approach work for you too?

Best regards,
Max

Vlad Dumitrescu

unread,
May 24, 2012, 8:02:32 AM5/24/12
to Max Bourinov, erlang-questions
Hi Max,

On Thu, May 24, 2012 at 1:44 PM, Max Bourinov <bour...@gmail.com> wrote:
> I have relatively big OTP system. I use macros. I don't have any problems
> you described. What I do wrong?

If you have no problems, then you do nothing wrong! :-)

> I don't debug my code because I think debugging is bad. I do unit tests and
> ct tests. If after those tests there is a need for debug - my tests are bad.
> I do better tests. Will this approach work for you too?

Yes, I know about that. But there is a debugger and if someone wants
to use it, it should be as useful as possible.

I have problems with macros because I am working on erlide [*] which
like any IDE that uses the source files and needs also to be as useful
as possible by providing editing help. Source files with
non-restricted macros can't be parsed with normal parsers. Even
putting restrictions on the macro values so that they are well-behaved
requires a context-dependent parsing.

As one example, how would you parse the following line?
?HELLO(world)

The answer is that it depends.
With -define(HELLO, hello), it is a function call to hello(world).
With -define(HELLO(Arg), whatever), it is whatever the macro value is.

This could be disambiguated by requiring all macro references to be
followed by parentheses, like function calls, but that would be
massively not backwards-compatible.



[*] Eclipse based IDE, http://erlide.org

Max Bourinov

unread,
May 24, 2012, 8:38:35 AM5/24/12
to Vlad Dumitrescu, erlang-questions
Hi Vlad,

You are right, it is nearly impossible to properly work with macros in IDE. I never tried erlide. Does it give any coding performance boost? I use sublime2 and emacs. Most of Erlangers uses Emacs (as far as I know).

About your example: You can agree within your team which macros you will use and which not. For example:

1. No complicated code snippets in macros - this is good for code simplicity. 
2. All atoms that are flying between modules must be in macros. Or even better - use records for that.

Simple rules - simple code. Macros are good. They are cool. Use right tool and you have no problems.

Best regards,
Max

Vlad Dumitrescu

unread,
May 24, 2012, 8:54:38 AM5/24/12
to Max Bourinov, erlang-questions
On Thu, May 24, 2012 at 2:38 PM, Max Bourinov <bour...@gmail.com> wrote:
> You are right, it is nearly impossible to properly work with macros in
> IDE. I never tried erlide. Does it give any coding performance boost? I use
> sublime2 and emacs. Most of Erlangers uses Emacs (as far as I know).

Well, there's a whole bunch of them at Ericsson use erlide, plus
others that I only have fragmentary information. Of course, I believe
that an IDE helps the development process, but I know that it is a
controversial question. Some people like it, some don't.

> About your example: You can agree within your team which macros you will use
> and which not. For example:
> 1. No complicated code snippets in macros - this is good for code
> simplicity.
> 2. All atoms that are flying between modules must be in macros. Or even
> better - use records for that.
> Simple rules - simple code. Macros are good. They are cool. Use right tool
> and you have no problems.

Yes, that would be good enough for me -- but as a generic tool
provider, I can't force people to follow any rules. There is also the
issue of legacy code, that nobody will touch as long as it works.

Anyway, the simple usage you describe above doesn't require a
preprocessor, these could be handled inside the language with some
additions to the compiler. This would enforce the cleanliness and I
don't think anybody would feel sorry about that. For the dirty work,
the preprocessor could still be there, if needed.

Max Bourinov

unread,
May 24, 2012, 9:11:32 AM5/24/12
to Vlad Dumitrescu, erlang-questions
Ah I see :-)

The "bad" key-word here is "generic tool provider".

Maybe good README.markdown file will help somehow? It always helps when code cannot solve everything.

Best regards,
Max

alisdair sullivan

unread,
May 24, 2012, 4:42:11 PM5/24/12
to Max Bourinov, erlang-questions
you could compile using 'P' or 'E' to get a source listing after preprocessing, compile that and debug from that source. not optimal, but better than not having source at all

-- 
alisdair sullivan
Sent with Sparrow

Vlad Dumitrescu

unread,
May 24, 2012, 5:09:18 PM5/24/12
to alisdair sullivan, erlang-questions
Hi,

On Thu, May 24, 2012 at 10:42 PM, alisdair sullivan
<alisdair...@yahoo.ca> wrote:
> you could compile using 'P' or 'E' to get a source listing after
> preprocessing, compile that and debug from that source. not optimal, but
> better than not having source at all

Yes, that is an option that helps when debugging. It's not useful for
keeping track of all the places a function is called from, for
example, so that one can rename it. It doesn't make parsing the
original source easier, either.

best regards,

Richard O'Keefe

unread,
May 24, 2012, 6:16:33 PM5/24/12
to Vlad Dumitrescu, erlang-questions

On 25/05/2012, at 12:02 AM, Vlad Dumitrescu wrote:

> As one example, how would you parse the following line?
> ?HELLO(world)


I am no friend of the preprocessor.
But this at least is no new problem: C has exactly the same
problem, and presumably the C and C++ support in Eclipse
already does something with it. (If the editor manual is
bigger than the listing of my editor, I don't use it. Eclipse
massively fails this test, so I don't know much about it.)
I presume it goes something like this:
if it looks like a variable, display it as one;
if it looks like a function call, display it as one;
if the display is wrong, it's the programmer's fault
for defining such a stupid macro.

There's one possible extra you _might_ want to consider.
I have C macros that would benefit from it.
That is
%%erlide%% begin_like <macro name>
%%erlide%% end_like <macro name>
so that for example you could use

%%erlide%% begin_like SPAWN
%%erlide%% end_like END
-define(SPAWN, spawn(fun () ->).
-define(END, end)).
...
Pid = SPAWN
Ping!pong,
Pong!ping
END,
...

and have it coloured/styled/indented appropriately.

Vlad Dumitrescu

unread,
May 25, 2012, 3:12:05 AM5/25/12
to Richard O'Keefe, erlang-questions
Hi Richard,

My real question got buried in the discussion about my particular use
case: previously when the preprocessor and its demise were discussed,
one main reason for not being able to do anything about it at the
moment was that first we need to do something about the records. But
the preprocessor doesn't do anything to records, they are processed by
the compiler. To quote you from a recent thread: "Abstract patterns
and frames are all part of a long-time project to make the
preprocessor unnecessary". Could you please enlighten me as to how the
preprocessor is involved here?

On Fri, May 25, 2012 at 12:16 AM, Richard O'Keefe <o...@cs.otago.ac.nz> wrote:
> On 25/05/2012, at 12:02 AM, Vlad Dumitrescu wrote:
>> As one example, how would you parse the following line?
>>    ?HELLO(world)
> I am no friend of the preprocessor.
> But this at least is no new problem:  C has exactly the same
> problem, and presumably the C and C++ support in Eclipse
> already does something with it.

Yes, it's not new and yes, the C/C++ support does something with it.
The problem at hand is that we don't have the kind of resources that
were put into the C support by the likes of Intel, IBM, Texas
Instruments and, yes, even Ericsson. There are many tens of man-years
invested in that project...

> I presume it goes something like this:
>        if it looks like a variable, display it as one;
>        if it looks like a function call, display it as one;
>        if the display is wrong, it's the programmer's fault
>        for defining such a stupid macro.

The displaying of fancy colors in the editor is not the important
issue. An IDE needs to know as much detail as possible about the code,
to support navigation and refactorings, to be able to suggest
meaningful completions and offer reasonable ways to fix any issues it
finds. This isn't possible if one can't understand (=parse) the code
properly.

>        if the display is wrong, it's the programmer's fault
>        for defining such a stupid macro.

Unfortunately, this argument doesn't really work for legacy systems,
where the development environment has to work with code that won't be
changed unless it's broken at runtime.

For macros that expand to well-formed expressions, we can treat them
as function calls, that's the easy part. There are other uses where
the parser needs to be prepared for all kinds of weirdness. The most
pervasive example is the '?line' macro used with common_test (this is
now no longer necessary, but see above about legacy), but I have to
handle for example macros that expand to:
- whole function clauses, in the middle of regular clauses
- guard tests including the 'when' keyword
The parser grammar is basically describing this nice language, except
that any grammar construct or combination thereof may be represented
by a macro call...

In any case, the problems related to smart editors are not the only
ones, there were many people complaining for not so many years ago and
you wrote the oh-so-old-that-Google-can't-find-an-electronic-copy-if-one-even-exists
paper "Delenda est preprocessor" even before that (1998?).

I'm taking the liberty of quoting you here:
"... the preprocessor is violently
at odds with everything else in the language. One of the papers I
wrote for SERC had the title "Delenda Est Preprocessor". There
really isn't anything that can be done with the processor that
could not be done better without it. In particular, one of the
major things about Erlang is the module system combined with hot
loading, but the preprocessor subverts the module system and
causes dependencies between source units that are not and cannot
be tracked by the run time system."
[http://erlang.org/pipermail/erlang-questions/2006-March/019614.html]

best regards,
Vlad

Ulf Wiger

unread,
May 25, 2012, 3:31:38 AM5/25/12
to Vlad Dumitrescu, erlang-questions

On 25 May 2012, at 09:12, Vlad Dumitrescu wrote:

> The displaying of fancy colors in the editor is not the important
> issue. An IDE needs to know as much detail as possible about the code,
> to support navigation and refactorings, to be able to suggest
> meaningful completions and offer reasonable ways to fix any issues it
> finds. This isn't possible if one can't understand (=parse) the code
> properly.

Yes - many years ago, I wrote a source code browser for code
stored in ClearCase, allowing you to select a "config spec" and get a
consistent view of all the code. To make it attractive to our erlang
programmers, I put a lot of effort into doing syntax highlighting and
cross-linking of erlang code. You could click on a function head
and get a listing of all call points, for example.

I struggled with what to do with macros. I finally decided to treat
macros and records as similar to functions, and cross-linked them
too. Thus, clicking on a macro invocation would take you to the
definition of the macro, and from a macro definition, you could
find all the invocations - same with records.

Personally, I found this very useful, and did put in some work to
try to make it VCS-agnostic (it did handle CVS repositories,
actually).

In the end, though, I gave up trying to maintain it, since I really
didn't have the time to do so. I also found that after a series of
unfinished refactorings to make the code more maintainable, it
was more or less *impossible* to maintain. :)

Still, compared to ErlIDE/Eclipse, this tool had pretty shallow
ambitions as regards actually understanding what the code did.
But just having that level of cross-linking was a great benefit.
Somewhere, there is a cost-benefit threshold, where reaching for
more sophistication simply doesn't pay off.

(At the scale of AXD 301 - 1.5 million lines of code and lots of
branches in the repos - I felt I had exceeded that threshold when
my cross-reference database exhausted the inodes on the SUN
server it was running on).

BR,
Ulf W

Ulf Wiger, Co-founder & Developer Advocate, Feuerlabs Inc.
http://feuerlabs.com

Vlad Dumitrescu

unread,
May 25, 2012, 5:23:46 AM5/25/12
to Ulf Wiger, erlang-questions
On Fri, May 25, 2012 at 9:31 AM, Ulf Wiger <u...@feuerlabs.com> wrote:
>
> On 25 May 2012, at 09:12, Vlad Dumitrescu wrote:
>> The displaying of fancy colors in the editor is not the important
>> issue. An IDE needs to know as much detail as possible about the code,
>
> Still, compared to ErlIDE/Eclipse, this tool had pretty shallow
> ambitions as regards actually understanding what the code did.
> But just having that level of cross-linking was a great benefit.
> Somewhere, there is a cost-benefit threshold, where reaching for
> more sophistication simply doesn't pay off.

Hi Ulf,

I think one issue with using Eclipse is that many of the users are
coming from using Eclipse with Java or C, where the code
cross-referencing works fully. If we have apparently the same
functionality, but it doesn't really finds everything for one reason
or another, then people are getting confused. They could even hack
happily ahead, unaware of the problems they left behind. There is not
much one can do about that, but try to cover as much ground as
possible and put a warning in the docs (should anyone read them).

regards,
Vlad

Vlad Dumitrescu

unread,
May 25, 2012, 7:03:23 AM5/25/12
to Richard O'Keefe, erlang-questions
On Fri, May 25, 2012 at 9:12 AM, Vlad Dumitrescu <vlad...@gmail.com> wrote:
> My real question got buried in the discussion about my particular use
> case: previously when the preprocessor and its demise were discussed,
> one main reason for not being able to do anything about it at the
> moment was that first we need to do something about the records. But
> the preprocessor doesn't do anything to records, they are processed by
> the compiler. To quote you from a recent thread: "Abstract patterns
> and frames are all part of a long-time project to make the
> preprocessor unnecessary". Could you please enlighten me as to how the
> preprocessor is involved here?

I think I can answer that myself now: the preprocessor is needed
because most of the times the record definitions are in hrl files,
that are included in the module source. I don't see any other reasons,
are there any?

I missed that because including files and ifdef-ing sections of code
aren't problems for me and are useful, so I kind of automatically
assumed that a restricted preprocessor that performs these tasks will
still be available. The predefined macros should probably be handled
by the preprocessor too.

Michael Turner

unread,
May 25, 2012, 11:18:06 AM5/25/12
to Vlad Dumitrescu, erlang-questions
> ... the oh-so-old-that-Google-can't-find-an-electronic-copy-if-one-even-exists
> paper "Delenda est preprocessor" even before that (1998?).

Is it this?

Richard O'Keefe. Abstract patterns for Erlang. Fourth International
Erlang/OTP User Conference.

In context, the citations I've seen seem to indicate this paper.
Perhaps someone would be so kind as to scan it and put it online?

-michael turner

Michael Turner

unread,
May 25, 2012, 11:45:05 AM5/25/12
to Vlad Dumitrescu, erlang-questions
"For macros that expand to well-formed expressions, we can treat them
as function calls, that's the easy part."

That would be nice. After an exchange with ROK (mostly offline) about
how to achieve a pattern-directed invocation style more like that of
Smalltalk, I found myself experimenting with ungainly syntax that
included using macros in arguments, to wrap the prepositional phrases.
The result is too embarrassing to show here. If the macro invocations
were instead inlined function calls, it would be slightly less
embarrassing.

-michael turner

Richard O'Keefe

unread,
May 27, 2012, 6:40:35 PM5/27/12
to Vlad Dumitrescu, erlang-questions

On 25/05/2012, at 7:12 PM, Vlad Dumitrescu wrote:
> My real question got buried in the discussion about my particular use
> case: previously when the preprocessor and its demise were discussed,
> one main reason for not being able to do anything about it at the
> moment was that first we need to do something about the records. But
> the preprocessor doesn't do anything to records, they are processed by
> the compiler. To quote you from a recent thread: "Abstract patterns
> and frames are all part of a long-time project to make the
> preprocessor unnecessary". Could you please enlighten me as to how the
> preprocessor is involved here?

If two modules need to use the same record,
the definition is put in a .hrl file
and each module -includes that file.
This inclusion is done by the preprocessor.

I just did a quick survey of the .hrl files in a directory
that (sigh) has two releases of Erlang/OTP in it.

defines=Y defines=N |
records=Y 212 160 | 372
records=N 485 26 | 511
========= ========= | =========
Totals 697 186 | 883

(The .hrl files with neither a -record nor a -define
either -include other files, contain -compile directives,
or contain generated code including function definitions.)
>
>> I presume it goes something like this:
>> if it looks like a variable, display it as one;
>> if it looks like a function call, display it as one;
>> if the display is wrong, it's the programmer's fault
>> for defining such a stupid macro.
>
> The displaying of fancy colors in the editor is not the important
> issue. An IDE needs to know as much detail as possible about the code,
> to support navigation and refactorings, to be able to suggest
> meaningful completions and offer reasonable ways to fix any issues it
> finds. This isn't possible if one can't understand (=parse) the code
> properly.

I find myself working on code that isn't _finished_ yet most of the
time, for which there _are_ as yet no facts of the matter on which
an IDE could depend. If memory serves me, the way Quintus handled
the Emacs interface was for Emacs to ask Prolog to do the parsing.

I for one do not expect "meaningful completions" for macros.

It's worth reflecting that refactoring IDEs started in Smalltalk,
where the syntax is simple, there is no compile time/run time
distinction, and above all, no preprocessor.

> The parser grammar is basically describing this nice language, except
> that any grammar construct or combination thereof may be represented
> by a macro call...

The difficulty is acknowledged. As is the fact that you can't stop
people using a processor even if there is none in the language. I
have Smalltalk code generated using m4(1) and awk(1). The Smalltalk+m4
source code has to be edited *somehow*.

I think "Delenda est preprocessor" was started while I was at RMIT,
perhaps in 1997. My copy has the date 20 July 1998 on it, which is
after it was sent to SERC. A copy can be found at
http://www.cs.otago.ac.nz/staffpriv/ok/delenda.txt

It was an internal report to SERC never meant for external publication,
so it is _not_ beautifully formatted.

Vlad Dumitrescu

unread,
May 28, 2012, 4:07:38 AM5/28/12
to Richard O'Keefe, erlang-questions
Thank you for the answer, Richard.

On Mon, May 28, 2012 at 12:40 AM, Richard O'Keefe <o...@cs.otago.ac.nz> wrote:
>
> If two modules need to use the same record,
> the definition is put in a .hrl file
> and each module -includes that file.
> This inclusion is done by the preprocessor.

Yes. So if the preprocessor only does inclusions and ifdefs (which
don't affect the structure of the code because they only work outside
form definitions), then "clean macros" could be part of the language
proper (with a suitable list of restrictions on what they could
resolve to). I can even imagine that a macro that isn't clean (value
can't be parsed standalone) may be still handled as today, during a
period of transition.

> I find myself working on code that isn't _finished_ yet most of the
> time, for which there _are_ as yet no facts of the matter on which
> an IDE could depend.

Why not? Even if the current module is not finished, you can still
know things about the other modules and about the entities in this
module that can be parsed.

An IDE can also help find problems before runtime, because it knows
about the whole code base while the compiler only handles one module
at a time. If I know I am not using weird meta-programming techniques
and I refer to a "foo:bar/3" where foo is in my project and bar/3
doesn't exist, I'd like to get a warning about that. The tests will
probably catch that, but this is faster.

> If memory serves me, the way Quintus handled
> the Emacs interface was for Emacs to ask Prolog to do the parsing.

That's what we do today, we use the Erlang scanner and parser. Problem
is that they know nothing about macros, so we need to do manual magic.
There are problems with having to talk to an Erlang node all the time
like this (loss of synchronization, etc), so we are now moving to a
Java-based parser. The disadvantage of that one is that we can't do
any magic, so we have to be able to parse as much as possible.

> I for one do not expect "meaningful completions" for macros.

For example, "?DE<tab>" will complete to "?DEBUG" or give a list with
"?DEBUG" and "?DELETE".

>> The parser grammar is basically describing this nice language, except
>> that any grammar construct or combination thereof may be represented
>> by a macro call...
>
> The difficulty is acknowledged.  As is the fact that you can't stop
> people using a processor even if there is none in the language.  I
> have Smalltalk code generated using m4(1) and awk(1).  The Smalltalk+m4
> source code has to be edited *somehow*.

Yes, but if they do that, they habe basically a new language and are
on their own, just like if they are using LFE, Elixir or Reia. I am
talking about the base language.

best regards,
Vlad
Reply all
Reply to author
Forward
0 new messages