[erlang-questions] case expression scope

178 views
Skip to first unread message

Daniel Goertzen

unread,
Mar 3, 2014, 11:59:10 AM3/3/14
to Erlang Questions
One thing that repeatedly bites me is the way variables bindings leak out of case expressions.  For example, the following code fails to even compile because the Ps and Qs smack into each other...

f1(X, Y) ->
    A = case {X,Y} of
            {0, Q} -> Q;
            {P, Q} -> P*Q
        end,

    B = case {X,Y} of
            {0, Q} -> Q;
            {P, Q} -> P+Q
        end,
    {A,B}.


In situations like the above, I artificially change the variable names to avoid trouble and end up with something like this:

f1(X, Y) ->
    A = case {X,Y} of
            {0, Q0} -> Q0;
            {P0, Q0} -> P0*Q0
        end,

    B = case {X,Y} of
            {0, Q1} -> Q1;
            {P1, Q1} -> P1+Q1
        end,
    {A,B}.


The binding leakages seems like nothing but hassles and headaches to me, so my questions is, is there actually a reason why bindings need leak out of case expressions?  Is there a programming style that leverages this feature?  Is there a limitation in the emulator that forces it to be this way?

I'd love to have a 'hygienic' case expression.  Could it be faked with parse transforms?

Dan.

Ivan Uemlianin

unread,
Mar 3, 2014, 12:06:35 PM3/3/14
to erlang-q...@erlang.org
It doesn't quite answer your question but perhaps case was a late
addition. Often more concise code avoids it and uses pattern matching
instead. e.g., your example using pattern matching could be:

f1(0, Y) ->
{Y, Y};
f1(X, Y) ->
{X*Y, X+Y}.

Best wishes

Ivan
> _______________________________________________
> erlang-questions mailing list
> erlang-q...@erlang.org
> http://erlang.org/mailman/listinfo/erlang-questions
>

--
============================================================
Ivan A. Uemlianin PhD
Llaisdy
Speech Technology Research and Development

iv...@llaisdy.com
www.llaisdy.com
llaisdy.wordpress.com
github.com/llaisdy
www.linkedin.com/in/ivanuemlianin

festina lente
============================================================
_______________________________________________
erlang-questions mailing list
erlang-q...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

Thomas Lindgren

unread,
Mar 3, 2014, 1:46:03 PM3/3/14
to Daniel Goertzen, Erlang Questions
You could hide the bindings inside a fun.

   A = (fun() -> case ... end end)( ),
   B = (fun() -> case ... end end)( ),
   {A, B}.

Though I agree with Ivan that refactoring might be a better idea in this case. 

A related issue: I'd like to get warnings when variables are matched (not bound) in patterns. I've seldom found repeated variables very useful in practice, they can easily be rewritten to use explicit compares, and sometimes they have hidden groan-inducing bugs. 

(Binary patterns are a special case; _some_ repeated variables are be benign there.)

Best,
Thomas


Daniel Goertzen

unread,
Mar 3, 2014, 2:07:36 PM3/3/14
to Thomas Lindgren, Erlang Questions
Hmm, fun wrapping could be the basis of hygienic-case parse transform.  I'll have to see what the performance overhead of a fun wrap is like.

The example I gave is contrived and a bit weak I guess.  In my most recent run-in with this issue there were a lot of bindings from the enclosing scope that were used, so breaking things out to top level functions would not always be a good option.  Funs would be better, but add clutter and maybe runtime overhead.  Compromises, compromises....

Cheers,
Dan.

Richard A. O'Keefe

unread,
Mar 3, 2014, 4:02:24 PM3/3/14
to Daniel Goertzen, Erlang Questions

On 4/03/2014, at 5:59 AM, Daniel Goertzen wrote:

> One thing that repeatedly bites me is the way variables bindings leak out of case expressions. For example, the following code fails to even compile because the Ps and Qs smack into each other...
>
> f1(X, Y) ->
> A = case {X,Y} of
> {0, Q} -> Q;
> {P, Q} -> P*Q
> end,
>
> B = case {X,Y} of
> {0, Q} -> Q;
> {P, Q} -> P+Q
> end,
> {A,B}.

That is of course a sketch, not real code.
But as a sketch, it looks like something that
is crying out for a couple of functions:

f1(X, Y) ->
A = f1_A(X, Y),
B = f1_B(X, Y),
{A,B}.

f1_A(0, Q) -> Q;
f1_A(P, Q) -> P*Q.

f1_B(0, Q) -> Q;
f1_B(P, Q) -> P+Q.


> The binding leakages seems like nothing but hassles and headaches to me, so my questions is, is there actually a reason why bindings need leak out of case expressions?

Yes. (It's not just 'case' either.)
Calling it "leaking" is rather prejudicial;
when you turn on the tap over your kitchen sink
you do not complain that it's "leaking".

> Is there a programming style that leverages this feature?

Yes.

> Is there a limitation in the emulator that forces it to be this way?

Certainly not.
>
> I'd love to have a 'hygienic' case expression. Could it be faked with parse transforms?

All selection expressions in Erlang do this, and it is as
hygienic as anyone could wish work.

You don't need a parse transform.
(It would be an AMAZINGLY bad idea to have things that
*look* like 'case' expressions not *act* like them.)

In fact all you really need is

-define(BEGIN, ((fun () ->).
-define(END, end)())).

f1(X, Y) ->
A = ?BEGIN
case {X,Y}
of {0,Q} -> Q
; {P,Q} -> P*Q
end
?END,
B = ?BEGIN
case {X,Y}
of {0,Q} -> Q
; {P,Q} -> P+Q
end
?END,
{A,B}.

Hmm. I seem to recall that this very same solution
has already been posted in this mailing list.

Right now, I suggest you take this deliberate design
feature of Erlang as a style warning: "You are using
the same variable for two different purposes in a
single clause; this is pretty much guaranteed to
confuse people so DON'T."

Think of it is not entirely unlike gcc's -Wshadow.

Richard A. O'Keefe

unread,
Mar 3, 2014, 4:23:20 PM3/3/14
to Daniel Goertzen, Erlang Questions

On 4/03/2014, at 8:07 AM, Daniel Goertzen wrote:

> Hmm, fun wrapping could be the basis of hygienic-case parse transform. I'll have to see what the performance overhead of a fun wrap is like.
>
> The example I gave is contrived and a bit weak I guess. In my most recent run-in with this issue there were a lot of bindings from the enclosing scope that were used, so breaking things out to top level functions would not always be a good option.

I see a non-sequitur there.

Simplified examples are great for revealing bugs.

For illuminating style concerns, NOTHING beats REAL code.

"There were a lot of bindings from the enclosing scope"
is already a warning sign that the code should be restructured.

> Funs would be better, but add clutter and maybe runtime overhead.

There is currently some run-time overhead.
In R16B03-1, the compiler still does not optimise

(fun (...) -> ... ... end)(...)

by turning it into the equivalent of an SML let ... in ... end
form, but there is no particular reason why it couldn't.

Whether you can _measure_ the overhead in a real application
is another matter entirely.

Splitting out little functions and then telling the compiler
to inline them might well have the least overhead of all.

Eric Pailleau

unread,
Mar 3, 2014, 5:54:47 PM3/3/14
to Daniel Goertzen, erlang-q...@erlang.org
f1(X,Y)->
case {X,Y} of
{0, Q} -> {Q, Q} ;
{P, Q} -> {P*Q, P+Q}
end.

regards.

« Envoyé depuis mon mobile » Eric

Daniel Goertzen <daniel....@gmail.com> a écrit :

One thing that repeatedly bites me is the way variables bindings leak out of case expressions.  For example, the following code fails to even compile because the Ps and Qs smack into each other...

f1(X, Y) ->
    A = case {X,Y} of
            {0, Q} -> Q;
            {P, Q} -> P*Q
        end,

    B = case {X,Y} of
            {0, Q} -> Q;
            {P, Q} -> P+Q
        end,
    {A,B}.


In situations like the above, I artificially change the variable names to avoid trouble and end up with something like this:

f1(X, Y) ->
    A = case {X,Y} of
            {0, Q0} -> Q0;
            {P0, Q0} -> P0*Q0
        end,

    B = case {X,Y} of
            {0, Q1} -> Q1;
            {P1, Q1} -> P1+Q1
        end,
    {A,B}.


The binding leakages seems like nothing but hassles and headaches to me, so my questions is, is there actually a reason why bindings need leak out of case expressions?  Is there a programming style that leverages this feature?  Is there a limitation in the emulator that forces it to be this way?

I'd love to have a 'hygienic' case expression.  Could it be faked with parse transforms?

Dan.

Eric Pailleau

unread,
Mar 4, 2014, 9:51:50 AM3/4/14
to Daniel Goertzen, erlang-q...@erlang.org
Hello,
in fact there is even no need to bind any other variable :

f1(X, Y) ->
case {X, Y} of
{0, Y} -> {Y, Y} ;
{X, Y} -> {X*Y, X+Y}
end.

and this can be reduced to test only X value as 0 or other value. No need to test Y.

Regard...



« Envoyé depuis mon mobile » Eric

Eric Pailleau <eric.p...@wanadoo.fr> a écrit :

Daniel Goertzen

unread,
Mar 4, 2014, 11:10:57 AM3/4/14
to Erlang Questions
I've been reflecting on this some more, and have realized one of my biases on the subject comes from the way C/C++ manages scope:  Anything within curly braces, whether it is a function, if, while, for, try, or catch, is a scope which encapsulates all declarations within.  This uniform rule makes it easier to reason about code, and not having it in Erlang is, well, jarring to me.

Now both Erlang and C/C++ can *read* variables from enclosing scopes, but only C/C++ can *mutate* variables from enclosing scopes.  Perhaps Erlang's case scoping rules are just a way to give similar powers to affect the enclosing scope.


The actual clause that predicated all this is here:  https://gist.github.com/goertzenator/9347573

There are a lot of case expressions that leave unwanted bindings lying around.  I have to pay attention to use different binding names for each to avoid collisions.  Specifically the AuthKey and PrivKey expressions had collisions at first because they are nearly identical.  The code would be a lot easier to reason about if I didn't have to look out for such things.

I tried putting this function together in various different ways, and this way was the most concise, most readable, and most amenable to future tweaking.  I normally don't tolerate clauses this long, but breaking it up always seemed to compromise those things.


Thank you for your ideas Richard.  I see I could also use funs directly instead of cases.

f1(X, Y) ->
    A = fun({0, Q}) -> Q;
           ({P, Q}) -> P*Q
           end({X,Y}),

    B = fun({0, Q}) -> Q;
           ({P, Q}) -> P+Q
        end({X,Y}),
    {A,B}.


Regards,
Dan.

Richard A. O'Keefe

unread,
Mar 4, 2014, 4:49:12 PM3/4/14
to Daniel Goertzen, Erlang Questions

On 5/03/2014, at 5:10 AM, Daniel Goertzen wrote:

> I've been reflecting on this some more, and have realized one of my biases on the subject comes from the way C/C++ manages scope: Anything within curly braces, whether it is a function, if, while, for, try, or catch, is a scope which encapsulates all declarations within. This uniform rule makes it easier to reason about code, and not having it in Erlang is, well, jarring to me.
>
> Now both Erlang and C/C++ can *read* variables from enclosing scopes, but only C/C++ can *mutate* variables from enclosing scopes. Perhaps Erlang's case scoping rules are just a way to give similar powers to affect the enclosing scope.

No, you are still thinking in C/C++ terms.
There are three things in Erlang that are scopes:

- a function clause is a scope
- a 'fun' is a scope
- a list comprehension is a scope

In something like

f(X, Y) ->
case Y
of {A,B} -> Z = 1
; [A,B] -> Z = 2
end,
{foo,A,B,Z}.

there is no "leakage" from one scope to another and
there is no "affect[ing] the enclosing scope" because
THERE IS ONLY ONE SCOPE.

> The actual clause that predicated all this is here: https://gist.github.com/goertzenator/9347573

Yike! I have never shared the vulgar prejudice against
long functions, but 78 lines for a single clause?

I would _definitely_ start by breaking out little functions.

engine_id(#{engine_id := local}) ->
snmp_agent_controller:get_engine_id();
engine_id(#{engine_id := E})
when is_list(E) ->
E.

authkey(#{authkey := AuthKey}, _, _)
when is_list(AuthKey) ->
AuthKey;
authkey(#{authpassword := AuthPass}, Localization_Hash, Engine_Id) ->
snmp:passwd2localized_key(Localization_Hash, AuthPass, Engine_Id).

privkey(#{privkey := PrivKey}, _, _)
when is_list(PrivKey) ->
PrivKey;
privkey(#{privpassword := PrivPass}, Localization_Hash, Engine_Id)
when is_list(PrivPass) ->
lists:sublist(snmp:passwd2localized_key(
Localization_Hash, PrivPass, Engine_Id), 16).

...

> There are a lot of case expressions that leave unwanted bindings lying around.

There shouldn't _be_ a lot of case expressions in a single
function clause. _That_ is the problem.

> I have to pay attention to use different binding names for each to avoid collisions. Specifically the AuthKey and PrivKey expressions had collisions at first because they are nearly identical. The code would be a lot easier to reason about if I didn't have to look out for such things.

And if you split them out into separate functions,
you _wouldn't_ have to look out for such things.
>
> I tried putting this function together in various different ways, and this way was the most concise, most readable, and most amenable to future tweaking.

If this was the most readable, I really do not want to see the
other versions! I find the code quite unreadable (the
RunTogtherVariableNames don't help) and splitting out little
functions made it MUCH easier for me to see what's going on.

As for "most amenable to future tweaking", I'd like to point
out an advantage of splitting things out as functions:

++ you can give them types and have the types checked. ++

> I see I could also use funs directly instead of cases.
>
> f1(X, Y) ->
> A = fun({0, Q}) -> Q;
> ({P, Q}) -> P*Q
> end({X,Y}),
>
> B = fun({0, Q}) -> Q;
> ({P, Q}) -> P+Q
> end({X,Y}),
> {A,B}.

You do not need those tuples.

f1(X, Y) ->
A = (fun (0, Q) -> Q
; (P, Q) -> P*Q
end)(X, Y),
B = (fun (0, Q) -> Q
; (P, Q) -> P+Q
end)(X, Y).

I would write

{AuthP, LocalizationHash} = case maps:get(authp, Record) of
md5 -> {usmHMACMD5AuthProtocol, md5};
sha -> {usmHMACSHAAuthProtocol, sha};
usmHMACMD5AuthProtocol -> {usmHMACMD5AuthProtocol, md5};
usmHMACSHAAuthProtocol -> {usmHMACSHAAuthProtocol, sha}
end,
as

Localization_Hash =
case maps:get(authp, Record)
of md5 -> md5
; sha -> sha
; usm_HMAC_MD5_Auth_Protocol -> md5
; usm_HMAC_SHA_Auth_Protocol -> sha
end,
Auth_Protocol =
case Localization_Hash
of md5 -> usm_HMAC_MD5_Auth_Protocol
; sha -> usm_HMAC_SHA_Auth_Protocol
end,

Martin Schut

unread,
Mar 4, 2014, 3:14:10 PM3/4/14
to erlang-q...@erlang.org
Why not put the extraction of the AuthKey and PrivKey to separate functions. I'd probably do that to a lot of other case statements as well. This leaves readable code. I do not know why you want it to be concise. Readable would be my first requirement, concise a secondary.

Best regards,
--Martin
--

Anthony Ramine

unread,
Mar 4, 2014, 5:27:32 PM3/4/14
to Richard A. O'Keefe, Erlang Questions
Note that there is also one scope per generator input, which does not spill into the generator pattern nor the rest of the comprehension.

--
Anthony Ramine

Le 4 mars 2014 à 22:49, Richard A. O'Keefe <o...@cs.otago.ac.nz> a écrit :

> - a list comprehension is a scope

Szoboszlay Dániel

unread,
Mar 4, 2014, 5:41:19 PM3/4/14
to Erlang Questions
On Tue, 04 Mar 2014 22:49:12 +0100, Richard A. O'Keefe <o...@cs.otago.ac.nz>
wrote:

> No, you are still thinking in C/C++ terms.
> There are three things in Erlang that are scopes:
> - a function clause is a scope
> - a 'fun' is a scope
> - a list comprehension is a scope

Personally, I feel that these rules are neither convenient nor intuitive.
Mainly due to the last one, about the list comprehensions. I *know* that
list comprehensions are implemented with funs, hence the scoping, but it
still *looks like* a big exception from the "one function, one scope"
rule. And if you have background in any C-descendant language you would
almost certainly feel that cases, ifs, trys, begins etc. shall have their
own scopes too.

Some examples where I really miss them:

foo(Dict) ->
Value = case orddict:find(foo, Dict) of
{ok, Value} -> Value;
error -> calculate_very_expensive_default_value()
end,
...

Sorry, variable 'Value' is unsafe in 'case', please rename your internal
variable to 'Value1' or refactor this humble four-liner into a
get_foo_from_orddict/1 function, whichever you feel more inconvenient.

And we aren't done yet, let's add my favorite, the error-handling mess-up
to the mix!

foo() ->
...
V1 = case bar() of
{ok, Bar} ->
Bar;
{error, Reason} ->
lager:error("SOS ~p", [Reason]),
BarDefault
end,
V2 = case baz() of
{ok, Baz} ->
Baz;
{error, Reason} ->
lager:error("SOS ~p", [Reason]),
BazDefault
end,
...

Of course I know real programmers never put two cases in the same
function, but lazy people tend to. And lazy people also tend to use
convenient variable names like Reason or Error in error branches. And than
get compile errors or random badmatch crashes, depending on their code's
complexity.

The only benefit I see of *not starting* a new scope within a case is that
you can do this:

foo() ->
case bar() of
1 -> X = 1;
2 -> X = 2;
_ -> X = 0
end,
bar(X).

Which, in my opinion, is less readable than:

foo() ->
X = case bar() of
1 -> 1;
2 -> 2;
_ -> 0
end,
bar(X).

I hope these examples may pass the "this code is too trivial/ugly to even
notice your actual problem" filter with better luck than the previous
nominee. :)

Cheers,
Daniel

Anthony Ramine

unread,
Mar 4, 2014, 5:48:43 PM3/4/14
to Szoboszlay Dániel, Erlang Questions
Hello,

This would be better written as:

maybe_log_error(Default, Fun) ->
case Fun() of
{ok,Result} ->
Result;
{error,Reason} ->
lager:error("SOS ~p", [Reason]),
Default
end.

foo() ->
...
V1 = maybe_log_error(BarDefault, fun bar/0),
V2 = maybe_log_error(BazDefault, fun baz/0),
...

Sorry, but no luck for the filter.

--
Anthony Ramine

Anthony Ramine

unread,
Mar 4, 2014, 5:57:11 PM3/4/14
to Szoboszlay Dániel, Erlang Questions
My point still stands, don’t put such complicated code in the middle of nowhere, why would you call calculate_very_expensive_default_value() here in foo/1? Put the nastiness away from main path.

--
Anthony Ramine

Le 4 mars 2014 à 23:55, Szoboszlay Dániel <dszob...@gmail.com> a écrit :

> But what if BazDefault is based on my trusty calculate_very_expensive_default_value(), and the error would have to be logged on warning level with a different format string? :P
>
> Daniel
>
> On Tue, 04 Mar 2014 23:48:43 +0100, Anthony Ramine <n.o...@gmail.com> wrote:
>
>> Hello,
>>
>> This would be better written as:
>>
>> maybe_log_error(Default, Fun) ->
>> case Fun() of
>> {ok,Result} ->
>> Result;
>> {error,Reason} ->
>> lager:error("SOS ~p", [Reason]),
>> Default
>> end.
>>
>> foo() ->
>> ...
>> V1 = maybe_log_error(BarDefault, fun bar/0),
>> V2 = maybe_log_error(BazDefault, fun baz/0),
>> ...
>>
>> Sorry, but no luck for the filter.

Tony Rogvall

unread,
Mar 4, 2014, 6:23:12 PM3/4/14
to Anthony Ramine, Erlang Questions
Of course. use list comprehensions! or?

This is kind of fun. Check the generated beam code for this module (using R16B02 right now)

-module(compcase).
-compile(export_all).
-compile(inline).
-compile({inline_size, 100}).
-compile({inline_unroll, 2}).

f(X,Y) ->
    [A] = [case {X,Y} of 
      {0,Q} -> Q; 
      {P,Q} -> P+Q 
  end || _ <- [[]]],
    [B] = [case {X,Y} of 
      {0,Q} -> Q; 
      {P,Q} -> P*Q 
  end || _ <- [[]]],
    {A,B}.


g(X,Y) ->
    A = fun({0,Q}) -> Q; 
   ({P,Q}) -> P+Q 
end({X,Y}),
    B = fun({0,Q}) -> Q; 
  ({P,Q}) -> P*Q
end({X,Y}),
    {A,B}.



"Installing applications can lead to corruption over time. Applications gradually write over each other's libraries, partial upgrades occur, user and system errors happen, and minute changes may be unnoticeable and difficult to fix"



Richard A. O'Keefe

unread,
Mar 4, 2014, 9:52:22 PM3/4/14
to Erlang Questions

On 5/03/2014, at 11:41 AM, Szoboszlay Dániel wrote:

> On Tue, 04 Mar 2014 22:49:12 +0100, Richard A. O'Keefe <o...@cs.otago.ac.nz> wrote:
>
>> No, you are still thinking in C/C++ terms.
>> There are three things in Erlang that are scopes:
>> - a function clause is a scope
>> - a 'fun' is a scope
>> - a list comprehension is a scope
>
> Personally, I feel that these rules are neither convenient nor intuitive. Mainly due to the last one, about the list comprehensions. I *know* that list comprehensions are implemented with funs, hence the scoping,

Wrong. Yes, list comprehensions *happen* to be implemented
as calls to out-of-line functions, but the *necessity* for
list comprehensions to be separate scopes falls directly
out of their semantics.

Suppose it were otherwise.

R = [(Y = f(X), Y) || X <- L].

What value would X have after the comprehension
- if L = []?
- if L = [1]?
- if L = [1,2]?
What value would Y have?

The key thing is that X and Y get _different_ values
on each iteration, and Erlang variables being immutable,
the only way we can make sense of that is if each
iteration gets its _own_ X and Y.

> And if you have background in any C-descendant language you would almost certainly feel that cases, ifs, trys, begins etc. shall have their own scopes too.

Surely C counts as a C-descendant language.
And in C, 'switch' is *not* a scope. Consider:

int goo(int n) {
switch (n) if (0)
case 1: return -27; else if (0)
case 2: return 43; else if (0)
default: return 122;
}
}

#include <stdio.h>

int main(void) {
int i;

for (i = 0; i < 4; i++) printf("%d => %d\n", i, goo(i));
return 0;
}

Yep. That's legal (classic C, C89, C99). I don't have
a copy of the C11 standard, but I'd be pretty surprised
if it wasn't legal C11.

In Classic C and C89, *none* of 'if', 'switch', 'while',
'do', and 'for' introduce new scopes.
In C99, it's clear from 6.8.4 that 'if' and 'switch'
do *NOT* introduce new scopes and it is clear from
6.8.5 that 'while' and 'do' do *NOT* introduce new
scopes.

In Classic C and C89 the *ONLY* construction that
introduces a new scope is '{...}'.
C99 adds
iteration-statement:
...
for ( declaration expr-opt ; expr-opt ) statement
which *is* a new scope, but in C this is very exceptional.

Conversely, if switch cases were scopes,
you'd expect

void goo(int n) {
switch (n) {
case 1:
int x = 1;
x++;
case 2:
int x = 2;
x++;
default:
int x = 3;
x--;
}
}

to be legal, but it isn't.
Change this to

void goo(int n) {
switch (n) {
case 1:;
int x = 1;
x++;
case 2:;
int x = 2;
x++;
default:;
int x = 3;
x--;
}
}

and it *still* isn't legal, because the whole body of the
switch is a *single* scope and you are not allowed
more than one declaration of x within it.

I would therefore expect a C programmer to *expect* that
occurrences of a variable in two cases would be the same
variable.

It turns out that the entire body of a switch is a
*single* scope, and the example isn't legal Java either.
Looking at the Java Language Specification for Java 7,
chapter 14, we find that 'if' and 'while' and 'do' do
*NOT* introduce new scopes, that 'for' may but need not,
that 'catch' does, but that 'try' does only in the
14.20.3 try-with-resources form. So in Java, which is
surely a 'C-descendant' language, selection statements
do NOT introduce new scopes and any Java progammer who
thought they did, or who expected selection statements
in some other language to do so on the strength of what
Java does, would be exposed as not understanding Java.


>
> Some examples where I really miss them:
>
> foo(Dict) ->
> Value = case orddict:find(foo, Dict) of
> {ok, Value} -> Value;
> error -> calculate_very_expensive_default_value()
> end,
> ...

So don't do that. If the compiler *did* allow that,
it would be as confusing as all-get-out for suffering
human beings trying to make sense of it.
It is a blessing that abominations like this are blocked.

This example is obviously not real code.
We have already seen in this thread that the *real*
code that inspired is would be hugely improved by
being broken into little functions, whereupon the
problem disappears.

> And we aren't done yet, let's add my favorite, the error-handling mess-up to the mix!
>
> foo() ->
> ...
> V1 = case bar() of
> {ok, Bar} ->
> Bar;
> {error, Reason} ->
> lager:error("SOS ~p", [Reason]),
> BarDefault
> end,
> V2 = case baz() of
> {ok, Baz} ->
> Baz;
> {error, Reason} ->
> lager:error("SOS ~p", [Reason]),
> BazDefault
> end,
> ...

Again, this is totally unreal code.
Let me offer an equally unreal alternative:

safe_bar() ->
case bar()
of {ok, Bar} ->
Bar
; {error, Reason} ->
lager:error("SOS ~p", [Reason]),
BarDefault
end.

safe_baz() ->
case baz()
of {ok, Baz} ->
Baz
; {error, Reason} ->
lager:error("SOS ~p", [Reason]),
BazDefault
end.

foo() ->
...
V1 = safe_bar(),
V2 = safe_baz(),
..

> Of course I know real programmers never put two cases in the same function, but lazy people tend to. And lazy people also tend to use convenient variable names like Reason or Error in error branches.

Real programmers who put multiple cases in the same clause
take care to use disjoint names SO THAT HUMAN BEINGS WILL
NOT BE CONFUSED.

Lazy programmers who over-use names like Value, Error, Reason
deserve to be forced to maintain code written by other lazy
programmers.

> The only benefit I see of *not starting* a new scope within a case is that you can do this:
>
> foo() ->
> case bar() of
> 1 -> X = 1;
> 2 -> X = 2;
> _ -> X = 0
> end,
> bar(X).
>
> Which, in my opinion, is less readable than:
>
> foo() ->
> X = case bar() of
> 1 -> 1;
> 2 -> 2;
> _ -> 0
> end,
> bar(X).

You are forgetting that there may be multiple variables
involved. Packaging them up in tuples just so that you
can unpack them again? Feh.

The thing is that Erlang is what it is
and is not something else.
Criticising Erlang for not being like something else
(especially when it turns out that the something else
is not in fact like that either)
is silly.

Before 'fun' and list comprehensions were added to Erlang
the rule was simple and uniform: one name, one variable,
everywhere in a function clause. 'fun' and list
comprehensions break that rule because they *have* to,
but even they try not to break it any more than they can help.

>
> I hope these examples may pass the "this code is too trivial/ugly to even notice your actual problem" filter with better luck than the previous nominee. :)

Your examples are profoundly unreal.
Real examples could well be informative.
So far, all you have shown is that lazy programmers
can screw up and the compiler sometimes notices.
So this is news, already?

Richard A. O'Keefe

unread,
Mar 4, 2014, 10:00:36 PM3/4/14
to Daniel Goertzen, Erlang Questions

On 5/03/2014, at 5:10 AM, Daniel Goertzen wrote:

> I've been reflecting on this some more, and have realized one of my biases on the subject comes from the way C/C++ manages scope: Anything within curly braces, whether it is a function, if, while, for, try, or catch, is a scope which encapsulates all declarations within. This uniform rule makes it easier to reason about code, and not having it in Erlang is, well, jarring to me.

This is rather confusingly worded. In Classic C and C89,
new scopes were introduced by '{...}' and only by '{...}.
There was a proposal years ago to add
begin
<body1>
within
<body2>
end
to Erlang with the idea that variable bindings introduced
in <body1> would be visible in <body2> but not outside,
while variable bindings in <body2> would be visible outside
as usual.

I don't know why that was rejected.

Thomas Lindgren

unread,
Mar 5, 2014, 6:18:01 AM3/5/14
to Szoboszlay Dániel, Erlang Questions

However, note that you don't want to write that sort of code anyway!
However, note that that code also is problematic in another way. 

What happens if we modify the second clause of your example to make Value safe? 
Value = 
   case orddict:find(foo, Dict) of
     {ok, Value} -> Value;
     error -> Value = calculate_very_expensive_default_value()
   end
How does this evaluate? If orddict finds foo, variable Value is bound in {ok, Value} in the first clause of the case. Then that value is returned from the case clause and MATCHED WITH the binding of the variable Value (itself). Recall that, since Value has been bound in the case clause previously, "Value = case ... end" means matching, not just variable binding. The same reasoning goes for the second clause. 

The basic problem is that the code above mixes two ways of writing erlang. It works, but only by accident, and only at the cost of needlessly walking Value a number of times matching itself. (See the end of this mail for the underlying problem, though.)

Here are the two ways to get it right. First, the old way of writing the same code, which these days give warnings about exported variables:
case orddict:find(foo, Dict) of
  {ok, Value} -> Value
  ...
end,
... %% use Value afterwards
Note that there's no enclosing "Value = case ... end" up there. Then the new way (which is what most of us use now):
Value =
   case orddict:find(foo, Dict) of
      {ok, Val} -> Val;    %% fresh variable
      ...
   end,
... %% use Value, but not Val

In the larger scheme of things: The way I see it, the feature of implicit variable matching is more trouble than it's worth. Warnings about such matches would catch this kind of problems.

Best,
Thomas

Thomas Lindgren

unread,
Mar 5, 2014, 6:20:43 AM3/5/14
to Szoboszlay Dániel, Erlang Questions
(The beginning of that mail looks like Yahoo did a remix of multiple drafts, but I think the rest is okay. Sorry.)

Best,
Thomas

Anthony Ramine

unread,
Mar 5, 2014, 6:25:06 AM3/5/14
to Thomas Lindgren, Erlang Questions
The compile option is warn_export_vars:

Causes warnings to be emitted for all implicitly exported variables referred to after the primitives where they were first defined. No warnings for exported variables unless they are referred to in some pattern, which is the default, can be selected by the option nowarn_export_vars.

--
Anthony Ramine

Le 5 mars 2014 à 12:18, Thomas Lindgren <thomasl...@yahoo.com> a écrit :

> In the larger scheme of things: The way I see it, the feature of implicit variable matching is more trouble than it's worth. Warnings about such matches would catch this kind of problems.
>

Thomas Lindgren

unread,
Mar 5, 2014, 9:33:42 AM3/5/14
to Anthony Ramine, Erlang Questions
Thanks Anthony, that option warns for Value in the previous example because it's exported from the case, but not for the variable match that then occurs. I'd like to see a compiler option that warned about matching (bound variables occurring in patterns), which would catch other inadvertent reuse of variable names.

Best,
Thomas

Anthony Ramine

unread,
Mar 5, 2014, 9:39:01 AM3/5/14
to Thomas Lindgren, Erlang Questions
What would you like to be warned about? I don’t get it.

--
Anthony Ramine

Thomas Lindgren

unread,
Mar 5, 2014, 10:38:21 AM3/5/14
to Anthony Ramine, Erlang Questions
Occurrences of bound variables in patterns. For example A, B, C, and D in these:

  {A, B} = f(X),
  {A, B} = f(Y),   % A,B already bound
  C = g(Z),
  C = g(W),        % C is already bound
  {D,D} = h(V)     % D is bound then matched

Best,
Thomas

Anthony Ramine

unread,
Mar 5, 2014, 10:57:28 AM3/5/14
to Thomas Lindgren, Erlang Questions
Why would you warn for this? It’s a feature.

--
Anthony Ramine

Daniel Goertzen

unread,
Mar 5, 2014, 12:12:38 PM3/5/14
to Erlang Questions

So I received suggestions to factor out the little functions.  But I was sure that would ruin the visual locality and “flow” of the code thereby making it harder to read.  So I decided to go ahead and implement it that way so I could show those people that they are wrong.  But as I starting factoring things out I could see the larger-grain structure more easily, and saw some easy ways to collapse and clean up things… and some of things that I thought would be problems were not problems at all.  Boy, am I ever eating crow now. :)


Here is the result: https://gist.github.com/goertzenator/9370847

There may be a bug or two; I am in the process of making functional changes.


So a big thank you to Richard and the others who commented on my coding style.  I am the sole Erlanger at a small startup, so I don’t get the benefit of that kind of review.


The scoping rules of case still irritate me, but now I have a better sense as to why it is the way it is, and I have been shown a better coding style that mostly removes the irritation.


Thanks again,

Dan.

Thomas Lindgren

unread,
Mar 5, 2014, 12:13:29 PM3/5/14
to Anthony Ramine, Erlang Questions
The way I see it, the feature of implicit variable matching is more trouble than it's worth. Warnings about such matches would catch this kind of problems.

For myself, it's seldom used in practice, and mostly a source of hidden bugs and unobvious code. Rewriting into explicit comparisons is normally straightforward and clearer, so having it doesn't buy me a lot. But maybe I'm being unreasonable. Anyone use bound variables in patterns a lot by intent?

Best,
Thomas

Loïc Hoguin

unread,
Mar 5, 2014, 2:22:59 PM3/5/14
to Thomas Lindgren, Erlang Questions
On 03/05/2014 06:13 PM, Thomas Lindgren wrote:
> The way I see it, the feature of implicit variable matching is more
> trouble than it's worth. Warnings about such matches would catch this
> kind of problems.
>
> For myself, it's seldom used in practice, and mostly a source of hidden
> bugs and unobvious code. Rewriting into explicit comparisons is normally
> straightforward and clearer, so having it doesn't buy me a lot. But
> maybe I'm being unreasonable. Anyone use bound variables in patterns a
> lot by intent?

*Yes*.

--
Loïc Hoguin
http://ninenines.eu

Jesper Louis Andersen

unread,
Mar 5, 2014, 2:35:06 PM3/5/14
to Loïc Hoguin, Erlang Questions

On 05 Mar 2014, at 20:22, Loïc Hoguin <es...@ninenines.eu> wrote:

> On 03/05/2014 06:13 PM, Thomas Lindgren wrote:
>> The way I see it, the feature of implicit variable matching is more
>> trouble than it's worth. Warnings about such matches would catch this
>> kind of problems.
>>
>> For myself, it's seldom used in practice, and mostly a source of hidden
>> bugs and unobvious code. Rewriting into explicit comparisons is normally
>> straightforward and clearer, so having it doesn't buy me a lot. But
>> maybe I'm being unreasonable. Anyone use bound variables in patterns a
>> lot by intent?
>
> *Yes*.

Me too!

Thomas Lindgren

unread,
Mar 5, 2014, 3:31:51 PM3/5/14
to Jesper Louis Andersen, Loïc Hoguin, Erlang Questions

Thanks, any examples either of you can share where they are particularly handy?

Best,
Thomas

Szoboszlay Dániel

unread,
Mar 5, 2014, 3:48:05 PM3/5/14
to Erlang Questions, Richard A. O'Keefe

On Wed, 05 Mar 2014 03:52:22 +0100, Richard A. O'Keefe <o...@cs.otago.ac.nz>
wrote:

> Wrong. Yes, list comprehensions *happen* to be implemented
> as calls to out-of-line functions, but the *necessity* for
> list comprehensions to be separate scopes falls directly
> out of their semantics.
>
> Suppose it were otherwise.
>
> R = [(Y = f(X), Y) || X <- L].
>
> What value would X have after the comprehension
> - if L = []?
> - if L = [1]?
> - if L = [1,2]?
> What value would Y have?
>
> The key thing is that X and Y get _different_ values
> on each iteration, and Erlang variables being immutable,
> the only way we can make sense of that is if each
> iteration gets its _own_ X and Y.

The key thing is list comprehensions introduce a new scope, thus behave
different from the rest of the expressions. This is not intuitive, even if
it is necessary.

> In Classic C and C89, *none* of 'if', 'switch', 'while',
> 'do', and 'for' introduce new scopes.
> In C99, it's clear from 6.8.4 that 'if' and 'switch'
> do *NOT* introduce new scopes and it is clear from
> 6.8.5 that 'while' and 'do' do *NOT* introduce new
> scopes.
>
> In Classic C and C89 the *ONLY* construction that
> introduces a new scope is '{...}'.

I don't want to argue with specs, you are technically correct of course.
But the mental model is that whenever I introduce a new variable in an
if-branch, it will go into a new scope. From the compiler's perspective
it's due to the curly braces. From my perspective it's because I want to
use that variable in that single block only.

> The thing is that Erlang is what it is
> and is not something else.
> Criticising Erlang for not being like something else
> (especially when it turns out that the something else
> is not in fact like that either)
> is silly.

I'm not criticizing Erlang for being what it is. I love this language the
way it is already. But I think it could be improved in some areas, and the
lack of scoping expressions is one such area. Changing the way case works
would be no doubt a terrible decision. Allowing to write code in the style
of my examples without breaking backward compatibility would be a useful
addition for those who prefer this style.

If starting a new scope would be as easy in Erlang as in C, I'd be very
happy. And why shouldn't it be?

For example consider a new expression: "<- Expression" that has the value
of Expression but introduces new variables in Expression to a new scope.
(OK, probably using the left arrow might have some problems, but at least
it looks very Erlangish). Then I could write:

foo(Dict) ->
Value = <- case orddict:find(foo, Dict) of
{ok, Value} -> Value;
error -> calculate_very_expensive_default_value()
end,
...

(Btw. my problem with "begin ... within ... end" mentioned in a later
email is that it is way too verbose. Which is of course also very
Erlangish...)

And once we are there, there are some other nice tricks we could do with
scopes. Consider this real life example:
https://github.com/rebar/rebar/blob/master/src/rebar_core.erl#L182-L263 -
this function performs a lot of sequential tasks and generates a series of
Config variables in the meantime (Config1, Config2, ...). A typical error
is that after creating Config123 somewhere later you accidentally use
Config122 instead. It would be nice if I could "drop" a variable from the
scope and get a compile error if I'd happen to reuse it later. Like this:

DirSet3 = sets:add_element(Dir, DirSet2),
~DirSet2, % Never ever use DirSet2 again!
...

or even:

DirSet3 = sets:add_element(Dir, ~DirSet2),
...

But I can accept if you think these ideas are worthless as well, only I'd
love to hear and discuss your arguments.

Cheers,
Daniel

Anthony Ramine

unread,
Mar 5, 2014, 4:12:35 PM3/5/14
to Szoboszlay Dániel, Erlang Questions
Replied inline.

--
Anthony Ramine
Javascript compilers would love it if scope was just a matter of a pair of curly braces.

From my perspective, the code

X = ...,
foo(X).

isn’t the same as

let X = ... in
foo(X).

In the first, « = » does not denote a let-binding of which the resulting augmented environment is used to evaluate its body. It denotes a match expression, which happens to never fail because the pattern matches everything, and of which the environment spills into the next expression « foo(X) ». Given that a match exports variables, why shouldn’t case? if?

I have a patch somewhere which I need to finish, that makes exporting from try possible to, for consistency’s sake.

That being said, if it were just me, I would just forbid the use of all unsafe and exported names. And make fun heads non-shadowing. One True Scope, all the way long.

>> The thing is that Erlang is what it is
>> and is not something else.
>> Criticising Erlang for not being like something else
>> (especially when it turns out that the something else
>> is not in fact like that either)
>> is silly.
>
> I'm not criticizing Erlang for being what it is. I love this language the way it is already. But I think it could be improved in some areas, and the lack of scoping expressions is one such area. Changing the way case works would be no doubt a terrible decision. Allowing to write code in the style of my examples without breaking backward compatibility would be a useful addition for those who prefer this style.

The lack of scoping expressions is a feature, I don’t want to have to look hard to find the initial binding of a variable, rebinding makes finding the original binding place harder. Also, I find the One True Scope and nonlinear patterns to be quite related and I don’t think it would be nice to have one without the other. Removing nonlinear patterns makes very idiomatic code impossible:

Ref = erlang:make_ref(),
Pid ! {request,Ref},
receive {reply,Ref,Msg} -> ok end.

> If starting a new scope would be as easy in Erlang as in C, I'd be very happy. And why shouldn't it be?

Because we don’t want a new scope.

> For example consider a new expression: "<- Expression" that has the value of Expression but introduces new variables in Expression to a new scope. (OK, probably using the left arrow might have some problems, but at least it looks very Erlangish). Then I could write:
>
> foo(Dict) ->
> Value = <- case orddict:find(foo, Dict) of
> {ok, Value} -> Value;
> error -> calculate_very_expensive_default_value()
> end,
> ...
>
> (Btw. my problem with "begin ... within ... end" mentioned in a later email is that it is way too verbose. Which is of course also very Erlangish…)

This is certainly not Erlangish because it introduces a new scope even though there is no lambda involved, and because there is no arrow in Erlang which is a prefix or postfix operator. Why would you use an arrow unbalanced?

> And once we are there, there are some other nice tricks we could do with scopes. Consider this real life example: https://github.com/rebar/rebar/blob/master/src/rebar_core.erl#L182-L263 - this function performs a lot of sequential tasks and generates a series of Config variables in the meantime (Config1, Config2, ...). A typical error is that after creating Config123 somewhere later you accidentally use Config122 instead. It would be nice if I could "drop" a variable from the scope and get a compile error if I'd happen to reuse it later. Like this:

Sometimes, the rules in Erlang are just here to make sure your life will be miserable if you get too fancy in a single clause; this is such an occurrence. Also if there were less comments, it would be better in my opinion. And all these newlines, they make my eyes wander all around.

%% Check that this directory is not on the skip list
Config7 = case rebar_config:is_skip_dir(Config3, Dir) of

What’s the point of such comments?

Richard A. O'Keefe

unread,
Mar 5, 2014, 8:02:25 PM3/5/14
to Anthony Ramine, Erlang Questions
> Le 5 mars 2014 à 21:48, Szoboszlay Dániel <dszob...@gmail.com> a écrit :

>> And once we are there, there are some other nice tricks we could do with scopes. Consider this real life example: https://github.com/rebar/rebar/blob/master/src/rebar_core.erl#L182-L263 - this function performs a lot of sequential tasks and generates a series of Config variables in the meantime (Config1, Config2, ...).

I don't have a problem with multipage functions that have
lots of clauses. I have a serious problem with clauses
that I can't see all at once. These days I have a lovely
big screen so I can get 60 lines on screen at once, which
is also about what I can see on a sheet of paper.

This function is one 82-line clause.

There are a number of little bits that bug me about it.
For example

%% Make sure the CWD is reset properly; processing the dirs may have
%% caused it to change
ok = file:set_cwd(Dir),

occurs twice. That tells me that the code is in the wrong place:
if "processing the dirs may" cause an *unwanted* change to the
current directly, then it *shouldn't*. It would be interesting
to change the design, but the comments don't tell me *where* the
current directory might be changed.

Oddly enough, an issue about changing the current working
directory came up recently in the SWI Prolog mailing list.
The answer was "These days, SWI Prolog is multithreaded,
so after your program starts up, DON'T change the current
working directory."

The file server's notion of the current working directory
counts as *shared mutable state*, the kind of thing Erlang
is supposed to help us avoid.

Just looking at this function, it is far from obvious that
Dir ever _was_ the current directory. Let us suppose that
it was. Then there is a snag. The pattern

let saved_cwd = getcwd() in $(
do something
chdir(saved_cwd)
$)

can fail. That's why modern UNIX systems have
fd = open(".", O_RONLY);
...
fchdir(fd);

If I were doing anything in Erlang where I needed to change
"the" current working directory, I'd put the information in
the process dictionary

Now let's return to the issue of variable names and
possible collisions between them. I see a lot of
Config variables, and I haven't a clue what they are,
and I especially haven't a clue how the
execute_* functions are supposed to be updating them.
The documentation for execute/5, for example, says
nothing about the result. What a 'Config' *is* and
how it's processed are core concepts in this file,
but nothing in the file documents them.

When it comes to trying to read the code, _this_ is
a far more serious issue than 'case'; it's even
more serious than the length of the function.
Reply all
Reply to author
Forward
0 new messages