[erlang-questions] Dict size in guards

45 views
Skip to first unread message

Alessandro Sivieri

unread,
Feb 10, 2011, 11:05:52 AM2/10/11
to Erlang Questions
Hi all,

is there a good reason for not allowing dict:size() in guards? I mean, it
looks exactly like length() for lists, in theory (of course, their
implementations are different).

--
Sivieri Alessandro
alessandr...@gmail.com
http://www.chimera-bellerofonte.eu/
http://www.poul.org/

Robert Virding

unread,
Feb 10, 2011, 12:27:14 PM2/10/11
to Alessandro Sivieri, Erlang Questions
There is a general restriction on guard tests that no calls to user defined functions are allowed. This irrespective of how simple these functions may be, as there is no way of guaranteeing this at run-time.

Robert

--
Robert Virding, Erlang Solutions Ltd.


________________________________________________________________
erlang-questions (at) erlang.org mailing list.
See http://www.erlang.org/faq.html
To unsubscribe; mailto:erlang-questio...@erlang.org

Alessandro Sivieri

unread,
Feb 10, 2011, 12:28:59 PM2/10/11
to Robert Virding, Erlang Questions
On Thu, Feb 10, 2011 at 6:27 PM, Robert Virding <
robert....@erlang-solutions.com> wrote:

> There is a general restriction on guard tests that no calls to user defined
> functions are allowed. This irrespective of how simple these functions may
> be, as there is no way of guaranteeing this at run-time.
>
>

I know, but this does not change my question, at least about that specific
function (which is not user defined, is in stdlib...).

Robert Virding

unread,
Feb 10, 2011, 12:33:32 PM2/10/11
to Alessandro Sivieri, Erlang Questions
----- "Alessandro Sivieri" <alessandr...@gmail.com> wrote:
>
> On Thu, Feb 10, 2011 at 6:27 PM, Robert Virding < robert....@erlang-solutions.com > wrote:
>

There is a general restriction on guard tests that no calls to user defined functions are allowed. This irrespective of how simple these functions may be, as there is no way of guaranteeing this at run-time.
>
>
> I know, but this does not change my question, at least about that specific function (which is not user defined, is in stdlib...).

In this case "user-defined" means written in Erlang. And irrespective of who wrote the module and whether it is in the release there are still no guarantees that at run-time this is the code which is run.

Robert

OvermindDL1

unread,
Feb 10, 2011, 4:28:50 PM2/10/11
to Robert Virding, Alessandro Sivieri, Erlang Questions


Would not the Abstract Patterns and other enhancements put through be
Richard A. O'Keefe help this issue a great deal?

Richard O'Keefe

unread,
Feb 10, 2011, 9:02:57 PM2/10/11
to Alessandro Sivieri, Erlang Questions

On 11/02/2011, at 5:05 AM, Alessandro Sivieri wrote:

> Hi all,
>
> is there a good reason for not allowing dict:size() in guards?

Yes. Who knows what dict:size() actually does?
Guards cannot and should not call functions.

> I mean, it
> looks exactly like length() for lists, in theory (of course, their
> implementations are different).

The inclusion of length/1 in guard tests has always been more than
a bit dodgy. It's better taste to avoid it. Not least because
if you do something like

f(L)
when length(L) > 10 -> long_case(L);
when length(L) > 5 -> medium_case(L);
when true -> short_case(L).

doesn't do what people often think. I've several times been
caught by that.

f(L) ->
N = length(L),
if N > 10 -> long_case(L)
; N > 5 -> medium_case(L)
; true -> short_case(L)
end.

is safer and only evaluates the length once.

A style warning "length/1 in guard" would be nice to have.

Richard O'Keefe

unread,
Feb 10, 2011, 9:04:25 PM2/10/11
to Alessandro Sivieri, Robert Virding, Erlang Questions

On 11/02/2011, at 6:28 AM, Alessandro Sivieri wrote:

> On Thu, Feb 10, 2011 at 6:27 PM, Robert Virding <
> robert....@erlang-solutions.com> wrote:
>
>> There is a general restriction on guard tests that no calls to user defined
>> functions are allowed. This irrespective of how simple these functions may
>> be, as there is no way of guaranteeing this at run-time.
>>
>>
> I know, but this does not change my question, at least about that specific
> function (which is not user defined, is in stdlib...).

I have been known to patch the occasional stdlib function...

Ulf Wiger

unread,
Feb 11, 2011, 3:03:17 AM2/11/11
to Richard O'Keefe, Alessandro Sivieri, Erlang Questions

On 11 Feb 2011, at 02:02, Richard O'Keefe wrote:

> The inclusion of length/1 in guard tests has always been more than
> a bit dodgy. It's better taste to avoid it. Not least because
> if you do something like
>
> f(L)
> when length(L) > 10 -> long_case(L);
> when length(L) > 5 -> medium_case(L);
> when true -> short_case(L).
>
> doesn't do what people often think. I've several times been
> caught by that.


Also, as pointed out by Kostis in his Tidier presentations, a fairly
common use is

f(L) when length(L) == 0 -> …

which of course has to traverse the whole (potentially very long) list.

BR,
Ulf

Ulf Wiger, CTO, Erlang Solutions, Ltd.
http://erlang-solutions.com

Tony Rogvall

unread,
Feb 11, 2011, 2:35:36 AM2/11/11
to Richard O'Keefe, Alessandro Sivieri, Erlang Questions, Björn Gustavsson

One way of avoiding length in this case is of course:

f(L) ->
case L of
[_,_,_,_,_,_,_,_,_,_,_|_] ->
long_case(L);
[_,_,_,_,_,_|_] ->
medium_case(L);
_ ->
short_case(L)
end.

May be the compiler could take care of the "simple" cases like this, and then
issue a warning if it can not do the transformation ?

BTW.
The compiler needs a fix to handle this transform in general. It uses way to many registers.
Maybe it's a problem with compiling guards?


> A style warning "length/1 in guard" would be nice to have.
>
>
>
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-questio...@erlang.org
>

"Have run Make so many times I dunno what's installed anymore"

Tony Rogvall

unread,
Feb 11, 2011, 3:54:37 AM2/11/11
to Ulf Wiger, Richard O'Keefe, Alessandro Sivieri, Erlang Questions
And in general guard you can use something like this:

f(L) ->
if is_list(tl(tl(tl(tl(tl(tl(tl(tl(tl(tl(tl(L)))))))))))) -> long_case(L);
is_list(tl(tl(tl(tl(tl(tl(L))))))) -> medium_case(L);
true -> short_case(L)
end.

It compiles a LOT better than the corresponding pattern version.

/Tony

"Have run Make so many times I dunno what's installed anymore"

Michael Turner

unread,
Feb 11, 2011, 4:40:12 AM2/11/11
to Tony Rogvall, Ulf Wiger, Richard O'Keefe, Alessandro Sivieri, Erlang Questions
FWIW, Doug Clark found that the average length of lists in large, mature
lisp applications was about four.

http://portal.acm.org/citation.cfm?id=359423.359427

<http://portal.acm.org/citation.cfm?id=359423.359427>Even if f() [see below]
were statistically tuned, however, the basic problem remains -- you wouldn't
be able to use that f() function in a guard, and you CAN use length() in a
guard, despite a possible compromise of the "soft real-time" promise.

How hard would it be to make length() execute in (short) constant time,
e.g., with an actual length attribute? I know this implies consing up a
length for each list returned by any evaluation, so (except for
length()-heavy list computation) it would tend to hurt performance. But if
the compiler has good live-variable analysis, it might be possible to avoid
computing list lengths (and storing them) in the overwhelming majority of
cases. Some such data flow analysis might also (with a little work) do
double-duty by making more opportunistic garbage collection of temporary
lists possible. For applications where space is more important than time,
the extra space required for storing list lengths could then be *much* more
than made up for, with cdr-coding and other time-honored tricks for reducing
list bulk.

-michael turner

David Mercer

unread,
Feb 11, 2011, 9:35:01 AM2/11/11
to Richard O'Keefe, Erlang Questions
On Thursday, February 10, 2011, Richard O'Keefe wrote:

> Guards cannot and should not call functions.

^^^^^^^^^^^^^^

Is the "should not" because of side-effects in Erlang functions? If Erlang
functions were pure, would you remove the "and should not" from your
statement? I realize there may be issues of performance if they are
allowed, but that would be up to the programmer, I suppose. Does Haskell
permit user-defined (non-monadic) functions in guards?

Just asking from a language theory point of view, since I know you're an
expert in this area. Thanks, ROK.

Cheers,

DBM

Raphael Korsoski

unread,
Feb 11, 2011, 10:29:09 AM2/11/11
to Erlang Questions
Hi,

To my understanding, Haskell guards are very different from Erlang guards. Haskell compiles guards to case statements (afaik), while Erlang evaluates them for selection at runtime.

In Haskell we can write:

f x | my_length(x) > 3 = "longer than 3"
| otherwise = "shorter than or equal to 3"
where my_length = λa → <some function, e.g. length(a)>

which does contain a user-defined non-monadic funtion.

In GHC it's also possible to do binding within guards, like so:

g x | s ← my_length(x)
, s > 3 = "longer than 3"
| otherwise = "shorter than or equal to 3"
where my_length = λa → <some funtion>

Hope that answers something of your question.

BR,
Raphael K

Alessandro Sivieri

unread,
Feb 11, 2011, 10:47:19 AM2/11/11
to Raphael Korsoski, Erlang Questions
Well, I think guards are more elegant than case/if statements; but the
Erlang limitation forces you to use the latter, that's all.
I mean, it is not a problem to get the dictionary length as the first
operation of a function and create an if statement, but...

James Churchman

unread,
Feb 11, 2011, 12:10:02 PM2/11/11
to Alessandro Sivieri, Raphael Korsoski, Erlang Questions
f(L)
when length(L) > 10 -> long_case(L);
when length(L) > 5 -> medium_case(L);
when true -> short_case(L).

(which is subtly wrong, as its misses the two following f(L)'s :-) )

can also be written like :

f(L)->f(L,length(L)).

f(L,Len) when Len > 10 -> long_case(L);
f(L,Len) when Len > 5 -> medium_case(L);
f(L,_) -> short_case(L).

which i think i personally prefer....

and if the list L is only needed for length it gets even shorter (tho f/2 needs renaming if becoming f/1 )

also you could consider using an orddict, the api is identical to the dict, but it consists of a normal list so both of the above code works with out modification. There will obviously just be a lookup time difference for large lists (eg ones that are more than "4" :-) )

http://www.erlang.org/doc/man/orddict.html

and if using normal dict, just replace length with dict:size() in the 2nd code


Also why does the compiler do a length check for every time you use length in a guard ?? I thought it did not do repetitive pattern matches that are "the same", so surly repetitive guard would also be similarly optimised?


also "hacks" like [_,_,_,_,_,_,_,_,_,_,_|_] look very clever, and i guess for very long lists have a speed benefit, but again could the compiler not realise that when you put "when Len > 10" it only needs to check that the list is longer than 10? not get the genuine numeric length and then numeric compare to 10. That seems a basic optimisation that could be made

also, why does is_list(tl(tl(tl(tl(tl(tl(L))))))) compile better?? it this true?? it looks really nasty!

Björn Gustavsson

unread,
Feb 12, 2011, 2:28:12 AM2/12/11
to James Churchman, Alessandro Sivieri, Raphael Korsoski, Erlang Questions
On Fri, Feb 11, 2011 at 6:10 PM, James Churchman
<jamesch...@gmail.com> wrote:
> Also why does the compiler do a length check for every time you use length in a guard ?? I thought it did not do repetitive pattern matches that are "the same", so surly repetitive guard would also be similarly optimised?

That would be possible, but it is not done yet.
The compiler currently does not do much
optimization of guards and it does not do
common sub-expression elimination.

>
> also "hacks" like [_,_,_,_,_,_,_,_,_,_,_|_] look very clever, and i guess for very long lists have a speed benefit, but again could the compiler not realise that when you put "when Len > 10" it only needs to check that the list is longer than 10? not get the genuine numeric length and then numeric compare to 10. That seems a basic optimisation that could be made

That basic optimization is not done because the
length(L) will cause an exception if the L is improper,
but only checking 10 elements will not cause an exception.
The compiler does not do optimizations that would
silence potential exceptions (silencing exceptions could
hide bugs in the program and make debugging more
difficult).

Here is an example of an improper list that will
cause a badarg exception in length/1:

lists:seq(1, 20) ++ [a|b]

--
Björn Gustavsson, Erlang/OTP, Ericsson AB

Jeff Schultz

unread,
Feb 12, 2011, 3:29:51 AM2/12/11
to Erlang Questions
On Fri, Feb 11, 2011 at 06:40:12PM +0900, Michael Turner wrote:
> FWIW, Doug Clark found that the average length of lists in large, mature
> lisp applications was about four.

FWIW,

1. As far as I remember, LISP programs don't typically manage strings
of characters as lists of small integers.

2. LISP doesn't have tuples, so things that in Erlang are usually
tuples are represented in LISP as small lists.

The one makes average Erlang lists longer, and the other makes average
LISP lists smaller.


Jeff Schultz

James Churchman

unread,
Feb 12, 2011, 8:12:33 AM2/12/11
to Björn Gustavsson, Alessandro Sivieri, Raphael Korsoski, Erlang Questions
i seee, so due to improper lists they have different behaviour

Trying : Len = length([1 , 2 | ok ]).
Gives : exception error: bad argument in function length/1 called as length([1,2|ok])

which is guess is ok because the list is improper and length is therefore undefined

and with a guard :

LengthFun = fun
(List) when length(List) > 1 -> big;
(_) -> small
end.

LengthFun( [1 , 2 | ok ] )

Gives small as the guard silently fails..

I guess the optimisation could still be made internally in beam somehow but its definitely more complex that it seems at first.

On 12 Feb 2011, at 07:28, Björn Gustavsson wrote:

> That basic optimization is not done because the
> length(L) will cause an exception if the L is improper,
> but only checking 10 elements will not cause an exception.
> The compiler does not do optimizations that would
> silence potential exceptions (silencing exceptions could
> hide bugs in the program and make debugging more
> difficult).
>
> Here is an example of an improper list that will
> cause a badarg exception in length/1:
>
> lists:seq(1, 20) ++ [a|b]

o...@cs.otago.ac.nz

unread,
Feb 13, 2011, 11:16:10 PM2/13/11
to Jeff Schultz, Erlang Questions
> On Fri, Feb 11, 2011 at 06:40:12PM +0900, Michael Turner wrote:
>> FWIW, Doug Clark found that the average length of lists in large, mature
>> lisp applications was about four.
>
> FWIW,
>
> 1. As far as I remember, LISP programs don't typically manage strings
> of characters as lists of small integers.
>
> 2. LISP doesn't have tuples, so things that in Erlang are usually
> tuples are represented in LISP as small lists.
>
> The one makes average Erlang lists longer, and the other makes average
> LISP lists smaller.

I suspect that this is the study behind the "CDR coding" paper,
which was based on 'five large Lisp programs (about 50,000 cells each)'.
This goes back to the era when a DECsystem-10 with 250 kilowords
(about 1.25 MB in today's terms, although it could give you very nearly
200k cons cells, the _effective_ equivalent of 1.6MB) was a mainframe
and a Xerox 1108 running Interlisp with 4 MB of memory was huge
(and a 24-bit address space and 8 bit CDR field gave you 4 byte
cons cells).

Since then, things have changed a lot. We have and we use a LOT more
memory. We expect to fit a lot more data in.
(HEAP-UTILIZATION) in a quiescent Clozure Common Lisp --- none of my
own code loaded --- says there are 293679 cons cells, but says nothing
about how long the lists are. I note that a modern Lisp with NO user
program in it has more cons cells than classic DEC-10 Lisps could
hold when full.

For comparison, I just took a quiescent Squeak 3.10 system and did
|n d|
n := d := 0.
Array allInstances do: [:each | n := n + each size. d := d + 1].
n / d asFloat
and got an average Array size of a whisker under 12.

Using 'String allSubInstancesDo:' instead gave me an average string
size of a whisker over 30.

Anyone have Lisp figures that *aren't* obsolete?

Reply all
Reply to author
Forward
0 new messages