[erlang-questions] Trouble with Erlang's lenient comparisons

12 views
Skip to first unread message

JohnyTex

unread,
Apr 13, 2011, 5:08:45 AM4/13/11
to erlang-q...@erlang.org
Hello! :)
I'm a newbie Erlang programmer, and I've really liked my experience so
far. However, I've made mistakes related to Erlang's type system a
couple of times now, more specifically the fact that you can compare
different types with each other without Erlang complaining.

I've inherited someone else's code base; sometimes I forget that a
function returns a tuple and I up comparing it with an integer or
similar, and get weird behaviour and hard-to-find bugs as a result.

What's the best way to prevent this? I will try writing some unit
tests later today with EUnit, and I'm generating PLTs for Dialyzer as
I write this, but I was wondering if there are any other tools or
"best practices" that could help me to write more "type safe" code?

Thanks in advance! :)
_______________________________________________
erlang-questions mailing list
erlang-q...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

Ahmed Omar

unread,
Apr 13, 2011, 5:17:09 AM4/13/11
to JohnyTex, erlang-q...@erlang.org
Using Typer and Dialyzer is usually what you u need
--
Best Regards,
- Ahmed Omar
Follow me on twitter

Gordon Guthrie

unread,
Apr 13, 2011, 6:27:34 AM4/13/11
to erlang-q...@erlang.org
> I've inherited someone else's code base; sometimes I forget that a
> function returns a tuple and I up comparing it with an integer or
> similar, and get weird behaviour and hard-to-find bugs as a result.

The key here is always to check return types whenever you call a function.

You should be thinking 'if there is an error how do I make it appear
as quickly as possible?'.

So with function returns you don't want to pass them round in a mush
of generic variables until the error surfaces 20 fn calls from when it
appeared.

Also check for bad input to functions by using guards like:

fun(A, B, C) when is_list(A), is_integer(B), is_record(C, recC) ->

This will force errors to appear quicker if you do have a mushy
untyped variable error.

Always code the happy path and make everything else an error. If a fn
is designed as the second paramter should be an integer and not a
float then use pattern matching/guards to make it so...

Use the dialyzer to pick up cases when you have let this sort of thing
happen, and write specs to tell dialyzer, yourself and your co-workers
you intentions with functions:
http://www.erlang.org/doc/reference_manual/typespec.html

Gordon

--
Gordon Guthrie
CEO hypernumbers

http://hypernumbers.com
t: hypernumbers
+44 7776 251669

Kostis Sagonas

unread,
Apr 13, 2011, 6:36:38 AM4/13/11
to erlang-q...@erlang.org
Gordon Guthrie wrote:
>...

>
> Also check for bad input to functions by using guards like:
>
> fun(A, B, C) when is_list(A), is_integer(B), is_record(C, recC) ->

The following is a bit pedantic, but please do not offer use of the
is_record guard (esp. the one with arity two instead of three) as advice
to newcomers. I suggest that you try to forget about this guard's
existence; the only place where this guard is possibly needed is in
or/orelse contexts.

The above code is written better using pattern matching, as in:

fun(A, B, #recC{} = C) when is_list(A), is_integer(B) ->

Kostis

Håkan Mattsson

unread,
Apr 13, 2011, 7:16:21 AM4/13/11
to Kostis Sagonas, erlang-questions
When you discourage one way of writing code and claim that another way
is better you ought to motivate why.

/Håkan

On Wed, Apr 13, 2011 at 12:36 PM, Kostis Sagonas <kos...@cs.ntua.gr> wrote:
> Gordon Guthrie wrote:
>>
>> ...
>>
>> Also check for bad input to functions by using guards like:
>>
>> fun(A, B, C) when is_list(A), is_integer(B), is_record(C, recC) ->
>
> The following is a bit pedantic, but please do not offer use of the
> is_record guard (esp. the one with arity two instead of three) as advice to
> newcomers.  I suggest that you try to forget about this guard's existence;
> the only place where this guard is possibly needed is in or/orelse contexts.
>
> The above code is written better using pattern matching, as in:
>
>  fun(A, B, #recC{} = C) when is_list(A), is_integer(B) ->

Torben Hoffmann

unread,
Apr 13, 2011, 7:26:58 AM4/13/11
to Håkan Mattsson, erlang-questions
From a practical stand point I agree with Kostis.

When we started doing Erlang we used a lot of is_record/2 guards - that generally makes the code a lot harder to understand, so we dropped that as we threw away our imperative inheritance.

And the minute you need to take a parameter out of the record it will be done with the pattern matching anyway, which makes it easier to make changes in the code.

One learning here: try to do the pattern matching on fields only in the cases where it is used to pick between different function clauses. It makes it easier to figure out what is controlling the flow of execution and what is merely taken out of the parameters to be used as input elsewhere.  We had a lot of functions that pattern matched a lot of fields in a record, but only one of the fields were used to control the flow of execution. It was more clear to do a R#rec.field1 in the function code for the other fields. But it is a matter of style, of course...

Cheers,
Torben

2011/4/13 Håkan Mattsson <h...@tail-f.com>



--
http://www.linkedin.com/in/torbenhoffmann

JohnyTex

unread,
Apr 13, 2011, 9:42:41 AM4/13/11
to erlang-q...@erlang.org
Thanks for the advice guys! I managed to get Dialyzer running and it
worked very nicely; it managed to spot some unreachable code right off
the bat :)

However, it doesn't seem to get the more insidious errors that arise
from comparing other types than was intended, i.e. accidentaly
comparing a string to an integer... but since it's completely valid
Erlang I guess it's not supposed to complain?

I will heed your advice and insert some type checking where
appropriate, thanks :)

The code is written in a very imperative style and I will try to
rewrite it to look more like Erlang as well, which I hope will solve a
lot of my problems.

Kind of off topic, but does anyone have experience with Haskell and
its type system? I keep hearing claims like "if it compiles, it
works!", and so on, which does sound quite nice... if it's true ;)

> > erlang-questi...@erlang.org
> >http://erlang.org/mailman/listinfo/erlang-questions
>
> --http://www.linkedin.com/in/torbenhoffmann
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questi...@erlang.orghttp://erlang.org/mailman/listinfo/erlang-questions

Kostis Sagonas

unread,
Apr 13, 2011, 9:54:24 AM4/13/11
to Erlang
JohnyTex wrote:
> Thanks for the advice guys! I managed to get Dialyzer running and it
> worked very nicely; it managed to spot some unreachable code right off
> the bat :)
>
> However, it doesn't seem to get the more insidious errors that arise
> from comparing other types than was intended, i.e. accidentaly
> comparing a string to an integer... but since it's completely valid
> Erlang I guess it's not supposed to complain?

The answer depends on the kind of comparison that is used.

If by comparison you mean case expressions or direct matching as in

case String of
Integer -> ...
end

or

String = Integer

then dialyzer will report these to you as impossible matches.

If by comparison you mean uses of =/=, <, ... then of course these are
allowed in Erlang and of course dialyzer will not warn you that you are
doing something wrong there. (*)

Kostis

(*) Though it will warn you that e.g. the 'false' clause is unreachable
in something like:

case String =\= Integer of
true -> ...
false -> ...
end

Gordon Guthrie

unread,
Apr 13, 2011, 10:36:46 AM4/13/11
to erlang-q...@erlang.org
Kostis

I have never read the documentation on is_record/2 before (didn't even
know there was is_record/3)

I *assumed* that the is_record guard checked that it indeed was a
record - it would appear it doesn't.

The documentation also explictly says to use is_record/2

> Note!
> This BIF is documented for completeness. In most cases is_record/2 should be used.

If you shouldn't use it then the documentation should probably say so.

I presume you are saying that using #rec{} in the function call is
correctly expanded by the preprocessor to {rec, _, _, } and is
therefore a better specification of the contract than is_record/2
which just says 'it is a tuple with first element of this atom'

Gordon


On 13 April 2011 11:36, Kostis Sagonas <kos...@cs.ntua.gr> wrote:
> Gordon Guthrie wrote:
>>
>> ...
>>
>> Also check for bad input to functions by using guards like:
>>
>> fun(A, B, C) when is_list(A), is_integer(B), is_record(C, recC) ->
>
> The following is a bit pedantic, but please do not offer use of the
> is_record guard (esp. the one with arity two instead of three) as advice to
> newcomers.  I suggest that you try to forget about this guard's existence;
> the only place where this guard is possibly needed is in or/orelse contexts.
>
> The above code is written better using pattern matching, as in:
>
>  fun(A, B, #recC{} = C) when is_list(A), is_integer(B) ->
>
> Kostis
>

--
Gordon Guthrie
CEO hypernumbers

JohnyTex

unread,
Apr 13, 2011, 10:50:37 AM4/13/11
to erlang-q...@erlang.org
What I meant was when you use lesser/greater than on two different
types, i.e. checking if a tuple is greater than an integer - sorry for
being unclear :)

Thanks again for your advice!

> erlang-questi...@erlang.orghttp://erlang.org/mailman/listinfo/erlang-questions

Robert Virding

unread,
Apr 13, 2011, 11:52:18 AM4/13/11
to JohnyTex, erlang-q...@erlang.org
The comparison operators as they are now were a mistake, almost a bad mistake. IMAO what we should have done was to have had two different sets of operators, one set of numeric comparisons (without type conversion) and one set of gerneral term comparisons (without type conversion). So for example:

== /= =< < >= > would only work on numbers
@== @/= @=< @< @>= @> would work on all terms

Note that the existing =:= =/= are the same as @== @/= in my scheme above.

We could add the full set of term comparison operators, but not change the existing operators to be only numeric comparison. The reaction if we did would be "interesting". :-)

Old sins cast long shadows.

Robert

H. Diedrich

unread,
Apr 13, 2011, 12:04:13 PM4/13/11
to erlang-q...@erlang.org
Robert Virding schrieb:
The comparison operators as they are now were a mistake, almost a bad mistake. IMAO what we should have done was to have had two different sets of operators, one set of numeric comparisons (without type conversion) and one set of gerneral term comparisons (without type conversion). So for example:

== /= =< < >= >   would only work on numbers
@== @/= @=< @< @>= @>   would work on all terms

Note that the existing =:= =/= are the same as @== @/= in my scheme above.

We could add the full set of term comparison operators, but not change the existing operators
So you could introduce the @ variants for numbers only instead. How about a triple sign notation for "strictness", as ===, /==, ==< ... !? Well ... <=< ... :-<


Henning

Robert Virding

unread,
Apr 13, 2011, 12:43:37 PM4/13/11
to H. Diedrich, erlang-q...@erlang.org
You could, but then we would be getting way too many operators. In many ways the pure term comparison operators maybe be more needed as they would clean up issues with ordering terms (you can have two terms in which one is not greater or less than the other but they are not equal). If we wish to have specific operators then we could have two different prefixes for whether they are term or numeric comparisons, for example @ for term and : for numeric:


@== @/= @=< @< @>= @>
:== :/= :=< :< :>= :>

Then phase out the old ones completely. This is the only way without causing too much grief.

The question is, of course, how many problems do the existing operators actually cause, and is it worth trying to fix it?

Robert

Richard O'Keefe

unread,
Apr 13, 2011, 8:10:24 PM4/13/11
to JohnyTex, erlang-q...@erlang.org

On 13/04/2011, at 9:08 PM, JohnyTex wrote:
> I've inherited someone else's code base; sometimes I forget that a
> function returns a tuple and I up comparing it with an integer or
> similar, and get weird behaviour and hard-to-find bugs as a result.
>
> What's the best way to prevent this?

There are two somewhat different questions lurking here.

- what's the best way to ensure that such mistakes are caught?

You've mentioned EUnit and Dialyzer, and they are excellent
for this.

- what's the best way to avoid making such mistakes in the first place?

Clear, consistent, informative naming conventions.
Consistent "algebraic" interface design.
Well thought out policy on when to return "oops" as a result
and when to raise an exception -- this is still contentious,
the point here is that you want to *know* the choice a
function made without having to think too much about it.

Perhaps you could provide an example of a function that returns a
tuple that you want to compare with an integer.

Richard O'Keefe

unread,
Apr 13, 2011, 9:35:25 PM4/13/11
to Robert Virding, erlang-q...@erlang.org

On 14/04/2011, at 3:52 AM, Robert Virding wrote:

> The comparison operators as they are now were a mistake, almost a bad mistake. IMAO what we should have done was to have had two different sets of operators, one set of numeric comparisons (without type conversion) and one set of gerneral term comparisons (without type conversion). So for example:
>
> == /= =< < >= > would only work on numbers
> @== @/= @=< @< @>= @> would work on all terms

Just like Erlang's predecessor Prolog!
Since the term/number distinction _was_ borrowed for equality and
inequality (although with the symbols switched around, which still
confuses me), I've often wondered why it wasn't borrowed for ordering.

Mazen Harake

unread,
Apr 14, 2011, 3:09:44 AM4/14/11
to Gordon Guthrie, erlang-q...@erlang.org
Actually they end up being essentially the same thing.

Check out Number 6 in this list:

http://mazenharake.wordpress.com/2010/10/31/9-erlang-pitfalls-you-should-know-about/

I believe the #rec{} notation is preferred for clarity and I agree but
in my opinion is_record/2 is just as correct (literally)

/M

Björn Gustavsson

unread,
Apr 14, 2011, 3:27:23 AM4/14/11
to Gordon Guthrie, erlang-q...@erlang.org
On Wed, Apr 13, 2011 at 4:36 PM, Gordon Guthrie <gor...@hypernumbers.com> wrote:
> Kostis
>
> I have never read the documentation on is_record/2 before (didn't even
> know there was is_record/3)
>
> I *assumed* that the is_record guard checked that it indeed was a
> record - it would appear it doesn't.
>
> The documentation also explictly says to use is_record/2
>
>> Note!
>> This BIF is documented for completeness. In most cases is_record/2 should be used.
>
> If you shouldn't use it then the documentation should probably say so.
>
> I presume you are saying that using #rec{} in the function call is
> correctly expanded by the preprocessor to {rec, _, _, } and is
> therefore a better specification of the contract than is_record/2
> which just says 'it is a tuple with first element of this atom'

Did you read the note in the documentation for is_record/2?
As long as the RecordTag argument is a literal atom, the
compiler will essentially rewrite it to a call to is_record/3,
which will check the size too.

Therefore, we still recommend that you use is_record/2,
if you are going to use is_record() at all.

Historically, only is_record/2 existed and it was not a
BIF, but specially treated by the compiler. We added the
BIF versions so that it would be possible to use apply on
them, and for consistency with match specs. In most
circumstances, the BIFs will not be called as the compiler
tries to convert calls to is_record/{2,3} to pattern matching
and inline most of the remaining calls.

--
Björn Gustavsson, Erlang/OTP, Ericsson AB

Gordon Guthrie

unread,
Apr 14, 2011, 5:20:29 AM4/14/11
to erlang-q...@erlang.org
Björn

> Did you read the note in the documentation for is_record/2?

I did, but in my defence I was confused by Kostis and in a fluster :(

I have re-read it now understand it properly. is_record/2 does what I
always assumed it did (without reading the documentation) and I stand
by my suggestion at the top of this thread.

Gordon


2011/4/14 Björn Gustavsson <bgust...@gmail.com>:

--
Gordon Guthrie
CEO hypernumbers

Robert Virding

unread,
Apr 14, 2011, 10:23:02 AM4/14/11
to Richard O'Keefe, erlang-q...@erlang.org

----- "Richard O'Keefe" <o...@cs.otago.ac.nz> wrote:

> On 14/04/2011, at 3:52 AM, Robert Virding wrote:
>
> > The comparison operators as they are now were a mistake, almost a
> bad mistake. IMAO what we should have done was to have had two
> different sets of operators, one set of numeric comparisons (without
> type conversion) and one set of gerneral term comparisons (without
> type conversion). So for example:
> >
> > == /= =< < >= > would only work on numbers
> > @== @/= @=< @< @>= @> would work on all terms
>
> Just like Erlang's predecessor Prolog!
> Since the term/number distinction _was_ borrowed for equality and
> inequality (although with the symbols switched around, which still
> confuses me), I've often wondered why it wasn't borrowed for ordering.

Yes, my suggestion is taken directly from the Prolog operator names. I don't have a problem with that. :-)

Originally there was no problem as we did not have floating point numbers. It was only when they were added and we decided to do implicit type conversions that the problem arose. We needed an exact equality check for pattern matching so =:= was added (yes it was a bad name choice) and =/= just tagged along and slipped in.

I suppose I will have to make an eep which suggests the complete range of term comparison operators without type conversions. The problem is whether it is worth effort to add pure numeric comparisons as well.

Robert

David Mercer

unread,
Apr 14, 2011, 11:52:20 AM4/14/11
to Robert Virding, Richard O'Keefe, erlang-q...@erlang.org
On Thursday, April 14, 2011, Robert Virding wrote:

> I suppose I will have to make an eep which suggests the complete range
> of term comparison operators without type conversions. The problem is
> whether it is worth effort to add pure numeric comparisons as well.

Not meaning to be an ass, but is it really worth it? Is it *that* big a
problem? Thinking about both in terms of your time and effort and the time
and effort of everyone else who has to change their code to use the new
operators, and also in the added complexity to the language and the
confusion caused by adding the new @ operators.

DBM

Reply all
Reply to author
Forward
0 new messages