[erlang-questions] Binary match in function head doesn't compile

51 views
Skip to first unread message

Erik Pearson

unread,
Oct 26, 2012, 2:27:06 PM10/26/12
to erlang-questions Questions
Hi,

I'm wondering why this 

test(<<Field:Len/binary, Rest/binary>>, Len) ->
    {Len, Field, Rest}.

does not compile, complaining that "variable 'Len' is unbound", while this

test(<<Field:2/binary, Rest/binary>>, Field) ->
    {Field, Rest}.

does. For some reason the compiler doesn't see the Len from the match spec in the arguments, but it does see Field. Is that by design? 

BTW  supplying a variable for Len does work in this case:

test(Bin, Len) ->
  <<Field:Len/binary, Rest/binary>> = Bin, 
  {Len, Field, Rest}.

Any insights?

The reason I ask is that a way of saving memory allocation when parsing binaries might be to walk the binary recursively, incrementing the Len, while matching on something else that follows. This would result in a sub-binary reference and not a sub-binary copy. This is not my idea (thanks for the tip Dimitry) but I think it loses efficiency if the match state can't be carried recursively in the function head.

Thanks,
Erik.

Jeff Schultz

unread,
Oct 26, 2012, 9:34:39 PM10/26/12
to Erik Pearson, erlang-questions Questions
On Fri, Oct 26, 2012 at 11:27:06AM -0700, Erik Pearson wrote:
> test(<<Field:Len/binary, Rest/binary>>, Len) ->
> {Len, Field, Rest}.

> does not compile, complaining that "variable 'Len' is unbound", while this

Short answer: Erlang isn't a general constraint solver.

> test(<<Field:2/binary, Rest/binary>>, Field) ->
> {Field, Rest}.

> does. For some reason the compiler doesn't see the Len from the match spec
> in the arguments, but it does see Field. Is that by design?

That depends on what you mean by "see." It actually treats that clause
as if it were something like

test(<<Field:2/binary, Rest/binary>>, Arg2),
Field = Arg2
->
{Field, Rest}.

(Which is not Erlang, but I hope you get the idea.)

And, of course, your problem clause looks like

test(<<Field:Len/binary, Rest/binary>>, Arg2),
Len = Arg2
->
{Len, Field, Rest}.

which makes clear why the compiler sees Len as unbound when used in
the pattern match.

It's obviously easy to re-arrange the data-flow to make this sort of
thing compilable. I wouldn't expect to run across a need for it in
practice though.


Jeff Schultz
_______________________________________________
erlang-questions mailing list
erlang-q...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

Erik Pearson

unread,
Oct 28, 2012, 2:08:44 PM10/28/12
to erlang-questions Questions
Is there a reference for the resolution of patterns in function/clause head similar to 


but for regular Erlang? The docs on the abstract format is useful


There are a few places in the docs that refer to the process of matching the function clause head against arguments

e.g. the function overview


but it would be really useful to have those references link to documentation which describes this process.

Thanks,
Erik.

(ps - I'm happy with your answer, Björn-Egil, but hoping this thread can include some solid leads for others researching similar issues.)

(pps - there is a definitive response to my original post from Björn-Egil Dahlberg below -- he inadvertently sent it directly to me)

On Fri, Oct 26, 2012 at 11:59 AM, Björn-Egil Dahlberg <wallentin...@gmail.com> wrote:


2012/10/26 Erik Pearson <er...@defunweb.com>

Hi,

I'm wondering why this 

test(<<Field:Len/binary, Rest/binary>>, Len) ->
    {Len, Field, Rest}.

does not compile, complaining that "variable 'Len' is unbound", while this

test(<<Field:2/binary, Rest/binary>>, Field) ->
    {Field, Rest}.

does. For some reason the compiler doesn't see the Len from the match spec in the arguments, but it does see Field. Is that by design?

Yes and a limitation that is being adressed. 



BTW  supplying a variable for Len does work in this case:

test(Bin, Len) ->
  <<Field:Len/binary, Rest/binary>> = Bin, 
  {Len, Field, Rest}.

The difference here is that Len is bound when entering the function body as opposed when it is in the function head. 

We have had fierce debates on, among other things, matching behaviors for Maps (extended frames/hashes) which also have led to redesigning parts how binary matching is done in function heads. This is currently in the prototyping stages and it is to early to say to which release this will be ready.

// Björn-Egil

Robert Virding

unread,
Oct 31, 2012, 2:45:35 PM10/31/12
to Erik Pearson, erlang-questions Questions
I don't know if it is actually explicitly stated anywhere but multiple occurrences of a variable in a pattern match means that the values the variable would get in each occurrence are tested for equality. They are not being used as you would like. Also the order in which function arguments are matched is not defined, and if we just test multiple variable occurrences for equality does not matter. This means that when you match the binary N may not yet have a value. In fact we do match left-to-right (but don't tell anyone) so N will in fact not have a value. Flipping the order of the arguments will not help here.

It is different when match *INSIDE* a binary. There, by necessity, the pattern match goes left-to-right and if match a value from a binary you can use the value later in the binary match. So you can do:

<<N,B1:N/binary,Rest/binary>>  = Bin

Robert


Erik Pearson

unread,
Oct 31, 2012, 3:52:09 PM10/31/12
to erlang-questions Questions
Thanks, Robert, that helps clarify the state of things.
Do you think it would be useful if it worked "the way I would like"?
It seems that in the specific case of walking a binary by extending
the length of a sub-binary in the match (via a Len argument) would be
much more efficient than binary accumulation which always does at
least some allocation (256 bytes) and copying, and also more efficient
than moving the code into a case (where the Len will already be
bound.) It seems that the compiler would need to notice that what it
thinks is an unbound variable is actually guaranteed to be bound in
another argument, and express this as a dependency in the head pattern
match code.
Erik.

Dmitry Kolesnikov

unread,
Oct 31, 2012, 4:13:37 PM10/31/12
to Erik Pearson, erlang-questions Questions
Erik,

I believe you can use closure instead of case to achieve re-usability of binary context.

e.g.

parse(Len, Bin) ->
fun(<<Tkn:Len/binary, $, , Rest/binary>>) ->
{Len, Tkn, Rest};
(<<Tkn:Len/binary, $: , Rest/binary>>) ->
....
end.

- Dmitry

Erik Pearson

unread,
Oct 31, 2012, 4:47:46 PM10/31/12
to erlang-questions Questions
Hi Dmitry,


On Wed, Oct 31, 2012 at 1:13 PM, Dmitry Kolesnikov
<dmkole...@gmail.com> wrote:
> Erik,
>
> I believe you can use closure instead of case to achieve re-usability of binary context.
>
> e.g.
>
> parse(Len, Bin) ->
> fun(<<Tkn:Len/binary, $, , Rest/binary>>) ->
> {Len, Tkn, Rest};
> (<<Tkn:Len/binary, $: , Rest/binary>>) ->
> ....
> end.

Have you tested this for performance? Wouldn't it require a new
closure for every invocation of parse? I'm not sure that the binary
match context would be preserved through a closure -- I though that
optimization was for self recursion.
But I don't really know -- and I really shouldn't care -- heck, this
is the stuff I'm hoping to leave behind with the wonderful world of
Erlang!

Erik.
Reply all
Reply to author
Forward
0 new messages