[erlang-questions] Binary pattern matching inconsistencies with R12B

1 view
Skip to first unread message

Rory Byrne

unread,
Feb 29, 2008, 12:43:44 PM2/29/08
to erlang-q...@erlang.org
Hello

I'm writing a scanner for a query language and I'm encountering
intermittent segmentation faults and other odd errors. The
code I'm working on appears to work fine on 11.b.2-4
(linux/amd64), but gives problems on r12b-0 (linux/i386) and
r12b-1 (linux/amd64). I didn't add any fancy options when I
compiled r12b, just a --prefix.

I'm an erlang newbie so highly likely I've written something
stupid. Just hope it's obvious whatever it is!

The scanner is quite large so I've reduced it down to two
smaller programs which show similar symptoms. The first one
just throws exceptions from time to time. The second program
ends up dying as a result of a segmentation fault sooner
of later.

The following code won't make real sense. The full scanner
makes sense but this is only a mutated 10% of that. Sorry
the code is so unintelligible - but on the bright side
it fails more frequently and predictably than the full
scanner does.


%% START OF CODE: weird.erl %%

-module(weird).
-compile(export_all).

%% For testing - runs scanner N number of times with same input
run(N) ->
lists:foreach(fun(_) ->
scan(<<"region:whatever">>, [])
end, lists:seq(1, N)).

scan(<<>>, TokAcc) ->
lists:reverse(['$thats_all_folks$' | TokAcc]);

scan(<<D, $\s, Rest/binary>>, TokAcc) when
(D =:= $D) or (D =:= $d) ->
scan(Rest, ['AND' | TokAcc]);

scan(<<D>>, TokAcc) when
(D =:= $D) or (D =:= $d) ->
scan(<<>>, ['AND' | TokAcc]);

scan(<<N, Z, Rest/binary>>, TokAcc) when
(N =:= $N) or (N =:= $n),
(Z =:= $\s) ->
scan(<<Z, Rest/binary>>, ['NOT' | TokAcc]);

scan(<<C, Rest/binary>>, TokAcc) when
(C >= $A) and (C =< $Z);
(C >= $a) and (C =< $z);
(C >= $0) and (C =< $9) ->
case Rest of
<<$:, R/binary>> ->
scan(R, [{'FIELD', C} | TokAcc]);
_ ->
scan(Rest, [{'KEYWORD', C} | TokAcc])
end.

%% END OF CODE %%


Here's what I see from the shell on an i386 machine:

1> c(weird).
{ok,weird}
2> weird:run(1000).
ok
3> weird:run(1000).
ok
4> weird:run(1000).
ok
5> weird:run(1000).
** exception error: no function clause
matching weird:scan(<<"whatever">>,
[{'FIELD',110},
{'KEYWORD',111},
{'KEYWORD',105},
{'KEYWORD',103},
{'KEYWORD',101},
{'KEYWORD',114}])
in function lists:foreach/2
6> weird:run(1000).
** exception error: no function clause
matching weird:scan(<<"whatever">>,
[{'FIELD',110},
{'KEYWORD',111},
{'KEYWORD',105},
{'KEYWORD',103},
{'KEYWORD',101},
{'KEYWORD',114}])
in function lists:foreach/2
7>

It will then keep throwing exceptions from this point on. On an
amd64 machine I'm getting similar output, but it usually has
the sequence ok, error, ok, error... And if I bump it from
1,000 up to 10,000 iterations the errors usually stop (on amd64).


The second block of code is:

%% START OF CODE: scanner.erl %%

-module(scanner).
-compile(export_all).

%% For testing - runs scanner N number of times with same input
run(N) ->
lists:foreach(fun(_) ->
scan(<<"region:whatever">>, [])
end, lists:seq(1, N)).

scan(<<>>, TokAcc) ->
lists:reverse(['$thats_all_folks$' | TokAcc]);

scan(<<D, Z, Rest/binary>>, TokAcc) when
(D =:= $D orelse D =:= $d) and
((Z =:= $\s) or (Z =:= $() or (Z =:= $))) ->
scan(<<Z, Rest/binary>>, ['AND' | TokAcc]);

scan(<<D>>, TokAcc) when
(D =:= $D) or (D =:= $d) ->
scan(<<>>, ['AND' | TokAcc]);

scan(<<N, Z, Rest/binary>>, TokAcc) when
(N =:= $N orelse N =:= $n) and
((Z =:= $\s) or (Z =:= $() or (Z =:= $))) ->
scan(<<Z, Rest/binary>>, ['NOT' | TokAcc]);

scan(<<C, Rest/binary>>, TokAcc) when
(C >= $A) and (C =< $Z);
(C >= $a) and (C =< $z);
(C >= $0) and (C =< $9) ->
case Rest of
<<$:, R/binary>> ->
scan(R, [{'FIELD', C} | TokAcc]);
_ ->
scan(Rest, [{'KEYWORD', C} | TokAcc])
end.

%% END OF CODE %%


When I use this code in the shell (on i386) is usually works okay
for a smaller number of iterations but when you get into the
hundreds it dies fast:

1> c(scanner).
{ok,scanner}
2> scanner:run(10). % Start with 10
ok
3> scanner:run(10).
ok
4> scanner:run(100). % Bumped up to 100
** exception error: no function clause
matching weird:scan(<<"whatever">>,
[{'FIELD',110},
{'KEYWORD',111},
{'KEYWORD',105},
{'KEYWORD',103},
{'KEYWORD',101},
{'KEYWORD',114}])
in function lists:foreach/2
5> scanner:run(100).
Segmentation fault


Anyone got any ideas?

Cheers,

Rory


Rusty Klophaus

unread,
Feb 29, 2008, 1:00:29 PM2/29/08
to erlang-q...@erlang.org
Hi Rory,

You may want to try specifying a size for the variables D, N, and C.

For example: scan(<<C:8/integer, Rest/binary>>, TokAcc)

According to the manual: "In matching, this default value is only valid
for the very last element. All other bit string or binary elements in the
matching must have a size specification."
(otp_doc_html_R12B-1/doc/reference_manual/expressions.html#6.16)

It's possible that the lack of a size is confusing things.

Hope that helps,
Rusty

--
Rusty Klophaus (http://rklophaus.com)

Kostis Sagonas

unread,
Feb 29, 2008, 2:32:24 PM2/29/08
to erlang-q...@erlang.org
Rory Byrne wrote:
>
> I'm writing a scanner for a query language and I'm encountering
> intermittent segmentation faults and other odd errors. The
> code I'm working on appears to work fine on 11.b.2-4
> (linux/amd64), but gives problems on r12b-0 (linux/i386) and
> r12b-1 (linux/amd64). I didn't add any fancy options when I
> compiled r12b, just a --prefix.
>
> I'm an erlang newbie so highly likely I've written something
> stupid. Just hope it's obvious whatever it is!
>
> ... SNIP
>
> Anyone got any ideas?

I confirm your experiences. Mine are slightly different than yours, but
the end results are the same; see below. This is with the most recent
development version. I suspect a GC-related.

Kostis

PS. Surprisingly, I cannot manage to get a seg-fault if I compile
to native code. [using hipe:c() instead of c()]

========================================================================
Erlang (BEAM) emulator version 5.6.2 [source] [async-threads:0] [hipe]
[kernel-poll:false]

Eshell V5.6.2 (abort with ^G)
1> c(weird).
{ok,weird}
2> weird:run(10000).
ok
3> weird:run(10000).
ok
4> weird:run(10000).
ok
5> weird:run(10000).
ok
6> weird:run(10000).
ok
7> weird:run(10000).
ok
8> weird:run(10000).
ok
9> weird:run(10000).
ok
10> weird:run(10000).
ok
11> weird:run(10000).
ok
12> weird:run(10000).
ok
13> weird:run(10000).
ok
14> weird:run(10000).
ok
15> weird:run(10000).
ok
16> weird:run(10000).
ok
17> weird:run(10000).
ok
18> weird:run(10000).
ok
19> weird:run(10000).
ok
20> weird:run(10000).
ok
21> weird:run(10000).
ok
22> weird:run(10000).
ok
23> weird:run(10000).
ok
24> weird:run(100).


** exception error: no function clause matching weird:scan(<<"whatever">>,
[{'FIELD',110},

{'KEYWORD',111},

{'KEYWORD',105},

{'KEYWORD',103},

{'KEYWORD',101},

{'KEYWORD',114}])
in function lists:foreach/2

25> weird:run(100).
ok
26> weird:run(100).
ok
27> halt().
@statler [~/HiPE/otp] hipe
Erlang (BEAM) emulator version 5.6.2 [source] [async-threads:0] [hipe]
[kernel-poll:false]

Eshell V5.6.2 (abort with ^G)
1> c(scanner).
{ok,scanner}
2> scanner:run(100).
ok
3> scanner:run(100).
Segmentation fault

Rory Byrne

unread,
Feb 29, 2008, 4:42:48 PM2/29/08
to erlang-q...@erlang.org
On Fri, Feb 29, 2008 at 09:32:24PM +0200, Kostis Sagonas wrote:
>
> I confirm your experiences. Mine are slightly different than yours, but
> the end results are the same; see below. This is with the most recent
> development version. I suspect a GC-related.
>

Thanks Kostis. Bit of a relief really - I wasn't exactly making a
whole lot of progess fixing my code!

>
> PS. Surprisingly, I cannot manage to get a seg-fault if I compile
> to native code. [using hipe:c() instead of c()]
>

Excellent. I just tried hipe on the amd64 machine and no seg-fault.
Oddly hipe doesn't seem to be enabled on my i386 even though
I compiled it the same way as the amd64 version.

As an added bonus, I was stearing clear of hipe because I had
read somewhere that there was problems with running it on
a xen instance (due to the threading library used if I recall).
However, the amd64 machine I just tried it on is a xen
instance, so that looks promising. So, thanks again!

Rory

Rory Byrne

unread,
Feb 29, 2008, 5:04:31 PM2/29/08
to erlang-q...@erlang.org
On Fri, Feb 29, 2008 at 01:00:29PM -0500, Rusty Klophaus wrote:
>
> You may want to try specifying a size for the variables D, N, and C.
>
> For example: scan(<<C:8/integer, Rest/binary>>, TokAcc)
>
> According to the manual: "In matching, this default value is only valid
> for the very last element. All other bit string or binary elements in the
> matching must have a size specification."
> (otp_doc_html_R12B-1/doc/reference_manual/expressions.html#6.16)
>
> It's possible that the lack of a size is confusing things.
>
> Hope that helps,

Cheers Rusty, I took a shot at that, but no dice I'm afraid.

Actually, I ran into a problem on another project that led me to
this passage last week. I was trying to write something like

<<Data/binary, Pad:8>> = Payload.

but the compiler was complaing (as compilers do). What it was
trying to tell me was that a binary type must have a length
field unless it appears at the end of a <<binary>> pattern.
Sorry, when speaking about this stuff the term binary inevitably
gets overloaded. In essence, it was telling me I had to
do something like:

Length = size(Payload) - 1,
<<Data:(Length)/binary, Pad:8>> = Payload.

Something like that anyway. That's what the passage you quoted is
about - it's talking about using the binary type within a pattern.
You must specify a length with it unless it's at the end of the
pattern.

Also, the defaults for items in a pattern are size 8 and type
integer - so I think my code is safe. Truth be told, if I
had to write that stuff for each term I'd probably just convert
the thing to a list and do matching that way. Yeah, I'm that
lazy :-)

Thanks again Rusty,

Rory


Bjorn Gustavsson

unread,
Mar 3, 2008, 7:29:21 AM3/3/08
to erlang-q...@erlang.org
Rory Byrne <ro...@jinsky.com> writes:

> The following code won't make real sense. The full scanner
> makes sense but this is only a mutated 10% of that. Sorry
> the code is so unintelligible - but on the bright side
> it fails more frequently and predictably than the full
> scanner does.

Thanks for the bug report. I was able to reproduce the crash
by running the scanner module. I'll start investigating it.

/Bjorn
--
Björn Gustavsson, Erlang/OTP, Ericsson AB

Bjorn Gustavsson

unread,
Mar 3, 2008, 10:42:24 AM3/3/08
to Rory Byrne, erlang-q...@erlang.org
Rory Byrne <ro...@jinsky.com> writes:

> The following code won't make real sense. The full scanner
> makes sense but this is only a mutated 10% of that. Sorry
> the code is so unintelligible - but on the bright side
> it fails more frequently and predictably than the full
> scanner does.

Again thanks for your bug report.

I have extended our test suites and corrected the bug. The correction will
be included in R12B-2.

Here is the correction:

*** erts/emulator/beam/beam_emu.c@@/OTP_R12B-1 Tue Feb 5 14:37:01 2008
--- erts/emulator/beam/beam_emu.c Mon Mar 3 16:21:22 2008
***************
*** 3471,3476 ****
--- 3471,3477 ----
ms = (ErlBinMatchState *) boxed_val(tmp_arg1);
dst = (ErlBinMatchState *) HTOP;
*dst = *ms;
+ *HTOP = HEADER_BIN_MATCHSTATE(slots);
HTOP += wordsneeded;
StoreResult(make_matchstate(dst), Arg(3));

Rory Byrne

unread,
Mar 3, 2008, 4:01:49 PM3/3/08
to erlang-q...@erlang.org
On Mon, Mar 03, 2008 at 04:42:24PM +0100, Bjorn Gustavsson wrote:
>
> I have extended our test suites and corrected the bug. The correction will
> be included in R12B-2.
>

Just applied the patch and everything works great now. Many thanks Bjorn!

Rory


Rory Byrne

unread,
Mar 8, 2008, 2:50:02 PM3/8/08
to erlang-q...@erlang.org
On Mon, Mar 03, 2008 at 10:01:49PM +0100, Rory Byrne wrote:
>
> Just applied the patch and everything works great now. Many thanks Bjorn!
>

Hello,

I'm seeing some problems with fprof on i386 (but not on amd64).
I'm not certain that the problem is related to this thread but
I think it might be since it's the same code that is effected.

Basically, fprof dies when it tries to load certain modules. It's
not just my own modules that causes this - here's what happens
when running fprof on http:request/1 on my machine:


%% -- START CODE -- %%

Erlang (BEAM) emulator version 5.6 [source] [async-threads:0]
[kernel-poll:false]

Eshell V5.6 (abort with ^G)
1> inets:start().
ok
2> fprof:apply(http, request, ["http://www.erlang.com"]).
Aborted

%% -- END -- %%


>From what I have seen, the modules that are effected can
only be profiled sucessfully if you load the module and its
dependencies before running fprof. The code that I posted
at the start of this thread (weird.erl and scanner.erl) is
effected by this problem so I'll use weird.erl in the
following examples:


%% -- START CODE -- %%

$ erl
Erlang (BEAM) emulator version 5.6 [source] [async-threads:0]
[kernel-poll:false]

Eshell V5.6 (abort with ^G)
1> l(weird).
{module,weird}
2> fprof:apply(weird, run, [1]).
ok
3> fprof:apply(weird, run, [1]).
ok
4> q().
ok

$ erl
Erlang (BEAM) emulator version 5.6 [source] [async-threads:0]
[kernel-poll:false]

Eshell V5.6 (abort with ^G)
1> fprof:apply(weird, run, [1]).
Aborted

%% -- END -- %%


Also, there is no weird:run/3, but I get the same result if I ask
fprof to call it:


%% -- START CODE -- %%

$ erl
Erlang (BEAM) emulator version 5.6 [source] [async-threads:0]
[kernel-poll:false]

Eshell V5.6 (abort with ^G)
1> fprof:apply(weird, run, [what, the, f]).
Aborted

%% -- END -- %%


Hipe is not supported on my i386 so I can't test with it.

This isn't really a problem for me as I can make warm-up
calls to my modules before profiling - probably a smart
thing to do anyway. Just thought it might be of interest.

Cheers.

Rory


Bjorn Gustavsson

unread,
Mar 10, 2008, 7:39:06 AM3/10/08
to Rory Byrne, erlang-q...@erlang.org
Rory Byrne <ro...@jinsky.com> writes:

> Erlang (BEAM) emulator version 5.6 [source] [async-threads:0]
> [kernel-poll:false]
>
> Eshell V5.6 (abort with ^G)

You should update to R12B-1. If I remember correctly, this was one of the
bug we fixed for the R12B-1 release.

Rory Byrne

unread,
Mar 10, 2008, 10:08:29 AM3/10/08
to erlang-q...@erlang.org
On Mon, Mar 10, 2008 at 12:39:06PM +0100, Bjorn Gustavsson wrote:
> Rory Byrne <ro...@jinsky.com> writes:
>
> > Erlang (BEAM) emulator version 5.6 [source] [async-threads:0]
> > [kernel-poll:false]
> >
> > Eshell V5.6 (abort with ^G)
>
> You should update to R12B-1. If I remember correctly, this was one of the
> bug we fixed for the R12B-1 release.
>

Oops! I really thought I was using R12B-1 on this machine. I'm
all fixed now. Sorry about that.

Rory

Reply all
Reply to author
Forward
0 new messages