[Erlang Programming Book] Exercise 6-3: Testing the supervisor.

59 views
Skip to first unread message

Ivan Uemlianin

unread,
Apr 21, 2010, 10:04:44 AM4/21/10
to Erlang Programming
Dear All

Starting into this exercise to expand the supervisor example, I
thought the first thing I should do is write a test script. The first
version should just run my_supervisor.erl with the mutex semaphore and
database server as children. It should verify that the supervisor
restarts processes after exits.

Actually the version I've got now is the zeroth version: it sets up
the supervisor and its children, and I interact with the children in
the shell.

However, children seem not always to receive pids and consequently
restarts fail. What is the correct way to write this kind of test
script?

Code and shell session below:

%% stest.erl

-module(stest).
-export([test/0]).

test() ->
ChildList = [{mutex, start, []},
{my_db, loop, [[]]}],
my_supervisor:start_link(my_supervisor, ChildList),
reportChildPids(ChildList).

reportChildPids([]) -> ok;
reportChildPids([{Mod, _Func, _Arg}|T]) ->
CPid = erlang:whereis(Mod), % ?undefined pid as module not
registered
io:format("~p pid: ~p.~n", [Mod, CPid]),
reportChildPids(T).


%% shell session

1> c(stest).
{ok,stest}
2> stest:test().
mutex pid: undefined.
my_db pid: undefined.
ok
3> mutex:wait().
linking to and locking for <0.33.0>
ok_waiting
4> mutex:signal().
signal from/unlinking <0.33.0>
ok_freeing
5> whereis(mutex).
<0.41.0>
6> exit(whereis(mutex), kill).
true
7> whereis(mutex).
undefined
8> mutex:wait().

=ERROR REPORT==== 21-Apr-2010::14:41:42 ===
Error in process <0.40.0> with exit value: {{badmatch,false},
[{my_supervisor,restart_child,2},{my_supervisor,loop,1}]}

** exception error: bad argument
in function mutex:wait/0


When the supervisor first starts up, mutex and my_db don't have pids
(because they haven't been registered?); but after a couple of uses
mutex has a pid (command 5). Finally the error report seems to say
that my_supervisor::restart_child has failed (presumably because
keysearch returns false, which won't match onto {value,
{Pid, ...etc}).

With thanks and best wishes

Ivan

--
Erlang Programming Website:
http://www.erlangprogramming.org/

Subscription settings: http://groups.google.com/group/erlang-programming-book/subscribe?hl=en

Ale

unread,
Apr 21, 2010, 10:22:50 AM4/21/10
to erlang-prog...@googlegroups.com
Hello Ivan,

> %% stest.erl
>
> -module(stest).
> -export([test/0]).
>
> test() ->
>    ChildList = [{mutex, start, []},
>                 {my_db, loop, [[]]}],
>    my_supervisor:start_link(my_supervisor, ChildList),
>    reportChildPids(ChildList).

Even though this probably doesn't answer your question fully it migth
help. Whey don't you make
start_link return a list of pids which is monitoring?


start_link([], Pids) -> Pids
start_link([{RegName, Func, Args}|T], Pids) ->
Pid = spawn_link(RegName, Func, Args)
% ^ or what ever you use to monitor
% in case of spawn_monitor it's a tuple, so you could pattern match
% {Pid, Ref}
NewPids = [Pid|Pids]
start_link(T, NewPids)

And then you print the pids..

Just a thought,

Cheers,


--
Ale.

Ivan Uemlianin

unread,
Apr 21, 2010, 12:15:23 PM4/21/10
to Erlang Programming
Hi Ale

I went and had another look at my_supervisor.erl (it's on pp153-4 of
the book, and in the code zipfile). I was looking in the wrong place
for the pids, but start_link isn't the right place either.

The function start_children/1 goes through the list of {Mod, Func,
Args} starting each in turn (with apply/3) and returning a list of
{Pid, {Mod, Func, Args}}:

start_children([]) -> [];
start_children([{M, F, A} | ChildSpecList]) ->
case (catch apply(M,F,A)) of
{ok, Pid} ->
[{Pid, {M,F,A}}|start_children(ChildSpecList)];
_ ->
start_children(ChildSpecList)
end.

To get a report of the actual pids in the running childlist I added a
report entry to the receive loop:

loop(ChildList) ->
receive
{'EXIT', Pid, _Reason} ->
NewChildList = restart_child(Pid, ChildList),
loop(NewChildList);
{stop, From} ->
From ! {reply, terminate(ChildList)};
{report, From} ->
From ! {reply, {length(ChildList), getPids(ChildList)}},
loop(ChildList)
end.

report(Name) ->
Name ! {report, self()},
receive {reply, Reply} -> Reply end.

getPids([]) -> [];
getPids([{Pid, {Mod,_F,_A}}|T]) ->
[{Mod,Pid}|getPids(T)].


However, the pids don't seem to be added to the list:

1> c(my_supervisor).
{ok,my_supervisor}
2> c(mutex).
{ok,mutex}
3> my_supervisor:start_link(my_supervisor, [{mutex, start, []}]).
ok
4> mutex:wait().
linking to and locking for <0.33.0>
ok_waiting
5> my_supervisor:report(my_supervisor).
{0,[]}

In fact, apply(mutex, start, [])., does not return {ok, Pid}, but
true. So, although mutex is started, and we can communicate with it,
and although my_supervisor is trapping exits, mutex is not being added
to the supervision tree.

I have two conclusions:

1: children should be rewritten to return {ok, Pid} from their
initialisation functions. Fair enough;

2: The example code is broken: processes which start up ok but do not
return {ok, Pid} to apply/3 seem to be playing along but are not in
fact in the supervision tree. When restart_child/2 is called on one
of them the supervisor crashes. I am wrong surely?

Best wishes

Ivan

Ale

unread,
Apr 21, 2010, 12:37:55 PM4/21/10
to erlang-prog...@googlegroups.com
A quick reply, later I'll take a better look.

>
> In fact, apply(mutex, start, [])., does not return {ok, Pid}, but
> true.  So, although mutex is started, and we can communicate with it,
> and although my_supervisor is trapping exits, mutex is not being added
> to the supervision tree.

apply returns what ever the function you called returns, in this case
start. In the book it says it assumes
one is using spawn_link which returns {ok, Pid}, bottom of page 152.
So the mutex function should return {ok, Pid}

>
> I have two conclusions:
>
> 1: children should be rewritten to return {ok, Pid} from their
> initialisation functions.  Fair enough;

Yeah, like I said above.

> 2: The example code is broken: processes which start up ok but do not
> return {ok, Pid} to apply/3 seem to be playing along but are not in
> fact in the supervision tree.  When restart_child/2 is called on one
> of them the supervisor crashes.  I am wrong surely?
>
To tell you the truth I've not looked in detail this exercise but it
seems, like you said, that because the apply isn't returning {ok, Pid}
it isn't passed as the starting data to the loop which is needed in
the restart_child which will probably fail in the pattern matching
expresion {value, ... } as I think keysearch returns false if it
doesn't find anything. So yes I agree with you. :-)


Cheers,

--
Ale.

Ivan Uemlianin

unread,
Apr 21, 2010, 2:17:57 PM4/21/10
to Erlang Programming
You're right: I had missed the stated {ok, Pid} assumption on p. 152.

I'll edit mutex and my_db to return {ok, Pid} from start/0, and start
again.

festina lente.

Ivan
Reply all
Reply to author
Forward
0 new messages