async_unix helper threads never finalize

9 views
Skip to first unread message

Owen Gunden

unread,
Nov 3, 2015, 10:38:46 AM11/3/15
to ocaml...@googlegroups.com
I think there is a bug in the code that is supposed to reclaim helper
threads back into the general async thread pool. The following test
program crashes when it hits 50 threads:

open! Core.Std
open! Async.Std

let thread_main i =
Caml.Printf.printf "thread main %d start\n%!" i;
Time.pause (Time.Span.of_sec 1.);
Caml.Printf.printf "thread main %d end\n%!" i;
Ok ()
;;

let make_helper_exn name =
In_thread.Helper_thread.create ~name () |! function
| Error e -> failwithf "make_helper_exn: %s" (Error.to_info e |!
Info.to_string_hum) ()
| Ok x -> x
;;

let make_thread_and_run n =
let thread = make_helper_exn (sprintf "thread-%d" n) in
In_thread.run ~thread (fun () -> thread_main n) >>| ok_exn
;;

let run () =
let rec f n =
don't_wait_for (make_thread_and_run n);
after (Time.Span.of_sec 5.) >>= fun () -> f (n + 1)
in
f 1
;;

let () =
(run () >>> fun () -> Shutdown.shutdown 0);
never_returns (Scheduler.go ())
;;


Results:
% ./test_helper_thread_example.exe
thread main 1 start
thread main 1 end
thread main 2 start
thread main 2 end
thread main 3 start
thread main 3 end
thread main 4 start
thread main 4 end
thread main 5 start
thread main 5 end
thread main 6 start
thread main 6 end
thread main 7 start
thread main 7 end
thread main 8 start
thread main 8 end
thread main 9 start
thread main 9 end
thread main 10 start
thread main 10 end
thread main 11 start
thread main 11 end
thread main 12 start
thread main 12 end
thread main 13 start
thread main 13 end
thread main 14 start
thread main 14 end
thread main 15 start
thread main 15 end
thread main 16 start
thread main 16 end
thread main 17 start
thread main 17 end
thread main 18 start
thread main 18 end
thread main 19 start
thread main 19 end
thread main 20 start
thread main 20 end
thread main 21 start
thread main 21 end
thread main 22 start
thread main 22 end
thread main 23 start
thread main 23 end
thread main 24 start
thread main 24 end
thread main 25 start
thread main 25 end
thread main 26 start
thread main 26 end
thread main 27 start
thread main 27 end
thread main 28 start
thread main 28 end
thread main 29 start
thread main 29 end
thread main 30 start
thread main 30 end
thread main 31 start
thread main 31 end
thread main 32 start
thread main 32 end
thread main 33 start
thread main 33 end
thread main 34 start
thread main 34 end
thread main 35 start
thread main 35 end
thread main 36 start
thread main 36 end
thread main 37 start
thread main 37 end
thread main 38 start
thread main 38 end
thread main 39 start
thread main 39 end
thread main 40 start
thread main 40 end
thread main 41 start
thread main 41 end
thread main 42 start
thread main 42 end
thread main 43 start
thread main 43 end
thread main 44 start
thread main 44 end
thread main 45 start
thread main 45 end
thread main 46 start
thread main 46 end
thread main 47 start
thread main 47 end
thread main 48 start
thread main 48 end
thread main 49 start
thread main 49 end
thread main 50 start
thread main 50 end
(((pid 23908) (thread_id 0))
((human_readable 2015-11-03T15:31:03+0000)
(int63_ns_since_epoch 1446564663932287000))
"unhandled exception in Async scheduler"
("unhandled exception"
((src/monitor.ml.Error_
((exn
(Failure
"make_helper_exn: (\"create_helper_thread could not get a thread\"\
\n ((id 0) (state In_use)\
\n (mutex ((id_of_thread_holding_lock 0) (num_using_blocker 0)
(blocker ())))\
\n (default_priority 0) (max_num_threads 50) (num_threads 50)\
\n (thread_creation_failure_lockout 1s)\
\n (last_thread_creation_failure (1970-01-01 00:00:00.000000Z))\
\n (threads\
\n (((name thread-50) (thread_id (50)) (priority 0)\
\n (state\
\n (Helper\
\n ((in_pool 0) (state In_use) (thread <opaque>)
(default_name thread-50)\
\n (default_priority 0))))\
\n (unfinished_work 0)\
\n (work_queue\
\n ((ev_q ()) (maxsize 1) (mutex <opaque>) (not_empty
<opaque>)\
\n (not_full <opaque>) (finally <fun>))))\
\n ((name thread-49) (thread_id (49)) (priority 0)\
\n (state\
\n (Helper\
\n ((in_pool 0) (state In_use) (thread <opaque>)
(default_name thread-49)\
\n (default_priority 0))))\
\n (unfinished_work 0)\
\n (work_queue\

[snipped state of the remaining 48 threads]


My theory on what causes this issue is detailed here:

https://github.com/janestreet/async_unix/issues/8

Unfortunately, I don't have a patch, yet, but I think a decent
workaround would be to allow the user to free up the thread manually by
exposing the call to finished_with_helper_thread within the interface to
In_thread.helper_thread. One could even remove the finalizer, add docs
saying you have to manually relinquish the named thread, and the result
would at least meet user expectations.

Stephen Weeks

unread,
Nov 3, 2015, 2:11:53 PM11/3/15
to ocaml...@googlegroups.com
Thanks for the bug report Owen. I confirm the bug. Your test isn't
quite a proof of the existence of the bug, though, because one must
call [Gc.full_major] to give finalizers a chance to run.

I've got a simple fix internally and a test to make sure it doesn't
happen again. The fix should hit in the next public release or two.

Owen Gunden

unread,
Nov 3, 2015, 8:16:53 PM11/3/15
to ocaml...@googlegroups.com
On 11/03/2015 02:11 PM, Stephen Weeks wrote:
> Thanks for the bug report Owen. I confirm the bug. Your test isn't
> quite a proof of the existence of the bug, though, because one must
> call [Gc.full_major] to give finalizers a chance to run.

Yeah sorry, we did try this but I forgot to include it in my report.

> I've got a simple fix internally and a test to make sure it doesn't
> happen again. The fix should hit in the next public release or two.

Fantastic!
Reply all
Reply to author
Forward
0 new messages