Re: [erlang-questions] nif driver c code must be reentrant?

48 views
Skip to first unread message

Gleb Peregud

unread,
May 21, 2013, 3:54:31 AM5/21/13
to 韦崇焜, erlang-q...@erlang.org
I believe there is no API nor technical mean to make NIFs reentrant
currently. Hence you should refrain from making NIFs which can run for
a long time.

But if you want to make NIFs more friendly to the scheduler, you can
use erlang:bump_reductions/1 from Erlang wrapper of a NIF call.

On Tue, May 21, 2013 at 4:14 AM, 韦崇焜 <wckbl...@yahoo.com.cn> wrote:
> hi~
>
> I try to design some nif codes recently and I reading a book " Erlang and
> OTP in Action" now. In this book, chapter 12, "Integrating with foreign code
> using ports and NIFs", page 313, section "Making a linked driver".
>
> I found some words:
> "
> Because all these instances execute
> within the memory of the Erlang VM, possibly running as different threads, the driver
> code must be designed to be reentrant—executable by multiple simultaneous callers—
> and must not depend on global variables or locks.
> "
>
> and then, this book give some methods to avoid this problem, something like
> not to use global variables, and use driver_alloc, driver_free to allocate
> memory dynamically.
>
> of course, these words were placed into the section "Making a linked
> driver"
>
> My questions is:
>
> 1) Is nif driver c code has the same problem? That is, nif driver c code
> must be designed to reentrant?
> 2) I know that, nif calling will block the VM until the driver c code
> returns, Is that other problems we must notice when using nifs?
>
> And if nif driver c code must be designed to reentrant, then I will have
> anther question.
> I read this words from erlang.org
>
> "
> NIFs are called directly by the same scheduler thread that executed the
> calling Erlang code. The calling scheduler will thus be blocked from doing any
> other work until the NIF returns
> "
>
> That is, calling scheduler is driven by the same thread, so the calling
> will be blocked.
> Then why should we designed our nif code to be reentrant?
>
>
> The e-mail is little long, thanks for reading, and I do hope to receive any
> helpful messages.
> Thanks!
>
>
>
>
> Best Regards
> ckwei
> _______________________________________________
> erlang-questions mailing list
> erlang-q...@erlang.org
> http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
erlang-q...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

Rapsey

unread,
May 21, 2013, 4:33:20 AM5/21/13
to Gleb Peregud, erlang-q...@erlang.org
My experience with R16B and frequent NIF calls with bump_reductions is
very negative. Without bump_reductions even worse. At the moment I
would not use NIFs for anything. Everything will look ok, but once
server starts getting busy it will completely stop working.


Sergej

Attila Rajmund Nohl

unread,
May 21, 2013, 4:39:16 AM5/21/13
to 韦崇焜, erlang-questions
2013/5/21 韦崇焜 <wckbl...@yahoo.com.cn>:
[...]
> "
> NIFs are called directly by the same scheduler thread that executed the
> calling Erlang code. The calling scheduler will thus be blocked from doing any
> other work until the NIF returns
> "
>
> That is, calling scheduler is driven by the same thread, so the calling
> will be blocked.
> Then why should we designed our nif code to be reentrant?

I think on SMP systems you can have multiple schedulers.

Max Lapshin

unread,
May 21, 2013, 4:53:01 AM5/21/13
to Attila Rajmund Nohl, erlang-questions
Rapsey: I remember that you do a lot of nif work, you are delivering
large amount of video in nif thread.

How do you solve it now?

Rapsey

unread,
May 21, 2013, 6:08:09 AM5/21/13
to Max Lapshin, erlang-questions
I keep my streaming servers in R14 where it works great. Calls
generally scan a buffer and do a write to a non blocking socket. I
don't know why R16 has such an issue with it.


Sergej

Dan Gudmundsson

unread,
May 21, 2013, 9:31:01 AM5/21/13
to 韦崇焜, erlang-q...@erlang.org
1) Is nif driver c code has the same problem? That is, nif driver c code
must be designed to reentrant?

Yes

2) I know that, nif calling will block the VM until the driver c code
returns, Is that other problems we must notice when using nifs?

On Tue, May 21, 2013 at 4:14 AM, 韦崇焜 <wckbl...@yahoo.com.cn> wrote:
hi~
 
I try to design some nif codes recently and I reading a book " Erlang and
OTP in Action" now. In this book, chapter 12, "Integrating with foreign code
using ports and NIFs", page 313, section "Making a linked driver". 
 
I found some words:
"
Because all these instances execute
within the memory of the Erlang VM, possibly running as different threads, the driver
code must be designed to be reentrant—executable by multiple simultaneous callers—
and must not depend on global variables or locks.
"
 
and then, this book give some methods to avoid this problem, something like
not to use global variables, and use driver_alloc, driver_free to allocate
memory dynamically.
 
of course, these words were placed into the section "Making a linked
driver"
 
My questions is:
 
1) Is nif driver c code has the same problem? That is, nif driver c code
must be designed to reentrant?
2) I know that, nif calling will block the VM until the driver c code
returns, Is that other problems we must notice when using nifs?
 
And if nif driver c code must be designed to reentrant, then I will have
anther question.
I read this words from erlang.org
 
"
NIFs are called directly by the same scheduler thread that executed the
calling Erlang code. The calling scheduler will thus be blocked from doing any
other work until the NIF returns
"
 
That is, calling scheduler is driven by the same thread, so the calling
will be blocked.
Then why should we designed our nif code to be reentrant?
 
 
The e-mail is little long, thanks for reading, and I do hope to receive any
helpful messages.
Thanks!
 
 
 
 
Best Regards
ckwei

Dan Gudmundsson

unread,
May 21, 2013, 9:34:46 AM5/21/13
to Gleb Peregud, erlang-q...@erlang.org
If you have lengthy work that you can break it in several parts,
you can do easily like this:

lengthy_work(Data) ->
    case lengthy_work_impl(Data) of
Cont = {continue, _State} ->
   erlang:bump_reductions(1000000),
   lengthy_work_impl(Cont);
{finished, Result} ->
   Result
    end.

lengthy_work_impl(_) -> erlang:nif_error(undefined).




On Tue, May 21, 2013 at 9:54 AM, Gleb Peregud <gleb...@gmail.com> wrote:

Scott Lystig Fritchie

unread,
May 21, 2013, 2:13:57 PM5/21/13
to Dan Gudmundsson, erlang-q...@erlang.org
Dan Gudmundsson <dan...@gmail.com> wrote:

dg> [...]
dg> erlang:bump_reductions(1000000)
dg> [...]

Dan, that prompts a good question that has been bugging several people
(including me). Does this 1 million reduction count bump have an effect
more than a 2000 reduction bump? IIRC, a process is scheduled out after
2000 reductions. And my (poor) memory says that the reduction count
starts at 2000, that count is decremented, and a scheduling decision is
made when the reduction count goes negative.

Does it really make a difference if the count goes negative by a little
(e.g., only one) or by a lot (e.g., 999,990)?

-Scott

Dan Gudmundsson

unread,
May 22, 2013, 3:31:32 AM5/22/13
to Scott Lystig Fritchie, erlang-q...@erlang.org
I believe you are correct, I just grabbed large enough number that will work in the example even if things are changed.

Even better would be to bump accordingly in the nif with the enif_consume_timeslice, but then it wouldn't be a short example anymore.

/Dan

Paul Davis

unread,
May 25, 2013, 6:14:42 AM5/25/13
to Scott Lystig Fritchie, erlang-q...@erlang.org
I'm not entirely sure how this function works but the implementation
of bump_reductions seems to cap the reduction bump to CONTEXT_REDS
here:

https://github.com/erlang/otp/blob/maint/erts/emulator/beam/bif.c#L4628

One place I've seen reductions calculated is in sending distributed
messages which suggests that the CONTEXT_REDS threshold is only
specific to erlang:bump_reductions/1. Though I'm also not sure if
there's a reason to bump more than 2000.

https://github.com/erlang/otp/blob/maint/erts/emulator/beam/dist.c#L1847

Also, I've been wondering lately if anyone had tried erlang:yield/0 to
address the NIF problem on newer Erlang versions. Most of the stuff I
work on is still on R14 so I haven't had to invest much time in
addressing the reductions issue but forcing a yield seems to be mostly
what people are trying to accomplish with maybe a bit less than a
yield for smaller operations.
Reply all
Reply to author
Forward
0 new messages