[erlang-questions] Dirty NIF - classifying as CPU or I/O bound

Roger Lipscombe

unread,

Oct 14, 2018, 8:42:11 AM10/14/18

to erlang-questions

The documentation (http://erlang.org/doc/man/erl_nif.html#dirty_nifs) says:

"It is important to classify the dirty job correct [sic]. An I/O bound job should be classified as such, and a CPU bound job should be classified as such. If you should classify CPU bound jobs as I/O bound jobs, dirty I/O schedulers might starve ordinary schedulers. I/O bound jobs are expected to either block waiting for I/O, and/or spend a limited amount of time moving data."

If I *don't know* whether the job is going to be CPU bound or I/O bound (it executes arbitrary code provided by a third party), am I safest to just classify the dirty job as CPU-bound? Or is this warning hinting at a disaster of biblical proportions[1] if I even *think* about fudging the classification?

Thanks,

Roger.

[1] Dr. Peter Venkman, 1984.

Stanislaw Klekot

unread,

Oct 14, 2018, 8:49:37 AM10/14/18

to Roger Lipscombe, erlang-questions

If you execute a third party code you have no control over, you don't
execute it as NIF, but as a separate program through a port (or maybe as
a C node). NIFs are not a solution for each and all problems in Erlang.

--
Stanislaw Klekot
_______________________________________________
erlang-questions mailing list
erlang-q...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

Roger Lipscombe

unread,

Oct 14, 2018, 9:03:34 AM10/14/18

to Stanislaw Klekot, erlang-questions

On 14 October 2018 at 13:49, Stanislaw Klekot <erlan...@jarowit.net> wrote:

If you execute a third party code you have no control over, you don't
execute it as NIF, but as a separate program through a port (or maybe as
a C node). NIFs are not a solution for each and all problems in Erlang.

I'm aware of the risks. Assume that I'm accepting them.

Stanislaw Klekot

unread,

Oct 14, 2018, 9:45:11 AM10/14/18

to Roger Lipscombe, erlang-questions

Why do you insist on using ill-suited mechanism of NIFs that will
most probably destablilize your BEAM machine? Ports were designed
specifically so you can easily run foreign code without causing problems
to the VM.

Jesper Louis Andersen

unread,

Oct 14, 2018, 10:47:40 AM10/14/18

to Roger Lipscombe, Erlang (E-mail)

On Sun, Oct 14, 2018 at 2:42 PM Roger Lipscombe <ro...@differentpla.net> wrote:

If I *don't know* whether the job is going to be CPU bound or I/O bound (it executes arbitrary code provided by a third party), am I safest to just classify the dirty job as CPU-bound? Or is this warning hinting at a disaster of biblical proportions[1] if I even *think* about fudging the classification?

Either classification risks being wrong, so you can't really do any of them safely. The two classifications exist because IO resources and CPU resources tend to orthogonally consumed: If we have many IO bound jobs, we can still run CPU bound jobs and vice versa. But if you don't know what kind of job you are looking at a priori, you have no way to classify it correctly.

You *could* run it with some kind of sandbox environment and then make a guess: If it were IO bound last time, it probably still is. But this is more involved.

Roger Lipscombe

unread,

Oct 14, 2018, 10:58:53 AM10/14/18

to Stanislaw Klekot, erlang-questions

On 14 October 2018 at 14:44, Stanislaw Klekot <erlan...@jarowit.net> wrote:

Why do you insist on using ill-suited mechanism of NIFs that will
most probably destablilize your BEAM machine? Ports were designed
specifically so you can easily run foreign code without causing problems
to the VM.

Why do you insist on *telling me something that I already know*? Assume that there are other things going on (that I'm not going to tell you about, because they're not relevant to the question) that make using a NIF the right trade-off even with the stability risks.

Roger Lipscombe

unread,

Oct 14, 2018, 11:08:05 AM10/14/18

to Jesper Louis Andersen, Erlang (E-mail)

On 14 October 2018 at 15:46, Jesper Louis Andersen <jesper.lou...@gmail.com> wrote:

On Sun, Oct 14, 2018 at 2:42 PM Roger Lipscombe <ro...@differentpla.net> wrote:
If I *don't know* whether the job is going to be CPU bound or I/O bound (it executes arbitrary code provided by a third party), am I safest to just classify the dirty job as CPU-bound? Or is this warning hinting at a disaster of biblical proportions[1] if I even *think* about fudging the classification?

Either classification risks being wrong, so you can't really do any of them safely. The two classifications exist because IO resources and CPU resources tend to orthogonally consumed: If we have many IO bound jobs, we can still run CPU bound jobs and vice versa. But if you don't know what kind of job you are looking at a priori, you have no way to classify it correctly.

Thanks Jesper, I guess my question is rooted in this statement in the docs:

"If you should classify CPU bound jobs as I/O bound jobs, dirty I/O schedulers might starve ordinary schedulers."

This, to me, implies that I should probably classify unknown jobs as CPU bound, rather than I/O bound, because the documentation only mentions bad things happening one way round.

Based on my limited knowledge of how dirty schedulers works, my instinct tells me that classifying jobs as CPU bound when they're I/O bound will probably just be less efficient, whereas classifying jobs as I/O bound when they're CPU bound will result in trying to run too many jobs at once. But I'm just guessing.

Cheers,

Roger.

Steve Vinoski

unread,

Oct 14, 2018, 11:09:23 AM10/14/18

to Jesper Louis Andersen, erlang-questions

I agree with Jesper, and I'll add that to the best of my knowledge no disaster of biblical proportions will happen if you guess incorrectly.

--steve

Steve Vinoski

unread,

Oct 14, 2018, 11:29:36 AM10/14/18

to Roger Lipscombe, erlang-questions

On Sun, Oct 14, 2018 at 11:08 AM Roger Lipscombe <ro...@differentpla.net> wrote:

On 14 October 2018 at 15:46, Jesper Louis Andersen <jesper.lou...@gmail.com> wrote:
On Sun, Oct 14, 2018 at 2:42 PM Roger Lipscombe <ro...@differentpla.net> wrote:
If I *don't know* whether the job is going to be CPU bound or I/O bound (it executes arbitrary code provided by a third party), am I safest to just classify the dirty job as CPU-bound? Or is this warning hinting at a disaster of biblical proportions[1] if I even *think* about fudging the classification?

Either classification risks being wrong, so you can't really do any of them safely. The two classifications exist because IO resources and CPU resources tend to orthogonally consumed: If we have many IO bound jobs, we can still run CPU bound jobs and vice versa. But if you don't know what kind of job you are looking at a priori, you have no way to classify it correctly.

Thanks Jesper, I guess my question is rooted in this statement in the docs:

"If you should classify CPU bound jobs as I/O bound jobs, dirty I/O schedulers might starve ordinary schedulers."

According to git, Rickard Green wrote this, so I'd take it as advice you shouldn't ignore.

This, to me, implies that I should probably classify unknown jobs as CPU bound, rather than I/O bound, because the documentation only mentions bad things happening one way round.

That's probably a good approach. One way to mitigate guessing incorrectly would be to teach your jobs to cooperatively yield, if possible. If there are points within the tasks where you can get them to reschedule themselves, then regardless of where they're running, they'll be giving other jobs a chance to run.

Based on my limited knowledge of how dirty schedulers works, my instinct tells me that classifying jobs as CPU bound when they're I/O bound will probably just be less efficient, whereas classifying jobs as I/O bound when they're CPU bound will result in trying to run too many jobs at once. But I'm just guessing.

It would be good if Rickard or Sverker could weigh in here, as I think they know this code best.

--steve

Jesper Louis Andersen

unread,

Oct 14, 2018, 11:37:50 AM10/14/18

to Steve Vinoski, Erlang (E-mail)

On Sun, Oct 14, 2018 at 5:29 PM Steve Vinoski <vin...@ieee.org> wrote:

On Sun, Oct 14, 2018 at 11:08 AM Roger Lipscombe <ro...@differentpla.net> wrote:

"If you should classify CPU bound jobs as I/O bound jobs, dirty I/O schedulers might starve ordinary schedulers."

According to git, Rickard Green wrote this, so I'd take it as advice you shouldn't ignore.

If we think about the two kinds of jobs:

CPU bound jobs will hog a CPU for their whole operation. Thus, we want them to live on their own threads next to the scheduler threads, so they can't stop the scheduler threads.

IO bound jobs tend to use the CPU very little but hog the IO resources quite a lot (in particular the disk). Because of this, one might want to run many of these threads, far more than there are CPU cores in the machine, since they'll just block immediately and never use the CPU resources.

However, if you accidentally or deliberately assign CPU bound jobs to the IO threads, you might have few scheduler threads and many IO threads in the system. In this case, the kernel might decide to give fair priority to all the threads, in which case the scheduler threads will suffer because of their fewer numbers.

Rickard and Sverker will know if there are differences in the strategies of the two kinds of job types, but the above would happen even if they were treated the same.

Rickard Green

unread,

Oct 15, 2018, 5:00:12 AM10/15/18

to Jesper Louis Andersen, erlang-questions@erlang.org Questions, Steve Vinoski

The above is true. That is, if you are not sure of the jobs characteristics you want to classify it as CPU bound. If you are wrong, and it is actually IO bound and blocks for a very long time it might cause problems for dirty CPU bound jobs, but those problems are typically not as severe as the problems experienced the other way around.

Regards,

Rickard

--

Rickard Green, Erlang/OTP, Ericsson AB

Reply all

Reply to author

Forward