How to limit heavy computations

157 views
Skip to first unread message

Enrique Artal

unread,
Nov 25, 2016, 5:24:49 PM11/25/16
to sage-support
In my University, most math labs are done using Sagemath; for this purpose we have two PC's with sagenb service to which students access remotely. In general, it works smoothly; three classrooms with 20 students each, and people working at home simultaneously does not saturate the servers. Sometimes, students program some infinite loops or try heavy computations. The notebook is launched with ulimits but they do not seem to work, and 30GB process (resident memory) arise, and they usually hang the notebook; killing the process is not enough, one must restart the notebook.
We have tried some limitations of memlock and cpu time in /etc/security/limits.conf but it seems they are too strict; I am not sure if it is the reason, but an open worksheet seems to use CPU time even with no computation and then the notebook becomes completely inaccessible. 
Which is the good strategy to avoid this? The systems run Ubuntu 16.04 and Sagemath 7.3
Thanks, Enrique Artal.

Nils Bruin

unread,
Nov 25, 2016, 8:49:09 PM11/25/16
to sage-support
On Friday, November 25, 2016 at 2:24:49 PM UTC-8, Enrique Artal wrote:
In my University, most math labs are done using Sagemath; for this purpose we have two PC's with sagenb service to which students access remotely. In general, it works smoothly; three classrooms with 20 students each, and people working at home simultaneously does not saturate the servers. Sometimes, students program some infinite loops or try heavy computations. The notebook is launched with ulimits but they do not seem to work, and 30GB process (resident memory) arise, and they usually hang the notebook; killing the process is not enough, one must restart the notebook.

Unless

https://trac.sagemath.org/ticket/9398

has reared its head again, ulimits should be respected. However, you probably run the notebook setup in a "server pool" setup. You need to configure those accounts to set the ulimit in order to limit the memory use of the computer processes.
 

Enrique Artal

unread,
Nov 26, 2016, 5:50:12 AM11/26/16
to sage-support
I forgot to mention we use server_pool (there is one master-server and the ssh connections go to accounts in both the master-server and the secondary-server). All the users have unlimited ulimit, we put the ulimit as an option for the notebook.
For one notebook we put limits on the users, the problem is that if they reach the limit, their process are stopped but no one can access anymore the notebook. This is what happens also if will kill manually the big processes in the notebooks where the users have no limits.

Any help will be welcome. If we put no limits we may loose any access to the computer and the only option is turn off and on; if we put, we do not loose access, but the notebook is stopped too soon.
Enrique.

Enrique Artal

unread,
Nov 26, 2016, 5:59:02 AM11/26/16
to sage-support
By the way, using server_pool, is there a way to know which user of the notebook is using a user of the server_pool? 

Jori Mäntysalo

unread,
Nov 27, 2016, 12:07:05 AM11/27/16
to sage-support
On Sat, 26 Nov 2016, Enrique Artal wrote:

> By the way, using server_pool, is there a way to know which user of the
> notebook is using a user of the server_pool? 

AFAIK not really. I can list files in /tmp and run fuser for them, i.e.

fuser /tmp/*

and see, for example

/tmp/tmpwIR2_j: 7589c

and in that file I can see a line like

DATA = . . . /sage_notebook.sagenb/home/jm58660

and hence user jm58660 is behind the process #7589.

This could be scripted of course, but I would like to have better NB made
as admin viewpoint in mind.

--
Jori Mäntysalo

Enrique Artal

unread,
Nov 27, 2016, 6:04:48 AM11/27/16
to sage-support
Thanks, As you say, it would be better something more direct, but your approach is a strong improvement for my needs. 
By the way, I changed in our experimental notebook 7.4 -> 7.3 and the limits work: they stop the process and the notebook is still running.

Enrique.

Jori Mäntysalo

unread,
Nov 27, 2016, 8:10:07 AM11/27/16
to sage-support
On Sun, 27 Nov 2016, Enrique Artal wrote:

> Thanks, As you say, it would be better something more direct, but your
> approach is a strong improvement for my needs. By the way, I changed in our
> experimental notebook 7.4 -> 7.3 and the limits work: they stop the process
> and the notebook is still running.

OK, so there is a degeneration somewhere. Hopefully this is read by some
of developers who can do something for it.

In general it is not easy to limit resource usage in Linux. That's due to
subprocesses, overcommit and the way programs are usually done. And of
course having something like SageNB above process level does this even
more complicated.

--
Jori Mäntysalo

Jeroen Demeyer

unread,
Nov 27, 2016, 8:13:53 AM11/27/16
to sage-s...@googlegroups.com
On 2016-11-27 14:10, Jori Mäntysalo wrote:
> In general it is not easy to limit resource usage in Linux.

It's not easy, that's true. But that's independent of the fact that
ulimit within SageNB simply doesn't work due to some SageNB bug. I have
known about this bug for years but never bothered to fix it (mainly
because there are a lot of hoops to jump through to develop anything for
SageNB).

Jori Mäntysalo

unread,
Nov 27, 2016, 9:53:33 AM11/27/16
to sage-s...@googlegroups.com
On Sun, 27 Nov 2016, Jeroen Demeyer wrote:

>> In general it is not easy to limit resource usage in Linux.
>
> It's not easy, that's true. But that's independent of the fact that
> ulimit within SageNB simply doesn't work due to some SageNB bug.

Yes, bugs are one thing. But the other is limiting resources for some
users. And that is much harder when the resource use comes through SageNB
or some similar system.

I won't except to see good solutions to this in near future.

--
Jori Mäntysalo

Dima Pasechnik

unread,
Nov 27, 2016, 2:54:30 PM11/27/16
to sage-support
In our setup some 5 years ago (we did a lab-based course for 200 undergraduates using Sage), if I recall right,
we used one user for the notebook server, and a different user, "worker", to run computations; there were various ulimits on the
latter account (one sometimes forgotten is a limit on the file size, as people tend to leave computations running and producing
a lot of output...).
One more trick: we did run "worker" with high niceness level, this makes the server more responsive.

Nils Bruin

unread,
Nov 27, 2016, 3:23:33 PM11/27/16
to sage-support
On Sunday, November 27, 2016 at 3:04:48 AM UTC-8, Enrique Artal wrote:
Thanks, As you say, it would be better something more direct, but your approach is a strong improvement for my needs. 
By the way, I changed in our experimental notebook 7.4 -> 7.3 and the limits work: they stop the process and the notebook is still running.

for sage 7.5beta(?) setting ulimits does have effect: with

sh$ ulimit  -v 10000000
sh$ sage -c 'L=[1]
for i in [1..1000]:
  L = L+L
  print i'

I get a memory error after "28" has been printed (and without it, it continues longer), and if I take the bound much lower sage will not even start.

So if you configure the "worker" user to have such a ulimit, I'd expect memory problems to be significantly reduced. People who try to use more memory should see their kernel die before it's causing problems for other people.

Given that there's no way of controling which notebook user gets mapped to which worker uid, I don't think there's much mileage to be had from configuring multiple worker uids (other than having them on multiple machines to load-balance a little bit).

Enrique Artal

unread,
Nov 27, 2016, 3:55:06 PM11/27/16
to sage-support
It seems to work now with the ulimits for the server_pool users. If they become too strict, we (maybe more precisely MIguel Marco) will try the worker user approach. We will let know. Thanks for the help!

Enrique Artal

unread,
Jan 13, 2017, 12:30:04 PM1/13/17
to sage-support
Putting limits in /etc/security/limits.conf (or in files in limits.d) works right up to Sage 7.3. Namely, if a user performs a strong computation (memory or CPU time), the system stops the computation when the limit is reached; usually one needs to quit the worksheet, but it is possible to reuse the notebook. With 7.4 and 7.5, when the limit is reached the notebook becomes unusable and the only possibility to work is to kill and restart it. Some change between 7.3 and 7.4 may cause it.

Dima Pasechnik

unread,
Jan 13, 2017, 1:34:15 PM1/13/17
to sage-support


On Friday, January 13, 2017 at 5:30:04 PM UTC, Enrique Artal wrote:
Putting limits in /etc/security/limits.conf (or in files in limits.d) works right up to Sage 7.3. Namely, if a user performs a strong computation (memory or CPU time), the system stops the computation when the limit is reached; usually one needs to quit the worksheet, but it is possible to reuse the notebook. With 7.4 and 7.5, when the limit is reached the notebook becomes unusable and the only possibility to work is to kill and restart it. Some change between 7.3 and 7.4 may cause it.

"the notebook"? Which one? sagenb, or jupyter?

Enrique Artal

unread,
Jan 14, 2017, 3:40:41 AM1/14/17
to sage-support
sagenb

Dima Pasechnik

unread,
Jan 14, 2017, 5:13:37 AM1/14/17
to sage-support
Oh, right, this certainly has nothing to do with sagenb, what is not stopped is sage worker that does the computation itself, right?

Could it be that you updated your OS in the meantime, and if you roll back to sage 7.3 you would still see the same behaviour?

On sage's side it might be ipython update, although I am guessing.

There is a debugging technique that would let you find a git commit that caused the change you noticed, using git bisect, although this might take a full day of work or so, just because there were few hundred commits and you would need to run 'make build' a dozen times or so, with testing after each rebuild...

Enrique Artal

unread,
Jan 14, 2017, 1:18:32 PM1/14/17
to sage-support
I have installed 7.3, 7.4 and 7.5 and we use several instances of sagenb (it is for teaching and we use different addresses for different studies). With 7.3, if a user reaches the limit of CPU time, his worksheet stops and all the data in memory is lost, but quitting and opening the worksheet again causes no problem, and it does not affect other users. With 7.5, the instance of the notebook must be restarted because it does not work for any user of this instance (no problem for the other ones). 
Is there some documentation for the debugging technique? Is it possible to take beta releases? Enrique. 

Enrique Artal

unread,
Feb 1, 2017, 5:27:35 PM2/1/17
to sage-support
If I am not wrong the problem is created in sage-7.3.beta3. I can try with the different patches beta2->beta3. Can anyone guess a good order to do that?

Enrique Artal

unread,
Feb 4, 2017, 5:07:43 PM2/4/17
to sage-support
I was wrong, but now I am quite sure that the issue seems to appear in sage-7.4.beta0.

Enrique Artal

unread,
Feb 12, 2017, 4:53:13 AM2/12/17
to sage-support
The final debugging process (IPhyton5.0 seems to be responsible) is described in sage-devel group
Reply all
Reply to author
Forward
0 new messages