why sagemath creates so many file in TMPDIR and how to prevent this?

33 views
Skip to first unread message

Nasser M. Abbasi

unread,
Aug 15, 2023, 6:33:06 AM8/15/23
to sage-devel
Each time I run a sagemath script, I see 10's of thousands of files created in my TMPDIR which I have to keep manually deleting.

I am using sagemath on Linux under Virtual box. 

In my .bashrc I set

export TMPDIR=/home/me/TMP/

Each start I start sagemath and run commands in loop, it starts making tmp files/folders . These go away when I exit sagemath,. 

But my sagemath script runs for days, and Linux starts to complain I am running low on disk space. Here is an example

pwd
/home/me/TMP

>ls tmp*
ls: cannot access 'tmp*': No such file or directory

Now in separate terminal  I start sagemath. And without doing anything on that terminal, I see new tmp folders created now

>ls tmp*
tmpesr6wcuy:

tmpzglyalu9:

Once I exist sagemath, these go away.

Is there a setting to prevent this? THese temp folders are all empty it seems.

>which sage
/home/me/TMP/sage-10.0/sage
>sage --version
SageMath version 10.0, Release Date: 2023-05-20
>uname -a
Linux me-virtualbox 6.1.31-2-MANJARO #1 SMP PREEMPT_DYNAMIC Sun Jun  4 12:31:46 UTC 2023 x86_64 GNU/Linux

Thanks
--Nasser









Michael Orlitzky

unread,
Aug 15, 2023, 9:28:39 AM8/15/23
to sage-...@googlegroups.com
On Tue, 2023-08-15 at 03:33 -0700, 'Nasser M. Abbasi' via sage-devel
wrote:
> Each time I run a sagemath script, I see 10's of thousands of files created
> in my TMPDIR which I have to keep manually deleting.

There aren't too many parts of sage that use temporary files. What's
the script doing?

Nasser M. Abbasi

unread,
Aug 15, 2023, 11:09:49 AM8/15/23
to sage-devel

May be it is Python's multiprocessing then?.  The script does a LOOP calling integrate command.

But it does each call in separate process, using Python's MP. 

Each command seems to create few folders. I just did one command
and saw  these created:

rwx------  2 me me            4096 Aug 15 09:53 tmpmyvujuyg
drwx------  2 me me            4096 Aug 15 09:53 tmpd14tubax
drwx------  2 me me            4096 Aug 15 09:53 tmpmp7uvinu
drwx------  2 me me            4096 Aug 15 09:53 tmpe7tycr65
drwx------  2 me me            4096 Aug 15 09:53 tmpg0mt3zb4

Since the script runs for 10's of thousands of integrals, and I have 3 running at same time, my TMP
fills up and Linux runs short of file nodes and get warning that desk space is short (even though
the folder are empty, it is the file nodes causing this).  See


I use multiprocessing, since that is only way I know to  set a timeout on the
integrate command, since sagemath has no build in support for timeout on
function call.

Here is the basic flow of the script: (this is not the real script but
a stripped down version)
----------------------------------------------------------------
#!/usr/bin/env sage

import os, sys, time, datetime, ntpath
import multiprocessing as mp
from sage.all import *

def doTheIntegration(q1,q2):
    problem = q1.get()
    integrand = SR(problem[0])
    theVar = SR(problem[1])
   anti = integrate(integrand,theVar)      
    q2.put(anti)    

if __name__ == "__main__":
   
    integrandAsString = "cos(x)"
    mp.set_start_method('spawn')        
    q1= mp.Queue()
    q2= mp.Queue()

    q1.put([integrandAsString,"x"]) #integrand, variable
    process = mp.Process(target=doTheIntegration, args=(q1,q2,))  
    process.start()                            
    print("after process start()..waiting to finish")                      

    try:
        anti = q2.get(True,4*60) #4 minutes timeout                                                        
    except Exception as ee:
        print("Exception in call to queue.get ",ee)        
        print("type(exception).__name__=",type(ee).__name__)
                                   
    del(q1)    
    del(q2)    
    process.terminate()            
------------------------------------------------

I run this using  

sage ./script.sage

And see these tmp folders created. Since real script runs in a loop, the tmp folders
remain there until the script is done, which takes days.

Is there a way not to create these tmp folders?

Thanks
--Nasser

John H Palmieri

unread,
Aug 15, 2023, 1:53:44 PM8/15/23
to sage-devel
For what it's worth, the `alarm` function (provided by the cysignals package) allows for interrupting a command after a given amount of time.

sage: alarm?
Signature:      alarm(seconds)
Call signature: alarm(*args, **kwargs)
Type:           cython_function_or_method
String form:    <cyfunction alarm at 0x103938040>
File:          
Docstring:    
   Raise an "AlarmInterrupt" exception in a given number of seconds.
   This is useful for automatically interrupting long computations and
   can be trapped using exception handling.

   Use "cancel_alarm()" to cancel a previously scheduled alarm.

   INPUT:

   * "seconds" -- positive number, may be floating point

   OUTPUT: None

   EXAMPLES:

      >>> from cysignals.alarm import alarm, AlarmInterrupt
      >>> from time import sleep
      >>> try:
      ...     alarm(0.5)
      ...     sleep(2)
      ... except AlarmInterrupt:
      ...     print("alarm!")
      alarm!
      >>> alarm(0)
      Traceback (most recent call last):
      ...
      ValueError: alarm() time must be positive

Michael Orlitzky

unread,
Aug 15, 2023, 3:27:16 PM8/15/23
to sage-...@googlegroups.com
On Tue, 2023-08-15 at 08:09 -0700, 'Nasser M. Abbasi' via sage-devel
wrote:
> Here is the basic flow of the script: (this is not the real script but
> a stripped down version)
>

For now at least, initializing the sage library creates one directory
under /tmp where all of sage's other temporary files live. That
directory is removed when the process exits. Is your loop allowing the
processes to exit? The integration routine is not using any additional
directories.

In any case, I would suggest that you use the @parallel decorator for
this instead of writing your own multiprocessing code:

https://doc.sagemath.org/html/en/reference/parallel/sage/parallel/decorate.html

I think its "fork" iterator will work a lot better for you, and it
takes a "timeout" parameter.
Reply all
Reply to author
Forward
0 new messages