Python garbage collection: Memory no longer used is not released to OS!

59 views
Skip to first unread message

shinto cv

unread,
Apr 15, 2016, 6:27:46 AM4/15/16
to Python Pune
I have written an application with flask and uses celery for a long running task. While load testing I noticed that the celery tasks are not releasing memory even after completing the task. So I googled and found this group discussion..

https://groups.google.com/forum/#!topic/celery-users/jVc3I3kPtlw

In that discussion it says, thats how python works.

Also the article at https://hbfs.wordpress.com/2013/01/08/python-memory-management-part-ii/ says 

"But from the OS’s perspective, your program’s size is the total (maximum) memory allocated to Python. Since Python returns memory to the OS on the heap (that allocates other objects than small objects) only on Windows, if you run on Linux, you can only see the total memory used by your program increase."

And I use Linux. So I wrote the below script to verify it.

    import gc
    def memory_usage_psutil():
        # return the memory usage in MB
        import resource
        print 'Memory usage: %s (MB)' % (resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1000.0)
   
    def fileopen(fname):
        memory_usage_psutil()# 10 MB
        f = open(fname)
        memory_usage_psutil()# 10 MB
        content = f.read()
        memory_usage_psutil()# 14 MB
   
    def fun(fname):
        memory_usage_psutil() # 10 MB
        fileopen(fname)
        gc.collect()
        memory_usage_psutil() # 14 MB
      
    import sys
    from time import sleep
    if __name__ == '__main__':
        fun(sys.argv[1])
        for _ in range(60):
            gc.collect()
            memory_usage_psutil()#14 MB ...
            sleep(1)

The input was a 4MB file. Even after returning from the 'fileopen' function the 4MB memory was not released. I checked htop output while the loop was running, the resident memory stays at 14MB. So unless the process is stopped the memory stays with it.

So if the celery worker is not killed after its task is finished it is going to keep the memory for itself. I know I can use **max_tasks_per_child** config value to kill the process and spawn a new one. **Is there any other way to return the memory to OS from a python process?.**
Reply all
Reply to author
Forward
0 new messages