freeing up memory after graph

20 views
Skip to first unread message

Michal Galdzicki

unread,
Nov 2, 2016, 6:58:41 PM11/2/16
to rdflib-dev
I'm not having much luck freeing up memory after a largish graph (IOMemory store) is loaded and eventually deleted. For example, in the test code below I load a new graph, then delete it. I have a 46Mb footprint left somewhere in RAM. To show the effect, I printed the total memory used before and after the load, and after the del. Also print the total size of the graph object 80.0MB.  After deletion some reference(s) must be remaining as garbage collection does not free it. I was unable to find what that is. Ideally, I would like to delete/ free up most if not all of the remaining footprint. This is particularly important for larger data sets 1-2gb. Note that triggering garbage collection manually is important to observe the change. 

thanks for any help,
mike

Output:
before load
38.0MB
after load
122.3MB
80.0MB
after del
83.9MB

from __future__ import print_function
import logging, psutil, os, gc, rdflib
from pympler.asizeof import asizeof
process = psutil.Process(os.getpid())

g=rdflib.Graph()
print("before load")
print("{:.1f}{}".format(process.memory_info().rss/(1024.0**2), 'MB'))
g.load('ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/void.rdf')
print("after load")
print("{:.1f}{}".format(process.memory_info().rss/1024.0**2, 'MB'))
print("{:.1f}{}".format(asizeof(g)/1024.0**2, 'MB'))
del(g) # recover 37Mb
gc.collect() 
print("after del")
print("{:.1f}{}".format(process.memory_info().rss/1024.0**2, 'MB'))
#print("{:.1f}{}".format(asizeof(g)/1024.0**2, 'MB'))

Gunnar Aastrand Grimnes

unread,
Nov 3, 2016, 3:12:01 AM11/3/16
to rdfli...@googlegroups.com
I am not saying there defintely isn't a problem - but if there is I
think your graph is too small to show the effect. Python makes some
effort to intern short strings. RDF is pretty much only short strings.
I guess these interned strings stick around, waiting for someone to
create the same string again.

A better test would be to create and delete the graph say 10 times,
and see if it keeps growing.

I have long-running rdflib processes that throw lots of graphs around,
they don't grow arbitrarily large.

- Gunnar
> --
> http://github.com/RDFLib
> ---
> You received this message because you are subscribed to the Google Groups
> "rdflib-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to rdflib-dev+...@googlegroups.com.
> To post to this group, send email to rdfli...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/rdflib-dev/89f843e2-a4b3-4b80-b769-33ff88467562%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
http://gromgull.net
Reply all
Reply to author
Forward
0 new messages