Memory leak problem

570 views
Skip to first unread message

Joachim De Beule

unread,
Sep 16, 2013, 11:17:14 AM9/16/13
to clo...@googlegroups.com
Dear List,

I'm experiencing a memory leak and I don't understand why.

I have a bunch of 50 files on disk called "data-1.edn" through "data-50.edn". I perform the following code:

(def all-processed-data (reduce (fn [ret f] (merge ret (process-data (clojure.tools.reader.edn/read-string (slurp f))))) {} file-list))

"file-list" is a sequence of java.io.File objects pointing towards the 50 data files.
"process-data" is a function that produces a map from the data read.

Although each individual datafile is about 100 Megabytes, the result of processing a datafile is a map of about 200k only, so that the map returned by the entire above expression is only about 10 Megabytes. Nevertheless, executing the above code fills up memory and eventually stops clojure from functioning. Why?

Thanks a lot for any suggestions!

Joachim.

Andy Fingerhut

unread,
Sep 16, 2013, 11:40:27 AM9/16/13
to clo...@googlegroups.com
Are you using a version of Java earlier than 7u6?  If so, this *might* be related to the conversation from a few days ago about functions like subs, re-find, etc. returning short strings that keep references to the longer strings they were created from:

    https://groups.google.com/forum/#!topic/clojure/PeHTCWNgrL0

Even if you are using such a Java version, I have not confirmed whether the function clojure.tools.reader.edn/read-string returns substrings of the input string.

One quick thing to try that might help is to call clojure.tools.reader.edn/read on the File object f directly, without calling slurp.  That will at least avoid the behavior of creating big strings with the entire contents of each file.  It should also avoid the "small substrings holding references to large original strings", if that is indeed what you are experiencing.

Andy


--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Brian Craft

unread,
Sep 16, 2013, 1:46:13 PM9/16/13
to clo...@googlegroups.com
A symptom of this would be jmap or visualvm reporting [C or char[] as the largest allocation, by class.

Joachim De Beule

unread,
Sep 17, 2013, 5:11:08 AM9/17/13
to clo...@googlegroups.com
Dear Andy,

Thanks for your reply. I am using java version 1.6.0_51. I'm a bit reluctant to upgrade however, so I was wondering how sure you are that this is indeed the problem? The problem persists after calling clojure.tools.reader.edn/read on (java.io.PushbackReader. (clojure.java.io/reader f)) by the way, so maybe it's something else after all?

Joachim.

Joachim De Beule

unread,
Sep 17, 2013, 5:14:20 AM9/17/13
to clo...@googlegroups.com
Dear Brian,

Thanks for your reply. I tried to use jmap, unfortunately it fails:

bash-3.2$ jmap -F -dump:file=heap.bin 58708
Attaching to process ID 58708, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 20.51-b01-457
Dumping heap to heap.bin ...
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at sun.tools.jmap.JMap.runTool(JMap.java:179)
at sun.tools.jmap.JMap.main(JMap.java:110)
Caused by: sun.jvm.hotspot.oops.UnknownOopException

As for visualvm: it never seems to finish producing a heap dump :( 

Joachim. 

Andy Fingerhut

unread,
Sep 17, 2013, 9:32:23 AM9/17/13
to clo...@googlegroups.com
If the problem persists after that change, then at least we know that it isn't the large strings produced by 'slurp' that are causing the problem, but something else.  I don't have any guesses based on what you have shown where that might be.  I've seen your message about difficulties using tools like jmap and visualvm -- I'd recommend doing Google searches for the name of the tool and the error message you get.  That often turns up questions where someone else has a similar error, and sometimes a solution is given to the problem.

Andy


Brian Craft

unread,
Sep 17, 2013, 10:56:09 AM9/17/13
to clo...@googlegroups.com
I'm learning that the tooling for the jvm covers a spectrum from pathetically broken to non-existent.

I've had some luck running jmap like "jmap -histo:live <pid>". Pipe it through head and run with watch, and you have a crude real-time monitor. E.g. running against the processing running from a SNAPSHOT uberjar:

watch jmap -histo:live $(ps axu | grep SNA[P] | sed -e 's/ \+/\t/g' | cut -f2) \| head -40
Reply all
Reply to author
Forward
0 new messages