"Too many open files" while executing iterative task

1,280 views
Skip to first unread message

angshu rai

unread,
Jan 3, 2013, 6:09:35 AM1/3/13
to spark...@googlegroups.com
Hi,
I am getting the below mentioned error while executing an iterative task where I am reusing a few RDDs in the loop.

Can some one please point out as to what might be causing this problem?

Exception in thread "main" java.io.IOException: Class not found
at org.objectweb.asm.ClassReader.a(Unknown Source)
at org.objectweb.asm.ClassReader.<init>(Unknown Source)
at spark.ClosureCleaner$.spark$ClosureCleaner$$getClassReader(ClosureCleaner.scala:15)
at spark.ClosureCleaner$$anonfun$clean$2.apply(ClosureCleaner.scala:89)
at spark.ClosureCleaner$$anonfun$clean$2.apply(ClosureCleaner.scala:88)
at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:59)
at scala.collection.immutable.List.foreach(List.scala:76)
at spark.ClosureCleaner$.clean(ClosureCleaner.scala:88)
at spark.SparkContext.clean(SparkContext.scala:563)
at spark.RDD.map(RDD.scala:170)
at spark.api.java.JavaRDDLike$class.map(JavaRDDLike.scala:60)
at spark.api.java.JavaRDD.map(JavaRDD.scala:7)
at spark.examples.JavaCKJMPR.main(JavaCKJMPR.java:281)
Exception in thread "delete Spark temp dir /tmp/spark-297c2d12-88d2-4f9f-9154-f1e26967a360" java.lang.NoClassDefFoundError: spark/Utils$$anonfun$deleteRecursively$1
at spark.Utils$.deleteRecursively(Utils.scala:271)
at spark.Utils$$anon$2.run(Utils.scala:87)
Caused by: java.lang.ClassNotFoundException: spark.Utils$$anonfun$deleteRecursively$1
at java.net.URLClassLoader$1.run(URLClassLoader.java:199)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
... 2 more
Caused by: java.io.FileNotFoundException: /home/angshu/spark-0.6.0/core/target/scala-2.9.2/classes/spark/Utils$$anonfun$deleteRecursively$1.class (Too many open files)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:120)
at sun.misc.URLClassPath$FileLoader$1.getInputStream(URLClassPath.java:1005)
at sun.misc.Resource.cachedInputStream(Resource.java:61)
at sun.misc.Resource.getByteBuffer(Resource.java:144)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:256)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
... 7 more
Exception in thread "delete Spark temp dir /tmp/spark-16e20a2b-b1e5-45c1-93e0-3e84ffe39eb6" java.lang.NoClassDefFoundError: spark/Utils$$anonfun$deleteRecursively$1
at spark.Utils$.deleteRecursively(Utils.scala:271)
at spark.Utils$$anon$2.run(Utils.scala:87)
Exception in thread "delete Spark local dirs" java.lang.NoClassDefFoundError: spark/storage/DiskStore$$anon$1$$anonfun$run$1
at spark.storage.DiskStore$$anon$1.run(DiskStore.scala:175)
Caused by: java.lang.ClassNotFoundException: spark.storage.DiskStore$$anon$1$$anonfun$run$1
at java.net.URLClassLoader$1.run(URLClassLoader.java:199)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
... 1 more
Caused by: java.io.FileNotFoundException: /home/angshu/spark-0.6.0/core/target/scala-2.9.2/classes/spark/storage/DiskStore$$anon$1$$anonfun$run$1.class (Too many open files)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:120)
at sun.misc.URLClassPath$FileLoader$1.getInputStream(URLClassPath.java:1005)
at sun.misc.Resource.cachedInputStream(Resource.java:61)
at sun.misc.Resource.getByteBuffer(Resource.java:144)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:256)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
... 6 more

Matei Zaharia

unread,
Jan 3, 2013, 1:18:05 PM1/3/13
to spark...@googlegroups.com
Try adding ulimit -n 16000 to your conf/spark-env.sh. The default limit on open files is pretty small (a few hundred).

Matei

angshu rai

unread,
Jan 3, 2013, 11:09:55 PM1/3/13
to spark...@googlegroups.com
Hi Matei,
Though this seems to solve the problem for now, I suspect as the iterations increase, this problem will resurface.

I am trying to understand as to what might be causing this issue, because as part of my job there are no file operations (except for the few calls to HDFS to read data, which is a fairly common requirement).

Is the internal implementation opening multiple files on the systems (for bookkeeping may be) which are growing too many in number? Is there a plan to address this issue in Spark in future?

It would be great to know

Thanks
Angshu

Matei Zaharia

unread,
Jan 3, 2013, 11:14:58 PM1/3/13
to spark...@googlegroups.com
There is apparently a problem with the way we read class files in the ClosureCleaner (the class that threw the exception) that leaves open some file handles by reopening the same .class file many times. We recently got a fix for it in a pull request. But since this only opens a fixed number of files per iteration, it's quite likely that just increasing ulimit will let you go for a very long time for now. We'll include the fix in the next minor version though.

Matei

angshu rai

unread,
Jan 3, 2013, 11:38:51 PM1/3/13
to spark...@googlegroups.com
Thanks Matei for the clarification!

Looking forward to future versions

Angshu

Vyatkin Andrei

unread,
Nov 27, 2013, 7:58:35 AM11/27/13
to spark...@googlegroups.com
Is this problem still present? I encountered similar problems with spark 0.8.0:
Exception in thread "main" java.io.IOException: Class not found
at org.objectweb.asm.ClassReader.a(Unknown Source)
at org.objectweb.asm.ClassReader.<init>(Unknown Source)
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$getClassReader(ClosureCleaner.scala:37)
at org.apache.spark.util.ClosureCleaner$.getInnerClasses(ClosureCleaner.scala:84)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:107)
at org.apache.spark.SparkContext.clean(SparkContext.scala:820)
at org.apache.spark.rdd.RDD.filter(RDD.scala:258)

четверг, 3 января 2013 г., 15:09:35 UTC+4 пользователь angshu rai написал:
Reply all
Reply to author
Forward
0 new messages