My jobs are hitting "too many open files" exceptions. I am using 0.8.0 and the cause seems to be a very large number of shuffle files (in /tmp/spark-local*) which remain open after successive iterations. During each successive run the number of files there grows to my ulimit - at which point my job crashes. I don't remember this problem in 0.7. This seems to be related to: https://groups.google.com/d/msg/spark-users/EkbeCGFDaAQ/mL6t9lXsPYcJ
Is there are workaround for this problem?
--
You received this message because you are subscribed to the Google Groups "Spark Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-users...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
You received this message because you are subscribed to a topic in the Google Groups "Spark Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/spark-users/ofaLPxlPi_8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to spark-users...@googlegroups.com.
sc.parallelize(1 to 1000, 2).map(x => (x, x)).reduceByKey(_+_, 1000).count
rxin @ dhcpx-193-92 : ~/Downloads/spark-0.8.0-incubating-bin-hadoop1
> lsof -p 2099 | wc -l
88 // before the job started
rxin @ dhcpx-193-92 : ~/Downloads/spark-0.8.0-incubating-bin-hadoop1
> lsof -p 2099 | wc -l
305 // when the job was running
rxin @ dhcpx-193-92 : ~/Downloads/spark-0.8.0-incubating-bin-hadoop1
> lsof -p 2099 | wc -l
88 // after the job finished
The files were closed properly it looks like.
> lsof -p 2099 | wc -l
88 // after the job finished
The files were closed properly it looks like.
it would be great to learn more about your environment and app.
Pretty standard centos distro
> 2.6.18-308.0.0.0.1.el5xen #1 SMP Sat Feb 25 16:26:29 EST 2012 x86_64 x86_64 x86_64 GNU/Linux
My JVM is
> java version "1.6.0_34"
> Java(TM) SE Runtime Environment (build 1.6.0_34-b04)
> Java HotSpot(TM) 64-Bit Server VM (build 20.9-b04, mixed mode)