state of master/too many open files

139 views
Skip to first unread message

Stephen Haberman

unread,
May 28, 2013, 8:59:45 PM5/28/13
to spark...@googlegroups.com
Hey,

Is anyone heavily using/testing master?

I assume so, but after updating our local Spark build today (previously we were on an old 0.7-ish snapshot), I'm seeing a lot of "too many file open" exceptions.

There was a post about this back in January, where Matei suggested setting ulimit -n 16,000 in spark-env.sh, which I've done, although I think that is actually lower than the default ulimit on the machine, which was 32,000.

I noticed a few large-ish changes in Spark; could they be leaking files/connections? Any hints about where to poke around? Or should I try looking somewhere else?

This seems to be happening fairly quickly, within the first 20 minutes of the job running.

Thanks!

- Stephen



Reynold Xin

unread,
May 29, 2013, 2:32:34 AM5/29/13
to spark...@googlegroups.com
There have been a lot of changes to the Spark 0.8 master branch lately. In particular, one change that I did changed the way shuffle files are handled.

Can you do an "lsof" to figure out what files are opened? Are they shuffle files?




- Stephen



--
You received this message because you are subscribed to the Google Groups "Spark Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-users...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Stephen Haberman

unread,
May 30, 2013, 1:34:38 PM5/30/13
to spark...@googlegroups.com
Hi Reynold,

> Can you do an "lsof" to figure out what files are opened? Are they
> shuffle files?

Sure; give me a bit, we're going back to 0.7-ish for now, but I'll give
master a try again here pretty soon and let you know.

- Stephen

Reply all
Reply to author
Forward
0 new messages