zmq: too many open files

andy

unread,

Jun 28, 2013, 7:08:24 PM6/28/13

to storm...@googlegroups.com

Folks:

We are deploying a topology with 470 workers with a simple DAG:

     spout ---> bolt1---> bolt3
        |                          /|\
       \|/                          |
        +------> bolt2 ----->

This topology works fine if we declare # of tasks as below:
     - spout ... 150
     - bolt1 ... 100
     - bolt2 ... 100
     - bolt3 ... 940

If we raise the # of tasks as below, Worker throws "org.zeromq.ZMQException: Too many open files"
     - spout ... 150
     - bolt1 ... 150
     - bolt2 ... 150
     - bolt3 ... 2000

2013-06-27 11:41:43 worker [ERROR] Error when processing event
org.zeromq.ZMQException: Too many open files(0x18)
        at org.zeromq.ZMQ$Socket.construct(Native Method)
        at org.zeromq.ZMQ$Socket.<init>(ZMQ.java:886)
        at org.zeromq.ZMQ$Context.socket(ZMQ.java:276)
        at zilch.mq$socket.invoke(mq.clj:52)
        at backtype.storm.messaging.zmq.ZMQContext.connect(zmq.clj:62)
        at
backtype.storm.daemon.worker$mk_refresh_connections$this__4289$iter__4296__4300$fn__4301.invoke(worker.clj:244)
        at clojure.lang.LazySeq.sval(LazySeq.java:42)
        at clojure.lang.LazySeq.seq(LazySeq.java:60)
        at clojure.lang.Cons.next(Cons.java:39)
        at clojure.lang.RT.next(RT.java:587)
        at clojure.core$next.invoke(core.clj:64)
        at clojure.core$dorun.invoke(core.clj:2726)
        at clojure.core$doall.invoke(core.clj:2741)
        at
backtype.storm.daemon.worker$mk_refresh_connections$this__4289.invoke(worker.clj:238)
        at
backtype.storm.daemon.worker$mk_refresh_connections$this__4289.invoke(worker.clj:218)
        at backtype.storm.timer$mk_timer$fn__1820$fn__1821.invoke(timer.clj:33)
        at backtype.storm.timer$mk_timer$fn__1820.invoke(timer.clj:26)
        at clojure.lang.AFn.run(AFn.java:24)
        at java.lang.Thread.run(Thread.java:722)
2013-06-27 11:41:43 util [INFO] Halting process: ("Error when processing an
event")

I suspect that Storm worker's refresh connection timer is causing the problem.
We are using the default task.refresh.poll.secs (10 sec). When a machine is loaded
with large task threads (12 workers), a timer thread for bolt1/bolt2 may not 
be able completely establish ZMQ connections to 400+ bolt3. Another timer thread 
may then kick in and eventually cause "too many open files".

Nathan, do you see other potential causes? I assume that we don't need to work 
about thread safety of zmq here.

Andy

Nathan Marz

unread,

Jun 28, 2013, 9:20:27 PM6/28/13

to storm-user

The refresh function doesn't open new connections unless there's been a reassignment. Have you tried increasing the open files settings on those boxes? The defaults for Linux are pitifully low.

--
You received this message because you are subscribed to the Google Groups "storm-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to storm-user+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--

Twitter: @nathanmarz

http://nathanmarz.com

andy

unread,

Jun 29, 2013, 6:49:21 PM6/29/13

to storm...@googlegroups.com, nat...@nathanmarz.com

worker.clj seems to maintain (1) C1 ... a list of required connections, and (2) C2 ... a list of current connections.
We have a timer thread that refresh connections if C2 does not have all required connections.
We got the "too many open files" exception within that timer thread.

Our server has the following settings:
* Open files : 32k
* Max process : 32k
The total # of Storm workers is certainly less than 1k in our cluster.
I don't think that we truly need more than 32k open files.

How many file descriptor each worker need (ZMQ, Zookeeper etc)?
Does each executor thread need their own file descriptor? I don't
understand why the problem starts to occur when executors increased
(although # of workers stay unchanged).

Andy

Nathan Marz

unread,

Jun 30, 2013, 5:04:40 AM6/30/13

to storm-user

That's strange, executors don't use files. Files are only used at the worker level. Does the error occur when lots of workers are being reassigned (when lots of new connections would be being made)?

andy

unread,

Jun 30, 2013, 12:36:05 PM6/30/13

to storm...@googlegroups.com, nat...@nathanmarz.com

The error happens when a large topology is being launched. The stack trace states:

  backtype.storm.daemon.worker$mk_refresh_connections$this__4289.invoke(worker.clj:238)
  backtype.storm.daemon.worker$mk_refresh_connections$this__4289.invoke(worker.clj:218)
  backtype.storm.timer$mk_timer$fn__1820$fn__1821.invoke(timer.clj:33)

I suspect that the issue is caused by the

(schedule-recurring (:refresh-connections-timer worker) 0 (conf TASK-REFRESH-POLL-SECS) refresh-connections)

I assume that above statement will make (mk-refresh-connections ...) be invoked every 10 second. Within that function,
we creates new connection if it miss some required connections. I am not sure whether a new thread could be launched
even if a previous thread has not finished execution. If so, that could cause various confusion.

Luan Cestari

unread,

Jun 30, 2013, 8:59:34 PM6/30/13

to storm...@googlegroups.com

Hi

Might need some more internal details of your application, I would suspect that in the bolt you have connection to databases or other external system, I would take a look in /proc/[PID]/fd for the socket:[inode] with the "netstat -ae" output to find out if it isn't the connections that are making you running out of file descriptors (open files). Did you verify the /proc/[PID]/limits? (Maybe some profile or other thing is changing the limit of open files).

Best regards,

Luan

Nathan Marz

unread,

Jul 1, 2013, 2:41:04 AM7/1/13

to storm-user

schedule-recurring doesn't launch any new threads. Every invocation of refresh-connections happens on the same thread. How many workers are you running per machine?

andy

unread,

Jul 9, 2013, 5:46:59 PM7/9/13

to storm...@googlegroups.com, nat...@nathanmarz.com

I found the root cause. It's in 0MQ source code (src/config.hpp).
max_sockets = 512

To support large cluster, we will need to revise the above line to larger value.

andy

unread,

Jul 9, 2013, 6:19:18 PM7/9/13

to storm...@googlegroups.com, nat...@nathanmarz.com

The revised 0MQ code is available at
https://github.com/anfeng/zeromq

Reply all

Reply to author

Forward