Listen::Internals::ThreadPool Thread dies silently on a INotify::IN_Q_OVERFLOW

56 views
Skip to first unread message

Patrick Schnetger

unread,
Dec 19, 2014, 5:52:37 AM12/19/14
to guar...@googlegroups.com
Hi there,

before creating an issue I would like to start a small discussion with you.

The last two days we faced an issue where a ruby daemon using listen 
was still running but stopped processing file change events.

During our investigation process we updated to the latest releases of listen 
and rb-inotify.

We Listen to a folder containing ~25000 xml files, all under git revision 
control. In order to clean up logs we ran git gc. That caused every file to 
get into the inotify queue until the point where the maximum number of
queued events was reached (see `cat /proc/sys/fs/inotify/max_queued_events`) 
and @worker.run within the linux adapter class raising the following Exception:
 
raise Exception.new("inotify event queue has overflowed.")

Since Exception is returned and the rescue block in 
Listen::Adapter::base.rb#start does not catch the exception 
(rescue only catches exceptions automatically inherited from StandardError).

The thread within the thread pool stopped working but did not abort though

What we changed in order to get at least a log entry:

We used the latest revision of rb-inotify because the latest commit uses a 
QueueOverflowError < RuntimeError instead of an Exception.

It would be on their part to make a release out of it... I think this is a significant and important change.

In order to kill the whole script we did the following:

We forked Listen and raised the exception on the main Thread (Thread.main) instead of Thread  in Listen::Adapter::Base#start
That causes our script to crash as we would have expected it and makes is possible for us to react.

Q:

Is the explained 'problem' a desired behavior of the library? (threads stop working silently)
What do you think of our changes/problems?
What might issues with Thread.main.raise instead of simply raise in Listen::Adapter::Base#start be?
Should I create an issue?

I am looking forward to your reply.

Patrick


Cezary Baginski

unread,
Dec 21, 2014, 1:13:05 PM12/21/14
to guar...@googlegroups.com
Just submit a PR to guard/listen.

A few notes:

I added the rescue just to log errors, because Celluloid tries to "recover" and doesn't show errors unless it has the right log level.
So, I don't object to catching every kind of exception in Listen there. Probably except Interrupt I guess.

This means you don't have to bother waiting for a new release for rb-inotify just to change the exception type.

To be honest, I don't even know how to abort Celluloid from there - from what I remember raising AbortError was the way to do it.

(Especially since Celluloid tries to restart actors by itself).

I have no idea what side effects Thread.main.raise would have - and I probably wouldn't care much if I would have to debug Celluloid to get things working.

(Some pass run Listen in a thread, so Thread.main by not be the thread where you expect to raise the error).

Another thing - you should consider watching only the directories you need (first parameter to Listen.to()). It just doesn't make sense for every change in in log/* or db/* or .git/* to trigger events going through Ruby and Celluloid. 

Don't hesitate to open issues on github for this - closing invalid issues is no problem.

I hope that helps!

(We can continue this on GitHub.)
Reply all
Reply to author
Forward
0 new messages