FileNotFoundException (Too many open files)

361 views
Skip to first unread message

ca...@distillanalytics.ca

unread,
Jun 9, 2016, 2:35:30 PM6/9/16
to OpenRefine
Hello!

I tried googling, searching the group's topics and the wiki and I am happy to report that I didn't find anything for this!

I'm starting to get a stream of these exceptions in my log:

java.io.FileNotFoundException: /mnt/refine/workspace.temp.json (Too many open files)
        at java.io.FileOutputStream.open(Native Method)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:171)
        at java.io.FileWriter.<init>(FileWriter.java:90)
        at com.google.refine.io.FileProjectManager.saveToFile(FileProjectManager.java:274)
        at com.google.refine.io.FileProjectManager.saveWorkspace(FileProjectManager.java:243)
        at com.google.refine.ProjectManager.save(ProjectManager.java:212)
        at com.google.refine.RefineServlet$AutoSaveTimerTask.run(RefineServlet.java:94)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
17:55:16.616 [       FileProjectManager] Failed to save workspace (4200016ms)

So I investigated how many files I had open and what my ulimits are!

ulimit -Hn
4096

ulimit -Hs
unlimited

ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 257614
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 257614
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

I used the list of file descriptors under /proc/{pid}/fd to count the file descriptors... They seem to be climbing! When I started composing this message I was at 349, now they've risen to 1136!

In the past, I've had success trying to up the memory for java in order to accommodate large datasets... Should I also be upping the number of files it is possible to open?

Cheers,
Caleb
PS: I've got a large custom text facet that includes a hefty regexp. And I'm trying to add a column based on the matches of said regexp. I can characterize what that regexp is if you may find it useful for debugging.

qi cui

unread,
Jun 16, 2016, 3:31:07 PM6/16/16
to OpenRefine
It's not about the limit of your OS. It's more about why the process(s) trigger so many "save" action. 

You can check how often do you see this exception. Also the if you can provide the size of data set and the action you tried to apply etc that will be helpful.

qi cui

unread,
Jun 16, 2016, 3:35:14 PM6/16/16
to OpenRefine
One more thing, which File system do you attached to? If the mount point detached silently, you may come across this issue.


On Thursday, June 9, 2016 at 2:35:30 PM UTC-4, ca...@distillanalytics.ca wrote:

Caleb Buxton

unread,
Jun 17, 2016, 2:48:05 PM6/17/16
to openr...@googlegroups.com
Thanks for these tips! The next time it happens I'll check all these. However, as I haven't been able to reproduce recently, I'm going to lean on file system detaching silently. *shrug*

--
You received this message because you are subscribed to a topic in the Google Groups "OpenRefine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/openrefine/OemiTbn69oQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to openrefine+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages