We are getting this error:
2006-12-04 04:47:04 ApplicationDispatcher[]: Servlet internal-error is currently unavailable
2006-12-04 04:48:24 StandardWrapperValve[bitstream]: Servlet.service() for servlet bitstream threw exception
java.io.FileNotFoundException: /l1/dspace/repository/prod/assetstore/14/58/45/145845832063558580850369699477251654488 (Too many open files)
2006-12-04 04:46:06 StandardWrapperValve[bitstream]: Servlet.service() for servlet bitstream threw exception
java.net.SocketException: Connection reset
2006-12-04 04:45:02 StandardWrapperValve[bitstream]: Servlet.service() for servlet bitstream threw exception
java.net.SocketException: Connection reset
in the evening when google is crawling our repository, and I belive these errors are bringing tomcat down. I get the impression that some pdf files are not being closed in the /tmp dir and this is brining tomcat down. When tomcat is restarted all works well.
Does any one have any suggestions?
Thanks!
Jose
A day ago I posted that we were getting “too many files open” error and I found this thread today discussing it:
http://sourceforge.net/mailarchive/forum.php?forum_id=39921&max_rows=25&style=flat&viewmonth=200408
I’m a bit confused as to what I need to do. I have version 1.4 of DSpace, I’m not sure what version of Lucene I have. Can some one tell me how I can find that out? Do I need to get the latest version of Lucene and run ./filter-media with a –f switch to force all items to be re-indexed to create compound files and get rid of this error?
Thanks!
Jose
Mark:
Thanks for answering this question.
We run index-all nightly, and when I go to the in <dspace>/search dir this is what I see:
-bash-3.00$ pwd
/l1/dspace/repository/prod/search
-bash-3.00$ ls -la
total 2102880
drwxr-xr-x 2 dspace dspace 4096 Dec 5 06:07 .
drwxr-xr-x 13 dspace dspace 4096 Dec 1 10:52 ..
-rw-r--r-- 1 dspace dspace 4 Dec 5 06:07 deletable
-rw-r--r-- 1 dspace dspace 2151226568 Dec 5 06:07 _s12.cfs
-rw-r--r-- 1 dspace dspace 29 Dec 5 06:07 segments
Does this look OK to you?
Thanks!!
From: Mark Diggory
[mailto:mdig...@MIT.EDU]
Sent: Tuesday, December 05, 2006
3:33 PM
To: Jose Blanco
Cc:
dspac...@lists.sourceforge.net; dspace-...@MIT.EDU
Subject: Re: [Dspace-general] too
many open files
FilterMedia doesn't actually interact with Lucene directly, only indirectly in that any generated text bitstreams will get picked up later when "index-all" is called. So, no, running filtermedia will not solve your too many files open issue.
The current version of <dspace>/bin/index-all will rebuild your entire lucene search index (and this will be completely optimized as welli). The usual suggestion is to run it nightly in a cron job on your dspace server. if you look in <dspace>/search and see many many "segment" files there, this may suggest that your index is not optimized.
Cheers,
Mark
_______________________________________________
Dspace-general mailing list
Mark R. Diggory
~~~~~~~~~~~~~
DSpace Systems Manager
MIT Libraries, Systems and Technology Services
Massachusetts Institute of Technology
-------------------------------------------------------------------------Take Surveys. Earn Cash. Influence the Future of ITJoin SourceForge.net's Techsay panel and you'll get the chance to share youropinions on IT & business topics through brief surveys - and earn cashDSpace-tech mailing list
So why do you think we are getting “too many open files” error? It seems to be happening when google is crawling our site. It also seems like this error message has to do with the kernel limits on the number of open files, which by default is 1024 – which should be enough, no? And we do just run ./filter-media nightly.
Thanks for you thoughts on this.
From: dspace-te...@lists.sourceforge.net [mailto:dspace-te...@lists.sourceforge.net] On Behalf Of Mark Diggory
Sent: Tuesday, December 05, 2006
3:44 PM
To: Jose Blanco
Cc:
dspac...@lists.sourceforge.net; dspace-...@MIT.EDU
Subject: Re: [Dspace-tech] [Dspace-general] too many open files
Yes, that looks like an optimized search index. An unoptimized index would many more files in it.
Here are the results:
-bash-3.00$ df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sda7 128520 27978 100542 22% /
/dev/sda1 64256 47 64209 1% /boot
/dev/sda8 5013504 10060 5003444 1% /l
/dev/sdb1 60325888 1881916 58443972 4% /l1
/dev/sdc1 58621952 119448 58502504 1% /l2
none 223864 1 223863 1% /dev/shm
/dev/sda5 262144 176 261968 1% /tmp
/dev/sda2 3074176 107051 2967125 4% /usr
/dev/sda3 262144 1278 260866 1% /var
AFS 9000000 0 9000000 0% /afs
From: Jose Blanco
[mailto:bla...@umich.edu]
Sent: Tuesday, December 05, 2006
4:15 PM
To: 'Mark Diggory'
Cc:
'dspac...@lists.sourceforge.net'
Subject: RE: [Dspace-tech]
[Dspace-general] too many open files
Mark:
Does this help?
-bash-3.00$ uname -a
Linux “server_name” 2.6.17-1.2142_FC4smp #1 SMP Tue Jul 11 22:57:02 EDT 2006 i686 i686 i386 GNU/Linux
-bash-3.00$
Fedora core 4
What do you mean by “allocating for open files”?
Thanks!
From: dspace-te...@lists.sourceforge.net
[mailto:dspace-te...@lists.sourceforge.net] On Behalf Of Mark Diggory
Sent: Tuesday, December 05, 2006
4:03 PM
To: Jose Blanco
Cc:
dspac...@lists.sourceforge.net; dspace-...@MIT.EDU
Subject: Re: [Dspace-tech]
[Dspace-general] too many open files
I'm going to stop duel posting to both lists in my next email and just post to dspace-tech for this issue.
Mark:
Does this help?
-bash-3.00$ uname -a
Linux “server_name” 2.6.17-1.2142_FC4smp #1 SMP Tue Jul 11 22:57:02 EDT 2006 i686 i686 i386 GNU/Linux
-bash-3.00$
Fedora core 4
What do you mean by “allocating for open files”?
I think theres one or two things you can do look into in this situation.1.) throttle your webserver activity so that crawlers cannot open more than a limited number of connections at any one time.
And/Or2.) increase the number of inodes available for open files on your filesystem. You want to look at the following settings and see where they are:cat /proc/sys/fs/file-maxpost them.file-max has the maximum number of inodes allowed on the system for open files, every open file requires one or more inodes. I think file-nr is the current number being used. Theres allot of documentation on the web about the subject. Heres a brief overview of the trade-offs of increasing inodes Its Redhat/Fedora centric.-MarkOn Dec 5, 2006, at 4:14 PM, Jose Blanco wrote:Mark:
Does this help?
-bash-3.00$ uname -a
Linux “server_name” 2.6.17-1.2142_FC4smp #1 SMP Tue Jul 11 22:57:02 EDT 2006 i686 i686 i386 GNU/Linux
-bash-3.00$
Fedora core 4
What do you mean by “allocating for open files”?
Mark R. Diggory~~~~~~~~~~~~~DSpace Systems ManagerMIT Libraries, Systems and Technology ServicesMassachusetts Institute of Technology
-------------------------------------------------------------------------Take Surveys. Earn Cash. Influence the Future of ITJoin SourceForge.net's Techsay panel and you'll get the chance to share youropinions on IT & business topics through brief surveys - and earn cashDSpace-tech mailing list
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of ITJoin SourceForge.net's Techsay panel and you'll get the chance to share youropinions on IT & business topics through brief surveys - and earn cashDSpace-tech mailing list
Mark:
We do have a throttle in place but it does not work on the number of connections opened. It’s more to block accesses to items, so perhaps this is an issue. I will talk this over with our sys admin person.
Here is the output you wanted me to post:
-bash-3.00$ cat /proc/sys/fs/file-max
206034
-bash-3.00$ cat /proc/sys/fs/file-nr
4224 0 206034
I’ll do some reading on this.
Thanks!
Jose
From: dspace-te...@lists.sourceforge.net [mailto:dspace-te...@lists.sourceforge.net] On Behalf Of Mark Diggory
Sent: Tuesday, December 05, 2006
4:31 PM
To: Jose Blanco
Cc:
dspac...@lists.sourceforge.net
~ $ cat /proc/sys/fs/file-max3595635 ~ $ cat /proc/sys/fs/file-nr640 0 359563~ $ cat /proc/sys/fs/inode-inode-nr inode-state~ $ cat /proc/sys/fs/inode-nr50087 15640~ $ cat /proc/sys/fs/inode-state50090 15640 0 0 0 0 0~ $ df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/md/0 1224000 8659 1215341 1% /udev 217552 745 216807 1% /dev/dev/mapper/vg-usr 1310720 275633 1035087 22% /usr/dev/mapper/vg-home 2621440 57411 2564029 3% /home/dev/mapper/vg-var 1310720 38530 1272190 3% /var/dev/mapper/vg-tmp 655360 98 655262 1% /tmpshm 217552 1 217551 1% /dev/shm192.168.0.13:/mnt/staging10134720 358260 9776460 4% /mnt/staging192.168.0.13:/mnt/assetstore10134720 358260 9776460 4% /mnt/prod/assetstore1192.168.0.13:/mnt/assetstore010134720 358260 9776460 4% /mnt/prod/assetstore0
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of ITJoin SourceForge.net's Techsay panel and you'll get the chance to share youropinions on IT & business topics through brief surveys - and earn cashDSpace-tech mailing list
Mark:
We are suspecting that
memory is being saturated during indexing and that this is causing trouble with
tomcat. Are there any guidelines on how to set memory parameters for
repositories based on size?
Here is a bit of information about our server:
Linux “server_name” 2.6.17-1.2142_FC4smp #1 SMP Tue Jul 11 22:57:02 EDT 2006 i686 i686 i386 GNU/Linux
Fedora core 4
Number if items: 32,710
We have two assetstore dirs with the following sizes
assetstore 0 : 86,1193,48 K
assetstore 1 : 21,411,068 K
Here is what we have in our search dir:
-bash-3.00$ ls -la
total 2103528
drwxr-xr-x 2 dspace dspace 4096 Dec 7 14:07 .
drwxr-xr-x 13 dspace dspace 4096 Dec 1 10:52 ..
-rw-r--r-- 1 dspace dspace 4 Dec 7 14:07 deletable
-rw-r--r-- 1 dspace dspace 2151226568 Dec 7 08:56 _s12.cfs
-rw-r--r-- 1 dspace dspace 4095 Dec 7 14:07 _s12.del
-rw-r--r-- 1 dspace dspace 444666 Dec 7 11:51 _s1u.cfs
-rw-r--r-- 1 dspace dspace 10 Dec 7 13:56 _s1u.del
-rw-r--r-- 1 dspace dspace 81333 Dec 7 11:52 _s2e.cfs
-rw-r--r-- 1 dspace dspace 85694 Dec 7 11:53 _s2y.cfs
-rw-r--r-- 1 dspace dspace 27933 Dec 7 14:07 _s3e.cfs
-rw-r--r-- 1 dspace dspace 65 Dec 7 14:07 segments
Thank you!
Jose