v1.4.21 - Too many open files!

147 views
Skip to first unread message

Taffy

unread,
Aug 7, 2016, 8:13:51 PM8/7/16
to OpenLiteSpeed Development
Hi,

Using latest v1.4.21, and am getting logs full of:

2016-08-08 12:05:01.007 [ERROR] Failed to open the real time report!
2016-08-08 12:05:01.192 [ERROR] [*:80] HttpListener::acceptConnection(): Accept failed:Too many open files!
2016-08-08 12:05:01.512 [ERROR] [*:80] HttpListener::acceptConnection(): Accept failed:Too many open files!
2016-08-08 12:05:01.759 [ERROR] [*:443] HttpListener::acceptConnection(): Accept failed:Too many open files!
2016-08-08 12:05:02.060 [ERROR] Failed to open the real time report!
2016-08-08 12:05:02.230 [ERROR] [*:80] HttpListener::acceptConnection(): Accept failed:Too many open files!
2016-08-08 12:05:03.031 [ERROR] Failed to open the real time report!
2016-08-08 12:05:03.710 [ERROR] [*:443] HttpListener::acceptConnection(): Accept failed:Too many open files!
2016-08-08 12:05:04.011 [ERROR] Failed to open the real time report!
2016-08-08 12:05:04.212 [ERROR] [*:443] HttpListener::acceptConnection(): Accept failed:Too many open files!

These errors are repeated over and over until essentially the hard drive fills up and the server just goes to a 503 error.  This has been happening since at least v1.4.19, or at least this is when it started to fill the logs within minutes and make the server fail.

Environment is 64bit Amazon Linux, ulimit is 5120 soft limit and 16k hard limit.  I can't find any specific reference to file limits in the docs or config.

David

unread,
Aug 7, 2016, 9:22:47 PM8/7/16
to openlitespee...@googlegroups.com
Sorry to hear that.
This bug was supposed to be fixed in release 1.4.21. Did you install it with source code or from RPM repo?
If you installed from the source code and got these error, there should be some case that the update do not touched.
We are going to figure it out and fix it soon.
Thanks.
David
--
You received this message because you are subscribed to the Google Groups "OpenLiteSpeed Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openlitespeed-deve...@googlegroups.com.
To post to this group, send email to openlitespee...@googlegroups.com.
Visit this group at https://groups.google.com/group/openlitespeed-development.
For more options, visit https://groups.google.com/d/optout.

Taffy

unread,
Aug 7, 2016, 9:55:11 PM8/7/16
to OpenLiteSpeed Development
Yes, it was installed from source, not a package.

David

unread,
Aug 8, 2016, 9:38:26 AM8/8/16
to openlitespee...@googlegroups.com
I checked the code, one of the problems should be the file permission of the real time report.
Please check the directory and file permisions,
/dev/
/dev/shm/
/dev/shm/ols/
/dev/shm/ols/.rtreport

Because the openlitespeed workers run as nobody user, if the file or directory can not be accessed by nobody, then will get "Failed to open real time report "  error.
About the other errors, I am still checking.

Thanks.
David


On 8/7/2016 9:55 PM, Taffy wrote:
Yes, it was installed from source, not a package.

George Wang

unread,
Aug 8, 2016, 9:55:39 AM8/8/16
to openlitespee...@googlegroups.com
Hi Taffy,

can you please run "lsof -p <pid_of_a_openlitespeed_worker>", and send
us the output, you can send it to bug@litespeed.... if need.

You can let it run for a little while, do not have to wait for the error
to appear.

We need to know which file was not closed.

Best regards,
George Wang

On 8/7/2016 8:13 PM, Taffy wrote:
> --
> You received this message because you are subscribed to the Google
> Groups "OpenLiteSpeed Development" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to openlitespeed-deve...@googlegroups.com
> <mailto:openlitespeed-deve...@googlegroups.com>.
> To post to this group, send email to
> openlitespee...@googlegroups.com
> <mailto:openlitespee...@googlegroups.com>.

Yang

unread,
Aug 8, 2016, 10:21:51 PM8/8/16
to OpenLiteSpeed Development
I have some ideas, If you was update openlitespeed not is new install, You must stop openlitespeed then delete old /dev/shm/ols directory , Then start openlitespeed and reload then real time report will be done .
I think you can use cat /proc/you openlitespeed pid/limits to show you openlitespeed open file limits if you use centos 7.X .

在 2016年8月8日星期一 UTC+8上午8:13:51,Taffy写道:

Taffy

unread,
Aug 9, 2016, 12:54:27 AM8/9/16
to OpenLiteSpeed Development
The 'failed to open realtime report' error is not an issue, this error is only occurring because the server is full and at that point that error starts.  Once the logs are removed and service restarted, there are no errors until the 'Too many open files' error reoccurs.

Once the error re-appears, I will do as George suggests below and send the output of the lsof command.


On Tuesday, August 9, 2016 at 1:38:26 AM UTC+12, David wrote:
I checked the code, one of the problems should be the file permission of the real time report.
Please check the directory and file permisions,
/dev/
/dev/shm/
/dev/shm/ols/
/dev/shm/ols/.rtreport

Because the openlitespeed workers run as nobody user, if the file or directory can not be accessed by nobody, then will get "Failed to open real time report "  error.
About the other errors, I am still checking.

Thanks.
David


On 8/7/2016 9:55 PM, Taffy wrote:
Yes, it was installed from source, not a package.
--
You received this message because you are subscribed to the Google Groups "OpenLiteSpeed Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openlitespeed-development+unsub...@googlegroups.com.

George Wang

unread,
Aug 9, 2016, 11:13:34 AM8/9/16
to openlitespee...@googlegroups.com
Hi,

We have not received the email yet. please make sure to use the full
domain name when you send the email. "litespeedtech.com", did not give
the full domain to avoid being collected by SPAM bots. :-)

Best regards,
George Wang
>> openlitespeed-deve...@googlegroups.com <javascript:>.
>> To post to this group, send email to
>> openlitespee...@googlegroups.com <javascript:>.
>> <https://groups.google.com/group/openlitespeed-development>.
>> For more options, visit https://groups.google.com/d/optout
>> <https://groups.google.com/d/optout>.
>
> --
> You received this message because you are subscribed to the Google
> Groups "OpenLiteSpeed Development" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to openlitespeed-deve...@googlegroups.com
> <mailto:openlitespeed-deve...@googlegroups.com>.
> To post to this group, send email to
> openlitespee...@googlegroups.com
> <mailto:openlitespee...@googlegroups.com>.

Taffy

unread,
Aug 9, 2016, 4:21:01 PM8/9/16
to OpenLiteSpeed Development
I haven't sent an email as yet :)  If it's of any use until I can grab an output for you (I had to put a script in place to delete logs and restart server, to stop site going down):

There's a LOT of these:
litespeed 3673 nobody  DEL    REG                0,5           3342576105 /dev/zero
litespeed 3673 nobody  DEL    REG                0,5           3342576104 /dev/zero
litespeed 3673 nobody  DEL    REG                0,5           3342576103 /dev/zero
litespeed 3673 nobody  DEL    REG                0,5           3342576102 /dev/zero
litespeed 3673 nobody  DEL    REG                0,5           3342576065 /dev/zero
litespeed 3673 nobody  DEL    REG                0,5           3342576064 /dev/zero
litespeed 3673 nobody  DEL    REG                0,5           3342576063 /dev/zero

The majority of the rest of the lsof output is like this:

litespeed 3673 nobody 4088r   REG              202,1     44387     677921 /home/(userfolder)/public_html/assets/images/site/(imagename).jpg
litespeed 3673 nobody 4089r   REG              202,1     24383     734870 /home/(userfolder)/public_html/assets/images/site/(imagename).jpg
litespeed 3673 nobody 4090r   REG              202,1     30102     684496 /home/(userfolder)/public_html/assets/images/site/(imagename).jpg
litespeed 3673 nobody 4091r   REG              202,1    150256     708987 /home/(userfolder)/public_html/assets/images/site/(imagename).jpg
litespeed 3673 nobody 4092r   REG              202,1     49191     719539 /home/(userfolder)/public_html/assets/images/site/(imagename).jpg

(Identifying data removed)

Apart from a handful of files, the vast majority are images, all in this state.

George Wang

unread,
Aug 9, 2016, 4:40:14 PM8/9/16
to openlitespee...@googlegroups.com

> The majority of the rest of the lsof output is like this:
>
> litespeed 3673 nobody 4088r REG 202,1 44387
> 677921 /home/(userfolder)/public_html/assets/images/site/(imagename).jpg
> litespeed 3673 nobody 4089r REG 202,1 24383
> 734870 /home/(userfolder)/public_html/assets/images/site/(imagename).jpg
> litespeed 3673 nobody 4090r REG 202,1 30102
> 684496 /home/(userfolder)/public_html/assets/images/site/(imagename).jpg
> litespeed 3673 nobody 4091r REG 202,1 150256
> 708987 /home/(userfolder)/public_html/assets/images/site/(imagename).jpg
> litespeed 3673 nobody 4092r REG 202,1 49191
> 719539 /home/(userfolder)/public_html/assets/images/site/(imagename).jpg
Yes, it is the section showing the problem.
There should be multiple entries for the same file, right?
How long does it take for duplicate entries for the same file start to
appear, after you restart OLS?

We will investigate.

Best regards,
George Wang

Taffy

unread,
Aug 9, 2016, 10:13:52 PM8/9/16
to openlitespee...@googlegroups.com
They don't appear to be the same name, I've done a few greps on the results and it looks like they are all different file names.

Edit: as per below, server recently crashed and restarted so might not show duplicates just yet.

David

unread,
Aug 9, 2016, 10:17:01 PM8/9/16
to openlitespee...@googlegroups.com
So they are maybe thousands of different files, right?
Do you know the size range of these files? Such as bigger than 40KB and smaller than 40MB.



On 8/9/2016 10:13 PM, Taffy wrote:
They don't appear to be the same name, I've done a few greps on the results and it looks like they are all different file names.
--
You received this message because you are subscribed to the Google Groups "OpenLiteSpeed Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openlitespeed-deve...@googlegroups.com.
To post to this group, send email to openlitespee...@googlegroups.com.

Taffy

unread,
Aug 9, 2016, 10:17:49 PM8/9/16
to OpenLiteSpeed Development
Probably unrelated, but same server process has just done this:

[New LWP 3673]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `openlitesp'.
Program terminated with signal 6, Aborted.
#0  0x00007fdcbde615f7 in raise () from /lib64/libc.so.6
#0  0x00007fdcbde615f7 in raise () from /lib64/libc.so.6
#1  0x00007fdcbde62ce8 in abort () from /lib64/libc.so.6
#2  0x00007fdcbde5a566 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007fdcbde5a612 in __assert_fail () from /lib64/libc.so.6
#4  0x00000000004c375a in FileCacheDataEx::getCacheData (this=0x2f85108, offset=0, wanted=wanted@entry=@0x7ffc7d57b6f8: 21525, pBuf=0x9f9180 <HttpResourceManager::g_aBuf> "(filename removed)", len=len@entry=16384) at staticfilecachedata.cpp:251
#5  0x00000000004e8f98 in HttpSession::sendStaticFileEx (this=this@entry=0x3016880, pData=pData@entry=0x3016d88) at httpsession.cpp:3701
#6  0x00000000004e92eb in HttpSession::sendStaticFile (this=this@entry=0x3016880, pData=pData@entry=0x3016d88) at httpsession.cpp:3736
#7  0x00000000004e9669 in HttpSession::flushBody (this=this@entry=0x3016880) at httpsession.cpp:3072
#8  0x00000000004e9f48 in HttpSession::flush (this=this@entry=0x3016880) at httpsession.cpp:3154
#9  0x00000000004ea394 in HttpSession::endResponse (this=this@entry=0x3016880, success=success@entry=1) at httpsession.cpp:3043
#10 0x00000000004ea44b in endResponse (success=1, this=0x3016880) at httpsession.cpp:3015
#11 HttpSession::sendDefaultErrorPage (this=this@entry=0x3016880, pAdditional=pAdditional@entry=0x0) at httpsession.cpp:441
#12 0x00000000004ebd7c in HttpSession::sendHttpError (this=this@entry=0x3016880, pAdditional=pAdditional@entry=0x0) at httpsession.cpp:1864
#13 0x00000000004ebff2 in HttpSession::httpError (this=this@entry=0x3016880, code=code@entry=50, pAdditional=pAdditional@entry=0x0) at httpsession.cpp:454
#14 0x00000000004eb547 in HttpSession::smProcessReq (this=this@entry=0x3016880) at httpsession.cpp:4216
#15 0x00000000004ec543 in HttpSession::onReadEx (this=0x3016880) at httpsession.cpp:1971
#16 0x00000000004d8005 in NtwkIOLink::handleEvents (this=0x307c920, evt=<optimized out>) at ntwkiolink.cpp:390
#17 0x00000000005384ae in epoll::waitAndProcessEvents (this=0x27c93b0, iTimeoutMilliSec=100) at epoll.cpp:214
#18 0x00000000004cc5b1 in EventDispatcher::run (this=this@entry=0x2778e68) at eventdispatcher.cpp:235
#19 0x00000000004ae440 in HttpServerImpl::start (this=0x2778e40) at httpserver.cpp:500
#20 0x00000000004b73b9 in HttpServer::start (this=<optimized out>) at httpserver.cpp:3669
#21 0x000000000048c690 in LshttpdMain::main (this=this@entry=0x2778bf0, argc=argc@entry=1, argv=argv@entry=0x7ffc7d57bab8) at lshttpdmain.cpp:936
#22 0x000000000048c5c7 in main (argc=1, argv=0x7ffc7d57bab8) at main.cpp:109

Taffy

unread,
Aug 9, 2016, 10:22:10 PM8/9/16
to OpenLiteSpeed Development
Yes, most would all be in that range, only one or two would be above 1mb, average range would be 40kb to 1mb


On Wednesday, August 10, 2016 at 2:17:01 PM UTC+12, David wrote:
So they are maybe thousands of different files, right?
Do you know the size range of these files? Such as bigger than 40KB and smaller than 40MB.


On 8/9/2016 10:13 PM, Taffy wrote:
They don't appear to be the same name, I've done a few greps on the results and it looks like they are all different file names.
--
You received this message because you are subscribed to the Google Groups "OpenLiteSpeed Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openlitespeed-development+unsub...@googlegroups.com.

David

unread,
Aug 11, 2016, 3:40:35 PM8/11/16
to openlitespee...@googlegroups.com
Hi Taffy,

We just released 1.4.22 on github, can you re-install it to your site and verify if that bug is fixed or not?
In this version, beside that bug, we also fixed a bug about the server level suexec user/group setting for extApp and did some verification about the directories permissions of the real time report .
Please let us know if need any help.

Thanks.
David
To unsubscribe from this group and stop receiving emails from it, send an email to openlitespeed-deve...@googlegroups.com.

Taffy

unread,
Aug 22, 2016, 12:46:55 AM8/22/16
to OpenLiteSpeed Development
Hi,

Only just got around to installing the release, so far so good, will keep you informed.

Taffy

unread,
Aug 30, 2016, 6:00:43 PM8/30/16
to OpenLiteSpeed Development
Hi,

Just to confirm, this bug is fixed now, no recurrence in over a week.

Thanks very much, appreciate your work greatly!

David

unread,
Aug 30, 2016, 8:56:13 PM8/30/16
to openlitespee...@googlegroups.com
Hi Taffy ,
Thank you for letting me know.
David
--
You received this message because you are subscribed to the Google Groups "OpenLiteSpeed Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openlitespeed-deve...@googlegroups.com.

Matteo

unread,
Feb 15, 2017, 12:23:49 AM2/15/17
to OpenLiteSpeed Development
If possibile, please, try to open this bug again. Same problem on CentOS 7.X fully updated, latest version of OpenLitespeed installed from your RPM repository. Same log problems until Hard Disk is full and you have a 502.
Reply all
Reply to author
Forward
0 new messages