Server instability apparently caused by errant PHP app (SITR 2.0.4)

9 views
Skip to first unread message

Lewis G Rosenthal

unread,
Oct 29, 2009, 4:52:40 PM10/29/09
to Apache2 Mailing List
Greetings, all...

My server uptime is now 15 days and counting; no Apache hiccups.
Essentially, it is/was an errant PHP app (SITR 2.0.4) which appears to
have been causing my difficulties all along. Limiting access to a single
listening port in httpd.conf simply forestalled the inevitable, and
thanks to Steve's keen eye and ability to merge multiple log files to
find common factors, it seemed that what was getting accessed regularly
was my knowledgebase, which currently runs on a rather rough PHP app,
SITR (Support Information TRacker) version 2.0.4
(http://sitr.sourceforge.net ).

After the last crash, I moved SITR out of my webspace and restarted the
server, which has been running ever since. The plan is to migrate the
data in SITR to MediaWiki, as I've seen some nice knowledgebases set up
in that fashion, with multi-tiered authentication (publicly-viewable
pages and private ones). However, as seems to usually be the case, the
project has just been stuck in/on my to-do list...

It may be that the stats table in the db (which had grown quite large
over the years) was returning data which PHP found objectionable. It may
have been able to deal with this at one time, but perhaps the addition
of the VOICE sites and the other PHP apps consumed too many resources
for the current configuration, causing the situation to just go over the
edge of the proverbial cliff, or it could simply be that the bad data
was posted to the table aroudn the same time as I was setting up the new
sites. I've emptied the contents of the stats table, and will put the
app back into production as soon as I am reasonably certain that SITR
was the problem (and I am rapidly moving toward that conclusion, simply
based upon observation of the system).

Anyway, I'll keep everyone posted, but I wanted to take a few minutes to
thank everyone for all of the assistance I've had with this these past
few months (wow!), and all of the patience which has been afforded me here.

Cheers/2

--
Lewis
-------------------------------------------------------------
Lewis G Rosenthal, CNA, CLP, CLE
Rosenthal & Rosenthal, LLC www.2rosenthals.com
Need a managed Wi-Fi hotspot? www.hautspot.com
Secure, stable, operating system www.ecomstation.com
-------------------------------------------------------------

Steven Levine

unread,
Oct 29, 2009, 8:14:07 PM10/29/09
to apa...@googlegroups.com
In <4AEA0098...@2rosenthals.com>, on 10/29/09
at 04:52 PM, Lewis G Rosenthal <lgros...@2rosenthals.com> said:

Hi,

>My server uptime is now 15 days and counting; no Apache hiccups.

This is more like it. :-)

>Limiting access to a single
>listening port in httpd.conf simply forestalled the inevitable,

Did you ever re-enable the second listen port?

>find common factors, it seemed that what was getting accessed regularly
>was my knowledgebase, which currently runs on a rather rough PHP app,
>SITR (Support Information TRacker) version 2.0.4
>(http://sitr.sourceforge.net ).

We don't really know if it was php or the mysql module that was the true
culprit. This will have to wait until you get the test server off the
wishlist.

>edge of the proverbial cliff, or it could simply be that the bad data
>was posted to the table aroudn the same time as I was setting up the new
>sites.

I suspect that if we ever do the analysis, we will find the data is fine
and that the quantity of data is the source of the problem. All it takes
is one missing limit check.

If you have bad data or corrupted indexes, you would have had trouble
deleting the records.

Steven

--
----------------------------------------------------------------------
"Steven Levine" <ste...@earthlink.net> eCS/Warp/DIY etc.
www.scoug.com www.ecomstation.com
----------------------------------------------------------------------

Lewis G Rosenthal

unread,
Oct 29, 2009, 9:27:21 PM10/29/09
to apa...@googlegroups.com
Hi...

On 10/29/09 08:14 pm, Steven Levine thus wrote :


> In <4AEA0098...@2rosenthals.com>, on 10/29/09
> at 04:52 PM, Lewis G Rosenthal <lgros...@2rosenthals.com> said:
>
> Hi,
>
>
>> My server uptime is now 15 days and counting; no Apache hiccups.
>>
>
> This is more like it. :-)
>
>

Indeed. The box feels good, too, with no odd issues (such as inability
to open command prompts and other "out of memory"-like conditions).


>> Limiting access to a single
>> listening port in httpd.conf simply forestalled the inevitable,
>>
>
> Did you ever re-enable the second listen port?
>
>

Not yet. I'm going to wait until this weekend, I guess, and then add the
second port. I suspect that whoever was constantly hittong on SITR was
coming in on port 81 (perhaps just due to Google's caching of the
server), and funneling that through the firewall took some load off of
the PHP process, allowing us to go the few extra days as we did. Of
course, that's just a guess, but it sort of fits with what we do know,
at this point.


>> find common factors, it seemed that what was getting accessed regularly
>> was my knowledgebase, which currently runs on a rather rough PHP app,
>> SITR (Support Information TRacker) version 2.0.4
>> (http://sitr.sourceforge.net ).
>>
>
> We don't really know if it was php or the mysql module that was the true
> culprit. This will have to wait until you get the test server off the
> wishlist.
>
>

:-)


>> edge of the proverbial cliff, or it could simply be that the bad data
>> was posted to the table aroudn the same time as I was setting up the new
>> sites.
>>
>
> I suspect that if we ever do the analysis, we will find the data is fine
> and that the quantity of data is the source of the problem. All it takes
> is one missing limit check.
>
>

Indeed, true, and we know what the SITR code for 2.0.4 looks like.


> If you have bad data or corrupted indexes, you would have had trouble
> deleting the records.
>
>

My thoughts exactly. phpMyAdmin (running local to MySQL on the NetWare
6.5 box) had no trouble emptying the table or even dumping it every
night (I have a cron job dump all the tables to separate .sql files each
night, so that I can pick them up during the regular tape backup,
because for some reason, Novell never saw fit to create a backup module
- TSA - for MySQL when they did the initial port of it to NetWare).

Lewis G Rosenthal

unread,
Oct 29, 2009, 11:05:06 PM10/29/09
to apa...@googlegroups.com
Quick follow-up, with some surprising info...

On 10/29/09 09:27 pm, Lewis G Rosenthal thus wrote :


> Hi...
>
> On 10/29/09 08:14 pm, Steven Levine thus wrote :
>
>> In <4AEA0098...@2rosenthals.com>, on 10/29/09
>> at 04:52 PM, Lewis G Rosenthal <lgros...@2rosenthals.com> said:
>>
>>

<snip>


>>> Limiting access to a single
>>> listening port in httpd.conf simply forestalled the inevitable,
>>>
>>>
>> Did you ever re-enable the second listen port?
>>
>>
>>
> Not yet. I'm going to wait until this weekend, I guess, and then add the
> second port. I suspect that whoever was constantly hittong on SITR was
> coming in on port 81 (perhaps just due to Google's caching of the
> server), and funneling that through the firewall took some load off of
> the PHP process, allowing us to go the few extra days as we did. Of
> course, that's just a guess, but it sort of fits with what we do know,
> at this point.
>

Okay, I admit it: I do more at that server's console than I really
should, and more at one time than I really should, for a box tuned for
background performance and not desktop activity. I have been working to
prune some of my messages from my burgeoning Inbox folder by moving them
to a set of archive folders in my IMAP tree. Unfortunately, CGP can get
confused when working with thousands of messages at one time, and ends
up making multiple copies. Thus, a move of 8,000 messages can wind up
(if one is not careful) with a folder containing a couple hundred
thousand messages, consuming huge amounts of disk space. Cleaning up
afterward is no fun, either, as even the best of duplicate message
removers still has to utilize CGP's IMAP interface, sometimes leading to
unintended consequences (like the server getting so bogged that it
literally stops).

In my case tonight, I was cleaning out my rsync backup of my archived
messages folder (del *) while deleting dupes from the "live" archive
folder via IMAP (from another machine). The server decided to stop. Ugh.
So, when it came back up, I figured I'd might as well add that second
Apache listener, which I did.

The firewall is still handling incoming port 81 requests, however, from
a machine behind the firewall, I could access 81 directly. I started one
session with the server-status page up (port 80) and then went to 81 (by
appending ":81" to my existing url in the same browser tab). I then
switched back and forth a couple of times (back & forward buttons,
hitting refresh in Firefox), and sure enough, the daemon restarted
itself. A couple more times, and it did it again. I was able to repeat
this - on that single page - about four times. Uptime between restarts
was about 30 seconds. Mind you, there could have been *no* other traffic
on 81 except mine, as all incoming traffic from the other side of the
firewall would have been PATted to 80. I haven't touched the SITR
directory at all, either, so that didn't play a role in this.

Since turning off the second port and bouncing the daemon, it's been up
for 20 minutes.

How's that for weird? At least it's an experiment easily duplicated on
other systems.

Cheers/2

Steven Levine

unread,
Oct 30, 2009, 3:32:01 PM10/30/09
to apa...@googlegroups.com
In <4AEA57E2...@2rosenthals.com>, on 10/29/09
at 11:05 PM, Lewis G Rosenthal <lgros...@2rosenthals.com> said:

Hi,

>Quick follow-up, with some surprising info...

Hmmm.

>thousand messages, consuming huge amounts of disk space. Cleaning up
>afterward is no fun, either, as even the best of duplicate message
>removers still has to utilize CGP's IMAP interface, sometimes leading to
>unintended consequences (like the server getting so bogged that it
>literally stops).

Since you are using mdir format, can't you bypass IMAP and let it rebuild
the indexes later?

>The firewall is still handling incoming port 81 requests, however, from
>a machine behind the firewall, I could access 81 directly. I started one
>session with the server-status page up (port 80) and then went to 81 (by
>appending ":81" to my existing url in the same browser tab). I then
>switched back and forth a couple of times (back & forward buttons,
>hitting refresh in Firefox), and sure enough, the daemon restarted
>itself. A couple more times, and it did it again.

Hmmm/2. Works for me. I also tried with phpinfo.html and a couple of
other pages.

Do the access logs agree that the last page accessed before the exception
is what you think it is?

Lewis G Rosenthal

unread,
Nov 1, 2009, 1:17:56 PM11/1/09
to apa...@googlegroups.com
Sorry for the late follow-up...and for the length of this post (for
those who haven't been following this thread, there are a few log
excerpts you might want to just skip).

On 10/30/09 03:32 pm, Steven Levine thus wrote :
> In <4AEA57E2...@2rosenthals.com>, on 10/29/09
> at 11:05 PM, Lewis G Rosenthal <lgros...@2rosenthals.com> said:
>
> Hi,
>
>
>> Quick follow-up, with some surprising info...
>>
>
> Hmmm.
>
>
>> thousand messages, consuming huge amounts of disk space. Cleaning up
>> afterward is no fun, either, as even the best of duplicate message
>> removers still has to utilize CGP's IMAP interface, sometimes leading to
>> unintended consequences (like the server getting so bogged that it
>> literally stops).
>>
>
> Since you are using mdir format, can't you bypass IMAP and let it rebuild
> the indexes later?
>
>
True. I'd need to do some regex matching to find the dupes that way, but
in the long run, it may well be worth the effort. It seems that every
time I attempt one of these mass move operations, the law of unintended
consequences applies.
>> The firewall is still handling incoming port 81 requests, however, from
>> a machine behind the firewall, I could access 81 directly. I started one
>> session with the server-status page up (port 80) and then went to 81 (by
>> appending ":81" to my existing url in the same browser tab). I then
>> switched back and forth a couple of times (back & forward buttons,
>> hitting refresh in Firefox), and sure enough, the daemon restarted
>> itself. A couple more times, and it did it again.
>>
>
> Hmmm/2. Works for me. I also tried with phpinfo.html and a couple of
> other pages.
>
> Do the access logs agree that the last page accessed before the exception
> is what you think it is?
>
>
Well, first the error log:

[Thu Oct 29 22:21:47 2009] [notice] caught SIGTERM, shutting down
[Thu Oct 29 22:21:54 2009] [notice] Apache/2.2.13 (OS/2) PHP/5.2.11
configured -- resuming normal operations
[Thu Oct 29 22:25:54 2009] [error] (OS 10035)Resource temporarily
unavailable: apr_socket_accept
[Thu Oct 29 22:25:55 2009] [error] (70007)The timeout specified has
expired: apr_socket_accept
[Thu Oct 29 22:25:58 2009] [notice] caught SIGTERM, shutting down
[Thu Oct 29 22:26:04 2009] [notice] Apache/2.2.13 (OS/2) PHP/5.2.11
configured -- resuming normal operations
[Thu Oct 29 22:28:48 2009] [error] (OS 10035)Resource temporarily
unavailable: apr_socket_accept
[Thu Oct 29 22:28:49 2009] [error] (70007)The timeout specified has
expired: apr_socket_accept
[Thu Oct 29 22:28:49 2009] [notice] caught SIGTERM, shutting down
[Thu Oct 29 22:28:56 2009] [notice] Apache/2.2.13 (OS/2) PHP/5.2.11
configured -- resuming normal operations
[Thu Oct 29 22:41:14 2009] [error] (OS 10035)Resource temporarily
unavailable: apr_socket_accept
[Thu Oct 29 22:41:14 2009] [error] caught exception
(XCPT_ACCESS_VIOLATION) in worker thread, initiating child shutdown
pid=107
[Thu Oct 29 22:41:14 2009] [error] caught exception in worker
thread, initiating child shutdown pid=107
[Thu Oct 29 22:41:15 2009] [error] (70007)The timeout specified has
expired: apr_socket_accept
[Thu Oct 29 22:41:15 2009] [error] caught exception
(XCPT_ACCESS_VIOLATION) in worker thread, initiating child shutdown
pid=106
[Thu Oct 29 22:41:15 2009] [error] caught exception in worker
thread, initiating child shutdown pid=106
[Thu Oct 29 22:41:15 2009] [error] caught exception
(XCPT_ACCESS_VIOLATION) in worker thread, initiating child shutdown
pid=106
[Thu Oct 29 22:41:15 2009] [error] caught exception in worker
thread, initiating child shutdown pid=106
(OS 10048)Address already in use: make_sock: could not bind to
address 0.0.0.0:80
[Thu Oct 29 22:41:16 2009] [alert] no listening sockets available,
shutting down
[Thu Oct 29 22:41:21 2009] [notice] caught SIGTERM, shutting down
[Thu Oct 29 22:41:29 2009] [notice] Apache/2.2.13 (OS/2) PHP/5.2.11
configured -- resuming normal operations
[Thu Oct 29 22:42:40 2009] [error] (OS 10035)Resource temporarily
unavailable: apr_socket_accept
[Thu Oct 29 22:42:40 2009] [error] caught exception
(XCPT_ACCESS_VIOLATION) in worker thread, initiating child shutdown
pid=110
[Thu Oct 29 22:42:40 2009] [error] caught exception in worker
thread, initiating child shutdown pid=110
[Thu Oct 29 22:42:41 2009] [error] (70007)The timeout specified has
expired: apr_socket_accept
[Thu Oct 29 22:42:41 2009] [error] caught exception
(XCPT_ACCESS_VIOLATION) in worker thread, initiating child shutdown
pid=111
[Thu Oct 29 22:42:41 2009] [error] caught exception in worker
thread, initiating child shutdown pid=111
(OS 10048)Address already in use: make_sock: could not bind to
address 0.0.0.0:80
[Thu Oct 29 22:42:42 2009] [alert] no listening sockets available,
shutting down
[Thu Oct 29 22:42:42 2009] [notice] caught SIGTERM, shutting down
[Thu Oct 29 22:42:49 2009] [notice] Apache/2.2.13 (OS/2) PHP/5.2.11
configured -- resuming normal operations
[Thu Oct 29 22:43:53 2009] [error] (70007)The timeout specified has
expired: apr_socket_accept
[Thu Oct 29 22:43:53 2009] [error] caught exception
(XCPT_ACCESS_VIOLATION) in worker thread, initiating child shutdown
pid=114
[Thu Oct 29 22:43:53 2009] [error] caught exception in worker
thread, initiating child shutdown pid=114
[Thu Oct 29 22:44:12 2009] [notice] caught SIGTERM, shutting down
[Thu Oct 29 22:44:19 2009] [notice] Apache/2.2.13 (OS/2) PHP/5.2.11
configured -- resuming normal operations

Interesting to note are the references to port 80 not being available.
With both ports listening (and 81 listed first in httpd.conf), one would
think that we should see the same message for 81, or at least the 81 and
not the 80 (as that would be the first one not yet closed). Anyway, the
excerpt above starts with the first crash and ends with the final
restart, after dropping the second port.

Now, for the access log (again, SITR is currently unavailable, moved out
of the webspace; naturally this doesn't stop the bots from trying to
reach it, however). I've started this exceerpt well before the observed
crashes, only to give a taste for what was going on:

192.168.100.201 - - [29/Oct/2009:22:23:08 -0400] "GET
/sitr/KnowledgeBase/form_browse_documents.php?usr_application_ids=%7C78%7C45%7C79%7C22%7C91%7C49%7C90&usr_application_id=-91&function=1
HTTP/1.1" 404 242
192.168.100.201 - - [29/Oct/2009:22:28:07 -0400] "GET
/sitr/KnowledgeBase/form_browse_documents.php?usr_application_ids=%7C86%7C22%7C77&usr_application_id=31&function=1
HTTP/1.1" 404 242
192.168.100.201 - - [29/Oct/2009:22:30:38 -0400] "GET
/sitr/KnowledgeBase/form_browse_documents.php?usr_application_ids=%7C22%7C34&usr_application_id=21&function=1
HTTP/1.1" 404 242
192.168.100.201 - - [29/Oct/2009:22:33:09 -0400] "GET
/sitr/KnowledgeBase/form_browse_documents.php?usr_application_ids=%7C86%7C22%7C31&usr_application_id=77&function=1
HTTP/1.1" 404 242
192.168.100.201 - - [29/Oct/2009:22:35:41 -0400] "GET
/sitr/KnowledgeBase/form_browse_documents.php?usr_application_ids=%7C86%7C41%7C44&usr_application_id=20&function=1
HTTP/1.1" 404 242
192.168.100.201 - - [29/Oct/2009:22:38:13 -0400] "GET
/sitr/KnowledgeBase/form_browse_documents.php?usr_application_ids=%7C22%7C31&usr_application_id=77&function=1
HTTP/1.1" 404 242
192.168.100.201 - - [29/Oct/2009:22:40:23 -0400] "GET /server-status
HTTP/1.0" 401 401
192.168.100.201 - myusername [29/Oct/2009:22:40:30 -0400] "GET
/server-status HTTP/1.0" 200 8490
192.168.100.201 - - [29/Oct/2009:22:40:30 -0400] "GET /favicon.ico
HTTP/1.0" 404 209
192.168.100.201 - - [29/Oct/2009:22:40:33 -0400] "GET /favicon.ico
HTTP/1.0" 404 209
192.168.100.201 - - [29/Oct/2009:22:41:01 -0400] "GET /server-status
HTTP/1.1" 401 401
192.168.100.201 - myusername [29/Oct/2009:22:41:08 -0400] "GET
/server-status HTTP/1.1" 200 8768
192.168.100.201 - - [29/Oct/2009:22:41:08 -0400] "GET /favicon.ico
HTTP/1.1" 404 209
192.168.100.201 - - [29/Oct/2009:22:41:11 -0400] "GET /favicon.ico
HTTP/1.1" 404 209
192.168.100.201 - myusername [29/Oct/2009:22:41:37 -0400] "GET
/server-status HTTP/1.0" 200 4976
192.168.100.201 - myusername [29/Oct/2009:22:41:44 -0400] "GET
/server-status HTTP/1.0" 200 5015
192.168.100.201 - myusername [29/Oct/2009:22:42:06 -0400] "GET
/server-status HTTP/1.0" 200 5835
192.168.100.201 - myusername [29/Oct/2009:22:42:08 -0400] "GET
/server-status HTTP/1.0" 200 5826
192.168.100.201 - myusername [29/Oct/2009:22:42:13 -0400] "GET
/server-status HTTP/1.1" 200 6653
192.168.100.201 - myusername [29/Oct/2009:22:42:18 -0400] "GET
/server-status HTTP/1.0" 200 6877
192.168.100.201 - myusername [29/Oct/2009:22:42:22 -0400] "GET
/server-status HTTP/1.1" 200 6881
192.168.100.201 - myusername [29/Oct/2009:22:42:24 -0400] "GET
/server-status HTTP/1.1" 200 6884
192.168.100.201 - myusername [29/Oct/2009:22:42:25 -0400] "GET
/server-status HTTP/1.1" 200 6883
192.168.100.201 - myusername [29/Oct/2009:22:42:28 -0400] "GET
/server-status HTTP/1.1" 200 7135
192.168.100.201 - myusername [29/Oct/2009:22:42:29 -0400] "GET
/server-status HTTP/1.1" 200 7135
192.168.100.201 - myusername [29/Oct/2009:22:42:32 -0400] "GET
/server-status HTTP/1.0" 200 7377
192.168.100.201 - myusername [29/Oct/2009:22:42:34 -0400] "GET
/server-status HTTP/1.0" 200 7377
192.168.100.201 - myusername [29/Oct/2009:22:42:35 -0400] "GET
/server-status HTTP/1.0" 200 7377
192.168.100.201 - myusername [29/Oct/2009:22:42:36 -0400] "GET
/server-status HTTP/1.0" 200 7377
192.168.100.201 - myusername [29/Oct/2009:22:42:52 -0400] "GET
/server-status HTTP/1.1" 200 5256
192.168.100.201 - myusername [29/Oct/2009:22:42:56 -0400] "GET
/server-status HTTP/1.0" 200 7123
192.168.100.201 - - [29/Oct/2009:22:43:17 -0400] "GET
/sitr/KnowledgeBase/form_browse_documents.php?usr_application_ids=%7C78%7C79%7C22%7C91%7C83%7C-45&usr_application_id=-79&function=1
HTTP/1.1" 404 242
192.168.100.201 - myusername [29/Oct/2009:22:45:10 -0400] "GET
/server-status HTTP/1.1" 200 5797
192.168.100.201 - myusername [29/Oct/2009:22:45:12 -0400] "GET
/server-status HTTP/1.1" 200 5792
192.168.100.201 - myusername [29/Oct/2009:22:45:13 -0400] "GET
/server-status HTTP/1.1" 200 5794
192.168.100.201 - myusername [29/Oct/2009:22:45:14 -0400] "GET
/server-status HTTP/1.1" 200 5794
192.168.100.201 - myusername [29/Oct/2009:22:45:16 -0400] "GET
/server-status HTTP/1.0" 200 6052
192.168.100.201 - myusername [29/Oct/2009:22:45:22 -0400] "GET
/server-status HTTP/1.0" 200 6053
192.168.100.201 - myusername [29/Oct/2009:22:45:37 -0400] "GET
/server-status HTTP/1.1" 200 6303
192.168.100.201 - myusername [29/Oct/2009:22:45:47 -0400] "GET
/server-status HTTP/1.0" 200 6554
192.168.100.201 - - [29/Oct/2009:22:45:49 -0400] "GET
/sitr/KnowledgeBase/form_browse_documents.php?usr_application_ids=%7C-78%7C21&usr_application_id=41&function=1
HTTP/1.1" 404 242
192.168.100.201 - - [29/Oct/2009:22:48:13 -0400] "GET
/sitr/KnowledgeBase/form_browse_documents.php?usr_application_ids=%7C78%7C45%7C79%7C22%7C91%7C34%7C56&usr_application_id=41&function=1
HTTP/1.1" 404 242
192.168.100.201 - - [29/Oct/2009:22:53:10 -0400] "GET
/sitr/KnowledgeBase/form_browse_documents.php?usr_application_ids=%7C78%7C41%7C79&usr_application_id=32&function=1
HTTP/1.1" 404 242
192.168.100.201 - - [29/Oct/2009:22:56:17 -0400] "GET
/sitr/KnowledgeBase/form_browse_documents.php?usr_application_ids=%7C56%7C20%7C94&usr_application_id=-94&function=1
HTTP/1.1" 404 242
192.168.100.201 - - [29/Oct/2009:22:59:34 -0400] "GET
/sitr/KnowledgeBase/form_browse_documents.php?usr_application_ids=%7C78%7C45%7C79%7C22%7C91%7C56%7C41&usr_application_id=-45&function=1
HTTP/1.1" 404 242
192.168.100.201 - myusername [29/Oct/2009:22:59:56 -0400] "GET
/server-status?refresh=10 HTTP/1.0" 200 8127
192.168.100.201 - myusername [29/Oct/2009:23:00:07 -0400] "GET
/server-status?refresh=10 HTTP/1.0" 200 9789
192.168.100.201 - myusername [29/Oct/2009:23:00:18 -0400] "GET
/server-status?refresh=10 HTTP/1.0" 200 9738
192.168.100.201 - myusername [29/Oct/2009:23:00:28 -0400] "GET
/server-status?refresh=10 HTTP/1.0" 200 9735
192.168.100.201 - myusername [29/Oct/2009:23:00:38 -0400] "GET
/server-status?refresh=10 HTTP/1.0" 200 9729
192.168.100.201 - myusername [29/Oct/2009:23:00:48 -0400] "GET
/server-status?refresh=10 HTTP/1.0" 200 9731

PMMail was being accessed:

192.168.100.201 - - [29/Oct/2009:22:31:33 -0400] "GET
/index.php?title=Release_Notes HTTP/1.1" 304 -
192.168.100.201 - - [29/Oct/2009:22:41:05 -0400] "GET
/index.php?title=Special:RecentChanges HTTP/1.0" 200 2699
192.168.100.201 - - [29/Oct/2009:22:41:09 -0400] "GET
/skins/common/shared.css?207 HTTP/1.0" 200 6690
192.168.100.201 - - [29/Oct/2009:22:41:11 -0400] "GET
/skins/common/commonPrint.css?207 HTTP/1.0" 200 4852
192.168.100.201 - - [29/Oct/2009:22:41:12 -0400] "GET
/skins/monobook/main.css?207 HTTP/1.0" 200 27459
192.168.100.201 - - [29/Oct/2009:22:41:14 -0400] "GET
/index.php?title=MediaWiki:Common.css&usemsgcache=yes&ctype=text%2Fcss&smaxage=18000&action=raw&maxage=18000
HTTP/1.0" 200 2920
192.168.100.201 - - [29/Oct/2009:22:42:05 -0400] "GET
/index.php?title=MediaWiki:Print.css&usemsgcache=yes&ctype=text%2Fcss&smaxage=18000&action=raw&maxage=18000
HTTP/1.0" 200 2918
192.168.100.201 - - [29/Oct/2009:22:42:09 -0400] "GET
/index.php?title=MediaWiki:Monobook.css&usemsgcache=yes&ctype=text%2Fcss&smaxage=18000&action=raw&maxage=18000
HTTP/1.0" 200 85
192.168.100.201 - - [29/Oct/2009:22:42:11 -0400] "GET
/index.php?title=-&action=raw&maxage=18000&gen=css HTTP/1.0" 200 98
192.168.100.201 - - [29/Oct/2009:22:46:58 -0400] "GET
/index.php?title=Talk:PMMail_for_OS/2&action=edit&redlink=1
HTTP/1.0" 200 2848
192.168.100.201 - - [29/Oct/2009:22:48:34 -0400] "GET
/index.php?title=DevTeam:Home HTTP/1.0" 200 3024
192.168.100.201 - - [29/Oct/2009:22:49:17 -0400] "GET
/index.php?title=VOICE:Home HTTP/1.0" 200 3028
192.168.100.201 - - [29/Oct/2009:23:00:27 -0400] "GET
/index.php?title=WishList HTTP/1.0" 200 2992

and OS/2 Voice:

192.168.100.201 - - [29/Oct/2009:22:41:03 -0400] "GET / HTTP/1.1"
200 13522
192.168.100.201 - - [29/Oct/2009:22:41:04 -0400] "GET /about.html
HTTP/1.1" 200 11644
192.168.100.201 - - [29/Oct/2009:22:41:05 -0400] "GET /contact.html
HTTP/1.1" 200 6856
192.168.100.201 - - [29/Oct/2009:22:41:06 -0400] "GET /livehelp.html
HTTP/1.1" 200 7595
192.168.100.201 - - [29/Oct/2009:22:41:07 -0400] "GET /about.html
HTTP/1.1" 200 11644
192.168.100.201 - - [29/Oct/2009:22:41:08 -0400] "GET /contact.html
HTTP/1.1" 200 6856
192.168.100.201 - - [29/Oct/2009:22:41:09 -0400] "GET /livehelp.html
HTTP/1.1" 200 7595
192.168.100.201 - - [29/Oct/2009:22:41:10 -0400] "GET /style.css
HTTP/1.1" 200 4611
192.168.100.201 - - [29/Oct/2009:22:41:11 -0400] "GET /menust.css
HTTP/1.1" 200 3702
192.168.100.201 - - [29/Oct/2009:22:41:12 -0400] "GET
/VNL/past_issues/VNL0206H/img/content-14.png HTTP/1.1" 200 144
192.168.100.201 - - [29/Oct/2009:22:41:12 -0400] "GET
/VNL/past_issues/VNL0206H/img/down-14.png HTTP/1.1" 200 228
192.168.100.201 - - [29/Oct/2009:22:41:13 -0400] "GET
/VNL/past_issues/VNL0206H/img/flag-14.png HTTP/1.1" 200 128
192.168.100.201 - - [29/Oct/2009:22:41:13 -0400] "GET
/VNL/past_issues/VNL0206H/img/legend_out.png HTTP/1.1" 200 247
192.168.100.201 - - [29/Oct/2009:22:41:13 -0400] "GET
/VNL/past_issues/VNL0206H/img/next-14.png HTTP/1.1" 200 654
192.168.100.201 - - [29/Oct/2009:22:41:13 -0400] "GET
/VNL/past_issues/VNL0206H/img/previous-14.png HTTP/1.1" 200 649
192.168.100.201 - - [29/Oct/2009:22:41:13 -0400] "GET
/VNL/past_issues/VNL0206H/img/shim.gif HTTP/1.1" 404 235
192.168.100.201 - - [29/Oct/2009:22:41:13 -0400] "GET
/VNL/past_issues/VNL0206H/img/shim.gif HTTP/1.1" 404 235
192.168.100.201 - - [29/Oct/2009:22:41:13 -0400] "GET
/VNL/past_issues/VNL0206H/img/shim.gif HTTP/1.1" 404 235
192.168.100.201 - - [29/Oct/2009:22:41:13 -0400] "GET
/VNL/past_issues/VNL0206H/img/shim.gif HTTP/1.1" 404 235
192.168.100.201 - - [29/Oct/2009:22:41:13 -0400] "GET
/VNL/past_issues/VNL0206H/img/up.gif HTTP/1.1" 200 88
192.168.100.201 - - [29/Oct/2009:22:41:14 -0400] "GET
/VNL/past_issues/VNL0206H/img/voicebg-14.png HTTP/1.1" 200 149569
192.168.100.201 - - [29/Oct/2009:22:42:40 -0400] "GET
/warpcast/2000-10/39E05B7C.htm HTTP/1.1" 200 3474
192.168.100.201 - - [29/Oct/2009:22:42:52 -0400] "GET
/VNL/past_issues/VNL1006H/editorial.html HTTP/1.0" 200 10566
192.168.100.201 - - [29/Oct/2009:22:42:52 -0400] "GET
/VNL/past_issues/VNL1006H/css/scaffold.css HTTP/1.0" 200 7553
192.168.100.201 - - [29/Oct/2009:22:42:54 -0400] "GET
/VNL/past_issues/VNL1006H/css/body.css HTTP/1.0" 200 13297
192.168.100.201 - - [29/Oct/2009:22:42:54 -0400] "GET
/VNL/past_issues/VNL1006H/css/print.css HTTP/1.0" 200 656
192.168.100.201 - - [29/Oct/2009:22:42:55 -0400] "GET
/VNL/past_issues/VNL1006H/css/small.css HTTP/1.0" 200 111
192.168.100.201 - - [29/Oct/2009:22:42:55 -0400] "GET
/VNL/past_issues/VNL1004H/vnewsf.htm HTTP/1.0" 200 3400
192.168.100.201 - - [29/Oct/2009:22:42:56 -0400] "GET
/VNL/past_issues/VNL1006H/css/medium.css HTTP/1.0" 200 1342
192.168.100.201 - - [29/Oct/2009:22:42:57 -0400] "GET
/VNL/past_issues/VNL1006H/css/large.css HTTP/1.0" 200 1303
192.168.100.201 - - [29/Oct/2009:22:43:07 -0400] "GET
/VNL/past_issues/VNL1000H/start.htm HTTP/1.0" 200 2360
192.168.100.201 - - [29/Oct/2009:22:43:09 -0400] "GET
/VNL/past_issues/VNL0708H/feature_7.html HTTP/1.0" 404 237
192.168.100.201 - - [29/Oct/2009:22:44:46 -0400] "GET
/vnewsarc/bn2008121.html HTTP/1.0" 200 42674
192.168.100.201 - - [29/Oct/2009:22:45:01 -0400] "GET
/vnewsarc/bn2007031.html HTTP/1.0" 200 25261
192.168.100.201 - - [29/Oct/2009:22:45:03 -0400] "GET
/VNL/past_issues/VNL0599H/vnewsf1.htm HTTP/1.0" 200 10051
192.168.100.201 - - [29/Oct/2009:22:45:27 -0400] "GET
/warpcast/1998-12/3687BB1A.htm HTTP/1.1" 200 1555
192.168.100.201 - - [29/Oct/2009:22:45:34 -0400] "GET
/warpcast/1998-08/35D81804.htm HTTP/1.0" 200 1424
192.168.100.201 - - [29/Oct/2009:22:45:50 -0400] "GET
/logs/SpeakUp_051898.LOG.html HTTP/1.0" 200 54410
192.168.100.201 - - [29/Oct/2009:22:45:56 -0400] "GET /robots.txt
HTTP/1.1" 200 128
192.168.100.201 - - [29/Oct/2009:22:47:31 -0400] "GET
/VNL/past_issues/VNL1103H/vnewso.htm HTTP/1.0" 200 25486
192.168.100.201 - - [29/Oct/2009:22:48:00 -0400] "GET /rssfeed.xml
HTTP/1.1" 200 26755
192.168.100.201 - - [29/Oct/2009:22:48:15 -0400] "GET / HTTP/1.1"
200 13522
192.168.100.201 - - [29/Oct/2009:22:48:15 -0400] "GET
/VNL/past_issues/VNL0500H/www.serenity-systems.com HTTP/1.1" 404 247
192.168.100.201 - - [29/Oct/2009:22:48:29 -0400] "GET
/VNL/past_issues/VNL0206H/feature_3.html HTTP/1.0" 200 14628
192.168.100.201 - - [29/Oct/2009:22:49:05 -0400] "GET / HTTP/1.1"
200 13522
192.168.100.201 - - [29/Oct/2009:22:49:07 -0400] "GET
/VNL/past_issues/VNL0402H/vnewsf8.htm HTTP/1.0" 200 5905
192.168.100.201 - - [29/Oct/2009:22:49:22 -0400] "GET /robots.txt
HTTP/1.1" 200 128
192.168.100.201 - - [29/Oct/2009:22:49:27 -0400] "GET
/codeofethics.html HTTP/1.1" 200 6055
192.168.100.201 - - [29/Oct/2009:22:50:31 -0400] "GET
/VNL/past_issues/VNL0405H/christian...@os2voice.org HTTP/1.0"
404 254
192.168.100.201 - - [29/Oct/2009:22:51:03 -0400] "GET
/warpcast/1998-09/3603D67D.htm HTTP/1.1" 200 1301
192.168.100.201 - - [29/Oct/2009:22:51:04 -0400] "GET /robots.txt
HTTP/1.1" 200 128
192.168.100.201 - - [29/Oct/2009:22:51:04 -0400] "GET / HTTP/1.1"
200 13522
192.168.100.201 - - [29/Oct/2009:22:51:18 -0400] "GET /index.php
HTTP/1.0" 404 207
192.168.100.201 - - [29/Oct/2009:22:51:53 -0400] "GET
/VNL/past_issues/VNL0999H/start.htm HTTP/1.0" 200 2469
192.168.100.201 - - [29/Oct/2009:22:52:01 -0400] "GET
/VNL/past_issues/VNL0699H/vnewsft.htm HTTP/1.0" 200 16433
192.168.100.201 - - [29/Oct/2009:22:52:53 -0400] "GET
/VNL/past_issues/VNL0708H/feature_1.html HTTP/1.0" 200 27087
192.168.100.201 - - [29/Oct/2009:22:53:14 -0400] "GET
/VNL/past_issues/VNL1103H/start.htm HTTP/1.0" 200 3031
192.168.100.201 - - [29/Oct/2009:22:53:25 -0400] "GET
/VNL/past_issues/VNL0206H/start.html HTTP/1.0" 200 10071
192.168.100.201 - - [29/Oct/2009:22:53:50 -0400] "GET
/warpcast/1999-01/3697FBF9.htm HTTP/1.1" 200 1828
192.168.100.201 - - [29/Oct/2009:22:54:19 -0400] "GET
/vnewsarc/bn2006091.html HTTP/1.0" 200 26168
192.168.100.201 - - [29/Oct/2009:22:54:32 -0400] "GET
/vnewsarc/bn2008091.html HTTP/1.0" 200 30203
192.168.100.201 - - [29/Oct/2009:22:54:51 -0400] "GET
/warpcast/1999-08/index.htm HTTP/1.0" 200 14852
192.168.100.201 - - [29/Oct/2009:22:55:33 -0400] "GET / HTTP/1.1"
200 13522
192.168.100.201 - - [29/Oct/2009:22:55:48 -0400] "GET
/vnewsarc/bn2006052.html HTTP/1.0" 200 33672
192.168.100.201 - - [29/Oct/2009:22:56:00 -0400] "GET
/VNL/past_issues/VNL0708H/vacpp_02.gif HTTP/1.1" 200 10593
192.168.100.201 - - [29/Oct/2009:22:56:39 -0400] "GET
/warpcast/1998-12/36742CB6.htm HTTP/1.1" 200 1270
192.168.100.201 - - [29/Oct/2009:22:57:40 -0400] "GET
/VNL/past_issues/VNL0606H/membership.html HTTP/1.0" 200 6774
192.168.100.201 - - [29/Oct/2009:22:58:29 -0400] "GET
/searchnl.html?words=help&intMaxHits=20&paging=1%2B0&do=now
HTTP/1.1" 200 6473
192.168.100.201 - - [29/Oct/2009:22:58:47 -0400] "GET
/VNL/past_issues/VNL0405H/vnewsfl.htm HTTP/1.0" 200 11999
192.168.100.201 - - [29/Oct/2009:22:59:26 -0400] "GET
/VNL/past_issues/VNL1103H/vnewsnu.htm HTTP/1.1" 200 9137

...but I see nothing here out of the ordinary, aside from my rapidly
switching from 80 to 81 while watching the server status page.

Thanks.

Steven Levine

unread,
Nov 1, 2009, 9:24:51 PM11/1/09
to apa...@googlegroups.com
In <4AEDD0D4...@2rosenthals.com>, on 11/01/09
at 01:17 PM, Lewis G Rosenthal <lgros...@2rosenthals.com> said:

Hi,

>Well, first the error log:

FWIW, these get wrapped on send because you are not putting them after the
tear bar.

> [Thu Oct 29 22:21:47 2009] [notice] caught SIGTERM, shutting down
> [Thu Oct 29 22:21:54 2009] [notice] Apache/2.2.13 (OS/2) PHP/5.2.11
> configured -- resuming normal operations
> [Thu Oct 29 22:25:54 2009] [error] (OS 10035)Resource temporarily
> unavailable: apr_socket_accept
> [Thu Oct 29 22:25:55 2009] [error] (70007)The timeout specified has
> expired: apr_socket_accept
> [Thu Oct 29 22:25:58 2009] [notice] caught SIGTERM, shutting down

You might want to extract more from access_log. The access logs you
posted do not seem to show the request(s) that triggered the first
SIGTERM. Also, I want to believe that there were other messages before
the 22:21:47 SIGTERM. There was the initial startup message, at the
least.

>Interesting to note are the references to port 80 not being available.

This might just be a timing issue. It takes some time for the sockets to
die in the stack. I may be that port 81 is not tried if the bind to 80
failed.

>With both ports listening (and 81 listed first in httpd.conf), one would
>think that we should see the same message for 81, or at least the 81 and
>not the 80 (as that would be the first one not yet closed).

Hard to say. You might want to add a netstat -s to your wrapper and
append the results to a file.

>Now, for the access log (again, SITR is currently unavailable, moved out
>of the webspace; naturally this doesn't stop the bots from trying to
>reach it, however). I've started this exceerpt well before the observed
>crashes, only to give a taste for what was going on:

I think you need to show me access items from earlier in time. The first
error is

[Thu Oct 29 22:21:47 2009] [notice] caught SIGTERM, shutting down

The first access is

192.168.100.201 - - [29/Oct/2009:22:23:08 -0400] "GET
/sitr/KnowledgeBase/form_browse_documents.php?usr_application_ids=%7C78%7C45%7C79%7C22%7C91%7C49%7C90&usr_application_id=-91&function=1
HTTP/1.1" 404 242

which is over a minute after the error was logged.
Reply all
Reply to author
Forward
0 new messages