[AOLSERVER] "Signal 11" and "alloc: invalid block" errors in AOLserver 4.5.1

65 views
Skip to first unread message

Wolfgang Winkler

unread,
Apr 26, 2012, 9:10:07 AM4/26/12
to aolserv...@lists.sourceforge.net
Hi!

We have various high volume servers with AOLserver installations. Most
of them are running stable but in all of our latest installs we have
problems with the stability, especially on startup.

I've recompiled all sources on a clean (tcl free) installation with:

* tcl 8.5.11 (and 8.5.9 on another machine)
* postgres 9.0.1 (and 9.1.3 on another machine)
* tls 1.6
* thread 2.6.6
* aolserver 4.5.1 (src package and latest github version)

I've checked all loaded libraries with strace and there is only one tcl
version on each of the boxes.

When I'm starting nsd, I get the following backtrace for 3 out of 10
times with the github version, the package version just states "
received fatal signal 11" or "alloc: invalid block":


[18/Apr/2012:10:48:28][25051.1675499280][-nssock:driver-] Notice:
nssock: listening on 91.118.87.98:8000
*** glibc detected *** /usr/local/aolserver/bin/nsd: double free or
corruption (fasttop): 0x0000000001203760 ***
======= Backtrace: =========
/lib/libc.so.6(+0x71ad6)[0x7fa79ad69ad6]
/lib/libc.so.6(cfree+0x6c)[0x7fa79ad6e84c]
/usr/lib/libcrypto.so(CRYPTO_free+0x1d)[0x7fa793d66aad]
/usr/lib/libcrypto.so(OBJ_NAME_add+0x92)[0x7fa793d69952]
/usr/lib/libssl.so.0.9.8(SSL_library_init+0x11)[0x7fa793430e71]
/usr/local/lib/tls1.6/libtls1.6.so(Tls_Init+0x83)[0x7fa7936463d3]
/usr/local/lib/libtcl8.5.so(+0xa2f40)[0x7fa79b9b3f40]
/usr/local/lib/libtcl8.5.so(+0x32411)[0x7fa79b943411]
/usr/local/lib/libtcl8.5.so(+0x32bc9)[0x7fa79b943bc9]
/usr/local/lib/libtcl8.5.so(Tcl_EvalEx+0x16)[0x7fa79b9442f6]
/usr/local/lib/libtcl8.5.so(TclEvalObjEx+0x41f)[0x7fa79b9449ef]
/usr/local/lib/libtcl8.5.so(+0x3de6b)[0x7fa79b94ee6b]
/usr/local/lib/libtcl8.5.so(+0x32411)[0x7fa79b943411]
/usr/local/lib/libtcl8.5.so(+0x78b74)[0x7fa79b989b74]
/usr/local/lib/libtcl8.5.so(TclObjInterpProcCore+0x10b)[0x7fa79b9cb69b]
/usr/local/lib/libtcl8.5.so(+0x32411)[0x7fa79b943411]
/usr/local/lib/libtcl8.5.so(+0x32bc9)[0x7fa79b943bc9]
/usr/local/lib/libtcl8.5.so(Tcl_EvalEx+0x16)[0x7fa79b9442f6]
/usr/local/lib/libtcl8.5.so(TclEvalObjEx+0x41f)[0x7fa79b9449ef]
/usr/local/lib/libtcl8.5.so(+0x3de6b)[0x7fa79b94ee6b]
/usr/local/lib/libtcl8.5.so(+0x32411)[0x7fa79b943411]
/usr/local/lib/libtcl8.5.so(+0x78b74)[0x7fa79b989b74]
/usr/local/lib/libtcl8.5.so(+0x80a51)[0x7fa79b991a51]
/usr/local/lib/libtcl8.5.so(TclEvalObjEx+0x85)[0x7fa79b944655]
/usr/local/lib/libtcl8.5.so(+0x443ea)[0x7fa79b9553ea]
/usr/local/lib/libtcl8.5.so(+0x32411)[0x7fa79b943411]
/usr/local/lib/libtcl8.5.so(+0x32bc9)[0x7fa79b943bc9]
/usr/local/lib/libtcl8.5.so(Tcl_EvalEx+0x16)[0x7fa79b9442f6]
/usr/local/aolserver_4.5.1/lib/libnsd.so(NsTclICtlObjCmd+0x5eb)[0x7fa79be796db]
/usr/local/lib/libtcl8.5.so(+0x32411)[0x7fa79b943411]
/usr/local/lib/libtcl8.5.so(+0x32bc9)[0x7fa79b943bc9]
/usr/local/lib/libtcl8.5.so(Tcl_EvalEx+0x16)[0x7fa79b9442f6]
/usr/local/aolserver_4.5.1/lib/libnsd.so(+0x44eef)[0x7fa79be78eef]
/usr/local/aolserver_4.5.1/lib/libnsd.so(+0x450d0)[0x7fa79be790d0]
/usr/local/aolserver_4.5.1/lib/libnsd.so(+0x45ea1)[0x7fa79be79ea1]
/usr/local/aolserver_4.5.1/lib/libnsd.so(Ns_TclAllocateInterp+0x9)[0x7fa79be79f79]
/usr/local/aolserver_4.5.1/lib/libnsd.so(Ns_TclEval+0x2c)[0x7fa79be7a0bc]
/usr/local/aolserver_4.5.1/lib/libnsd.so(+0x3bb91)[0x7fa79be6fb91]
/usr/local/aolserver_4.5.1/lib/libnsthread.so(+0x6704)[0x7fa79bc31704]
/lib/libpthread.so.0(+0x68ba)[0x7fa79b4f78ba]
/lib/libc.so.6(clone+0x6d)[0x7fa79adc702d]
======= Memory map: ========
00400000-00401000 r-xp 00000000 08:01 2261630
/usr/local/aolserver_4.5.1/bin/nsd
00600000-00601000 rw-p 00000000 08:01 2261630
/usr/local/aolserver_4.5.1/bin/nsd
0115f000-030d6000 rw-p 00000000 00:00 0 [heap]
7fa761fe0000-7fa761fe1000 ---p 00000000 00:00 0
7fa761fe1000-7fa763de2000 rw-p 00000000 00:00 0
7fa763de2000-7fa763de3000 ---p 00000000 00:00 0
7fa763de3000-7fa765be4000 rw-p 00000000 00:00 0
7fa765be4000-7fa765be5000 ---p 00000000 00:00 0
7fa765be5000-7fa7679e6000 rw-p 00000000 00:00 0
7fa7679e6000-7fa7679e7000 ---p 00000000 00:00 0
7fa7679e7000-7fa7697e8000 rw-p 00000000 00:00 0
7fa7697e8000-7fa7697e9000 ---p 00000000 00:00 0
7fa7697e9000-7fa76b5ea000 rw-p 00000000 00:00 0
7fa76b5ea000-7fa76b5eb000 ---p 00000000 00:00 0
7fa76b5eb000-7fa76d3ec000 rw-p 00000000 00:00 0
7fa76d3ec000-7fa76d3ed000 ---p 00000000 00:00 0
7fa76d3ed000-7fa76f1ee000 rw-p 00000000 00:00 0
7fa76f1ee000-7fa76f1ef000 ---p 00000000 00:00 0
7fa76f1ef000-7fa770ff0000 rw-p 00000000 00:00 0
7fa770ff0000-7fa770ff1000 ---p 00000000 00:00 0
7fa770ff1000-7fa772df2000 rw-p 00000000 00:00 0
7fa772df2000-7fa772df3000 ---p 00000000 00:00 0
7fa772df3000-7fa774bf4000 rw-p 00000000 00:00 0
7fa774bf4000-7fa774bf5000 ---p 00000000 00:00 0
7fa774bf5000-7fa7769f6000 rw-p 00000000 00:00 0
7fa7769f6000-7fa7769f7000 ---p 00000000 00:00 0
7fa7769f7000-7fa7787f8000 rw-p 00000000 00:00 0
7fa7787f8000-7fa7787f9000 ---p 00000000 00:00 0
7fa7787f9000-7fa77a5fa000 rw-p 00000000 00:00 0
7fa77a5fa000-7fa77a5fb000 ---p 00000000 00:00 0
7fa77a5fb000-7fa77c3fc000 rw-p 00000000 00:00 0
7fa77c3fc000-7fa77c3fd000 ---p 00000000 00:00 0
7fa77c3fd000-7fa77e1fe000 rw-p 00000000 00:00 0
7fa77e1fe000-7fa77e1ff000 ---p 00000000 00:00 0
7fa77e1ff000-7fa780000000 rw-p 00000000 00:00 0
7fa780000000-7fa780021000 rw-p 00000000 00:00 0
7fa780021000-7fa784000000 ---p 00000000 00:00 0
7fa785634000-7fa785635000 ---p 00000000 00:00 0
7fa785635000-7fa787436000 rw-p 00000000 00:00 0
7fa787436000-7fa787437000 ---p 00000000 00:00 0
7fa787437000-7fa789238000 rw-p 00000000 00:00 0
7fa789238000-7fa789239000 ---p 00000000 00:00 0
7fa789239000-7fa78b03a000 rw-p 00000000 00:00 0
7fa78b03a000-7fa78b03b000 ---p 00000000 00:00 0
7fa78b03b000-7fa78ce3c000 rw-p 00000000 00:00 0
7fa78ce3c000-7fa78ce3d000 ---p 00000000 00:00 0
7fa78ce3d000-7fa78ec3e000 rw-p 00000000 00:00 0
7fa78ec3e000-7fa78ec3f000 ---p 00000000 00:00 0
7fa78ec3f000-7fa790a40000 rw-p 00000000 00:00 0
7fa790a40000-7fa790a42000 r-xp 00000000 08:01 1524953 /lib/libutil-2.11.2.so
7fa790a42000-7fa790c41000 ---p 00002000 08:01 1524953 /lib/libutil-2.11.2.so
7fa790c41000-7fa790c42000 r--p 00001000 08:01 1524953 /lib/libutil-2.11.2.so
7fa790c42000-7fa790c43000 rw-p 00002000 08:01 1524953 /lib/libutil-2.11.2.so
7fa790c43000-7fa790d7a000 r-xp 00000000 08:01 2050113
/usr/lib/libpython2.5.so.1.0
7fa790d7a000-7fa790f79000 ---p 00137000 08:01 2050113
/usr/lib/libpython2.5.so.1.0
7fa790f79000-7fa790fac000 rw-p 00136000 08:01 2050113
/usr/lib/libpython2.5.so.1.0
7fa790fac000-7fa790fb4000 rw-p 00000000 00:00 0
7fa790fb4000-7fa790fb7000 r-xp 00000000 08:01 2130882
/usr/local/lib/tclpython/tclpython.so.4.1
7fa790fb7000-7fa7911b6000 ---p 00003000 08:01 2130882
/usr/local/lib/tclpython/tclpython.so.4.1
7fa7911b6000-7fa7911b7000 rw-p 00002000 08:01 2130882
/usr/local/lib/tclpython/tclpython.so.4.1
7fa7911b7000-7fa7911e8000 r-xp 00000000 08:01 2130186
/usr/local/lib/libGeoIP.so.1.4.6
7fa7911e8000-7fa7913e8000 ---p 00031000 08:01 2130186
/usr/local/lib/libGeoIP.so.1.4.6
7fa7913e8000-7fa7913e9000 rw-p 00031000 08:01 2130186
/usr/local/lib/libGeoIP.so.1.4.6
7fa7913e9000-7fa7913ec000 r-xp 00000000 08:01 2089151
/usr/lib/tclgeoip0.2/libtclgeoip0.2.so
7fa7913ec000-7fa7915eb000 ---p 00003000 08:01 2089151
/usr/lib/tclgeoip0.2/libtclgeoip0.2.so
7fa7915eb000-7fa7915ec000 rw-p 00002000 08:01 2089151
/usr/lib/tclgeoip0.2/libtclgeoip0.2.so
7fa7915ec000-7fa7915ed000 ---p 00000000 00:00 0
7fa7915ed000-7fa7933ee000 rw-p 00000000 00:00 0
7fa7933ee000-7fa79343d000 r-xp 00000000 08:01 2050453
/usr/lib/libssl.so.0.9.8
7fa79343d000-7fa79363c000 ---p 0004f000 08:01 2050453
/usr/lib/libssl.so.0.9.8
7fa79363c000-7fa793643000 rw-p 0004e000 08:01 2050453
/usr/lib/libssl.so.0.9.8Aborted


This are the last lines of the strace output:

getegid() = 1001
getgid() = 1001
geteuid() = 1001
getuid() = 1001
write(2, "[18/Apr/2012:10:18:25][21888.112"...,
120[18/Apr/2012:10:18:25][21888.112105216][-main-] Notice: nsmain:
security info: uid=1001, euid=1001, gid=1001, egid=1001
) = 120
futex(0x17ad2a4, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x1782460, 6) = 3
futex(0x1782460, FUTEX_WAKE_PRIVATE, 1) = 1
[18/Apr/2012:10:18:25][21888.9258768][-sched-] Notice: sched: starting
futex(0x1777bc0, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource
temporarily unavailable)
write(2, "[18/Apr/2012:10:18:25][21888.112"...,
81[18/Apr/2012:10:18:25][21888.112105216][-main-] Notice: driver:
starting: nssock
) = 81
futex(0x1777bc0, FUTEX_WAKE_PRIVATE, 1) = 0
mmap(NULL, 31465472, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f90cc5da000
mprotect(0x7f90cc5da000, 4096, PROT_NONE) = 0
clone(child_stack=0x7f90ce3daff0,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x7f90ce3db9e0, tls=0x7f90ce3db710,
child_tidptr=0x7f90ce3db9e0) = 21916
futex(0x1bd5f04, FUTEX_WAIT_PRIVATE, 1,
NULL[18/Apr/2012:10:18:25][21888.18446744072874735376][-nssock:driver-]
Notice: nssock: listening on 91.118.87.98:8000
) = 0
futex(0x17b2f00, FUTEX_WAKE_PRIVATE,
1[18/Apr/2012:10:18:25][21888.64059152][-socks-] Notice: socks: starting
) = 0
rt_sigtimedwait([HUP INT TERM], NULL, NULL,
8[18/Apr/2012:10:18:25][21888.9258768][-sched-] Fatal: received fatal
signal 11
[18/Apr/2012:10:18:25][21888.18446744073663354640][-thread-46196976-]
Fatal: received fatal signal 11
<unfinished ...>
+++ killed by SIGABRT +++

According to the backtrace there is a problem with the tls package. I'd
be very grateful if anybody could give me a hint.

Thanks,

Wolfgang Winkler

--
digital concepts OG
Software & Design
Landstrasse 68 / 5. Stock
A - 4020 Linz

Büro: +43 732 99711772
Mobil: +43 699 19971172

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
aolserver-talk mailing list
aolserv...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/aolserver-talk

Victor Guerra

unread,
Apr 26, 2012, 9:44:17 AM4/26/12
to Wolfgang Winkler, aolserv...@lists.sourceforge.net
Dear Wolfgang, 

Take a look at this aolserver conversation here: 

I guess the patch described there would 
--
-vg

Victor Guerra

unread,
Apr 26, 2012, 9:46:17 AM4/26/12
to Wolfgang Winkler, aolserv...@lists.sourceforge.net
sorry.. prematurely sent the email... 

as I was saying... the patch described there would be useful for you, if tls is causing the crashes. It can be that the tls code you have is not thread safe. 

Best,
--
-vg

Fenton, Brian

unread,
Nov 26, 2012, 10:46:34 AM11/26/12
to aolserv...@lists.sourceforge.net
Hello

We've just had some reports of this error re-appearing on some systems. Was there ever a solution found?

These PROPFIND/OPTIONS methods seem to be something to do with WebDav, which as far as I'm aware, we're not using. Any idea what would cause them to appear in the logs?

Thanks
Brian


-----Original Message-----
From: Tom Jackson
Sent: 14 April 2009 14:10
To: aolserv...@lists.sourceforge.net
Subject: [AOLSERVER] Tracked down bug with PROPFIND / OPTIONS methods

Over the last few years some users have noticed that their servers
suddenly stop responding, and the error log has entries similar to this:

[-conn:965-] Error: return: failed to redirect
'PROPFIND /global/file-not-found.tcl': exceeded recursion limit of 3
[-conn:965-] Error: return: failed to redirect
'PROPFIND /global/server-error.tcl': exceeded recursion limit of 3

The second error.log line then repeats hundreds or thousands of times
until the server stops responding.

The question is why this happens, and what do these log entries tell
us.

I think I have found out the answer, but the fix isn't apparent.

The first error message indicates the recursion limit code is working
correctly, after three tries, the HTTP status code goes from 404 to 500.
The second error message indicates a similar recursion limit is reached
looking for a 500 handler. Unfortunately there is a loop here:

Ns_ConnReturnInternalError
executes
ReturnRedirect
which executes
Ns_ConnRedirect
which executes
Ns_ConnReturnInternalError

This loop accounts for the remaining error log entries.

One problem is that ReturnRedirect uses the redirects configured like
this:

ns_section "ns/server/farid/redirects"
ns_param 404 "/fnf-tmpl.tcl"
ns_param 403 "global/forbidden.html"
ns_param 500 "global/server-error.tcl"

What is missing, it seems to me is the method of the request. The method
is used in Ns_AuthorizeRequest and Ns_ConnRunRequest.

The quick fix is to not configure a 500 redirect.

tom jackson


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to <list...@listserv.aol.com> with the
body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.

_______________________________________________
aolserver-talk mailing list
aolserv...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/aolserver-talk

------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov

Fenton, Brian

unread,
Nov 26, 2012, 11:18:00 AM11/26/12
to Don Baccus, aolserv...@lists.sourceforge.net
Hi Don

thanks for the info. Actually one of these systems isn't even on the internet, so maybe we've got some "internal" hackers ;-)

Brian

________________________________________
From: Don Baccus [dho...@pacifier.com]
Sent: 26 November 2012 16:11
To: Fenton, Brian
Cc: aolserv...@lists.sourceforge.net
Subject: Re: [AOLSERVER] Tracked down bug with PROPFIND / OPTIONS methods

On Nov 26, 2012, at 7:46 AM, Fenton, Brian wrote:

> Hello
>
> We've just had some reports of this error re-appearing on some systems. Was there ever a solution found?
>
> These PROPFIND/OPTIONS methods seem to be something to do with WebDav, which as far as I'm aware, we're not using. Any idea what would cause them to appear in the logs?

Hackers probing you.

I don't know of a solution for the recursion issue regarding returning a custom error page.

Don Baccus

unread,
Nov 26, 2012, 11:11:59 AM11/26/12
to Fenton, Brian, aolserv...@lists.sourceforge.net

On Nov 26, 2012, at 7:46 AM, Fenton, Brian wrote:

> Hello
>
> We've just had some reports of this error re-appearing on some systems. Was there ever a solution found?
>
> These PROPFIND/OPTIONS methods seem to be something to do with WebDav, which as far as I'm aware, we're not using. Any idea what would cause them to appear in the logs?

Hackers probing you.

I don't know of a solution for the recursion issue regarding returning a custom error page.


Jeff Rogers

unread,
Nov 26, 2012, 12:13:43 PM11/26/12
to Fenton, Brian, aolserv...@lists.sourceforge.net
If this is the bug I think it is, a checkin from 10/2011 fixed this bug.

handle internal error from redirect recursion overflow
directly instead of redirecting to internal error page. Prevents
error displaying error page from crashing server.

If updating to a more recent server isn't an option, you should be able
to apply the patch standalone:

http://aolserver.cvs.sourceforge.net/viewvc/aolserver/aolserver/nsd/op.c?r1=1.18&r2=1.19

Also, doesn't necessarily need to be hackers probing you; some versions
of windows are very eager about discovering shares on their local network.

-J

Dave Bauer

unread,
Nov 26, 2012, 12:22:21 PM11/26/12
to Jeff Rogers, aolserv...@lists.sourceforge.net, Fenton, Brian
You can register a filter for those methods to return a 405 Method Not Allowed response.

These requests can come from Microsoft products checking the capabilities of your web server so it is not necessarily a hacker although it can indidcate that as well.

Peter Sadlon

unread,
Nov 26, 2012, 2:39:32 PM11/26/12
to da...@thedesignexperience.org, dv...@diphi.com, aolserv...@lists.sourceforge.net
I had a similar issue in the past, I believe I had tracked it down to a toolbar, or some desktop application probing the site for some reason, I don't remember exactly.
Here is my filter put in /servers/my_server/modules/tcl/filters.tcl

ns_register_filter preauth OPTIONS * options_na
proc options_na { why } {
  ns_return 405 "text/html; charset=iso-8859-1" "OPTIONS method is not allowed on this url"
  return filter_return
}

too add the same for PROFIND just copy the first line and replace OPTIONS with PROFIND and put it after the 1st line, then restart your server.


Date: Mon, 26 Nov 2012 12:22:21 -0500
From: da...@thedesignexperience.org
To: dv...@diphi.com
CC: aolserv...@lists.sourceforge.net; Brian....@quest.ie
Subject: Re: [AOLSERVER] Tracked down bug with PROPFIND / OPTIONS methods

Fenton, Brian

unread,
Nov 27, 2012, 9:12:44 AM11/27/12
to Peter Sadlon, da...@thedesignexperience.org, dv...@diphi.com, aolserv...@lists.sourceforge.net

Hi Peter

 

That’s a nice work-around. Thanks a lot for replying.

 

Best wishes

Brian

Reply all
Reply to author
Forward
0 new messages