[AOLSERVER] AOLserver 4.5.1 crashing with max connections per thread messages

131 views
Skip to first unread message

Björn Þór Jónsson

unread,
Nov 18, 2010, 11:13:03 AM11/18/10
to AOLS...@listserv.aol.com
Hi,

After recently upgrading from AOLserver 4.5.0 to 4.5.1 and from nspostgres-4.0 to nspostgres-4.1 the server is repeatedly crashing (when it gets hammered by the google bots).  The error.log has many entries like these before the server dies:


[17/Nov/2010:02:18:42][700.3218660208][-default:6195-] Notice: exiting: exceeded max connections per thread
[17/Nov/2010:02:18:43][700.3217636208][-default:6193-] Notice: exiting: exceeded max connections per thread
[17/Nov/2010:02:18:44][700.3219172208][-default:6196-] Notice: exiting: exceeded max connections per thread
[17/Nov/2010:02:18:45][700.3218148208][-default:6194-] Error: Tcl exception:
adp flush failed: connection closed
    abort exception raised
    while processing connection #31907:
        GET ...
        Host: localhost:8006
...
nsthreads: pthread_create failed in NsCreateThread: Resource temporarily unavailable   [this is the last line in the log before the crash]


This is the database section of the AOLserver config file:

ns_section "ns/db/drivers"
ns_param postgres nspostgres.so

ns_section ns/db/pools
    ns_param   pool1              "Pool 1"
    ns_param   pool2              "Pool 2"
    ns_param   pool3              "Pool 3"

ns_section ns/db/pool/pool1
    ns_param   maxidle            1000000000
    ns_param   maxopen            1000000000
    ns_param   connections        5
    ns_param   extendedtableinfo  true
    ns_param   driver             postgres
    ns_param   datasource         localhost::${db_name}
    ns_param   user               $user_account

ns_section ns/db/pool/pool2
    ns_param   maxidle            1000000000
    ns_param   maxopen            1000000000
    ns_param   connections        5
    ns_param   extendedtableinfo  true
    ns_param   driver             postgres
    ns_param   datasource         localhost::${db_name}
    ns_param   user               $user_account

ns_section ns/db/pool/pool3
    ns_param   maxidle            1000000000
    ns_param   maxopen            1000000000
    ns_param   connections        5
    ns_param   extendedtableinfo  true
    ns_param   driver             postgres
    ns_param   datasource         localhost::${db_name}
    ns_param   user               $user_account

ns_section ns/server/${server}/db
    ns_param   pools              "*"
    ns_param   defaultpool        pool1


The server is running on Ubuntu 10.04.1 LTS
2.6.32-25-generic-pae #45-Ubuntu SMP Sat Oct 16 21:01:33 UTC 2010 i686 GNU/Linux


Is there anything I should configure differently or has any other ideas what might be causing this?


/Björn


--
Björn Þór Jónsson
http://bthj.is

-- AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to <list...@listserv.aol.com> with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.

Alexey Pechnikov

unread,
Nov 18, 2010, 11:42:19 AM11/18/10
to AOLS...@listserv.aol.com
On my debian lenny/squeeze 32 and 64 bit hosts this work fine with PostgreSQL 8.1/8.4:

# Database drivers
ns_section "ns/db/drivers"
ns_param   postgres nspostgres.so  ;# An internal driver

#ns_section "ns/db/driver/postgres"
#ns_param        pgbin           /usr/lib/postgresql/8.1/bin/psql

ns_section "ns/db/pools"
ns_param   maindb  "Main database Pool"
ns_param   session "Session database Pool"

ns_section "ns/db/pool/maindb"
ns_param   driver          postgres
ns_param   datasource      $env(PGHOST):$env(PGPORT):$env(PGDBNAME)
ns_param   user            $env(PGUSER)
ns_param   password        ""
ns_param   connections     20
ns_param   logsqlerrors    true      ;# Verbose SQL query error logging
ns_param   verbose         false     ;# Verbose error logging
ns_param   maxidle         600       ;# Max time to keep idle db conn open
ns_param   maxopen         3600      ;# Max time to keep active db conn open

ns_section "ns/db/pool/session"
ns_param   driver          postgres
ns_param   datasource      $env(PGHOST):$env(PGPORT):$env(PGDBNAME)
ns_param   user            $env(PGUSER)
ns_param   password        ""
ns_param   connections     20
ns_param   logsqlerrors    true      ;# Verbose SQL query error logging
ns_param   verbose         false     ;# Verbose error logging
ns_param   maxidle         600       ;# Max time to keep idle db conn open
ns_param   maxopen         3600      ;# Max time to keep active db conn open

# Accessing DB pools
ns_section "ns/server/${servername}/db"
ns_param pools          *            ;# Wildcard gives access to all
ns_param defaultpool    maindb


You can try on Ubuntu packages aolserver4 and aolserver4-nspostgres from my repository:
deb http://mobigroup.ru/debian/ squeeze main non-free
deb-src http://mobigroup.ru/debian/ squeeze main non-free

--
Best regards, Alexey Pechnikov.
http://pechnikov.tel/

Tom Jackson

unread,
Nov 18, 2010, 11:47:06 AM11/18/10
to AOLS...@listserv.aol.com
My guess is that this is caused by the way the current threadpool code
works. I'll have to see if I can find my test data, but here is what I
remember off the top of my head.

There are a few causes:

1. use of Ns_CondSignal instead of Ns_CondBroadcast to wakeup
threadpool threads. This usually results in the thread which just sent
the signal to "wake up" and grab the mutex again, and service another
request. This results in the max number of requests per thread being
reached for a particular thread and it tries to exit.
2. The exiting thread starts up a replacement thread under certain
conditions. Sometimes, with many requests coming in, this new thread
will grab the mutex repeatedly, and get into the same condition as #1.
However, thread exiting from #1 hasn't yet exited and now its parent
is also trying to exit.
3. The basic problem is detecting when to allow threads to exit. For
instance, a thread might exit because it has been sitting around for
too long. Say it has serviced 10 connections and is supposed to exit
at 50. What has happened is that you remove the ability to handle 40
requests. The visible result of this is the inability of the server to
maintain threads between the min and max specified for a particular
threadpool. (Replacing a thread at thread exit patches over this
problem, but causes a different problem).

Note that it is hard to demonstrate the bug, I only found it by
hammering the server. But the bug inevitably shows up and crashes the
server. I also added additional logging code so I was able to track
what thread was servicing requests and the configuration of the thread
during each request (how many conns had been serviced, etc.).

tom jackson

2010/11/18 Bj�rn ��r J�nsson <ban...@bthj.is>:

> /Bj�rn
>
> --
> Bj�rn ��r J�nsson

Tom Jackson

unread,
Nov 18, 2010, 12:06:21 PM11/18/10
to AOLS...@listserv.aol.com
BTW, I don't think the issue you are seeing has anything to do with
the database pools, the problem is the connection threadpools (you are
using the default threadpool).

tom jackson

Björn Þór Jónsson

unread,
Nov 19, 2010, 5:52:59 AM11/19/10
to AOLS...@listserv.aol.com
Are there any good examples of proper connection threadpool configuration available?

I've looked at http://openacs.org/forums/message-view?message_id=1146218 and am a bit confused.  (BTW this is not an OpenACS site, just plain .adp pages).

Thanks Alexey for the Ubuntu packages, before checking them out I'll try to get this latest AOLserver version just compiled to stay up :)

/Björn

2010/11/18 Tom Jackson <t...@rmadilo.com>

--
Björn Þór Jónsson
http://bthj.is

Tom Jackson

unread,
Nov 19, 2010, 12:14:38 PM11/19/10
to AOLS...@listserv.aol.com
Here is one example: See the full configuration at:

http://junom.com/document/aolserver/startup/errors.txt

ns_section ns/server/junom/pool/default
ns_param maxconnections 100
ns_param minthreads 4
ns_param maxthreads 10
ns_param threadtimeout 240
ns_param map {GET /}
ns_param map {POST /}

ns_section ns/server/junom/pool/fast
ns_param maxconnections 100
ns_param minthreads 2
ns_param maxthreads 10
ns_param threadtimeout 120
ns_param map {GET /*-thumb.jpg}

ns_section ns/server/junom/pools
ns_param default {default pool}
ns_param fast {fast pool}

There is also a pools.tcl file which goes in the global tcl library
directory (should be in the current AOLserver version).

Right now, this only works only to configure the default pool. I have
an updated pools.tcl file, but is also relies on a C patch which
creates a separate namespace for each virtual server. Otherwise, any
pool named "default" will overwrite the global pool, also named
default.

Here is what is in my pools.tcl file, which might give hints for using ns_pools:


set cfgsection "ns/server/[ns_info server]"

set minthreads [ns_config $cfgsection minthreads 0]
set maxthreads [ns_config $cfgsection maxthreads 10]
set maxconns [ns_config $cfgsection maxconnections 0]
set timeout [ns_config $cfgsection threadtimeout 0]

#ns_pools set default -minthreads $minthreads -maxthreads $maxthreads
-maxconns $maxconns -timeout $timeout

ns_log Notice "Default Pool: [ns_pools get default]"

# Setup optional threadpools

set poolSection $cfgsection/pools

set poolSet [ns_configsection $poolSection]

if {"$poolSet" ne ""} {

set poolSize [ns_set size $poolSet]
for {set i 0} {$i < $poolSize} {incr i} {
set poolName [ns_set key $poolSet $i]
set poolDescription [ns_set value $poolSet $i]
set poolConfigSection "ns/server/[ns_info server]/pool/$poolName"
set poolConfigSet [ns_configsection $poolConfigSection]
if {"$poolConfigSet" eq ""} {
continue
}
set poolMinthreads [ns_config $poolConfigSection minthreads $minthreads]
set poolMaxthreads [ns_config $poolConfigSection maxthreads $maxthreads]
set poolMaxconns [ns_config $poolConfigSection maxconnections $maxconns]
set poolTimeout [ns_config $poolConfigSection threadtimeout $timeout]

ns_pools set $poolName -minthreads $poolMinthreads -maxthreads
$poolMaxthreads -maxconns $poolMaxconns -timeout $poolTimeout
ns_log Notice "$poolName Pool: [ns_pools get [ns_info
server]-$poolName]"
set poolConfigSize [ns_set size $poolConfigSet]
for {set j 0} {$j < $poolConfigSize} {incr j} {
if {[string tolower [ns_set key $poolConfigSet $j]] eq "map"} {
set mapList [split [ns_set value $poolConfigSet $j]]
set poolMethod [lindex $mapList 0]
set poolPattern [lindex $mapList 1]
ns_pools register ${poolName} [ns_info server] $poolMethod $poolPattern
ns_log Notice "ns_pools registered $poolName [ns_info server]
$poolMethod $poolPattern"
}
}
}
}

(You can use the ns_pools command anywhere, even after startup or from
the control port)

I think the above script can be fixed by modifying [ns_pools set] to this:

ns_pools set [ns_info server]-$poolName -minthreads $poolMinthreads
-maxthreads $poolMaxthreads -maxconns $poolMaxconns -timeout
$poolTimeout

(anywhere you see $poolName, replace with [ns_info server]-$poolName):

ns_pools register [ns_info server]-${poolName} [ns_info server]
$poolMethod $poolPattern

tom jackson

(you can post or send me directly the config info during startup if
you have problems)

2010/11/19 Bj�rn ��r J�nsson <ban...@bthj.is>:


> Are there any good examples of proper connection threadpool configuration
> available?
> I've looked at�http://openacs.org/forums/message-view?message_id=1146218�and
> am a bit confused. �(BTW this is not an OpenACS site, just plain .adp
> pages).
> Thanks Alexey for the Ubuntu packages, before checking them out I'll try to
> get this latest AOLserver version just compiled to stay up :)

> /Bj�rn

> Bj�rn ��r J�nsson

> http://bthj.is
>
> --
> AOLserver - http://www.aolserver.com/
>
> To Remove yourself from this list, simply send an email to
> <list...@listserv.aol.com> with the
> body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject:
> field of your email blank.
>

Gustaf Neumann

unread,
Nov 22, 2010, 2:54:49 AM11/22/10
to AOLS...@listserv.aol.com
Dear Björn,

The error indicates that pthread_create returned EAGAIN
(paraphrased as "Resource temporarily unavailable").
This error indicates that

   "the system lacked the necessary resources to create
    another thread, or the system-imposed limit on the total
    number of threads in a process {PTHREAD_THREADS_MAX}
    would be exceeded. " (from http://linux.die.net/man/3/pthread_create).

so, for the user, under which the server runs, check "ulimit -u",
limits.conf, etc. Can it be that you switched to a new machine with lower
limits than before?

What is you setting of maxthreads?

-gustaf neumann

Am 18.11.10 17:13, schrieb Björn Þór Jónsson:
-- 
Univ.Prof. Dr. Gustaf Neumann
Institute of Information Systems and New Media
WU Vienna
Augasse 2-6, A-1090 Vienna, AUSTRIA

Björn Þór Jónsson

unread,
Dec 1, 2010, 10:25:29 AM12/1/10
to AOLS...@listserv.aol.com
Hi,

The server has been rock stable since I changed

ns_param   maxconnections     5

to

ns_param   maxconnections     100


and from

ns_param   maxthreads         5

to

ns_param   maxthreads         10


"ulimit -u" says:  unlimited
and /etc/security/limits.conf has all lines commented out.


Thanks for the help :)
/Björn

2010/11/22 Gustaf Neumann <neu...@wu.ac.at>

Gustaf Neumann

unread,
Dec 2, 2010, 4:50:02 AM12/2/10
to AOLS...@listserv.aol.com
Dear Björn,

While i don't see the direct connection between your changed settings and error with EAGAIN, there are apparently misconfiguration in the snippet of your config file which are related to the changed settings:

1) If you have maxthreads defined as 10, then your first db-pool should have at least 10 db-connections. Note, that you might have in your app as well scheduled procs that will use as well  db-connections; 15 might be a good value.

2) In the pools parameter, you have to enlist you pools (pools "*" is not allowed for new aolservers, use in your example "pool1,pool2,pool3" instead of the "*").

For an example of the config file, see:
http://cvs.openacs.org/browse/OpenACS/openacs-4/etc/config.tcl?r=HEAD

-gustaf neumann
Reply all
Reply to author
Forward
0 new messages