Our web app is showing a critical error: We're seeing multiple copies
of the Decimal class -- the id's of the classes themselves are
different.
Printing stuff in decimal.py's __new__:
decimal.Decimal == <class 'decimal.Decimal'>
id(decimal.Decimal) == 147710692
value.__class__ == <class 'decimal.Decimal'>
id(value.__class__) == 144723308
We can get it to happen in requests about once every 10 minutes,
intermittently. And it's happening when value comes from psycopg2
(from a Postgres NUMERIC).
When it occurrs it means that psycopg2 can't convert to Decimal,
because Decimal.__new__ uses isinstance(value, Decimal) -- which says,
"No, it's not Decimal" because the id's of the classes are different,
even though it clearly is a decimal.Decimal instance.
Ideas, anyone?
Notes:
* We're running web.py 0.21 with Python 2.4.4, Apache 2.2.3, mod_wsgi,
psycopg 2.0.5.1, and PostgreSQL 8.1.4.
* We've reduced Apache's number of processes and threads to 1 (one),
and it still occurs.
* To "print" that stuff out, we're actually setting local vars so we
see those things in the Django error traceback.
Thanks,
Ben.
--
Ben Hoyt, +64 21 331 841
http://www.benhoyt.com/
Pardon my ignorance, but what package is decimal.py from?
> We can get it to happen in requests about once every 10 minutes,
> intermittently. And it's happening when value comes from psycopg2
> (from a Postgres NUMERIC).
>
> When it occurrs it means that psycopg2 can't convert to Decimal,
> because Decimal.__new__ uses isinstance(value, Decimal) -- which says,
> "No, it's not Decimal" because the id's of the classes are different,
> even though it clearly is a decimal.Decimal instance.
>
> Ideas, anyone?
>
> Notes:
> * We're running web.py 0.21 with Python 2.4.4, Apache 2.2.3, mod_wsgi,
> psycopg 2.0.5.1, and PostgreSQL 8.1.4.
> * We've reduced Apache's number of processes and threads to 1 (one),
> and it still occurs.
> * To "print" that stuff out, we're actually setting local vars so we
> see those things in the Django error traceback.
Would need more information about what code is in what files and
whether you are relying on mod_wsgi ability to reload the WSGI
application script file, but similar issues to this are often seen
when module reloaders are used. The issue in relation to pickling is
described for mod_wsgi in:
http://code.google.com/p/modwsgi/wiki/IssuesWithPickleModule
Do you see your problem if you do a full restart of Apache and then
use your application without changing any code, or have you only been
seeing it after you have been trying to change code and rely on some
automatic module reloader?
Graham
Pardon my ignorance, but what package is decimal.py from?
Would need more information about what code is in what files and
whether you are relying on mod_wsgi ability to reload the WSGI
application script file, but similar issues to this are often seen
when module reloaders are used.
Do you see your problem if you do a full restart of Apache and then
use your application without changing any code ...
The issue in relation to pickling is described for mod_wsgi in:
http://code.google.com/p/modwsgi/wiki/IssuesWithPickleModule
Hmm, hasn't showed up there yet.
> Pardon my ignorance, but what package is decimal.py from?
>
> It's a standard library module (on its own -- not in a package), but it's
> new in Python 2.4:http://docs.python.org/lib/module-decimal.html
Ahh, my box only has Python 2.3, so no wonder I couldn't find it. :-)
> Would need more information about what code is in what files and
>
> > whether you are relying on mod_wsgi ability to reload the WSGI
> > application script file, but similar issues to this are often seen
> > when module reloaders are used.
>
> Okay, we've dug a bit deeper: Decimal was pretty much a red herring. It
> seemed awfully like a Decimal or __new__ problem, so we didn't look at
> mod_wsgi and reloading till late today.
>
> It definitely has to do with module reloading (but, Aaron, we have
> web.reloader off, so it's not a web.py issue). We actually didn't realise
> mod_wsgi was doing any reloading at all, but then looking at the docs more
> closely it says reloading's on by default (WSGIScriptReloading On).
>
> We're not 100% sure, but our main problem seems to be this:
>
> "For many WSGI applications [the mod_wsgi reloading] mechanism will
> generally work fine, however there are a few limitations on what is
> reloaded, plus some Python C extension modules can be incompatible with this
> feature." (from the bottom ofhttp://code.google.com/p/modwsgi/wiki/ApplicationIssues)
>
> We're using psycopg2, which is a Python module written mostly in C, and the
> C module is doing an import decimal. We see further down in that mod_wsgi
> wiki page that we should have been "cautious if using any Python database
> client".
>
> Graham, we realise it's not exactly a mod_wsgi problem, but because of the
> weird nature of the bugs that crop up, and because using Python database
> clients is a fairly common thing in web apps, we think it'd be *much* safer
> to have WSGIScriptReloading Off by default. What do you think?
I need to cover in documentation better the purpose of the WSGI
application script file. That is, that in the main it should be
regarded as stepping stone only to trigger into the actual application
which would be stored in a normal set of Python modules or packages
outside of your Apache document tree. This is how one would use it
with Django, TurboGears, Trac, MoinMoin etc. The web.py package is a
bit different as applications are small enough that they can be fit in
the WSGI application script file.
Being a stepping stone, it is reloaded on changes to make it easier
for people, especially in web hosting environment, to get second
chance to get their configuration correct if they stuff it up the
first time. If reloading wasn't enabled, they would have to get the
whole of Apache restarted instead, thus potentially having to annoy
site admins.
How much actual caching of global data and application logic do you
have in the script file.
> Do you see your problem if you do a full restart of Apache and then
>
> > use your application without changing any code ...
>
> Yeah, that's the one thing we still don't understand -- why the problem
> happened in the first place. It was happening after a full restart of Apache
> and without editing or changing any code. Why was reloading kicking in at
> all? Or are we still missing something?
Maybe.
What I would suggest is setting:
LogLevel info
in your main Apache configuration file. This will turn on additional
logging from mod_wsgi, including messages when it is loading and/or
reloading WSGI script files and when it is creating interpreters for
the first time.
That may help to understand the issue better.
Graham
Do not that that section in the documentation only refers to when you
have explicitly set:
WSGIReloadMechanism Interpreter
The default is to only reload the WSGI script file, not destroy and
recreate the whole interpreter. Thus, unless you had gone and set that
directive, the specific problems mentioned in there would not be an
issue.
As I previously suggested, try increasing the log level so that
exactly when the script file is being reloaded is displayed and try
and cross reference any behaviour to that.
Graham
What I would suggest is setting "LogLevel info"
How much actual caching of global data and application logic do you
have in the script file.
Do you think it's a mod_wsgi issue, or something with our setup? We're running a pretty standard mod_wsgi config -- the only config directive we have is WSGIScriptAlias and "WSGIScriptReloading Off".
If you are also using psycopg2 and the problems relates to use of
decimal module in conjunction with that somehow, then I would
speculate that it could be psycopg2 that is the problem.
The particular problem I am thinking of derives from the fact that C
extension modules are only loaded once per process. Thus, where there
are multiple Python sub interpreters they all use the same version of
the internal C extension module within psycopg2. This is a problem for
two reasons.
The first problem is that different sub interpreters cannot use
different versions of a C extension module. If they do use different
versions then the version loaded first will take precedence. When a
second version is then loaded from another sub interpreter, the C
extension module already loaded is used instead, and the Python
wrappers for the alternate version may not work with that first C
extension module.
The second problem is if the C extension module hasn't been written so
that it will work if used from multiple sub interpreters at the same
time. The sort of issues that can arise here is if the C extension
module caches Python objects injected into the C extension module from
one sub interpreter, and the C extension module tries to reuse that
cached Python object in executing code for a different sub
interpreter.
In this problem of caching, imagine that the Decimal class type is
being cached from one sub interpreter, but is then being used in
checks across multiple sub interpreters. As you might be able to see
that would then cause isinstance checks to fail if the cached Decimal
class type is used in calls from sub interpreters other than which the
Decimal class type originated.
If you can post your two VirtualHost definitions I could give a more
accurate example, but what I suggest you do is use the daemon mode of
mod_wsgi to delegate the running of your demo site WSGI application to
a dedicated daemon sub process, thus ensuring it isn't running in the
same process as your main site. That should ensure there is no
possibility for conflict. Thus:
<VirtualHost *:80>
ServerName demo.micropledge.com
CustomLog logs/demo.micropledge.com-errors_log common
WSGIDaemonProcess demo.micropledge.com threads=15
WSGIProcessGroup demo.micropledge.com
WSGIScriptAlias / /some/path/demo.wsgi
...
</VirtualHost>
<VirtualHost *:80>
ServerName micropledge.com
ServerAlias www.micropledge.com
WSGIScriptAlias / /some/path/production.wsgi
...
</VirtualHost>
The CustomLog directive will ensure that all mod_wsgi messages related
to the demo site WSGI application go into its own log file.
I would still really like to see your VirtualHost configurations
though and particularly whether you are using a wildcard in your
ServerAlias directive. I saw a recent problem with mod_python which I
could not explain, and the only strange thing about it that made it
different was that a wildcard was being used with ServerAlias
directive, with the wildcard pattern overlapping the ServerName for an
alternate VirtualHost container.
Graham
This is definitely a problem in psycopg2 and I probably should log an
issue against that package telling them that their package is unusable
from multiple applications in different interpreters under mod_python
or mod_wsgi. The problem code in psycopg2 is:
PyObject *decimalType = NULL;
/* psyco_decimal_init
Initialize the module's pointer to the decimal type. */
void
psyco_decimal_init(void)
{
#ifdef HAVE_DECIMAL
PyObject *decimal = PyImport_ImportModule("decimal");
if (decimal) {
decimalType = PyObject_GetAttrString(decimal, "Decimal");
}
else {
PyErr_Clear();
decimalType = (PyObject *)&PyFloat_Type;
Py_INCREF(decimalType);
}
#endif
}
/** DECIMAL - cast any kind of number into a Python Decimal object **/
#ifdef HAVE_DECIMAL
static PyObject *
typecast_DECIMAL_cast(char *s, Py_ssize_t len, PyObject *curs)
{
PyObject *res = NULL;
char *buffer;
if (s == NULL) {Py_INCREF(Py_None); return Py_None;}
if ((buffer = PyMem_Malloc(len+1)) == NULL)
PyErr_NoMemory();
strncpy(buffer, s, (size_t) len); buffer[len] = '\0';
res = PyObject_CallFunction(decimalType, "s", buffer);
PyMem_Free(buffer);
return res;
}
#else
#define typecast_DECIMAL_cast typecast_FLOAT_cast
#endif
In other words, they do exactly like I suspected, which is to cache a
reference to the Decimal type object and then use that from multiple
sub interpreters.
The only solution as I already described is to use daemon mode of
mod_wsgi to force applications to run in different processes.
Will be fun to see what the psycopg2 people have to say. :-)
Graham
http://www.initd.org/tracker/psycopg/ticket/192
Graham
On Jul 14, 3:44 pm, Graham Dumpleton <Graham.Dumple...@gmail.com>
wrote:
This is definitely a problem in psycopg2 and I probably should log an
issue against that package telling them that their package is unusable
PyObject *decimal = PyImport_ImportModule("decimal");
if (decimal) {
decimalType = PyObject_GetAttrString(decimal, "Decimal");
}
Will be fun to see what the psycopg2 people have to say. :-)
Correct, if using just mod_python there wouldn't be a way around it
except for fixing psycopg2. At least in mod_wsgi one can use daemon
mode to delegate applications to separate processes thereby avoiding
the issue.
I am actually quite surprised that this issue hasn't come up
previously with mod_python as the code in psycopg2 has been there for
many years.
FWIW, I have extended my documentation on application issues when
using mod_wsgi to add a section on problems with multiple
interpreters.
http://code.google.com/p/modwsgi/wiki/ApplicationIssues
Described there are general problems with extension modules when using
multiple sub interpreters, plus this specific issue.
Look forward to seeing how you go with daemon mode of mod_wsgi as a
way of getting around this.
Graham
so i've hit this problem as well though i think it might have been
with a datetime object, not a decimal.... i've switched to pgdb for
now, but prefer psycopg2. is it confirmed that psycopg2 works in
daemon mode? i am ignorant of how that works, would it be restricted
to processes=1 threads=1 in order to avoid similar problems?
thanks,
-brent
The issue is not number of processes and/or threads, but which sub
interpreter the application runs in and whether there are other
applications in same process running in other sub interpreters.
In short, delegate each application to a separate daemon process using
mod_wsgi daemon mode. Then force the application to run in the main
interpreter of their respective processes, by setting:
WSGIApplicationGroup %{GLOBAL}
So, one application per process group and all running in the main
Python interpreter and not a secondary sub interpreter.
Graham