Recently, we started getting "Fatal Python error: Cannot recover from
stack overflow." in the Python 3 tests (when running setup.py test),
which is immediately followed by "Abort trap: 6", which kills Python.
In Mac OS X, this generates a problem report automatically with
logging information. I've uploaded the test output and the logging
info at https://gist.github.com/2317869. This has shown up in at
least two other people's SymPy-Bot tests (see
https://github.com/sympy/sympy/pull/1208), so I'm pretty sure it's not
a problem on my end.
Does anyone have any idea what could be causing this? I don't see how
it could be anything but a Python bug. I'm currently in the process
of bisecting the history to find the offending commit. It's taking a
while, though, as I have to rerun ./bin/use2to3 and setup.py test to
reproduce it (just running ./bin/test on the failing test file does
not reproduce the error). I'll report back when I get it.
Aaron Meurer
commit 0856119bd7399a416c21e1692855a1077164f21c
Author: Aaron Meurer <asme...@gmail.com>
Date: Mon Mar 12 23:05:30 2012 -0600
Factor out setup.py test into run_all_tests() in sympy/utilities/runtests.py
This way, we can easily get the functionality of running all the tests in a
forward compatible way, but still have the ability to pass arguments and
keyword arguments to the various test functions.
This would explain why bin/test does not generate the error. The
problem has something to do with setup.py.
I don't know how this causes a problem. I don't think it's related to
the translation with 2to3. A diff on the Python 2 file and the
translated file reveals that the only changes it made to the contents
of that commit are to add () to the end of the "print" lines. I think
it has something to do with calling a function from the sympy module
inside setup.py somehow.
If necessary, this commit can be reverted. Nothing else depends on this change.
By the way, apparently this failure actually showed up in the pull
request implementing this (https://github.com/sympy/sympy/pull/1115),
but it went unnoticed. I think we need to make it more clear in the
bot test summary if there were failures, as it looks too much like the
passed test case in passing.
Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To post to this group, send email to sy...@googlegroups.com.
> To unsubscribe from this group, send email to
> sympy+un...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/sympy?hl=en.
I created https://github.com/sympy/sympy-bot/issues/106 for this.
Aaron Meurer
My first bet would be an endless recursion, possibly in SymPy's test
routines, possibly in the Python runtime.
The stack depth is a bit small for that though.
@XFAIL
def test_complex_2899():
# infinite recursion in coth,
https://code.google.com/p/sympy/issues/detail?id=2899
a,b = symbols('a,b', real=True)
for deep in [True,False]:
for func in [sinh, cosh, tanh, coth]:
assert func(a).expand(complex=True,deep=deep) == func(a)
The Python stack is never supposed to overflow, though. It should
raise a RecursionError.
A print statement (actually print function, since this is Python 3)
reveals that the error happens with deep=True, func=coth, which is
actually the first one to cause a recursion error.
I wonder if my patch caused the error simply because it added another
function call to the stack. I can't see how else it would affect it.
Aaron Meurer
This could be a bug in Python's stack overflow detection.
That wouldn't be a surprise, detecting stack overflows is a trade-off
between being accurate and being fast, and bugs are hard to trigger
reliably so it's not easy to test all potential cases. Such a
constellation can start producing bugs at the drop of a hat.
In the meantime, we need to do one of the following:
- Revert the above commit
- Fix the recursion error bug
- Comment out the XFAIL test.
So that the tests can be run in Python 3 again. If we choose one of
the second two options, there may be other similar XFAIL tests that
we'll have to do the same for.
Aaron Meurer
In the meanwhile, if I understand the issue correctly, it should be
enough to just fix this one recursion error. So I think we should
just fix issue 2899, and even if there are other recursion problems
they shouldn't show up (unless we are very unlucky). We need to fix
that issue anyway.
Aaron Meurer
I would just comment out that particular test, since it is XFAILing anyway.
Ondrej
As far as I understand the situation, we have an infinite recursion on
the SymPy side, which also crashes the Python side of things.
So we need to fix the recursion anyway.
Python crashing means we get a stack dump and all tests further down the
line get ignored. So the consequence is that some recursion bugs (which
need to be fixed anyway) need a higher priority than they would have
gotten otherwise, is it?
Yes, this is right. The test is an XFAIL test, so it's just testing
the broken behavior. But we should fix it.
> Python crashing means we get a stack dump and all tests further down the
> line get ignored. So the consequence is that some recursion bugs (which need
> to be fixed anyway) need a higher priority than they would have gotten
> otherwise, is it?
From my understanding, some C Python function was suppressing the
error, which usually is a more subtle bug, but in this case it
prevented it from stopping the Python stack from overflowing. I think
the chances of hitting this particular bug are pretty slim. You have
to call whatever Python function was suppressing the error exactly
when the Python stack fills up. That's why my patch made the error
show up: it added another function call to the stack, making the error
show up in a different place. Also, if it were easy to hit, someone
else would have found it by now.
Aaron Meurer