looking for a numerical expression to fail this test

45 views
Skip to first unread message

Chris Smith

unread,
Sep 19, 2023, 9:14:35 AM9/19/23
to sympy
Given
```
def f(self):
    from sympy.core.numbers import int_valued
    r = self.round(2)
    i = int(r)
    if int_valued(r):
        # non-integer self should pass one of these tests
        if (self > i) is S.true:
            return i
        if (self < i) is S.true:
            return i - 1
        # is it safe to assume that self == i?
    return i ``` Can anyone think of a real numerical expression which will round up and then not compare with self and cause f to return an integer that is 1 too large?

Chris Smith

unread,
Sep 19, 2023, 10:48:17 AM9/19/23
to sympy
`eq=(cos(2)**2+sin(2)**2-1/S(10**120))` rounds to 1 but neither `eq<1` nor `eq>1` evaluates but the correct value for `int(eq)` is 0.

/c

Aaron Meurer

unread,
Sep 19, 2023, 1:26:31 PM9/19/23
to sy...@googlegroups.com
This is somewhat off-topic, but perhaps related to what you are trying
to figure out, and anyway I see that this code is part of a PR
https://github.com/sympy/sympy/pull/25699. I don't like how this code
depends on the fact that self < i automatically evaluates to true or
false. This is a great example of code that ends up being dependent on
automatic evaluation behavior which makes that behavior harder to
remove. Whether a < b can be computed and whether it should evaluate
automatically are separate considerations. It would be better if there
were an explicit method, like say (a > b).doit(), to tell an
inequality to try to evaluate to true or false.

Things like this are one of the main reasons the core is slower than
it needs to be. It's the reason why, for instance, the expression from
https://github.com/sympy/sympy/issues/24565 takes over 20 seconds just
to construct.

There are three types of automatic evaluation that are bad, in terms
of performance:

1. Creating expressions that are larger than the original
2. Using assumptions.
3. Using evalf (this is often implicitly done as part of using assumptions)

Using evalf is by far the worst of these. It's the thing that causes
SymPy to hang on very simple things that should return instantly (see
https://github.com/sympy/sympy/issues/10800 for another example of
this). We need to move to a model where evalf is never called
automatically, except for cases where it is known that it will be very
cheap.

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/e58a36bd-af3d-43ae-97fd-c9fb06d4f4afn%40googlegroups.com.

Chris Smith

unread,
Sep 22, 2023, 5:16:37 PM9/22/23
to sympy
>   This is a great example of code that ends up being dependent on
automatic evaluation behavior which makes that behavior harder to
remove.

If that is removed then the test will be changed to whatever mechanism will allow the sign of `self - i` to be determined. And for an arbitrary function I don't know how you can do that without evaluation (even if that evaluation does something like ball/interval arithmetic to determine that the range of possible values allows the inequality to be determined). Computing with `n(2)` is much faster than trying to infer the value of a complicated expression.

Do you have ideas about how we can tell if `sqrt(log(100)+log(log(100))) > 2` that is faster than evaluation?

I don't really appreciate the preference for using `is_gt(a,b)` instead of `(a > b) is S.true` or else I may have used the former. (I forgot until writing this that we had that function.) I can change that in the code if that would make something clearer.

/c


Aaron Meurer

unread,
Sep 22, 2023, 5:28:52 PM9/22/23
to sy...@googlegroups.com
I'm not saying that this sort of calculation shouldn't be doable, just
that the user needs to be explicit about asking for it to happen.

Also keep in mind that some evalf() calls can be very slow. This can
obviously happen if the expression is large, but it can also happen
for some Integrals and Sums in the current code.

Here's an example (from https://github.com/sympy/sympy/issues/10800)

Integral(sin(exp(x)), (x, 1, oo)).evalf() # hangs

Obviously we need to fix that to evaluate faster, but the point is
that even if we did that, evalf is inherently a very expensive
operation in general.

On Fri, Sep 22, 2023 at 3:16 PM Chris Smith <smi...@gmail.com> wrote:
>
> > This is a great example of code that ends up being dependent on
> automatic evaluation behavior which makes that behavior harder to
> remove.
>
> If that is removed then the test will be changed to whatever mechanism will allow the sign of `self - i` to be determined. And for an arbitrary function I don't know how you can do that without evaluation (even if that evaluation does something like ball/interval arithmetic to determine that the range of possible values allows the inequality to be determined). Computing with `n(2)` is much faster than trying to infer the value of a complicated expression.
>
> Do you have ideas about how we can tell if `sqrt(log(100)+log(log(100))) > 2` that is faster than evaluation?

Computing whether a > b is a hard problem. It's computation
undecidable in general by Richardson's theorem (a = b is equivalent to
~(a > b) & ~(a < b)). If you have something like sin(1)**2 + cos(1)**2
> 1, and you try to compute it numerically, you cannot get a definite
answer.

So the only reasonable approach is to use something that is fast, but
might give an indeterminate answer, and give users an option to extend
to something more computationally expensive.

I know the code here is coming from __int__, which can only return a
number or raise an exception. I would say here that it should just
raise an exception when too much computation is required, and there
should be a separate SymPy API that can return an unevaluated
expression and has options for controlling how much computation is
done.

But if your model is "compute everything immediately with automatic
evaluation", you have absolutely no control over how much computation
is done. And you end up with defaults that try to do way too much
computation, which ends up slowing things down.

Aaron Meurer
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/3da5eaa6-2299-4d24-bfd1-fbe0e899d678n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages