Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

fyi, MIT Integration Bee problems added to CAS integration tests

106 views
Skip to first unread message

Nasser M. Abbasi

unread,
Mar 3, 2023, 1:57:11 PM3/3/23
to
FYI

MIT Integration Bee problems are now included in the
CAS integration tests.

These problems came from https://math.mit.edu/~yyao1/integrationbee.html

Updated the summer 2022 edition of the CAS integration
tests pages to include these problems showing the result
for all CAS systems currently supported. They can be
found under the link called

"links to individual test reports"

Starting at file #211 in the list

<https://12000.org/my_notes/CAS_integration_tests/reports/summer_2022/index.htm/>

At the very bottom of the page.
(one file per year starting from 2010, and per each
competition held) so they match the order shown in the MIT
page above.

A number (may be half) of the MIT integration problems
are definite, so those were solved as indefinite integration
only as that is the only mode supported.

316 new integrals were added. The total number of
integrals now is 85,795.

This is the result of percentage solved per each CAS just
for the MIT problems section (i.e. 316 problems).

=============

1. Mathematica 13.2.1 98.73 %
2. Fricas 1.3.8/sage 9.8 96.52 %
3. Maple 2022.2 94.3 %
4. Rubi 4.16.1 93.35 %
5. Maxima 5.46/sage 9.8 92.41 %
6. Giac 1.9.0-37/sage 9.8 91.77 %
7. Mupad Matlab 2021a 89.56 %
8. Sympy 1.11.1 82.28 %

Any problems, issues, please let me know so I can fix it.

--Nasser

Nasser M. Abbasi

unread,
Mar 4, 2023, 3:33:11 AM3/4/23
to
On 3/3/2023 12:57 PM, Nasser M. Abbasi wrote:

>
> "links to individual test reports"
>
> Starting at file #211 in the list
>
> <https://12000.org/my_notes/CAS_integration_tests/reports/summer_2022/index.htm/>
>
> At the very bottom of the page.
>
...
>
> =============
>
> 1. Mathematica 13.2.1 98.73 %
> 2. Fricas 1.3.8/sage 9.8 96.52 %
> 3. Maple 2022.2 94.3 %
> 4. Rubi 4.16.1 93.35 %
> 5. Maxima 5.46/sage 9.8 92.41 %
> 6. Giac 1.9.0-37/sage 9.8 91.77 %
> 7. Mupad Matlab 2021a 89.56 %
> 8. Sympy 1.11.1 82.28 %
>


Fyi;

The following are the integrals from the MIT integration
problems which were not solved and broken per each CAS.

The syntax used for the integrand below is that
from the Maple file which might need small modification
(if any) to make it run on each specific CAS. In all, the
integration variable is x.

The Hall of fame problem that no CAS could solve is this one:

(2018*x^2017+2017*x^2016)/(x^4036+2*x^4035+x^4034+1)

Its anti-derivative is arctan(x^2017+x^2018) which is shown below:

integrand := (2018*x^2017+2017*x^2016)/(x^4036+2*x^4035+x^4034+1);
anti := arctan(x^2017+x^2018);

simplify(diff(anti,x)-integrand)

0

Mathematica:
-------------
(2018*x^2017+2017*x^2016)/(x^4036+2*x^4035+x^4034+1)
(-x^4+4*x^3-7*x^2+6*x+1)^11
(x+exp(1)+1)*x^exp(x)*exp(x)
ln(x/Pi)/(ln(x)^ln(exp(1)*Pi))

Fricas:
-------
ln(x+1)/(x^2+1)
sin(101*x)*sin(x)^99
x*(1-x)^2014 Exception raised: RecursionError >> maximum recursion depth exceeded
(2018*x^2017+2017*x^2016)/(x^4036+2*x^4035+x^4034+1) Timed out
cos(3*x)+sin(2*x)*(-sin(2019*x)+cos(3*x)) Timed out
x*(1-x)^2020 Exception raised: RecursionError >> maximum recursion depth exceeded
(1-x)^3+(-x^2+x)^3+(x^2-1)^3-3*(1-x)*(-x^2+x)*(x^2-1)
(x+exp(1)+1)*x^exp(x)*exp(x)
(3*x^3+2*x^2+1)/(x^2+1)^(1/3)
(1/x*ln(1/x))^(1/2)
2^(1/2)*ln(x)^(1/2)+1/2*2^(1/2)/ln(x)^(1/2)

Maple:
----------
3*x^2*(x^3+1)^2*exp(-x^6-2*x^3)
cos(x)^(cos(x)+1)*tan(x)*(1+ln(cos(x)))
(2018*x^2017+2017*x^2016)/(x^4036+2*x^4035+x^4034+1)
1/(x^(41/25)+x^(9/25))
x^(1/ln(x))
(x+exp(1)+1)*x^exp(x)*exp(x)
(2*x^2022+1)/(x^2023+x)
ln(x/Pi)/(ln(x)^ln(exp(1)*Pi))
(1-(-1/2*Pi+arcsin(sin(x)))^2)^(1/2)
(1/x*ln(1/x))^(1/2)
arcsin(x)*arccos(x)
2^(1/2)*ln(x)^(1/2)+1/2*2^(1/2)/ln(x)^(1/2)
x^(-ln(x))
((x+1)^(1/2)-x^(1/2))^Pi
sin(4*arctan(x))
tan(x)^(1/3)/(cos(x)+sin(x))^2
(1+x^2+(x^4+x^2+1)^(1/2))^(1/2)

Rubi:
-----
1/(sin(x)+sec(x))
(cos(x)*ln(x)-sin(x)/x)/ln(x)^2
exp(exp(x)+exp(-x)+x)-exp(exp(x)+exp(-x)-x)
(1+2*x*exp(x^2))*cos(x)-(x+exp(x^2))*sin(x)
cos(x)*cosh(x)+sin(x)*sinh(x)
(2018*x^2017+2017*x^2016)/(x^4036+2*x^4035+x^4034+1)
x^(x^2+1)*(1+2*ln(x))
sin(x+sin(x))-sin(x-sin(x))
(1+ln(x))*ln(ln(x))
(x+exp(1)+1)*x^exp(x)*exp(x)
sin(x)/(2*exp(x)+cos(x)+sin(x))
(1-(-1/2*Pi+arcsin(sin(x)))^2)^(1/2)
(cos(x)-sin(x))/(2+sin(2*x))
arcsin(x)*arccos(x)
ln(3^(1/2)+tan(x))
exp(-2*x)*sin(3*x)/x
exp(x+1/x)*(x^6+x^4-x^2-1)/x^4
x^(-ln(x))
exp(cos(x))*cos(2*x+sin(x))
sin(4*arctan(x))
(1+x^2+(x^4+x^2+1)^(1/2))^(1/2)


Maxima:
-------
1/(sin(x)+sec(x))
(x/(-x^3+1))^(1/2)
sin(101*x)*sin(x)^99
1/(1+x^(1/2))/(-x^2+x)^(1/2)
x^(1/2)/((2012-x)^(1/2)+x^(1/2))
(-1+x)/(x+1)/(x^3+x^2+x)^(1/2)
(x+(x^2+1)^(1/2))^(1/2)
(1+sin(x))^(1/2)
(2018*x^2017+2017*x^2016)/(x^4036+2*x^4035+x^4034+1)
1/(x^(41/25)+x^(9/25))
1/(x^(3/2)-x^2)^(1/2)
x/((-1+x)^(1/2)+(x+1)^(1/2))
exp(exp(x))-exp(-x+exp(x))
(x+exp(1)+1)*x^exp(x)*exp(x)
ln(x/Pi)/(ln(x)^ln(exp(1)*Pi))
(3*x^3+2*x^2+1)/(x^2+1)^(1/3)
(cos(x)-sin(x))/(2+sin(2*x))
(1/x*ln(1/x))^(1/2)
1/((x+1)^3*(-1+x))^(1/2)
sin(23*x)/sin(x)
x^(-ln(x))
exp(cos(x))*cos(2*x+sin(x))
((x+1)^(1/2)-x^(1/2))^Pi
(1+x^2+(x^4+x^2+1)^(1/2))^(1/2)

Giac
-----
ln(x+1)/(x^2+1)
(csc(x)-sin(x))^(1/2)
1/x^2/(x^4+1)^(3/4)
x^(1/2)/((2012-x)^(1/2)+x^(1/2))
(-1+x)/(x+1)/(x^3+x^2+x)^(1/2)
(csc(x)-sin(x))^(1/2)
(x+(x^2+1)^(1/2))^(1/2)
(csc(x)-sin(x))^(1/2)
sin(x+1/4*Pi)^2/exp(x^2)
cos(x)^(cos(x)+1)*tan(x)*(1+ln(cos(x)))
(2018*x^2017+2017*x^2016)/(x^4036+2*x^4035+x^4034+1)
exp(-2019/4/x^2)/x^2
x^(x^2+1)*(1+2*ln(x))
exp(exp(x))-exp(-x+exp(x))
(x+exp(1)+1)*x^exp(x)*exp(x)
ln(x/Pi)/(ln(x)^ln(exp(1)*Pi))
(1-(-1/2*Pi+arcsin(sin(x)))^2)^(1/2)
(3*x^3+2*x^2+1)/(x^2+1)^(1/3)
(1/x*ln(1/x))^(1/2)
sin(1/x^11)
x*(exp(-x)+1)/(exp(x)-1)
ln(3^(1/2)+tan(x))
x^(1/3)*(1-x)^(2/3)
x*cot(x)
((x+1)^(1/2)-x^(1/2))^Pi
(1+x^2+(x^4+x^2+1)^(1/2))^(1/2)

Mupad
------
ln(x+1)/(x^2+1)
(x/(-x^3+1))^(1/2)
1/(1+x^(1/2))/(-x^2+x)^(1/2)
x*arcsin(x)/(-x^2+1)^(1/2)
sin(x)*(1+tan(x)^2)^(1/2)
sin(x)*ln(sin(x))
1/(1-ln(1-x))
(x+(x^2+1)^(1/2))^(1/2)
sin(x+1/4*Pi)^2/exp(x^2)
(2018*x^2017+2017*x^2016)/(x^4036+2*x^4035+x^4034+1)
1/(x^(3/2)-x^2)^(1/2)
exp(-x^40)
arcsin(x)/x^3
(arctan(x)+arccot(x))/x
(x+exp(1)+1)*x^exp(x)*exp(x)
(2*x^2022+1)/(x^2023+x)
ln(x/Pi)/(ln(x)^ln(exp(1)*Pi))
(1-(-1/2*Pi+arcsin(sin(x)))^2)^(1/2)
(3*x^3+2*x^2+1)/(x^2+1)^(1/3)
(sec(1+ln(x))^2-tan(1+ln(x)))/x^2
(1/x*ln(1/x))^(1/2)
(4-(x+1)^2)^(1/2)-3^(1/2)-(-x^2+4)^(1/2)
arcsin(x)*arccos(x)
sin(1/x^11)
x*(exp(-x)+1)/(exp(x)-1)
ln(3^(1/2)+tan(x))
x^(1/3)*(1-x)^(2/3)
x*cot(x)
x^(-ln(x))
cos(1/2*Pi*x^2*2^(1/2))^2
((x+1)^(1/2)-x^(1/2))^Pi
tan(x)^(1/3)/(cos(x)+sin(x))^2
(1+x^2+(x^4+x^2+1)^(1/2))^(1/2)

Sympy
------
sqrt(tan(x))
ln(x+1)/(x^2+1)
x^(1/2)/(x^(1/2)-x^(1/3))
1/(sin(x)+sec(x))
1/(1+exp(x)+exp(2*x))^(1/2)
(csc(x)-sin(x))^(1/2)
1/(9*cos(x)^2+4*sin(x)^2)
(x/(-x^3+1))^(1/2)
sin(101*x)*sin(x)^99
((1-x)/(x+1))^(1/2)
1/(1+x^(1/2))/(-x^2+x)^(1/2)
x^(1/2)/((2012-x)^(1/2)+x^(1/2))
(-1+x)/(x+1)/(x^3+x^2+x)^(1/2)
(csc(x)-sin(x))^(1/2)
sin(x)*(1+tan(x)^2)^(1/2)
x*sec(4*x)^2
1/(1-ln(1-x))
exp(sin(x))/tan(x)/csc(x)
(csc(x)-sin(x))^(1/2)
1/(sin(x)^4+cos(x)^4)
(1+2*x*exp(x^2))*cos(x)-(x+exp(x^2))*sin(x)
arccosh(x)
tanh(x)/exp(x)
(1+sin(x))^(1/2)
sin(x+1/4*Pi)^2/exp(x^2)
cos(x)/(1-cos(2*x))
(2018*x^2017+2017*x^2016)/(x^4036+2*x^4035+x^4034+1)
1/(x^(41/25)+x^(9/25))
1/(x^(3/2)-x^2)^(1/2)
exp(x+exp(x))+exp(x-exp(x))
(sin(20*x)+sin(19*x))/(cos(20*x)+cos(19*x))
(arctan(x)+arccot(x))/x
x/((-1+x)^(1/2)+(x+1)^(1/2))
sin(x+sin(x))-sin(x-sin(x))
1/(1+sin(x))+1/(1+cos(x))+1/(tan(x)+1)+1/(1+cot(x))+1/(1+sec(x))+1/(1+csc(x))
(x+exp(1)+1)*x^exp(x)*exp(x)
x^2/(-x^2+2)+2^(1/2)*(x/(x+1))^(1/2)
(2*x^2022+1)/(x^2023+x)
(1-(-1/2*pi+arcsin(sin(x)))^2)^(1/2)
(cos(x)-sin(x))/(2+sin(2*x))
(sec(1+ln(x))^2-tan(1+ln(x)))/x^2
(1/x*ln(1/x))^(1/2)
x*(exp(-x)+1)/(exp(x)-1)
ln(3^(1/2)+tan(x))
((sin(20*x)+3*sin(21*x)+sin(22*x))^2+(cos(20*x)+3*cos(21*x)+cos(22*x))^2)^(1/2)
exp(-2*x)*sin(3*x)/x
x*cot(x)
1/((x+1)^3*(-1+x))^(1/2)
x^(-ln(x))
exp(cos(x))*cos(2*x+sin(x))
sin(4*arctan(x))
tan(x)^(1/3)/(cos(x)+sin(x))^2
sin(2*x)^2*sin(3*x)^2*sin(5*x)^2*sin(30*x)^2/sin(x)^2/sin(6*x)^2/sin(10*x)^2/sin(15*x)^2
(1+x^2+(x^4+x^2+1)^(1/2))^(1/2)

--Nasser

Sam Blake

unread,
Mar 4, 2023, 5:01:00 AM3/4/23
to
Nice one, Nasser! Very interesting to see the performance of each CAS on these integrals.

Sam

Nasser M. Abbasi

unread,
Mar 5, 2023, 10:44:01 AM3/5/23
to
On 3/4/2023 2:33 AM, Nasser M. Abbasi wrote:
> On 3/3/2023 12:57 PM, Nasser M. Abbasi wrote:
>
>>
>> "links to individual test reports"
>>
>> Starting at file #211 in the list
>>
>> <https://12000.org/my_notes/CAS_integration_tests/reports/summer_2022/index.htm/>
>>
>> At the very bottom of the page.
>>
>> =============
>>
>> 1. Mathematica 13.2.1 98.73 %
>> 2. Fricas 1.3.8/sage 9.8 96.52 %
>> 3. Maple 2022.2 94.3 %
>> 4. Rubi 4.16.1 93.35 %
>> 5. Maxima 5.46/sage 9.8 92.41 %
>> 6. Giac 1.9.0-37/sage 9.8 91.77 %
>> 7. Mupad Matlab 2021a 89.56 %
>> 8. Sympy 1.11.1 82.28 %
>>
>

Fyi,

Small update.

I've merged all MIT 20 or so files into one single
file to make it easier to manage and also to better
be able to see the statistics, as these are per file,
and before this it was hard to see overall stats, as
the problems were broken into different files.

I've also added a zip file with all these problems in
Mathematica, Maple, sagemath and Python format. Link to
the zip file is at top of main page, in the
introduction section.

So for all the MIT bee integration problems, the following
are additional stats to the ones given above

A Grading:
============
1. Mathematica 87.03 %
2. Rubi 86.08 %
3. Fricas 76.95 %
4. Maple 75.63 %
5. Maxima 73.42 %
6. Giac 70.89 %
7. Sympy 61.08 %
8. Mupad N/A (not graded)

Average time used per integral
================================
1. Rubi 0.04 sec
2. Mathematica 0.1 sec
3. Mupad 0.19 sec
4. Maxima 0.43 sec
5. Maple 0.53 sec (note: Maple tries all algorithms)
6. Giac 0.64 sec
7. Fricas 1.33 sec
8. Sympy 2.16 sec

This is direct link to these problems

<https://12000.org/my_notes/CAS_integration_tests/reports/summer_2022/test_cases/11_MIT/report.htm/>

Again, these originally were collected and typed in manually from
the MIT math dept website <https://math.mit.edu/~yyao1/integrationbee.html/>
then converted to different formats after that.

--Nasser




Nasser M. Abbasi

unread,
Apr 28, 2023, 2:56:57 PM4/28/23
to

FYI,

Made new build to the MIT integration bee test file.

<https://12000.org/my_notes/CAS_integration_tests/reports/summer_2022/test_cases/11_MIT/report.htm>

Thanks to Albert Rich adding few more integrals, I rebuild
this file using now the latest Mathematica 13.2.1 and Maple 2023
versions. There are now 321 problems in this file up from 316.

Current result for % solved is

Mathematica 99.38
Fricas 96.26
Maple 95.33
Rubi 94.39
Maxima 92.52
Giac 91.59
Mupad 90.03
Sympy 82.24

In terms of % of getting an A grade for quality of antiderivatives
that were solved, the result is

Rubi 89.10
Mathematica 87.54
Maple 80.06
Fricas 77.88
Maxima 75.88
Giac 73.52
Sympy 62.31
Mupad Not graded.

Any problems please let me know.

--Nasser



On 3/3/2023 12:57 PM, Nasser M. Abbasi wrote:

nob...@nowhere.invalid

unread,
Apr 29, 2023, 12:07:33 PM4/29/23
to

"Nasser M. Abbasi" schrieb:
This is an interesting problem collection.

Many of integrands on which Rubi fails are of the "unnatural" type
Waldek used to test his non-algebraic Risch integrator: derivatives of
more or less randomly assembled "elementary" operators. But others
would not automatically be expected to defeat Rubi, such as #20, #180,
#257, #265, #298, #307:

<https://12000.org/my_notes/CAS_integration_tests/reports/summer_2022/test_cases/11_MIT/reportsubsection5.htm#x9-180002.1.1>

In these cases Rubi fails on:

Int[(Sec[x] + Sin[x])^(-1), x]
Int[Cos[x]*Cosh[x] + Sin[x]*Sinh[x], x]
Int[(Cos[x] - Sin[x])/(2 + Sin[2*x]), x]
Int[ArcCos[x]*ArcSin[x], x]
Int[(E^(x^(-1) + x)*(-1 - x^2 + x^4 + x^6))/x^4, x]
Int[x^(-Log[x]), x]

The last integral could just be a terminal rule, I suppose; I am not so
sure what to do about the next-to-last one. Some of the integrands
causing Rubi may be considered ill-posed, like #250, #317:

Int[Sqrt[1 - ArcCos[Sin[x]]^2], x]
Int[Sin[4*ArcTan[x]], x]

They should perhaps be preprocessed by a strong simplifier at the
user's discretion.

Also note that FriCAS did not really fail on #287:

<https://12000.org/my_notes/CAS_integration_tests/reports/summer_2022/test_cases/11_MIT/reportsubsection9.htm#x13-220002.1.5>

Here, a polynomial integrand equals zero:

integrate((1-x)^3+(-x^2+x)^3+(x^2-1)^3-3*(1-x)*(-x^2+x)*(x^2-1), x)

Martin.

Nasser M. Abbasi

unread,
Apr 29, 2023, 3:51:28 PM4/29/23
to
On 4/29/2023 11:09 AM, clicl...@freenet.de wrote:

>
> Also note that FriCAS did not really fail on #287:
>
> <https://12000.org/my_notes/CAS_integration_tests/reports/summer_2022/test_cases/11_MIT/reportsubsection9.htm#x13-220002.1.5>
>
> Here, a polynomial integrand equals zero:
>
> integrate((1-x)^3+(-x^2+x)^3+(x^2-1)^3-3*(1-x)*(-x^2+x)*(x^2-1), x)
>
> Martin.

Thanks. Yes, I knew about this but forget to fix it.

I finally added code to handle this special case for Fricas.

Fricas has bug where it can return zero on non-zero integrands

<https://groups.google.com/g/fricas-devel/c/OcHBQgoBONM>

----------------------
>fricas
FriCAS Computer Algebra System
Version: FriCAS 1.3.8

(1) -> integrate(x/sqrt(1-x^3),x)
(1) 0
1) -> integrate(x/(-x^3+1)^(1/2),x)
(1) 0
(2) -> integrate(x/(-x^3+1)^(1/2),x)
(2) 0
(3) -> integrate(1/2*(log(a*x-1)-2*log(-(a*x-1)^(1/2)))/pi/(a*x-1)^(1/2),x)
(3) 0
------------------------------

Hopefully the above will be fixed in Fricas 1.3.9.

So the program was checking if the integrand is
not zero but the anti-derivative was zero, and making it failed.

The test program should first fully simplify the integrand
and only then do the checking. It was not doing this.

So I just fixed this and updated the page.

--------from Fricas sagemath script -----

if anti.full_simplify()==0:
if integrand.full_simplify()==0: #add full_simplify()
return passed
else:
return failed
----------------------------------------

Now Fricas gets a pass on this one with A grade.

<https://12000.org/my_notes/CAS_integration_tests/reports/summer_2022/test_cases/11_MIT/reportsubsection303.htm#x313-3220003.3.87>

I also fixed one minor result on one of Rubi's results.
So this is the current table of passing percentages where
Fricas got a very small improvement on its score.

<https://12000.org/my_notes/CAS_integration_tests/reports/summer_2022/test_cases/11_MIT/report.htm>

MIT integration bee passing score
----------------------------
Mathematica 99.38
Fricas 96.57
Maple 95.33
Rubi 94.08
Maxima 92.52
Giac 91.59
Mupad 90.03
Sympy 82.24

any other problems please let me know.

--Nasser

Nasser M. Abbasi

unread,
Apr 29, 2023, 7:24:29 PM4/29/23
to
On 4/29/2023 11:09 AM, clicl...@freenet.de wrote:
>
> <https://12000.org/my_notes/CAS_integration_tests/reports/summer_2022/test_cases/11_MIT/reportsubsection5.htm#x9-180002.1.1>
>
> In these cases Rubi fails on:
>
> Int[(Sec[x] + Sin[x])^(-1), x]
> Int[Cos[x]*Cosh[x] + Sin[x]*Sinh[x], x]
> Int[(Cos[x] - Sin[x])/(2 + Sin[2*x]), x]
> Int[ArcCos[x]*ArcSin[x], x]
> Int[(E^(x^(-1) + x)*(-1 - x^2 + x^4 + x^6))/x^4, x]
> Int[x^(-Log[x]), x]
>
> The last integral could just be a terminal rule, I suppose; I am not so
> sure what to do about the next-to-last one. Some of the integrands
> causing Rubi may be considered ill-posed, like #250, #317:
>
> Int[Sqrt[1 - ArcCos[Sin[x]]^2], x]
> Int[Sin[4*ArcTan[x]], x]
>
> They should perhaps be preprocessed by a strong simplifier at the
> user's discretion.
>

Yes, for example, for the second one above

Int[Cos[x]*Cosh[x] + Sin[x]*Sinh[x], x]

Rubi seems not to have a rule for the product of
circular trig functions with the hyperbolic trig functions:

Int[Cos[x]*Cosh[x], x]

Returns unsolved. But by first converting the integrand to exponentials,
now it can solve it

Int[ TrigToExp[Cos[x]*Cosh[x]] , x] // FullSimplify

1/2 (Cosh[x] Sin[x] + Cos[x] Sinh[x])

Which is same result given by other CAS systems directly

Integrate[Cos[x]*Cosh[x], x]

1/2 Cosh[x] Sin[x] + 1/2 Cos[x] Sinh[x]

But the CAS integration test program does not do any special
preprocessing or simplification on the input or the output
when running these tests.

--Nasser

Albert Rich

unread,
Apr 30, 2023, 4:52:08 AM4/30/23
to
In his post above, Martin points out Rubi fails on many “unnatural” integrands. Or as I prefer to call them: “contrived” integration problems.

In order to keep this project at least theoretically finite in nature, Rubi does not worry about indefinite integrals having relatively simple antiderivatives but complicated integrands. Such gotcha problems are easily generated by differentiating simple expressions to produce complicated integrands.

Rather, the goal for Rubi is to produce optimal antiderivatives for ALL members of a fixed set of general forms. For example, one such form is

P[x] (d+e x^n)^q (a+b x^n+c x^(2 n))^p

where P[x] is any polynomial in x and the exponents n, p, q can be integer, fractional or symbolic. This is a huge class of expressions requiring hundreds of rules to integrate.

In short, I contend that a symbolic integrator should strive to get such coherent sets of real-world integrands before worrying about contrived ones…

Martin listed 6 MIT integration problems of the “natural” type that Rubi should be able to integrate. The version of Rubi currently under development is able to find optimal antiderivatives for

#20 – 1/(Sin[x]+Sec[x])
#265 – ArcSin[x]*ArcCos[x]
#307 – x^(-Log[x])
#317 – Sin[4*ArcTan[x]]

And non-optimal antiderivatives for

#180 – Cos[x]*Cosh[x] + Sin[x]*Sinh[x]
#257 – (Cos[x]-Sin[x]) / (2+Sin[x])

Not quite sure why he considers “ill-posed” the MIT problem

#250 – Sqrt[1–ArcCos[Sin[x]]^2]

How should it be posed?

Albert

Nasser M. Abbasi

unread,
May 9, 2023, 9:48:23 PM5/9/23
to
Fyi,

For exploration only, ChatGPT was added to this one file
of integration problems (MIT problems). Same link as above.

The following is the result.

ChatGPT version 3.5 was used. It is known that ChatGPT is not
meant to be used for solving math problems as its results
can be inaccurate. It says so on the openAI web site

"ChatGPT may produce inaccurate information about
people, places, or facts"

But given all that, I just wanted to see how current
A.I. based on neural networks does on solving
math integration problems.

This was done by directly using ChatGPT and issuing
manually each integrate command for each problem
and then copying the result from the web page.

It was actually able to produce result for almost 90% of the
problems, however when the results were verified many
did not verify and hence counted as not solved.

The following is the final % solved score

Mathematica 13.2.1 99.38
Fricas 1.3.8 96.57
Maple 2023 95.33
Rubi 4.16.1 94.08
Maxima 5.46 92.52
Giac 1.9.0-41 91.59
Mupad Matlab 2021a 90.03
Sympy 1.11.1 82.24
ChatGPT 3.5 14.33

Still, I think scoring 14% is not bad, considering ChatGPT
did this without running an actual integration algorithm
but by just training on textuaal data collected from
the internet.

May be in few more years and more training it can score much higher?

One interesting thing I noticed, is that issuing the
same integrate command many times resulted in slightly
different output.

So most of the time, I just used the first result, otherwise,
I will never be able to finish this. I am not sure why this happens.

Someone mentioned at another forum that ChatGPT tries to
purposely slightly change its answer each time when asked the
same question. I do not know if this is true or not.

Either way, I think the above result shows that A.I. is not going
to replace CAS systems anytime soon, at least when it comes to
solving integration problems.

Any problems, please let me know.

--Nasser



Nasser M. Abbasi

unread,
May 9, 2023, 9:49:37 PM5/9/23
to

Nasser M. Abbasi

unread,
May 10, 2023, 3:37:13 AM5/10/23
to
Fyi,

There is a paper on arxiv on chatGPT for doing mathematica,
it also has small section about solving integration problems using
chatGPT

https://arxiv.org/pdf/2301.13867.pdf

--Nasser

nob...@nowhere.invalid

unread,
Jun 4, 2023, 7:44:40 AM6/4/23
to

"Nasser M. Abbasi" schrieb:
>
> On 4/29/2023 11:09 AM, clicl...@freenet.de wrote:
>
> >
> > Also note that FriCAS did not really fail on #287:
> >
> > <https://12000.org/my_notes/CAS_integration_tests/reports/summer_2022/test_cases/11_MIT/reportsubsection9.htm#x13-220002.1.5>
> >
> > Here, a polynomial integrand equals zero:
> >
> > integrate((1-x)^3+(-x^2+x)^3+(x^2-1)^3-3*(1-x)*(-x^2+x)*(x^2-1), x)
> >
>
> Thanks. Yes, I knew about this but forget to fix it.
>
> I finally added code to handle this special case for Fricas.
>
> Fricas has bug where it can return zero on non-zero integrands
>
> <https://groups.google.com/g/fricas-devel/c/OcHBQgoBONM>
>
> ----------------------
> >fricas
> FriCAS Computer Algebra System
> Version: FriCAS 1.3.8
>
> (1) -> integrate(x/sqrt(1-x^3),x)
> (1) 0
> 1) -> integrate(x/(-x^3+1)^(1/2),x)
> (1) 0
> (2) -> integrate(x/(-x^3+1)^(1/2),x)
> (2) 0
> (3) -> integrate(1/2*(log(a*x-1)-2*log(-(a*x-1)^(1/2)))/pi/(a*x-1)^(1/2),x)
> (3) 0
> ------------------------------
>
> Hopefully the above will be fixed in Fricas 1.3.9.
>

There is little hope in my impression: according to a message of 05 May
2023 on <fricas-devel>, Waldek cannot afford the required extended
commitment for some unspecified time to come:

<https://www.mail-archive.com/fricas...@googlegroups.com/msg15297.html>

Compare this with his wide-ranging "Developement plans" of 27 Aug 2022:

<https://www.mail-archive.com/fricas...@googlegroups.com/msg14888.html>


> So the program was checking if the integrand is
> not zero but the anti-derivative was zero, and making it failed.
>
> The test program should first fully simplify the integrand
> and only then do the checking. It was not doing this.
>
> So I just fixed this and updated the page.
>
> --------from Fricas sagemath script -----
>
> if anti.full_simplify()==0:
> if integrand.full_simplify()==0: #add full_simplify()
> return passed
> else:
> return failed
> ----------------------------------------
>
> Now Fricas gets a pass on this one with A grade.
>
> [...]
>

Martin.

Nasser M. Abbasi

unread,
Jun 25, 2023, 2:59:13 PM6/25/23
to
On 4/29/2023 2:51 PM, Nasser M. Abbasi wrote:
>
> Fricas has bug where it can return zero on non-zero integrands
>
> <https://groups.google.com/g/fricas-devel/c/OcHBQgoBONM>
>
> ----------------------
>> fricas
> FriCAS Computer Algebra System
> Version: FriCAS 1.3.8
>
> (1) -> integrate(x/sqrt(1-x^3),x)
> (1) 0
> 1) -> integrate(x/(-x^3+1)^(1/2),x)
> (1) 0
> (2) -> integrate(x/(-x^3+1)^(1/2),x)
> (2) 0
> (3) -> integrate(1/2*(log(a*x-1)-2*log(-(a*x-1)^(1/2)))/pi/(a*x-1)^(1/2),x)
> (3) 0
> ------------------------------
>
> Hopefully the above will be fixed in Fricas 1.3.9.
>

fyi, 2 of the above are fixed in the 1.3.9. pre-release

>fricas
FriCAS Computer Algebra System
Version: FriCAS 2023-06-17

(1) -> integrate(x/sqrt(1-x^3),x)
2 weierstrassZeta(0,4,weierstrassPInverse(0,4,x))
(1) - -------------------------------------------------
+---+
\|- 1

(2) -> integrate(x/(-x^3+1)^(1/2),x)
2 weierstrassZeta(0,4,weierstrassPInverse(0,4,x))
(2) - -------------------------------------------------
+---+
\|- 1

(3) -> integrate(1/2*(log(a*x-1)-2*log(-(a*x-1)^(1/2)))/pi/(a*x-1)^(1/2),x)
(3) 0


--Nasser



0 new messages