AI achieves silver-medal standard solving International Mathematical Olympiad problems

Georgi Guninski

unread,

Jul 27, 2024, 11:16:56 AM7/27/24

to sage-...@googlegroups.com

https://deepmind.google/discover/blog/ai-solves-imo-problems-at-silver-medal-level/
AI achieves silver-medal standard solving International Mathematical
Olympiad problems

Today, we present AlphaProof, a new reinforcement-learning based
system for formal math reasoning, and AlphaGeometry 2, an improved
version of our geometry-solving system. Together, these systems solved
four out of six problems from this year’s International Mathematical
Olympiad (IMO), achieving the same level as a silver medalist in the
competition for the first time.

ED: There is a catch: the problems were translated by a human:

First, the problems were manually translated into formal mathematical
language for our systems to understand. In the official competition,
students submit answers in two sessions of 4.5 hours each. Our systems
solved one problem within minutes and took up to three days to solve
the others.

Kwankyu Lee

unread,

Jul 29, 2024, 3:01:43 AM7/29/24

to sage-devel

Down to earth, chatgpt on my phone surprised me by solving my final exam statistics problem just from the photo of the exam sheet in Korean :-) Problems with a "standard" procedural answer seems a piece of cake to it.

parisse

unread,

Aug 2, 2024, 3:16:24 PM8/2/24

to sage-devel

The geometry problem was not fully solved, in the sense that AlphaGeometry2 does not prove that the construction is feasible, as far as I understand, they do not construct point X and Y, they assume that such a construction exists

" coll x b c; x != c; foot d i b c; perp x t1 t1 i; cong i t1 i d; para x t1 a c;"

" We manually construct an approximate diagram. "

(https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/imo-2024-solutions/P4/index.html#:~:text=Solution%20(reversed%20proof)-,Formalization,-%3A)

I was very surprised to see that they claim their solution would get a 7/7 mark. Especially since the construction of points X and Y is not complicated.

When they made the announcement of AlphaGeometry version 1, they claimed to solve 25 OIM geometry problems since 2000. In fact, for two of them, they only proved an implication instead of an equivalence.

AlphaGeometry2 is also presented as two orders of magnitude more efficient than AlphaGeometry. Looks like impressive, but it's not that good, it's still about three orders or magnitude less efficient than Giac/Xcas. I have put a page of solutions of 41/42 OIM geometry problems since 2000, they run in the browser in at most a couple of seconds each: https://xcas.univ-grenoble-alpes.fr/forum/viewtopic.php?f=32&t=2891

The solutions are also available in the check directory of the source code of Giac (./chk_oim).

Reply all

Reply to author

Forward