You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Automated Mathematics India
---------- Forwarded message --------- From: Siddhartha Gadgil<gad...@iisc.ac.in> Date: Fri, 26 Jul 2024 at 09:29 Subject: AI at the Maths Olympiad To: ALL @ MATH <all....@iisc.ac.in> Cc: Rajesh Sundaresan <raj...@iisc.ac.in>
The system ran for 2 days (so beyond Olympiad time limits). It got 4 out of 6 problems, getting 7/7 in each of them. This meant its score was at the top end of the silver medal range (taking more time is not a big deal as the system
will surely get faster).
In addition to "AlphaGeometry" for solving the geometry problems, the system included "AlphaProof", which uses generative AI with the Lean Theorem Prover.
Indeed, one of the top experts in Lean (Eric Weiser) was part of the team and there was deep integration with Lean including "metaprogramming"
Crucially, the actual problems were solved. A lot of false hype with AI solving Olympiad problems has been where a problem says, "find the number n such that ..." and the people running the system change this to "show that 1324 is
the number such that ..." , which often makes it a much easier problem. This was not done here - the system guessed answers and then tried to prove the guesses in Lean, accepting an answer if it was proved.
That Generative AI paired with Interactive Theorem Provers is what is needed for mathematical reasoning (in a broad sense) is something a large number of people have concluded independently. This seems to show that this is correct
and that such a system has promise.
P.S. My opinion: the mathematics community (in a broad sense, including applied mathematics that is theorem-proof based at least) has two choices,
Work on and think about tools and workflows that will allow us to use AI (probably coupled with Lean) in our research (and teaching).
Continue as usual and let AI in mathematics develop in parallel. Essentially this means betting that a system that can autonomously do mathematics at a level equal to or better than
us (come up with questions from the literature, answer some, decide which are non-trivial, write a paper and submit it) is sufficiently far in the future (i.e., many decades rather than, say, 10 years).
Whether or not a system can be a Gromov or Deligne, if it can do work better than most papers of most mathematicians it will already be disruptive.