AlphaGeometry and future of symbolic computation

253 views
Skip to first unread message

Sangyub Lee

unread,
Aug 18, 2024, 8:34:15 PM8/18/24
to sympy
AI achieves silver-medal standard solving International Mathematical Olympiad problems - Google DeepMind

Recently, Google had announced the result that their AI model, AlphaProof and AlphaGeometry can silver medal in IMO problems. Their system is hybrid of symbolic models, and uses proof assistant Lean as backend, which guarantees that the proof can be verified automatically.
ChatGPT had many problems that it can hallucinate the steps of proof, and keep human verifying their result, as well as understaing the steps, so expressing proof as formal proof statements is a gain.

I think that the research reinforces that solve or simplify, or integral is losing competition. Because a lot of them are written with heuristics that won't win with AI, and we also have concerns about code around them are getting messy.

I think that if we want to avoid the losing competition, and make AI systems work collaborative, symbolic computation should be focused to solve only a few 'formal' problems in 100% precision and speed.

I already notice that there is research to connect Wu's method to AlphaGeometry
[2404.06405] Wu's Method can Boost Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry (arxiv.org)
Although symbolic system would no longer competitive solution to general math problems, the 'formal' symbolic systems can still be valued. (I also hear that AlphaGeometry2 is using Wu's method, but I'm trying to verify the sources)

I also think that such advances in AI systems can raise concerns about software engineering careers, or educational system, which may be interesting for some readers in the forum.

For example, math exams can be pointless in the future, even to identify and train good science or engineers in the future, because trained AI models can beat IMO. I think that in AI age, the education should change, such that it is not bearing through boring and repetitive systems, which does not even reflect the capability of future engineers or scientists.

Also, I notice that software engineering is changing, because AI models can complete a lot of code, and precision is improving, or people are improving the skills of prompting.
It also seems to be deprecating code sharing efforts for open source communities, because code can be generated rather than shared.

Peter Stahlecker

unread,
Aug 18, 2024, 11:26:52 PM8/18/24
to sympy
I am not a trained mathematician.
My question: would Gödel's incompleteness theorem not set a limit to what can be proven 'automatically'? 

Maaz Muhammad

unread,
Aug 19, 2024, 3:18:57 PM8/19/24
to sympy
"research reinforces that solve or simplify, or integral is losing competition"

I really don't think so. It's way easier to use a computer algebra system to do the things that these functions do than use Lean. Every step in Lean has to be "handwritten" by the user as its an interactive theorem prover, i.e. it has to be justified by some appeal to a ring axiom or something. Simplifying algebraic expressions in Lean is much more tedious than just calling "simplify" in SymPy because SymPy will do a lot of things automatically. As well, Lean works by using "goals", i.e. you have to pre-specify the end result, as its made to prove theorems. Much more could be said but overall my point is that Lean and SymPy are meant for different purposes.

And for what it's worth, ChatGPT Premium uses SymPy as a backend if asked to do mathematical tasks because a computer algebra system is guaranteed to be correct.

Oscar Benjamin

unread,
Aug 19, 2024, 4:22:27 PM8/19/24
to sy...@googlegroups.com
It is a lot easier to use a computer algebra system than lean but the
goals are different. I was thinking about the idea of making an AI
that could talk to a theorem prover like lean. The idea would be that
you have an AI that hallucinates lots of proofs that may or may not be
correct but then you filter out the correct ones with lean. Then the
answer given to the user is generated by the generative AI but also
proven correct by the theorem prover. It is also presumably
straight-foward to train a generative AI that writes proofs if you can
score its output with a theorem prover.

I haven't read the cited paper yet but it sounds like they have done
something similar to what I was imagining. The flip side is that
suddenly lean becomes a much easier thing to use because you don't
have to write the lean code yourself: the AI does it and translates
the results to/from natural language. Then you're using the theorem
prover for proving theorems and a computer algebra system for doing
algebra and the large language model can focus on language-based tasks
as it should. The LLM doesn't replace lean/SymPy: it uses them much
like humans do.
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/dda943a9-f23d-41b4-9014-5b5cd50c0f70n%40googlegroups.com.

Sangyub Lee

unread,
Aug 19, 2024, 4:43:56 PM8/19/24
to sympy
>  ChatGPT Premium uses SymPy as a backend if asked to do mathematical tasks because a computer algebra system is guaranteed to be correct.

Actually, the reason that ChatGPT still depends on SymPy on some feature,
but Google may not cited SymPy, over Lean or other solution is something we should definitely worry about influence though,
if Google engineers had decided their own reason to not use algebraic steps of SymPy because it can be slow, buggy, or difficult to customize,
or they even had skipped any usage of computer algebra systems, and connect Lean and AI system directly to achieve same algebraic system.

Aaron Meurer

unread,
Aug 19, 2024, 5:00:45 PM8/19/24
to sy...@googlegroups.com
On Sun, Aug 18, 2024 at 6:34 PM Sangyub Lee <syle...@gmail.com> wrote:
>
> AI achieves silver-medal standard solving International Mathematical Olympiad problems - Google DeepMind
>
> Recently, Google had announced the result that their AI model, AlphaProof and AlphaGeometry can silver medal in IMO problems. Their system is hybrid of symbolic models, and uses proof assistant Lean as backend, which guarantees that the proof can be verified automatically.
> ChatGPT had many problems that it can hallucinate the steps of proof, and keep human verifying their result, as well as understaing the steps, so expressing proof as formal proof statements is a gain.
>
> I think that the research reinforces that solve or simplify, or integral is losing competition. Because a lot of them are written with heuristics that won't win with AI, and we also have concerns about code around them are getting messy.
>
> I think that if we want to avoid the losing competition, and make AI systems work collaborative, symbolic computation should be focused to solve only a few 'formal' problems in 100% precision and speed.

I tend to agree. I think there's really two types of computing,
precise computing and fuzzy computing. The traditional computing
paradigm is precise, where the computer computes exactly what it is
told to and always gives the exact same answer. This works really well
for problems where there is an exact correct answer. For example,
computing arithmetic falls into this category because there is only
one correct answer to .

Fuzzy computing is where there isn't necessarily a single "correct"
answer, and the "best" thing that could be returned is subjective.
This has traditionally been done by applying heuristics on top of
precise computing, but with the advent of machine learning and now
more advanced AI models, one can often get results that are better
than any possible precisely defined heuristic. Many language and image
processing tasks fall into this category, which is why those are the
most popular uses of AI models. But there are also a lot of
"heuristics" that get applied even in a computer algebra system. For
example, to compute an indefinite integral of a function f(x), one
just needs to "guess" somehow a function g(x) such that g'(x) = f(x)
(or more generally, guess an ansatz g(x, a1, ..., an) with (typically
linear) parameters a1, ..., an such that the parameters a1, ..., an
can be solved in the equation d/dx g(x, a1, ..., an) = f(x)). There
are precise algorithms for making these guesses in cases where f(x)
has a specific form (for example, Risch applies whenever g(x) is
elementary), and one can definitely use smart mathematics to make
better guesses. But there's also an "art" to guessing that can't
really be encoded in heuristics. But a sufficiently trained AI model
could learn this art and potentially make better guesses.

The great thing about indefinite integration is that it doesn't matter
how you get your "guess" g(x). As long as it differentiates back to
f(x), you know it's the correct answer. This also applies to other
types of problems where heuristics are applied like solving equations,
solving ODEs, and simplification.

So I would say that SymPy should focus on the "precise" parts of
symbolic computing, i.e., the parts that you would use to actually
verify the output of an AI model. Maybe one day a theorem prover like
Lean will be able to do all of this and SymPy won't be needed. That
would be hard to achieve, though, because Lean is a formal theorem
prover, so everything in it has to be completely mathematically
correct down to the last detail. A symbolic system like SymPy is much
more lenient. For example, SymPy can sometimes return an expression
that is only technically correct for some subset of the input domain
because of things like division by 0 or branch cuts. In Lean, you'd
have to formally account for all these cases for the proof to verify.

With that being said, some algorithms like the Risch algorithm, and
even some of the more sophisticated heuristic type algorithms in SymPy
are still useful because they employ more sophisticated mathematical
knowledge than just plain guessing. A good AI model would more or less
have to internalize the equivalent of Risch to match its performance,
for instance.

I'm also hopeful that AI systems can benefit the development of SymPy
itself. I was happy to see that SymPy was one of the projects whose
issues are part of the SWE-Bench dataset, which is a dataset to
benchmark how well AI systems can solve real issues in open source
projects. If a bot can automatically produce fixes for some of the
more trivial SymPy issues that could improve the quality of the code.
(But it needs to be done in a way that is done with human
intervention, not by just spamming AI generated PRs. It needs to
reduce maintainer time and energy, not increase it)

>
> I already notice that there is research to connect Wu's method to AlphaGeometry
> [2404.06405] Wu's Method can Boost Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry (arxiv.org)
> Although symbolic system would no longer competitive solution to general math problems, the 'formal' symbolic systems can still be valued. (I also hear that AlphaGeometry2 is using Wu's method, but I'm trying to verify the sources)
>
> I also think that such advances in AI systems can raise concerns about software engineering careers, or educational system, which may be interesting for some readers in the forum.
>
> For example, math exams can be pointless in the future, even to identify and train good science or engineers in the future, because trained AI models can beat IMO. I think that in AI age, the education should change, such that it is not bearing through boring and repetitive systems, which does not even reflect the capability of future engineers or scientists.

It can still be worth it for people to learn mathematics, because it
provides a better understanding of the world. Just as we still teach
children how to do addition and multiplication, even though pocket
calculators have existed for decades.

>
> Also, I notice that software engineering is changing, because AI models can complete a lot of code, and precision is improving, or people are improving the skills of prompting.
> It also seems to be deprecating code sharing efforts for open source communities, because code can be generated rather than shared.

It certainly hasn't reached that point yet. I think the reasons for
code sharing still apply even in a world where AI could rewrite an
entire framework from scratch. For instance, common code and protocols
allow systems to communicate with one another.

At the same time, there are a lot of software systems that are very
centralized right now because they are effectively the only real
option. For example, everyone on this thread (so far) uses Gmail as
their email system. If AI means that things can become more
decentralized, I think that would ultimately be a good thing.

Aaron Meurer

>
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/e7898bdb-d1e4-49fd-94c7-66ba8a840511n%40googlegroups.com.

Jorge Ramos

unread,
Aug 23, 2024, 7:30:13 AM8/23/24
to sy...@googlegroups.com
I find your English pretty ambiguous. It's not clear to me if, for example, what you mean when you say:

"I think that the research reinforces that solve or simplify, or integral is losing competition. Because a lot of them are written with heuristics that won't win with AI,"

Do you mean that Sympy is losing competitiveness or attractiveness?



--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/e7898bdb-d1e4-49fd-94c7-66ba8a840511n%40googlegroups.com.


--
----------------------------------------------------------------------
Then maybe we are gods after all,....
baby gods only just now waking up to our true power

Sangyub Lee

unread,
Aug 23, 2024, 8:32:29 AM8/23/24
to sympy
> Do you mean that Sympy is losing competitiveness or attractiveness?

I am not 100% definitely saying something like that,
I already know that coming up with strong opinions for that, in subjective or religious tone, 
decreases my professional credibility, so I definitely avoid expressing such sentence.

I already highlight the parts like 'heuristics', or  'solve', 'simplify',
and that narrows down the topic and some background,
and obviously some person already explains the general debate between fuzzy computing and precise computing in software engineering.
and I am neither saying 100% support or anti statement for something.

And it's problem of you not paying attention to some details like that,
but try to raise offense of my English skills.
it won't be productive or pleasant experience for for participating in discussion or debate with you.

Sangyub Lee

unread,
Aug 23, 2024, 6:29:51 PM8/23/24
to sympy

Okay, I apologize for the tone of the conversation and for any offense caused to Jorge Ramos.

Please feel free to ask any questions. I’m happy to explain my opinion in more detail:

Not all expectations of SymPy come from strictly applying algebraic algorithms to specific problems.
Many people use SymPy as a flexible tool to solve mathematical problems, and in that regard, it overlaps with the goals of AI.

Certainly, this overlap could reduce SymPy's "attractiveness" or "competitiveness" to some extent.

I've occasionally participated in product design meetings in the industry.
It’s increasingly difficult to engage people with ideas that rely solely on SymPy or similar computer algebra systems.
People are already seeing the successes of AI products, and their expectations for these products are rising,
especially when new developments come from companies like Google or OpenAI.

Maybe Jorge Ramos or Maaz Muhammad are particularly passionate about SymPy
and feel unhappy with my statements (or maybe not).
I completely understand—it's similar to how someone might react if I were critical of their favorite musician or athlete.
However, I want to note that I’ve spent a significant part of my life working with SymPy.
I rarely make statements based on misunderstandings, and I have no intention of shaming a project that I’ve been deeply involved with.

I’m always trying to learn new things and prepare for changes in the world.
Our existing knowledge can become obsolete with new tools,
and I’ve already noticed that there is a vast amount of knowledge outside SymPy,
that could be used to understand or improve SymPy.
And that's why I'm always watching over movements of AI, as well as Lean.

To be more productive, SymPy itself can enhance its geometry capabilities by incorporating Wu's method or a deductive database approach,
which are useful in addressing geometric challenges.
I’ve also shared the implementation of the Area method with SymPy, which can be searched.

Oscar Benjamin

unread,
Aug 23, 2024, 7:24:36 PM8/23/24
to sy...@googlegroups.com
On Fri, 23 Aug 2024 at 23:29, Sangyub Lee <syle...@gmail.com> wrote:
>
> To be more productive, SymPy itself can enhance its geometry capabilities by incorporating Wu's method or a deductive database approach,
> which are useful in addressing geometric challenges.
> I’ve also shared the implementation of the Area method with SymPy, which can be searched.

I know that you have said that this can be searched but can you share
a link to what you are referring to?

I think that would be useful to others reading this.

--
Oscar

Chris Smith

unread,
Aug 23, 2024, 7:30:18 PM8/23/24
to sympy
Message has been deleted

Sangyub Lee

unread,
Aug 23, 2024, 9:12:38 PM8/23/24
to sympy

I had a few discussions after Christopher Smith's comments, which I believe could be useful to share:

  • The Area Method uses constructive geometry, so it might be beneficial to provide a tool that can draw actual diagrams, in addition to solving the geometry problems (for example, the ability to draw random points, lines, circles).
  • There were discussions between myself and others regarding the choice between the Wu Method and the Area Method. However, this ultimately led to the conclusion that the Area Method should be used due to the poor performance of sparse multivariate polynomials at that time. (Intuitively, we considered using separate x and y symbols for every point in the geometry.)
  • Maybe things have improved by 2024, but I'm not 100% sure. Nonetheless, the performance of sparse multivariate polynomials could clearly contribute to improvements in SymPy as well.
  • I also note that the performance of the 'cancel' function and algebraic number operations could be improved. I believe that many of the datasets from the book on the Area Method could also be useful for providing test cases for these issues. Some geometry problems took dozens of seconds to compute or were not feasible to implement due to performance bottlenecks with algebraic numbers.
  • I note that the debate between using geometry invariants (like the Area Method) and polynomials (like the Wu Method or Gröbner basis) has not yet been concluded. The problem is that geometric identities lead to very inefficient representations of polynomials (think of writing a matrix determinant for every triangle, which is obviously inefficient). This has motivated the use of a symbolic system that represents the determinant expression directly and efficiently.
  • I believe that research on this topic was advanced by people like Hongbo Lee (Invariant Algebras and Geometry Reasoning), but I haven't fully read his work yet. It could offer improvements.
  • I also note that the Area Method could potentially be implemented with Galgebra due to its relevance, but I previously skipped this because Galgebra was more focused on geometric analysis (such as computing derivatives), which was not relevant to the topic. However, I think suggestions from the Galgebra community could be valuable here.
  • I also had access to a full version of the Area Method (including Pythagorean identities) from my previous employer. While it may not be possible to share the code, it's fairly easy to figure out from the original paper I cited.

Alan Bromborsky

unread,
Aug 23, 2024, 9:40:26 PM8/23/24
to sy...@googlegroups.com
This is another approach also by Hongbo Lee -

https://arxiv.org/abs/1507.06634

On 8/23/24 9:12 PM, Sangyub Lee wrote:
> Hongbo Lee

Alan Bromborsky

unread,
Aug 23, 2024, 9:43:09 PM8/23/24
to sy...@googlegroups.com
On 8/23/24 9:12 PM, Sangyub Lee wrote:
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.

Francesco Bonazzi

unread,
Aug 28, 2024, 3:03:59 PM8/28/24
to sympy
On Monday, August 19, 2024 at 2:34:15 a.m. UTC+2 syle...@gmail.com wrote:
AI achieves silver-medal standard solving International Mathematical Olympiad problems - Google DeepMind

Recently, Google had announced the result that their AI model, AlphaProof and AlphaGeometry can silver medal in IMO problems. Their system is hybrid of symbolic models, and uses proof assistant Lean as backend, which guarantees that the proof can be verified automatically.
ChatGPT had many problems that it can hallucinate the steps of proof, and keep human verifying their result, as well as understaing the steps, so expressing proof as formal proof statements is a gain.

I am a bit skeptical about claims from closed source software developed by big tech. Does that code really generalize or has it been optimized for that specific task?
 
So far, project numina  has won a Kaggle competition about solving Math Olympiad problems with AI... and they won by generating SymPy code to solve the problems... (maybe I'll open a separate thread on this).

I think that if we want to avoid the losing competition, and make AI systems work collaborative, symbolic computation should be focused to solve only a few 'formal' problems in 100% precision and speed.

Maybe we could use AI to generate our algorithms, though I believe this is a really hard task. Current LLMs are good at tasks they have seen in their training dataset, they are not that good when you try to generalize on algorithms they have never seen.

But... can we try to set up an agent that generates the nodes of the Risch algorithm by repeatedly interacting with an LLM and feeding it with the proper literature and giving the LLM proper feedback to correct the code it's generated so far?

I also think that such advances in AI systems can raise concerns about software engineering careers, or educational system, which may be interesting for some readers in the forum.

That's a good question. LLMs can do most of the coding stuff, but one should always be careful that the generated code may be wrong, so experience is still required. Furthermore, in corporations software engineers usually spend a lot of time discussion project ideas with business, dealing the security policy / legal auditing of software, bureaucracy to request database access and understanding the mess you find in databases, privacy regulation concerns and so on. It's not just writing code... these jobs will survive AI code generators.
 
Also, I notice that software engineering is changing, because AI models can complete a lot of code, and precision is improving, or people are improving the skills of prompting.

It will take some time, many developers still aren't used to using AI.
 
It also seems to be deprecating code sharing efforts for open source communities, because code can be generated rather than shared.

AI can help you to find other projects that are developing something useful for you. In many cases, people don't use the proper open source tool because they are not aware of its existence.
Reply all
Reply to author
Forward
0 new messages