ChatGPT is an expert in SageMath too

840 views
Skip to first unread message

Kwankyu Lee

unread,
Mar 23, 2023, 6:03:21 AM3/23/23
to sage-devel
I asked "in sagemath, how can i compute the genus of a curve", and this is its reply:
Screenshot 2023-03-23 at 6.59.49 PM.png
which is perfect. Impressive!

Dima Pasechnik

unread,
Mar 23, 2023, 7:03:01 AM3/23/23
to sage-...@googlegroups.com
On Thu, Mar 23, 2023 at 10:03 AM Kwankyu Lee <ekwa...@gmail.com> wrote:
>
> I asked "in sagemath, how can i compute the genus of a curve", and this is its reply:
>
> which is perfect. Impressive!

did you try asking whether it's geometric genus, or not?
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/b1b2ace0-12d2-45c0-bc32-39d66ada13efn%40googlegroups.com.

John Cremona

unread,
Mar 23, 2023, 7:22:33 AM3/23/23
to sage-...@googlegroups.com
Can this specific example be found in the online documentation?


Georgi Guninski

unread,
Mar 23, 2023, 8:12:09 AM3/23/23
to sage-...@googlegroups.com
I can't reproduce the AI script.
First, 'y' is not defined.
Second, sage doesn't like this constructor of Curve
with error:
TypeError: ambient space must be either an affine or projective space

David Ayotte

unread,
Mar 23, 2023, 9:54:03 AM3/23/23
to sage-devel
I think that Chat gpt is a very interesting tool, but we are still a long way from calling it a SageMath "expert":

Capture d’écran 2023-03-23 094944.png

("eisenstein_series_weight_n" is not a built-in function)

Kwankyu Lee

unread,
Mar 23, 2023, 11:40:37 PM3/23/23
to sage-devel
Can this specific example be found in the online documentation?

I also guess that the example is from our documentation. I didn't try to spot the source.

did you try asking whether it's geometric genus, or not?

No. We understand what it did. It didn't "think" that there are two kinds of genus and did not ask your question back to me :-)

 

Kwankyu Lee

unread,
Mar 23, 2023, 11:42:39 PM3/23/23
to sage-devel
I can't reproduce the AI script.
First, 'y' is not defined.
Second, sage doesn't like this constructor of Curve ...

Ah, I missed that. So its answer is less than perfect. 

William Stein

unread,
Mar 24, 2023, 4:04:48 PM3/24/23
to sage-...@googlegroups.com
In CoCalc there is now a "Help me fix this..." button, so when Sage outputs an error message, you can click that button and it will automatically create a chat with chatgpt to suggest a fix:

image.png

Where it says "I ran the following SageMath 9.8 code..." in the right, that's automatically written based on the kernel you're using in Jupyter when you click the Help button below the error. 

The code it suggests next doesn't work, but to me that seems like a bug in Sage (?). Omitting the P entirely does work.

 -- William



--
William (http://wstein.org)

Nils Bruin

unread,
Mar 24, 2023, 4:45:22 PM3/24/23
to sage-devel
On Friday, 24 March 2023 at 13:04:48 UTC-7 William Stein wrote:

The code it suggests next doesn't work, but to me that seems like a bug in Sage (?). Omitting the P entirely does work.

 -- William

--
Not a bug in sage, but a more insidious error in the example ChatGPT originally created. It's basically doing something along the lines:

sage: R.<x,y>=GF(7)[]
sage: P = ProjectiveSpace(2,GF(7))
sage: Curve( P, x^3+y^3-x-y+1)

However, the documentation of Curve says that the *second* argument should be the ambient space (if provided). So:

Curve(x^3+y^3-x-y+1, AffineSpace(R))

does work. The syntactical correction Curve(x^3+y^3-x-y+1,P) still doesn't work because we have an affine equation and are specifying an unrelated projective space.

I think the conclusion should be that ChatGPT pretends to be a SageMath expert with great confidence but .. gasp ... isn't!


William Stein

unread,
Mar 24, 2023, 5:14:33 PM3/24/23
to sage-...@googlegroups.com
Thanks for the clarification.

ChatGPT4 (the more high end version of ChatGPT) gets this same question right for me:

image.png

When you ask about the arithmetic genus it does this:

image.png

which doesn't work because C.degree() isn't implemented.  It would probably be a reasonable thing to implement though (e.g., magma has https://magma.maths.usyd.edu.au/magma/handbook/text/1411#15877)

William


--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Nils Bruin

unread,
Mar 24, 2023, 5:48:13 PM3/24/23
to sage-devel
On Friday, 24 March 2023 at 14:14:33 UTC-7 William Stein wrote:
[...] which doesn't work because C.degree() isn't implemented.  It would probably be a reasonable thing to implement though (e.g., magma has https://magma.maths.usyd.edu.au/magma/handbook/text/1411#15877)

William


For schemes in ordinary projective space having degree would make perfect sense. However, for varieties in affine space less so: it could be a patch of some non-ordinary-projective space.

Kwankyu Lee

unread,
Mar 25, 2023, 12:03:14 AM3/25/23
to sage-devel
which doesn't work because C.degree() isn't implemented.  It would probably be a reasonable thing to implement though (e.g., magma has https://magma.maths.usyd.edu.au/magma/handbook/text/1411#15877)

This

sage: C.projective_closure().degree()

3

does work.

Kwankyu Lee

unread,
Mar 25, 2023, 12:13:17 AM3/25/23
to sage-devel

... The syntactical correction Curve(x^3+y^3-x-y+1,P) still doesn't work because we have an affine equation and are specifying an unrelated projective space.

That is the most serious defect in the answer. I missed that.

I think the conclusion should be that ChatGPT pretends to be a SageMath expert with great confidence but .. gasp ... isn't!

and also I was not careful enough to claim that.
 

William Stein

unread,
Apr 20, 2023, 1:02:18 PM4/20/23
to sage-...@googlegroups.com
Hi,

I don't know whether or not ChatGPT is trained on the source code of
SageMath, but one of the biggest publicly available training sets of
code is described here: https://arxiv.org/abs/2211.15533
In that training set, they explicitly remove any GPL'd code (e.g.,
SageMath): "Permissive license dataset: We develop a dataset of source
code with only permissive licenses, i.e., with minimal restrictions on
how the software can be copied, modified, and redistributed. We first
provide the list of licenses which we classified as permissive in
Appendix A. Note that we intentionally exclude copyleft licenses like
GPL, as this community has strongly expressed the concern of machine
learning models and inferred outputs violating the terms of their
licenses Kuhn (2022)."

As things unfold in the years to come with people trying to use LLM's
in the context of mathematical software, our academic community's
choice of license could explain why LLM's output code that's much more
like Sympy (say) than SageMath's, and perhaps why LLM's are not as
good at using Sage.

William

P.S.In case anybody is curious, Sage was originally GPL'd because it
is a derived work of Pari, and Pari is GPL'd. I made the choice to
use Pari heavily in the implementation of Sage, rather than starting
from scratch. I asked Henri Cohen why he GPL'd Pari and he told me
that Richard Stallman personally "strongly encouraged" him to do so,
because Pari depends on GNU Readline, and GNU Readline is GPL'd.
Despite anything mentioned above, the GPL still seems like the right
license for Sage, given that all the other competitors (Magma,
Mathematica, etc.) are much more restrictive in their licenses.
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/f49e1b3e-396b-4483-93d1-1af4d721476bn%40googlegroups.com.



--
William (http://wstein.org)

Michael Orlitzky

unread,
Apr 20, 2023, 3:02:14 PM4/20/23
to sage-...@googlegroups.com
On 2023-04-20 10:01:36, William Stein wrote:
>
> As things unfold in the years to come with people trying to use LLM's
> in the context of mathematical software, our academic community's
> choice of license could explain why LLM's output code that's much more
> like Sympy (say) than SageMath's, and perhaps why LLM's are not as
> good at using Sage.
>

Violating the license in this case is only a problem because the
researchers have morals. Microsoft and OpenAI are free of such
limitations:

https://githubcopilotlitigation.com/

William Stein

unread,
Apr 20, 2023, 3:19:00 PM4/20/23
to sage-...@googlegroups.com
That's also an argument that violating the license may lead to other
problems (financial and business harm).

Setting aside any moral questions, it is not clear to me whether or
not training models using GPL code from GitHub will ultimately be
considered fair use under US law. I think it is more likely that it
ultimately will be allowed, if for no other reason than the
organizations that want it to be allowed are more powerful than the
ones who don't.

https://www.theverge.com/2023/1/28/23575919/microsoft-openai-github-dismiss-copilot-ai-copyright-lawsuit

For the rest of the world, it's going to be pretty chaotic:

https://www.searchenginejournal.com/chatgpt-legal-woes/484323/

The relevance to Sage, is that LLM models are incredibly powerful, and
the math community may have to put in extra effort to train a model
that knows Sage, analogous to the extra work repl.it is putting in:

https://blog.replit.com/llm-training

Of course, the math community is sometimes slow at realizing that
significant technical innovations are happening...

Dima Pasechnik

unread,
Apr 20, 2023, 3:37:54 PM4/20/23
to sage-devel


On Thu, 20 Apr 2023, 20:19 William Stein, <wst...@gmail.com> wrote:
On Thu, Apr 20, 2023 at 12:02 PM Michael Orlitzky <mic...@orlitzky.com> wrote:
>
> On 2023-04-20 10:01:36, William Stein wrote:
> >
> > As things unfold in the years to come with people trying to use LLM's
> > in the context of mathematical software, our academic community's
> > choice of license could explain why LLM's output code that's much more
> > like Sympy (say) than SageMath's, and perhaps why LLM's are not as
> > good at using Sage.
> >
>
> Violating the license in this case is only a problem because the
> researchers have morals. Microsoft and OpenAI are free of such
> limitations:
>
>   https://githubcopilotlitigation.com/

That's also an argument that violating the license may lead to other
problems (financial and business harm).

Setting aside any moral questions, it is not clear to me whether or
not training models using GPL code from GitHub will ultimately be
considered fair use under US law.    I think it is more likely that it
ultimately will be allowed, if for no other reason than the
organizations that want it to be allowed are more powerful than the
ones who don't.

https://www.theverge.com/2023/1/28/23575919/microsoft-openai-github-dismiss-copilot-ai-copyright-lawsuit

A cursory reading of this wish to dismiss the case sounds to me as the usual M$ chutzpah.
Of course they want it gone, as it hurts their profits.




For the rest of the world, it's going to be pretty chaotic:

https://www.searchenginejournal.com/chatgpt-legal-woes/484323/

The relevance to Sage, is that LLM models are incredibly powerful, and
the math community may have to put in extra effort to train a model
that knows Sage, analogous to the extra work repl.it is putting in:

https://blog.replit.com/llm-training

Of course, the math community is sometimes slow at realizing that
significant technical innovations are happening...

 -- William


--
William (http://wstein.org)

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Michael Orlitzky

unread,
Apr 20, 2023, 5:35:12 PM4/20/23
to sage-...@googlegroups.com
On Thu, 2023-04-20 at 20:37 +0100, Dima Pasechnik wrote:
> >
> > https://www.theverge.com/2023/1/28/23575919/microsoft-openai-github-dismiss-copilot-ai-copyright-lawsuit
>
>
> A cursory reading of this wish to dismiss the case sounds to me as the
> usual M$ chutzpah.
> Of course they want it gone, as it hurts their profits.
>

Sadly it's not. The American legal system isn't built for this. The
fact that they're clearly doing something illegal and that it's hurting
people isn't grounds for a third-party lawsuit. The victims can file
suits, but like Microsoft's lawyers said, the victims have to be able
to demonstrate injury. Then for a suit to be worthwhile, that injury
has to outweigh your legal fees. In practice this makes it legal for a
corporation to steal $1 from each of a billion people. See also: online
privacy violations; spam email.

Adding onto the pile in this scenario is how difficult it would be to
prove that *your* code was copied, considering that they've assimilated
most of the available copyrighted material on Earth and that the AI are
black boxes.

Dima Pasechnik

unread,
Apr 21, 2023, 3:22:36 AM4/21/23
to sage-devel


On Thu, 20 Apr 2023, 22:35 Michael Orlitzky, <mic...@orlitzky.com> wrote:
On Thu, 2023-04-20 at 20:37 +0100, Dima Pasechnik wrote:
> >
> > https://www.theverge.com/2023/1/28/23575919/microsoft-openai-github-dismiss-copilot-ai-copyright-lawsuit
>
>
> A cursory reading of this wish to dismiss the case sounds to me as the
> usual M$ chutzpah.
> Of course they want it gone, as it hurts their profits.
>

Sadly it's not. The American legal system isn't built for this. The
fact that they're clearly doing something illegal and that it's hurting
people isn't grounds for a third-party lawsuit. The victims can file
suits, but like Microsoft's lawyers said, the victims have to be able
to demonstrate injury.

Copilot makes money which does not go to the producers of the content it sells. How is it not injury?
Copyright is copyright, it is violated here.


Then for a suit to be worthwhile, that injury
has to outweigh your legal fees. In practice this makes it legal for a
corporation to steal $1 from each of a billion people. See also: online
privacy violations; spam email.

Adding onto the pile in this scenario is how difficult it would be to
prove that *your* code was copied, considering that they've assimilated
most of the available copyrighted material on Earth and that the AI are
black boxes.

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Georgi Guninski

unread,
Apr 21, 2023, 4:47:55 AM4/21/23
to sage-...@googlegroups.com
I know I am minority, but I recommend not to use github (owned by m$).

IMHO m$ are evil and technically incompetent.
They buy stuff and later spoil it.

William Stein

unread,
Apr 21, 2023, 3:36:31 PM4/21/23
to sage-...@googlegroups.com
Hi,

There's is a discussion right now on HN about LLM's trained on code

https://news.ycombinator.com/item?id=35657982

One of the comments https://news.ycombinator.com/item?id=35658118
points out that most of the non-GPL super permissive licenses require
explicit attribution when creating derived works. If the output of an
LLM is a derived work (and not just some fair use of that input), then
there is legally nothing particularly special about GPL in the context
of training LLM's. That I think successfully undercuts my point in
starting this thread.

-- William
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/CAGUWgD8j71OEw7LdzhWUDjy7B8-A2ENHZQeTxT7E88nKJnVGjw%40mail.gmail.com.



--
William (http://wstein.org)

Oscar Benjamin

unread,
Apr 21, 2023, 4:29:34 PM4/21/23
to sage-...@googlegroups.com
On Fri, 21 Apr 2023 at 20:36, William Stein <wst...@gmail.com> wrote:
>
> There's is a discussion right now on HN about LLM's trained on code
>
> https://news.ycombinator.com/item?id=35657982
>
> One of the comments https://news.ycombinator.com/item?id=35658118
> points out that most of the non-GPL super permissive licenses require
> explicit attribution when creating derived works. If the output of an
> LLM is a derived work (and not just some fair use of that input), then
> there is legally nothing particularly special about GPL in the context
> of training LLM's. That I think successfully undercuts my point in
> starting this thread.

I don't think GPL or other licenses really matter here. It won't be
long before these models can produce code that is sufficiently
original/distinct that it would not be considered "derived" anyway.
The fact that the model happened to learn in part from looking at lots
of code with different licenses is not really that different from the
way that humans learn programming. If I happen to look at the Sage
source code and learn something in the process it does not mean that
Sage's GPL conditions apply to all future code I write after gaining
that knowledge. There is a spectrum from using knowledge learned from
code through to adapting the code and in the extreme just copying the
code. Pretty soon these models will be able to position themselves
wherever you want on that spectrum.

--
Oscar

Tobia...@gmx.de

unread,
Apr 22, 2023, 6:41:13 AM4/22/23
to sage-devel
I assume in the short term, one of the most interesting application of (Chat)GPT for sage might be the "Copilot for docs" project from github: https://githubnext.com/projects/copilot-for-docs/ They train the model on the official docs (/ source code?) and thus are able to provide better results. This should take care of the above-mentioned issues that the generated code is not for sage but for scipy/mathematica/...

Michael Orlitzky

unread,
Apr 22, 2023, 7:56:42 AM4/22/23
to sage-...@googlegroups.com
On Fri, 2023-04-21 at 08:22 +0100, Dima Pasechnik wrote:
> >
> >
> > Sadly it's not. The American legal system isn't built for this. The
> > fact that they're clearly doing something illegal and that it's hurting
> > people isn't grounds for a third-party lawsuit. The victims can file
> > suits, but like Microsoft's lawyers said, the victims have to be able
> > to demonstrate injury.
>
> Copilot makes money which does not go to the producers of the content it
> sells. How is it not injury?
> Copyright is copyright, it is violated here.
>

There's injury, but only the people who are actually injured can file
suit.

Suppose copilot is willing to reproduce the source code of block_ldlt()
but without the GPL. What's the dollar amount that I'm harmed by that?
I can sue for damages, or I can sue for the portion of the profits
attributable to the infringement. Microsoft's lawyers are going to
claim that both numbers are zero. How much will it cost me in legal
fees to fight that, and for what potential benefit?

It's a losing proposition for me, and by extension, for almost everyone
writing free software.

Emmanuel Charpentier

unread,
Apr 22, 2023, 2:39:04 PM4/22/23
to sage-devel

“Le droit du pauvre est un mot creux” (“Rights of the poor” is a hollow phrase) Eugène Pottier, L’internationale (1871).

I am somewhat skeptic about the odds of any legal action suceeding against a porential multi-billion $ buisness prospect. Except if it folds, of course... ;-)

Reply all
Reply to author
Forward
Message has been deleted
0 new messages