Cremona's database of elliptic curves over rationals : compilation?

157 views
Skip to first unread message

Julien Puydt

unread,
Jul 19, 2019, 9:19:32 AM7/19/19
to sage-devel
Hi,

I'm considering packaging this database for Debian.

I have upstream files: https://github.com/JohnCremona/ecdata

I know sagemath wants a cremona.db sqlite3 database, as that's what is
in the src/ for the pkg.

What I miss is the file turning the data files into the database!

For the smaller database, there is:
https://git.sagemath.org/sage.git/tree/build/pkgs/elliptic_curves/spkg-install.py

But as far as I can tell the two databases don't even have the same format:
https://git.sagemath.org/sage.git/tree/src/sage/databases/cremona.py

Can someone lend a hand?

JP

John Cremona

unread,
Jul 19, 2019, 9:54:22 AM7/19/19
to SAGE devel
The code to create the database is in Sage itself.  See https://github.com/sagemath/sage/blob/master/src/sage/databases/cremona.py.

I recommend that you hold off for a short while since I am about to update the database to include all conductors up to 500000 (now it is 400000).  This is thanks to Andrew Sutherland + Simon Foundation + Google CP, which ran 80,000 simultaneous jobs on 600,000 cores last night.  So right now I am drowning in data but it will all be sorted before long.

John

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
To post to this group, send email to sage-...@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/922ea2c1-1525-7060-4bbb-06a268f64f5b%40laposte.net.
For more options, visit https://groups.google.com/d/optout.

Julien Puydt

unread,
Jul 19, 2019, 10:21:26 AM7/19/19
to John Cremona, sage-devel
Le 19/07/2019 à 15:53, John Cremona a écrit :
> The code to create the database is in Sage itself. 
> See https://github.com/sagemath/sage/blob/master/src/sage/databases/cremona.py.

Oh, dear... I'll see if I can't turn it into a much shorter piece of code...

> I recommend that you hold off for a short while since I am about to
> update the database to include all conductors up to 500000 (now it is
> 400000).  This is thanks to Andrew Sutherland + Simon Foundation +
> Google CP, which ran 80,000 simultaneous jobs on 600,000 cores last
> night.  So right now I am drowning in data but it will all be sorted
> before long.

Oh! That sounds like an amazing project!

I can still try to prepare the package with the old data, and only
upload to Debian when the new data is there.

Thanks!

JP

John Cremona

unread,
Jul 19, 2019, 10:42:10 AM7/19/19
to Julien Puydt, sage-devel
On Fri, 19 Jul 2019 at 16:21, Julien Puydt <julien...@laposte.net> wrote:
Le 19/07/2019 à 15:53, John Cremona a écrit :
> The code to create the database is in Sage itself. 
> See https://github.com/sagemath/sage/blob/master/src/sage/databases/cremona.py.

Oh, dear... I'll see if I can't turn it into a much shorter piece of code...

Not written by me, just used by me to update Sage's optional package.  William would remember who wrote it.
 

> I recommend that you hold off for a short while since I am about to
> update the database to include all conductors up to 500000 (now it is
> 400000).  This is thanks to Andrew Sutherland + Simon Foundation +
> Google CP, which ran 80,000 simultaneous jobs on 600,000 cores last
> night.  So right now I am drowning in data but it will all be sorted
> before long.

Oh! That sounds like an amazing project!

I can still try to prepare the package with the old data, and only
upload to Debian when the new data is there.

That sounds like a good plan.  Thanks!
 

Thanks!

JP

Julien Puydt

unread,
Jul 20, 2019, 4:08:17 PM7/20/19
to sage-devel
Le 19/07/2019 à 16:41, John Cremona a écrit :

> Not written by me, just used by me to update Sage's optional package. 
> William would remember who wrote it.

I crossed the build/pkgs/elliptic_curves and
src/sage/databases/cremona.py files and obtained the following build script.


I'm a bit annoyed because the elliptic_curve pkg ships a 470M
cremona.db, and I get a 479M cremona.db ; but perhaps it's because I
pack everything (no bound on the genus : everything in the files!)?

Does it look good?

JP
build_sql3db.py

John Cremona

unread,
Jul 21, 2019, 7:20:08 AM7/21/19
to SAGE devel
"genus" --> "conductor".  This would be explained (I think) by the fact that the shipped spkg goes up to conductor 400k while ecdata on github now goes up to 410k.  That's all.  I will soon have finished to 500k and then the spkg will presumably be around 500M.


 

Does it look good?

JP

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Julien Puydt

unread,
Jul 21, 2019, 11:06:08 AM7/21/19
to sage-...@googlegroups.com
Le 21/07/2019 à 13:19, John Cremona a écrit :
>
> On Sat, 20 Jul 2019 at 21:08, 'Julien Puydt' via sage-devel
> <sage-...@googlegroups.com <mailto:sage-...@googlegroups.com>> wrote:
>
> Le 19/07/2019 à 16:41, John Cremona a écrit :
>
> > Not written by me, just used by me to update Sage's optional package. 
> > William would remember who wrote it.
>
> I crossed the build/pkgs/elliptic_curves and
> src/sage/databases/cremona.py files and obtained the following build
> script.
>
>
> I'm a bit annoyed because the elliptic_curve pkg ships a 470M
> cremona.db, and I get a 479M cremona.db ; but perhaps it's because I
> pack everything (no bound on the genus : everything in the files!)?
>
> "genus" --> "conductor".  This would be explained (I think) by the fact
> that the shipped spkg goes up to conductor 400k while ecdata on github
> now goes up to 410k.  That's all.  I will soon have finished to 500k and
> then the spkg will presumably be around 500M.

Yes, conductor.

Could you add a nice LICENSE file to clarify things? I think it's public
domain since it's a list of mathematical objects... but I'd rather have
a nice file by the author to tell that :-P

Thanks,

JP

John Cremona

unread,
Jul 22, 2019, 6:45:04 AM7/22/19
to SAGE devel
Certainly. I think that CC0 is most appropriate.  The significant part of the repo is the data, not the code.  (The most important code used to compute this data is in another repository, eclib; here there are just some utility scripts).  But github does not list any CC licenses in its "add a license" menu.  Any advice?
 

Thanks,


JP

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Dima Pasechnik

unread,
Jul 22, 2019, 7:47:38 AM7/22/19
to sage-devel
On Mon, Jul 22, 2019 at 11:45 AM John Cremona <john.c...@gmail.com> wrote:
>
>
>
>
> On Sun, 21 Jul 2019 at 16:06, 'Julien Puydt' via sage-devel <sage-...@googlegroups.com> wrote:
>>
>> Le 21/07/2019 à 13:19, John Cremona a écrit :
>> >
>> > On Sat, 20 Jul 2019 at 21:08, 'Julien Puydt' via sage-devel
>> > <sage-...@googlegroups.com <mailto:sage-...@googlegroups.com>> wrote:
>> >
>> > Le 19/07/2019 à 16:41, John Cremona a écrit :
>> >
>> > > Not written by me, just used by me to update Sage's optional package.
>> > > William would remember who wrote it.
>> >
>> > I crossed the build/pkgs/elliptic_curves and
>> > src/sage/databases/cremona.py files and obtained the following build
>> > script.
>> >
>> >
>> > I'm a bit annoyed because the elliptic_curve pkg ships a 470M
>> > cremona.db, and I get a 479M cremona.db ; but perhaps it's because I
>> > pack everything (no bound on the genus : everything in the files!)?
>> >
>> > "genus" --> "conductor". This would be explained (I think) by the fact
>> > that the shipped spkg goes up to conductor 400k while ecdata on github
>> > now goes up to 410k. That's all. I will soon have finished to 500k and
>> > then the spkg will presumably be around 500M.
>>
>> Yes, conductor.
>>
>> Could you add a nice LICENSE file to clarify things? I think it's public
>> domain since it's a list of mathematical objects... but I'd rather have
>> a nice file by the author to tell that :-P
>
>
> Certainly. I think that CC0 is most appropriate. The significant part of the repo is the data, not the code. (The most important code used to compute this data is in another repository, eclib; here there are just some utility scripts). But github does not list any CC licenses in its "add a license" menu. Any advice?

Nothing prevents one to add LICENSE.md (or LICENSE) with any content
one sees fit.
Did you consider FDL, by the way?

>
>>
>>
>> Thanks,
>>
>> JP
>>
>> --
>> You received this message because you are subscribed to the Google Groups "sage-devel" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/63694530-e253-7c92-f0f5-2d2838d2e666%40laposte.net.
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/CAD0p0K6nSnYHm7qTNKqnmfuX-4dy16BSj981o0M4ki9SL-p%2Btw%40mail.gmail.com.

John Cremona

unread,
Jul 22, 2019, 9:45:34 AM7/22/19
to SAGE devel
OK. fair enough.
 
Did you consider FDL, by the way?


No.  That is for documentation, while this is data.
 
>
>>
>>
>> Thanks,
>>
>> JP
>>
>> --
>> You received this message because you are subscribed to the Google Groups "sage-devel" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/63694530-e253-7c92-f0f5-2d2838d2e666%40laposte.net.
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/CAD0p0K6nSnYHm7qTNKqnmfuX-4dy16BSj981o0M4ki9SL-p%2Btw%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Dima Pasechnik

unread,
Jul 22, 2019, 11:39:00 AM7/22/19
to sage-devel
How about https://opensource.org/licenses/Artistic-2.0 ?
This is what e.g. SmallGrp GAP package is under.
https://gap-packages.github.io/smallgrp/README.html
(and the authors of the latter thought long and hard about the question)

Basically, the rationale is that you'd like your data to be
an autoritative source, you'd rather not let someone to create
a derivative for purpose of self-promotion.
You probably don't want a company with the name started with M
to offer a copy of your data without any reference to the original source,
and I gather that's what CC0 would allow...

HTH
Dima




>
>>
>> >
>> >>
>> >>
>> >> Thanks,
>> >>
>> >> JP
>> >>
>> >> --
>> >> You received this message because you are subscribed to the Google Groups "sage-devel" group.
>> >> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
>> >> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/63694530-e253-7c92-f0f5-2d2838d2e666%40laposte.net.
>> >
>> > --
>> > You received this message because you are subscribed to the Google Groups "sage-devel" group.
>> > To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
>> > To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/CAD0p0K6nSnYHm7qTNKqnmfuX-4dy16BSj981o0M4ki9SL-p%2Btw%40mail.gmail.com.
>>
>> --
>> You received this message because you are subscribed to the Google Groups "sage-devel" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/CAAWYfq2J3v_fsZ-E5nYuV6M6YbRA2aAy%2BEyimJTXMZHZQUwDbQ%40mail.gmail.com.
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/CAD0p0K5DNY8F5VrVTUrz%2B9FoN6HbAyTEXG-%3DDPwkK%3D%3DQD2nEig%40mail.gmail.com.

John Cremona

unread,
Aug 20, 2019, 11:02:55 AM8/20/19
to SAGE devel
Julien,

[By the way, the optional spkg which is relevant here is database_cremona_ellcurve and *not* elliptic_curves.]

I have tried your new py code to produce a new version of the database file: it appears to work perfectly, and will compare it with what the existing Sage code creates.  I know rather little about SQLITE -- should the resulting files be identical?  Even if not, I will try using both output db files and running all Sage tests.  The db files have the same size (613924864 bytes -- the new version will contain over 500000 more curves than the old) so perhaps the only difference is the order in which data was added.  I'll test.

I did also  need to patch sage/databases/cremona.py  since the build() function requires SAGE_SHARE to be imported.  Running that, I see that (as you may have noticed) there are many files in the ecdata repository / tarball which Sage does not use at all.  I adjusted the tar command to only extract the ones used.  (It would be possible for Sage to extract more and use more, but that is an issue for another day).

So I will soon post a new spkg at trac #28372.  Other than that, what is the plan?  I would like to add JP's python script to the ecdata/scripts directory for future use anyway.  I have wondered exactly why Sage's codebase contained this code for creating a specific new optional spkg.  Of course in the same file (src/databases/cremona) is code for creating the mini version of the database, but that is something which never needs to be updated.

John

Julien Puydt

unread,
Aug 21, 2019, 7:21:53 AM8/21/19
to sage-...@googlegroups.com, John Cremona
Hi,

Le 20/08/2019 à 17:02, John Cremona a écrit :
> Julien,
>
> [By the way, the optional spkg which is relevant here is
> database_cremona_ellcurve and *not* elliptic_curves.]

Yes ; in Debian:
- sagemath's standard elliptic_curves is
sagemath-database-elliptic-curves (I'm the maintainer) ;
- sagemath's optional database_cremona_ellcurve will be
sagemath-database-cremona-elliptic-curves (I'll be the maintainer).

> I have tried your new py code to produce a new version of the database
> file: it appears to work perfectly, and will compare it with what the
> existing Sage code creates.  I know rather little about SQLITE -- should
> the resulting files be identical?  Even if not, I will try using both
> output db files and running all Sage tests.  The db files have the same
> size (613924864 bytes -- the new version will contain over 500000 more
> curves than the old) so perhaps the only difference is the order in
> which data was added.  I'll test.

Thanks!

> I did also  need to patch sage/databases/cremona.py  since the build()
> function requires SAGE_SHARE to be imported.  Running that, I see that
> (as you may have noticed) there are many files in the ecdata repository
> / tarball which Sage does not use at all.  I adjusted the tar command to
> only extract the ones used.  (It would be possible for Sage to extract
> more and use more, but that is an issue for another day).

Yes, I was pondering just not shipping them.

> So I will soon post a new spkg at trac #28372.  Other than that, what is
> the plan?  I would like to add JP's python script to the ecdata/scripts
> directory for future use anyway.  I have wondered exactly why Sage's
> codebase contained this code for creating a specific new optional spkg. 
> Of course in the same file (src/databases/cremona) is code for creating
> the mini version of the database, but that is something which never
> needs to be updated.

I'm all for using a simple standalone Python script instead of depending
on all of sagemath...

Cheers,

JP

John Cremona

unread,
Aug 21, 2019, 7:43:45 AM8/21/19
to Julien Puydt, SAGE devel
I suggest that you ship the whole thing so that a debian (or derivative) user who installs the package gets everything they could also get by either git-cloning the ecdata repository (but without the git history) or the tarball from github.

It would be quite possible for the Sage interface to extract more of the data than it does.  Since William and I first set up this spkg in about 2006 the extent of the data stored in ecdata has grown -- not only the number for curves, also more date with each curve.   For example:  the integral points on each curve.  I do not plan to do this myself, partly because we are working on a Sage-LMFDB interface which would allow this additional data (and more) to be downloaded directly from the LMFDB, making this spkg redundant except for those need offline access to the database.

By the way, trac #28372 is ready for review.  I think the version of cremona.db I put in there was the one your script produced.
 

> So I will soon post a new spkg at trac #28372.  Other than that, what is
> the plan?  I would like to add JP's python script to the ecdata/scripts
> directory for future use anyway.  I have wondered exactly why Sage's
> codebase contained this code for creating a specific new optional spkg. 
> Of course in the same file (src/databases/cremona) is code for creating
> the mini version of the database, but that is something which never
> needs to be updated.

I'm all for using a simple standalone Python script instead of depending
on all of sagemath...

OK.  Can I add your script into the scripts directory of ecdata?

John
 

Cheers,

JP

Julien Puydt

unread,
Aug 21, 2019, 11:50:11 AM8/21/19
to John Cremona, SAGE devel
Hi,

Le 21/08/2019 à 13:43, John Cremona a écrit :
> I suggest that you ship the whole thing so that a debian (or derivative)
> user who installs the package gets everything they could also get by
> either git-cloning the ecdata repository (but without the git history)
> or the tarball from github.

Well, my source package has files, but the binary only has the .db file,
so I don't have much motivation to ship a complete source...

There's also the fact that I only need good copyright+license
assignments on files in the source package - so cutting down on it is
interesting.

> It would be quite possible for the Sage interface to extract more of the
> data than it does.  Since William and I first set up this spkg in about
> 2006 the extent of the data stored in ecdata has grown -- not only the
> number for curves, also more date with each curve.   For example:  the
> integral points on each curve.  I do not plan to do this myself, partly
> because we are working on a Sage-LMFDB interface which would allow this
> additional data (and more) to be downloaded directly from the LMFDB,
> making this spkg redundant except for those need offline access to the
> database.

The motivation for the packaging is sagemath ; if it gets to use more of
ecdata, then it will still be time to complete my source package.

Cheers,

JP
Reply all
Reply to author
Forward
0 new messages