Loading datasets into sage cells

54 views
Skip to first unread message

Steven Clontz

unread,
Mar 22, 2021, 4:08:16 PM3/22/21
to PreTeXt development
An author had started to write a text with sage cells that loaded datasets for analysis in R. They were unpleasantly caught off-guard today: https://twitter.com/TChihMath/status/1374075742267404290 I pointed them to https://groups.google.com/g/pretext-announce/c/Y_C5WDgJlhY

I know we can't control how Sage Cells function (and the security issue is certainly real), but I thought it's worth discussion on this end what work might need to be done for books that already exist and depend on such functionality.

Rob Beezer

unread,
Mar 22, 2021, 4:24:04 PM3/22/21
to prete...@googlegroups.com
Thanks, Steven, for monitoring the Twitter-verse. This is not the first
instance in my inbox already.

I hope some sort of whitelisting happens. There's already a suggestion of
whitelisting GitHub, which might be enough?

Working on it...

Rob
> --
> You received this message because you are subscribed to the Google Groups
> "PreTeXt development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to pretext-dev...@googlegroups.com
> <mailto:pretext-dev...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pretext-dev/9f99b596-eb6a-48d5-9a7f-5c005413ccd8n%40googlegroups.com
> <https://groups.google.com/d/msgid/pretext-dev/9f99b596-eb6a-48d5-9a7f-5c005413ccd8n%40googlegroups.com?utm_medium=email&utm_source=footer>.

David Farmer

unread,
Mar 22, 2021, 5:00:53 PM3/22/21
to prete...@googlegroups.com

Who would whitelisting be done?

What if only Get was allowed?

What if there were a computer available, if someone were
willing to set up a sage cell server?
> email to pretext-dev...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pretext-dev/31e4d2af-cc0a-b969-6d58-fe6310f31e12%40ups.edu.
>

Rob Beezer

unread,
Mar 22, 2021, 5:32:45 PM3/22/21
to prete...@googlegroups.com
On 3/22/21 2:00 PM, David Farmer wrote:
> Who would whitelisting be done?

How? GitHub, or by textbook sites?

> What if only Get was allowed?

That might solve part of the problem.

> What if there were a computer available, if someone were
> willing to set up a sage cell server?

(A) I get the impression it is not that easy to administer.

(B) Who is going to patrol the crypto miners?

Rob

Steven Clontz

unread,
Mar 22, 2021, 5:48:20 PM3/22/21
to PreTeXt development
I think encouraging the admins of https://sagecell.sagemath.org/ to whitelist GET requests to github.io (GitHub Pages) resources would be a large help, and maybe have them get common datasets e.g. https://vincentarelbundock.github.io/Rdatasets/datasets.html available as local resources (maybe those are available as an R package, but I've used them in Python scripts as well).

Andrey Novoseltsev

unread,
Mar 22, 2021, 10:20:01 PM3/22/21
to PreTeXt development
On Monday, 22 March 2021 at 15:32:45 UTC-6 Rob Beezer wrote:
On 3/22/21 2:00 PM, David Farmer wrote:
> Who would whitelisting be done?

How? GitHub, or by textbook sites?

> What if only Get was allowed?

That might solve part of the problem.

It is likely that connections to a list of IP addresses will be allowed and others will not. If you have concrete directions on how to implement something else, or better yet - you are willing to help implementing it, please do let me know!


> What if there were a computer available, if someone were
> willing to set up a sage cell server?

(A) I get the impression it is not that easy to administer.

For those who can" administer a server" it should be relatively straightforward:


(B) Who is going to patrol the crypto miners?

That is a trickier question, but on your own server it is completely up to you how to configure a firewall, if at all.

Steven Clontz

unread,
Apr 13, 2021, 10:41:00 AM4/13/21
to PreTeXt development
Got another ping from a [well the same] frustrated author. Would it be somehow possible to send a dataset over the pipe to the SageCell, rather than having the SageCell request the dataset from a server? Or would SageCell be willing to whitelist requests to a trusted server that we can whitelist contributors to? (e.g. could David wire up datasets.aimath.org to serve up datasets from GitHub repos we trust? or better yet, can we decide that it's okay to request files from *.github.io GitHub Pages?)

This is a huge blocker for development of statistics texts in PreTeXt, unless anyone can suggest another workaround.

David Farmer

unread,
Apr 13, 2021, 10:46:43 AM4/13/21
to PreTeXt development

I think I am okay with setting up something like

pretextbook.org/data/*

as a place to host moderate amounts of data for use in Sage cells,
if that could be whitelisted. My preference would be to do it for
Sage cells in PreTeXt books (and I do consider stat books a priority).

We would have to discuss policies who manages it.

David
> --
> You received this message because you are subscribed to the Google Groups "PreTeXt development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
> pretext-dev...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pretext-dev/864cf22c-95df-49a6-9a96-a0eae4615845n%40googlegroups.com.
>
>

Sean Fitzpatrick

unread,
Apr 13, 2021, 10:49:57 AM4/13/21
to PreTeXt development
Not a short term solution since it's not implemented, but long term, could Sage cells be replaced by R shiny apps?

Someone would have to run a shiny server to keep costs down.

--
You received this message because you are subscribed to the Google Groups "PreTeXt development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pretext-dev...@googlegroups.com.

Steven Clontz

unread,
Apr 13, 2021, 11:03:32 AM4/13/21
to PreTeXt development
I'm not sure what an "R shiny app" is, but since Sage Cells already support R it'd be wise to weigh the technical debt of adding another mechanism for running R code.

Sean Fitzpatrick

unread,
Apr 13, 2021, 11:24:09 AM4/13/21
to PreTeXt development
Shiny is an R studio package that lets you make fancy web apps powered by R.

https://shiny.rstudio.com/

I think last time I tried running some R code in Sage cells (some plotting stuff my wife was using for intro stats) there were things that work in Jupyter but not Sage cell.
(Sorry that's not specific. My knowledge of R, and stats generally, doesn't go much beyond the fact that it exists )

Andrey Novoseltsev

unread,
Apr 13, 2021, 12:22:28 PM4/13/21
to PreTeXt development
Trying to summarize an answer to the resent comments:

GitHub should be accessible from sagecell.sagemath.org at the moment, if it does not work for you - please let me know the exact URL/code you are trying to use.

There is now a system to whitelist (manually) certain IP addresses, so as long as you are on some network with fixed IPs (e.g. you are NOT behind a Cloudflare) we can whitelist your specific server/network.

Sending data to cells in some way, allowing particular HTTP requests, setting up other apps for R code - all can be done in principle, but requires substantial time/efforts, so not likely to happen, especially in the near future.

Andrey

Steven Clontz

unread,
Apr 13, 2021, 12:32:32 PM4/13/21
to prete...@googlegroups.com
You've made at least one author very happy. https://twitter.com/TChihMath/status/1382007836188803074?s=20

--
You received this message because you are subscribed to a topic in the Google Groups "PreTeXt development" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/pretext-dev/AT8-SUDaxV0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to pretext-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pretext-dev/6a012852-0419-46f5-bfd3-73c9da42eafbn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages