R package system include downloading facility from repositories.
That’s why you need curl. At least in theory.
I'm afraid that we don't have much say in the matter : the R core development team has choosen to rely on curl, and their build system will fail without it. Furthermore, among those 9402 packages, a lot of them may have choosen to follow R "guidance" an use CURL.
If we choose to use another library, we are effectively forking R and a $#!+load of those packages. Somehow, I doubt that we have the workforce to do that credibly...
So, before coming to these extremities, I'd like to explore two avenues :
Question the R Core Team to know how they reconcile their DPL 2-3 license and teir use of OpenSSL, and discuss if we can use the same loophole as they did. In which case, all is fine and dandy...
On Thursday, October 27, 2016 at 12:22:37 PM UTC+2, Emmanuel Charpentier wrote:I'm afraid that we don't have much say in the matter : the R core development team has choosen to rely on curl, and their build system will fail without it. Furthermore, among those 9402 packages, a lot of them may have choosen to follow R "guidance" an use CURL.
If we choose to use another library, we are effectively forking R and a $#!+load of those packages. Somehow, I doubt that we have the workforce to do that credibly...
So, before coming to these extremities, I'd like to explore two avenues :
Question the R Core Team to know how they reconcile their DPL 2-3 license and teir use of OpenSSL, and discuss if we can use the same loophole as they did. In which case, all is fine and dandy...I don't know how much they thought about it, or rather how many lawyers they did pay, but if they ship binaries including the libcurl and openssl that might be risky.
Some people invoke the special exception clause of the GPL...
See:
* https://people.gnome.org/~markmc/openssl-and-the-gpl.html
* https://groups.google.com/forum/#!searchin/sage-devel/openssl|sort:relevance/sage-devel/mbGbpRz96q0/8nsKRNyLs3oJ (OpenSSL license issue)
* https://groups.google.com/forum/#!searchin/sage-devel/openssl|sort:relevance/sage-devel/Jl11JxIb2E8/-1NrQmjbMKIJ (why GnuTLS which is GPL is not a drop in replacement
If they rely on the user having its own libcurl and openssl then there is no problem.
On 2016-10-27 12:24, Francois Bissey wrote:
> While not configurable by the user in R-3.2.x it would build if
> libcurl wasn’t found or missing https support.
> The change to bail out if you don’t fulfil all the condition appear
> deliberate to me. And I see the point from a support point of view.
>
> My conclusion is that R developers would fill it as "not a bug".
> We really want this.
Can we please develop software without the a priori assumption that
upstream won't cooperate?
It is a most interesting point because it explain why
the R binary installed from epel for RH7.1 and family
isn’t linked to openssl.
If they don’t have a rock solid argument, and even if they
have, there may not be anymore official R packages from big
league binary distros. Unless they patch R, their legal arm
will forbid it.
So either they will stop distribute R or they will patch
en-masse.
Le jeudi 27 octobre 2016 13:10:17 UTC+2, François a écrit :It is a most interesting point because it explain why
the R binary installed from epel for RH7.1 and family
isn’t linked to openssl.
How old is RH7.1 ? The introduction of the libcurl requirement dates from the R 3.3.0 release, (May 3, 2016).
I note that Debian (notoriously twitchy about license issues) didn't seem to have any qualms packaging 3.3.0, nor 3.3.1.
Le jeudi 27 octobre 2016 16:04:52 UTC+2, Jean-Pierre Flori a écrit :
On Thursday, October 27, 2016 at 3:59:26 PM UTC+2, Emmanuel Charpentier wrote:
Le jeudi 27 octobre 2016 13:10:17 UTC+2, François a écrit :It is a most interesting point because it explain why
the R binary installed from epel for RH7.1 and family
isn’t linked to openssl.
How old is RH7.1 ? The introduction of the libcurl requirement dates from the R 3.3.0 release, (May 3, 2016).
I note that Debian (notoriously twitchy about license issues) didn't seem to have any qualms packaging 3.3.0, nor 3.3.1.I'd say Debian links curl to GnuTLS.
Probably not : My system's curl-config --protocols says that HTTPS: is supported. A trial of curl's configure --without-ssl --with-gnutls does not claim to support it (in the summary printed at th end of ./configure...).
However, I'm not sure : Debian's blurb about libcurl3-gnutls says that it supports https, imaps, ldaps, smtps and pop3s.
To be sure, I'll have to setup a virtual machine with a minimal environment (no openssl, of course) and try to setup libcurl (and, if successful, Sage) in that environment. This is time consuming, so not quite real soon...
On Thursday, October 27, 2016 at 4:49:23 PM UTC+2, Emmanuel Charpentier wrote:
Le jeudi 27 octobre 2016 16:04:52 UTC+2, Jean-Pierre Flori a écrit :
On Thursday, October 27, 2016 at 3:59:26 PM UTC+2, Emmanuel Charpentier wrote:
Le jeudi 27 octobre 2016 13:10:17 UTC+2, François a écrit :It is a most interesting point because it explain why
the R binary installed from epel for RH7.1 and family
isn’t linked to openssl.
How old is RH7.1 ? The introduction of the libcurl requirement dates from the R 3.3.0 release, (May 3, 2016).
I note that Debian (notoriously twitchy about license issues) didn't seem to have any qualms packaging 3.3.0, nor 3.3.1.I'd say Debian links curl to GnuTLS.
Probably not : My system's curl-config --protocols says that HTTPS: is supported. A trial of curl's configure --without-ssl --with-gnutls does not claim to support it (in the summary printed at th end of ./configure...).Are you sure you installed the gnutls dev headers?
GnuTLS provides an SSL implem, not just SSL.
However, I'm not sure : Debian's blurb about libcurl3-gnutls says that it supports https, imaps, ldaps, smtps and pop3s.
To be sure, I'll have to setup a virtual machine with a minimal environment (no openssl, of course) and try to setup libcurl (and, if successful, Sage) in that environment. This is time consuming, so not quite real soon...You mean with gnutls?
Or with no ssl/tls at all?
I already did the latter and it works (modulo hacking R's configure).
But you're right, by default Debian links to openssl: https://packages.debian.org/sid/libcurl3
And indeed curl is not GPL anyway: https://curl.haxx.se/docs/copyright.html
Groumpf.
See https://curl.haxx.se/legal/distro-dilemma.html for more rumbling.
Hi,
<useless rant>
We've been down this road before with Sage, and it's pretty annoying.
I've personally wasted hundreds of hours on it (GNUtls, openssl, etc.)
Programmers playing lawyers have ended up with a broken and
inconsistent legal foundation. There is no easy way out, since only
copyright owners can change licenses. Because this is volunteer open
source, much of the generation that got us into this mess is MIA (or
even dead in some cases). Sigh...
</useless rant>
So either they will stop distribute R or they will patch
en-masse.
Somehow, I doubt it.
So either they will stop distribute R or they will patch
en-masse.
Somehow, I doubt it.
Probably nobody even bothered to notice or notify e.g. Debian?
Thanks for working on this; how annoying.
I just checked (by installation on a virtual machine) that a *virgin* (base + destktop + usual utilities) debian stable (jessie) has openssl installed. Tentatively asking for its removal (apt-get remove -s openssl) tells that it would remove a ton of system utilities.
I think you are making it more difficult than it is. I'm pretty sure our binaries already depend on openssl being installed, and we do this under the GPL system library exception. We just can't ship our own openssl (nor would I want to).
Hi,
Regarding the openssl dependency issue, the standard way people
justify getting around it is the "system library exemption", which
allows for GPL'd programs to link in system libraries that are not
GPL'd (otherwise, things like GPL software on MS Windows would be
impossible!). Some links here:
http://opensource.stackexchange.com/questions/2233/gpl-v3-with-openssl-exception
As the person who chose to add R to Sage in the first place, my
instinct on this is that we should **completely and totally remove R
from Sage**. Why?
- Our pexpect based interface to R sucks. It was mostly written by
Mike Hansen and me, so I take the blame. In SageMathCloud Sage
worksheets
we just switched to making the %r mode be implemented using Jupyter's
R kernel, which is way more robust.
- It's easy enough to install R in other ways outside of Sage.
I've heard of a lot of people installing Sage in order to install X
(where X is say Pari or Singular or even Cython at one point); I've
*never* heard of anybody installing Sage in order to get R.
- The main technical reason for installing R into Sage, as opposed
to just finding a system-wide R install, is to ensure that rpy2 -- the
C-level bindings to R -- actually work.
we should **completely and totally remove R
from Sage**.
It's ridiculous that we spend no effort on pandas/statsmodels, and all this
effort on R.
For example, I recall that there are some issues involving pandas +
statsmodels + the sage preparser.
Hi all,
The latest R versions depends on libcurl and actually more than that: on a libcurl with https support.
So we might want to build our own libcurl with https support (see #21767) but we then need an SSL/TLS implementation which Sage curretnly provides only optionally through openSSL because of license issues so we can:
[1] either make R depend on libcurl depend on openssl and they all become optional,
[2] or make R depend on libcurl and make them standard and add an SSL/TLS implementation and its development headers a prereq,
[3] or make libcurl with https support (and development headers) a prereq, which basically means adding an SSL/TLS implementation as a prereq as well,
[4] or make R a prereq,
[5] or drop R support,
[6] or patch R not to use curl,
[7] or patch R to use curl but without https support,
[8] or wait until the end of times,
[9] or a mix of all of this,
[10] or do something else.
What do you think?
Best,
JPF
Dear William,
Sorry to have sounded frightened by *you* : I'm frightened by the amount of *my* ignorance...
Since it seems that I'm (almost) alone among Sage users to be interested by the development of an interface to R, I'm exploring the options. And I'm trying to understand things I was used to *use* (as you use your car without thinking about the thousands of engeenering problems that had to be solved to make it "just run"). So I try to list all the possible solutions and ranking them also by a very personal criterion : the amount of knew knowledge I will need to use them... Your description of Jupyter kernels makes me put it at the extremity of my list...
You do not have to convince me that pexpect is a poor way to do it : I'm convinced : the data interchange involves a serious amount of playing fast and loose with Python and R dumps. Not quite steamlined...
Sincerely,
--
Emmanuel Charpentier
Le jeudi 27 octobre 2016 10:03:03 UTC+2, Jean-Pierre Flori a écrit :Hi all,
The latest R versions depends on libcurl and actually more than that: on a libcurl with https support.
So we might want to build our own libcurl with https support (see #21767) but we then need an SSL/TLS implementation which Sage curretnly provides only optionally through openSSL because of license issues so we can:
[1] either make R depend on libcurl depend on openssl and they all become optional,
[2] or make R depend on libcurl and make them standard and add an SSL/TLS implementation and its development headers a prereq,
[3] or make libcurl with https support (and development headers) a prereq, which basically means adding an SSL/TLS implementation as a prereq as well,
[4] or make R a prereq,
[5] or drop R support,
[6] or patch R not to use curl,
[7] or patch R to use curl but without https support,
[8] or wait until the end of times,
[9] or a mix of all of this,
[10] or do something else.
What do you think?
Best,
JPF
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+unsubscribe@googlegroups.com.
To post to this group, send email to sage-...@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
On Mon, Oct 31, 2016 at 6:58 PM, kcrisman <kcri...@gmail.com> wrote:
> All I know is this:
>
> 1) We have "sold" a lot of Sage by saying it's all in there, and at least
> some people have used Sage+R effectively. Estimates of how many vary
> wildly. But non-zero.
Nobody is suggesting deprecating the potential to use Sage+R.
> 2) rpy2 might be there, but as far as I can tell most people who've used
> Sage+R use it via the "dumb pexpect interface".
rpy2 is vastly better than anything else when large amounts of data
need to be shared back and forth between Sage and R. It's the only
option in that case.
> 3) All those thousands of packages are darn useful and can't be easily
> replaced with pandas or anything else.
This is the same as 1.
> 4) kernels are nice but how do you get Sage command line to do that? I
> personally would only know how to do it, sort of, in Jupyter notebook, and
> not at all clear how to send stuff back and forth between Sage and R there.
>
In SageMathCloud we've been supporting actual users of Sage, R, etc.,
for a few years now. People frequently request that we install R
packages. As far as I can tell, 100% of the time they are using R
directly, via scripts, via %r mode in a notebook (either Sagews or
Jupyter).
I've never once heard of anybody actually trying to use
Sage + R yet.
As far as I can tell, Sage simply doesn't add anything
of value that people care or know about when they are using R...
If
people are using Python to do stats they use the Python tools; if they
are using R, then R is already good enough for what they are doing,
and Sage doesn't add anything.
> My conclusion:
>
> 1) Definitely don't remove R from Sage, unless we go back on the mission
> statement.
The mission statement is to create a viable open source alternative to
Magma, Maple, Mathematica, and Matlab. Not including R in Sage does
not mean going back on that mission statement.
> We want to be able to easily use all those extra packages with
> algebra and graph theory. (Not to mention current users.)
> 2) Finding a way to deprecate (!!!) current R behavior is viable, but should
> be a longer-than-normal period. Including graphics and other stuff.
If
> rpy2 is the answer, great; I recall it being deemed not as satisfactory at
> some point but maybe no one ever tried since pexpect was "good enough".
We could remove R from sage-7.5 and the current behavior would likely
work exactly as is (via pexpect) as long as R is installed somewhere
on the computer. In fact, things would be better, since all install
issues involving R itself would be the responsibility of the R
project.
> 3) Excellent idea to integrate pandas or whatever the current flavor of the
> month is far better!
pandas is not just some "flavor of the month" stats library. It's
been around since at least 2010, and is a sort of defacto standard at
this point. There's at least one Oreilly book on it.
pandas
consists of the following things
An that's all. Data management, a whiff of vanilla descriptive statistics.
- A set of labeled array data structures, the primary of which are Series and DataFrame
- Index objects enabling both simple axis indexing and multi-level / hierarchical axis indexing
- An integrated group by engine for aggregating and transforming data sets
- Date range generation (date_range) and custom date offsets enabling the implementation of customized frequencies
- Input/Output tools: loading tabular data from flat files (CSV, delimited, Excel 2003), and saving and loading pandas objects from the fast and efficient PyTables/HDF5 format.
- Memory-efficient “sparse” versions of the standard data structures for storing data that is mostly missing or mostly constant (some fixed value)
- Moving window statistics (rolling mean, rolling standard deviation, etc.)
- Static and moving window linear and panel regression
> 4) But if no one is willing/able/whatever to make these new interfaces,
> keeping pexpect is definitely far better than simply jettisoning R.
> (Note e.g. people who have used Sage cell server to embed R graphics in web
> pages - not sure how that would be affected, if it's a separate R process or
> via Sage?)
Andrey would type "apt-get install R" (or whatever) and be done.
- Python stats have come a *LONG* way in the last 10 years, with
libraries like Pandas. Why use rpy2 when you can much more
effectively use pandas and statsmodels and so on.
In my opinion, it would be way, way better to completely remove R from
Sage and instead do the following:
1. Include the R jupyter kernel config files.
2. Includes the modern Python stats libraries pandas and
statsmodels in Sage.
Our time would be much better spent supporting 2 than 1. It's
ridiculous that we spend no effort on pandas/statsmodels, and all this
effort on R. That was a strategy that made sense 10 years ago, but
not today.
For example, I recall that there are some issues involving pandas +
statsmodels + the sage preparser. We could put effort into
addressing those, like Robert Bradshaw did with numpy (which used to
be very unhappy with Sage integers, reals, etc.). Fixing this stuff
probably wouldn't be hard, and would make Sage a better environment
for stats. There may be similar remarks around machine learning,
where Python has really come into its own recently (e.g., see
tensorflow).
Nobody is suggesting deprecating the potential to use Sage+R.
That's replacing "It's there" by "It can be done". Not the same thing... unless you're a theologian or a politician.
A third point, distinct from the previous two, is that William deems the current pexpect interface to be insufficient. Having suffered with it, I tend to concur. But I think that the pexpect() interface is *still* useful.
I've never once heard of anybody actually trying to use
Sage + R yet.
Can you hear me now ?
It adds a lot : availability of tools much better at expressing *structures* than anything available on the R side (thonk graph theory, symbolic computation, etc...). That's no as popular among applied statisticians as computing "tests p-values" (the present), but remains the core problem of statistics ("given some data, what is the "right" structure that accounts for them ? "., i. e. the future...).
I would say we even have a shorter-time solution: close the tickets for including curl (modulo adding a SAGE_FAT_BINARY mode to avoid overlinking)
and updating R (modulo adding a 6-line patch to not check for https support,
note that most of the time https will be supported anyway).