Small issues with MRO R packages

0 views
Skip to first unread message

Shaun Walbridge

unread,
Mar 22, 2018, 11:22:40 PM3/22/18
to conda - Public
Hello,

Great to see the MRO packages and the continued work on putting up more of the R ecosystem!

I recently installed a small collection of packages via the `r` channel, and noticed a couple of possible issues with the mro-feedstock. The bigger of the two is that any of the packages I installed included a info/recipe/parent directory which was significant in size, in particular because of the 40MB of test data. It looks like conda-build PR #2687 which introduced this behavior isn't probably wanted here, or the test data may need to be omitted because its size. This doesn't affect the environment itself, but it does waste space with each extracted package in `pkgs`, adding up quickly, and will make the package downloads larger than necessary. It looks like it should be possible to have these data files downloaded from GitHub when the tests are executed instead of being included directly in the recipe itself.

The second potential issue: the base package creates duplicate DLLs in the form of "Rlapack.dll.mkl" which is also copied as the (used) "Rlapack.dll", there are nomkl versions, and the same thing for Rblas. Perhaps these are part of a pattern still in progress, but from what I can tell, there is no "nomkl" context here, and omitting these would save ~60MB.

Thanks again for all the great work, and if you'd like an issue (or even a PR) let me know, it wasn't clear if the aggregateMRO repository was intended to be public-facing.

Cheers,
Shaun


Ray Donnelly

unread,
Mar 23, 2018, 4:57:31 AM3/23/18
to Shaun Walbridge, conda - Public
Omitting test data due to size isn't something I will do, disk space is cheap and 40mb is not so much (yes it unpacks to more but still, the package cache is unpacked only once and hardlinks are used). We want out tests to work in "air-gapped" situations too, esp. for something as important as the MRO R interpreter. Sorry, I will resist any suggestion to change this at all.

Rlapack.dll.mkl comes with MRO, this is not a pattern in progress, it is a deliberate decision on our part to include both the mkl optimized libs and the GPL ones to for GPL compliance (and benchmarking) reasons. GPL-wise, if you uninstall r-revoutilsmath then you will have redistributable software, and the mro-base package itself is fully GPL compliant.

aggregateMRO is intended to be public-facing (and PRs for many things would be considered) but I will reject any PRs to change these along the lines you are suggesting.


Thanks,

Ray.


--
You received this message because you are subscribed to the Google Groups "conda - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to conda+un...@continuum.io.
To post to this group, send email to co...@continuum.io.
Visit this group at https://groups.google.com/a/continuum.io/group/conda/.
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/conda/875accb0-12a3-4c1f-bc27-27ca0ccb4e0f%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Ray Donnelly

unread,
Mar 23, 2018, 5:01:34 AM3/23/18
to Shaun Walbridge, conda - Public
Oh actually, I didn't read carefully enough, ok 40MB per package is a bit more of an issue (but it's not for every R package, just the ones that come from that recipe, i.e. the 30 odd MS repacks). I will think about that one some more then.

Shaun Walbridge

unread,
Mar 23, 2018, 4:46:01 PM3/23/18
to Ray Donnelly, conda - Public
Got it on the MKL split. I figured I'd mention it since it doesn't map directly onto the model used with Python in conda and e.g. the nomkl package.

In terms of the testing, yes that makes sense to retain it with the package so that it will work in an air-gapped environment. A possible compromise would to just trim the dataset, I don't see anything specifically in the tests that require that volume of data to confirm the software is functioning correctly, but I of course could be missing something. Thanks for taking a look, yes as you mentioned its only really an issue in aggregate -- installing here pulled in 31 MRO packages, for a total of 1240MB of disk usage.
Reply all
Reply to author
Forward
0 new messages