HPC Multi Node/Cluster Install Question

Skip to first unread message


Aug 10, 2023, 9:13:54 AMAug 10
to sage-s...@googlegroups.com
I've done some research, googled around, searched though ask sage, looked at some of the thematic tutorials and have finally come to this google group (ask sage, in all fairness, never approved this post, deleted me and it, and no idea why and it was from my corporate email address? But ask a question on the sage page leads there...).

Can anyone point me towards some documentation, how-tos, or git gist in and around installing sage on a multi-node HPC cluster? We have a job manager (PBS family, so like PBS Pro or Torque). Multiple compute nodes, use modules for the environmental variables

That said, it is not clear to me about the best approach to laying this down. Some software lends itself to a shared folder install (though that is often built from source or installed on a master node), others with a local install on each node. Maybe it is some kind of mental block, but have installed others without issue and sliced and diced this question before.

Just curious about the best approach to install sage (I did find this bit for mpi support, sage -pip install mpi4py but I can't imagine there isn't more information that I am somehow just not searching right for)

Even a new direction or place to search would be a great answer.



Nils Bruin

Aug 10, 2023, 2:07:49 PMAug 10
to sage-support
This response is very much in the "new direction or place to search" category.

Sagemath's build process has been undergoing quite some changes. It used to be the case that sage-the-distribution kept virtually everything in-house, so that an install on shared folder would work great on a cluster with many (identical) nodes. The build system now has much more preference for relying on system-supplied components. As long as all nodes are still identical *and have the same system-supplied components in place*, it should still work.

Sagemath is now quite viably buildable in conda, where conda is used to supply many components. So if a plain install isn't working on a cluster (or looks potentially problematic), it could be very instructive to read up on how conda deals with such situations -- that might well be applicable to sagemath (built in conda) now as well.

Nathan Dunfield

Aug 10, 2023, 2:22:33 PMAug 10
to sage-support
I have used Sage extensively on a couple HPC clusters.  While I used to build it from source, I use conda/mamba to install Sage:

and strongly recommend this approach today.

As for a local or shared install, Sage opens a staggeringly large number of files on startup.  On a shared file system, this can result in very slow startup times (30 seconds or even more) even though starting Sage takes just a few seconds on one's laptop.




Aug 10, 2023, 3:13:15 PMAug 10
to sage-support
Thanks. That makes sense (even if it tweaks my sense of consistency) but it might more sense as MPI is built in via Python as well... but we do have conda on the cluster and it puts them in control more or less of their process.

Reply all
Reply to author
0 new messages