Hi,
I'm using Conda in CVMFS and am presented with the problem of rapidly growing space usage in the CVMFS repository (filesystem). This is in part because CVMFS does not support cross-directory hardlinks and so installing packages into environments results in multiple copies of the same package contents.
CVMFS is an HTTP-based FUSE-implemented read-only filesystem designed for software distribution which stores the "master" copy of a repository in content-addressable storage (CAS) (i.e. hashed file chunks on a local filesystem) on a host called a stratum 0. Changes are propagated out from stratum 0 servers and ultimately to clients via HTTP, which mount it via FUSE. When you make changes to the repository on the stratum 0 (i.e. beginning a transaction), CVMFS first read-only mounts the CAS using the CVMFS FUSE client. It then uses AUFS to mount a writable filesystem that is unioned with the read-only mount. You make your changes and when done, publish the changes. The difference between the read-write set and the read-only set becomes a snapshot.
This AUFS filesystem supports cross-directory hardlinks just fine. But when the changes are published, cross-directory hardlinks are broken and become multiple copies of the same file.
Conda falls back to symlinks if hardlinks are impossible, but unfortunately, this doesn't work in the CVMFS case, since conda is able to create hardlinks at runtime and doesn't know they will be broken later.
So, is it possible to force conda to use symlinks? The only config option I can see related to soft/hardlinks is to disable the symlink fallback. I need it to prefer symlinks to hardlinks, not the other way around.
Thanks,
--nate