A common mechanism for provisioning the storage required for Data Science & AI Workbench persistence involves the use of a Network File Server (NFS). This includes many cloud offerings such as Amazon EFS and Google Filestore and many on-premise NAS/SAN implementations. In this section, we provide specific recommendations for server- and client-side configuration. These recommendations should be used to augment the general storage requirements offered on our install requirements page.
Increase the number of threads created by the NFS daemon to at least 64,to reduce the likelihood of contention between multiple users.For information on how to do this see your operating system documentation;for instance, this RedHat article.
If possible, use this file server as your administrationserver as well. This is a great way to manageand administer this persistence. If this is not possible, make sure toexport the volume to the administration server as well as theKubernetes cluster.
If you are intending to use the same server for both anaconda-storageand anaconda-persistence, then you should consolidate to asingle PersistentVolume, as discussed in thegeneral storage requirements.
In many environments, the performance of the volume (e.g., IOPS)is tightly coupled to the size of the disk. For this reason, Anaconda recommendsover-provisioning the size of the disk to take advantage of this. Insome environments, IOPS can be provisioned separately, but it can stillbe cost-effective to over-provision size instead.
Anaconda recommends against the use of the root_squash option. Whilea seemingly sensible option for security reasons, in practice we find thatit too often leads to unexpected permissions issues. That said, asimilar and more reliable option is to use the all_squash optionalong with anonuid and anonguid. This effectively forces allremote access to be translated to the same UID and GID on the server.In summary, in order of preference, Anaconda recommends:
When mounting the NFS share, Anaconda recommends overriding the defaultread and write block sizes by using the options rsize=65536,wsize=65536. The reason smaller block sizes are preferredis because the creation of conda environments frequently involvesthe manipulation of thousands of smaller files. Large block sizesresult in significant inefficiency.
: you can give this any name you wish, or adhere to ourconventions of anaconda-storage and/or anaconda-persistence.This name will ultimately be supplied to the Helm chart values. Notethat appears in three places; use the same value for all.
: this is the group ID which has write access to the volume.As discussed above, the recommended value is 0; but if you are forcedto use root_squash or all_squash, make sure this has the valueof the selected GID. The quotes must be preserved.
c01484d022