Notice of maintenance outage: Huckleberry to be offline Monday - Friday, April 15 - 19, 2019

3 views
Skip to first unread message

ARC

unread,
Apr 12, 2019, 12:07:38 PM4/12/19
to arc_huckleb...@vt.edu
The Huckleberry cluster will be taken offline on Monday, April 15 at 5:00AM to update and reprovision the machines in the cluster. The planned updates are comprehensive and include IBM firmware, NVIDIA P100 GPU firmware, CUDA software, and Linux operating system sufficient to run the latest IBM PowerAI and other ML/DL software. Based on vendor advertisements and supported by our own tests, these updates will provide major improvements in functionality, performance, and throughput for many workloads including ML/DL applications.

Slurm scheduler and Linux operating system configurations will also be modified to align Huckleberry with the standards established on ARC\'s Dragonstooth and Cascades clusters.

ARC is coordinating with IBM to host a hackathon which will capitalize on the updates and improvements being made to the Huckleberry cluster. More information regarding the hackathon will be provided in a separate announcement.

~ARC

Reply all
Reply to author
Forward
0 new messages