Dear Astra-sim Team,
My name is Zhongkai Yu, and I am a second-year Ph.D. student in Computer Science and Engineering at UC San Diego. I am currently working on deploying Astra-sim to simulate the performance of the NVL72 system with 72 GPUs.
However, I noticed that the simulation relies on trace files generated from PyTorch, and I could only find trace files for systems with 2 to 16 GPUs in the github repo. Since we do not have access to a 72-GPU system, I was wondering if you have any suggestions or recommendations on how to generate or obtain a suitable trace file for a 72-GPU setup.
Thank you for your time and for providing such a well-designed tool that has saved us a great deal of effort in our research.
Best regards,
Zhongkai Yu