[ANNOUNCEMENT] Kubeflow Trainer v2.1.0 is released!

7 views
Skip to first unread message

Andrey Velichkevich

unread,
Nov 7, 2025, 9:20:49 AM (5 days ago) Nov 7
to kubeflow-discuss, wg-b...@kubernetes.io
Hi Folks,

We’re excited to announce that Kubeflow Trainer v2.1.0 is officially released!

$ helm install kubeflow-trainer oci://ghcr.io/kubeflow/charts/kubeflow-trainer --version 2.1.0

$ pip install -U kubeflow


Major Highlights:

🚀 Distributed Data Cache
Load massive datasets efficiently in-memory, maximize GPU utilization with zero-copy transfer, and minimize I/O for large-scale pre- or post-training distributed AI workloads.

⏱️ Enhanced Kueue Integration
Topology Aware Scheduling for optimized Pod placement and reduced inter-node bandwidth – crucial for large-scale training on advanced GPUs like the GB200

🔥 MLX Runtime with CUDA Support
Fine-tune and evaluate LLMs across multiple GPUs using MLX and mlx-lm with Kubernetes + OpenMPI.

🧩 Official Volcano Scheduler Support
Network topology awareness and advanced scheduling for improved TrainJob orchestration.

🧠 LLM Post-Training Enhancements
Supports LoRA, QLoRA, and DoRA for parameter-efficient fine-tuning with BuiltinTrainers.
Adds new post-training runtime for Qwen2.5.

📣 Release announcement: https://bit.ly/4qO8iM1

📝 Release notes: https://github.com/kubeflow/trainer/releases/tag/v2.1.0


Massive thanks to all contributors who made this possible!



Regards,
Andrey
Reply all
Reply to author
Forward
0 new messages