Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#1064257: ITP: rocm-tensile -- ROCm tool for generating and benchmarking assembly kernels

2 views
Skip to first unread message

Cordell Bloor

unread,
Feb 19, 2024, 1:00:05 AMFeb 19
to
Package: wnpp
Severity: wishlist
Owner: Cordell Bloor <cg...@slerp.xyz>
X-Debbugs-Cc: debian...@lists.debian.org, cg...@slerp.xyz, debi...@lists.debian.org

* Package name : rocm-tensile
Version : 6.0.2
* URL : https://github.com/ROCm/Tensile
* License : Expat
Programming Lang: Python, HIP
Description : ROCm tool for generating and benchmarking assembly kernels

Tensile is a set of tools and libraries primarily for selecting
parameters of GPU kernels implementing the general matrix multiply
(GEMM) operation. There are three components that comprise Tensile:
.
1. A command-line tool for generating kernels, benchmarking them, and
saving the parameters used for generating the best kernels (a.k.a.
"solutions") in YAML files.
2. A build system component that reads YAML solution files, generates
kernel source files, and invokes the compiler to turn them into code
object files. The kernels are indexed by their parameters in either
YAML or MessagePack format within a TensileLibrary file.
3. A runtime library for loading and executing the best available
solution for a given set of GEMM input parameters (a.k.a. "a problem").

The rocm-tensile library sources are currently packaged as part of
rocblas in a multi-upstream tarball package, but they should be split
out so that the command-line tool can be packaged. Tensile kernels are a
vital part of the performance of the rocblas library. It is often
necessary to add tuned kernels for particular problem sizes to achieve
optimal performance in a new application or on a new hardware
architecture. This is therefore an important development tool for BLAS
performance on AMD GPUs.

A fork of the Tensile library is also used by hipblaslt. Splitting
Tensile out from the rocblas package may be helpful in preventing the
duplication of embedded copies. The Tensile library can also be used by
MIOpen.

This package is part of AMD's ROCm stack and will be maintained under
the Debian AI team umbrella.
0 new messages