Compiling Magma for arm architecture

Skip to first unread message

Ashar Alam

Nov 11, 2020, 12:23:00 PM11/11/20
to MAGMA User

Is it possible to cross-compile and use MAGMA for ARM architecture and embedded processors; or is only x64 architecture supported?



Stanimire Tomov

Nov 11, 2020, 2:54:48 PM11/11/20
to Ashar Alam, MAGMA User
Hi Ashar,

Yes, it is possible to cross compile and use MAGMA on ARM.
We have done it successfully on NVIDIA’s Tegra devices.
I assume you would still want to use a GPU where the host is powered by ARM.

Best regards,

You received this message because you are subscribed to the Google Groups "MAGMA User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To view this discussion on the web visit

Ashar Alam

Nov 11, 2020, 9:59:32 PM11/11/20
to MAGMA User,, MAGMA User, Ashar Alam
Hi Stan,

Thank you so much for your reply. I think I want to use it for AGX Xavier. Is there a guideline or document on the compiling procedure?



Stanimire Tomov

Nov 11, 2020, 11:27:37 PM11/11/20
to Ashar Alam, MAGMA User

You may have to look at the AGX Xavier development kit documentation how to compile for it,
how to get compilers, etc.
After that compiling magma is as on any other system - see the instructions in README.
There are a number of examples in but not for arm.
We have to add one - or if you manage to compile it and are willing to contribute it,
that would be great.
Last time I compiled for an arm machine I used a like this:

#   -- MAGMA (version 2.0) --
#      Univ. of Tennessee, Knoxville
#      Univ. of California, Berkeley
#      Univ. of Colorado, Denver
#      @date

# GPU_TARGET contains one or more of Fermi, Kepler, Maxwell, Pascal, Volta
# to specify for which GPUs you want to compile MAGMA:
#     Fermi   - NVIDIA compute capability 2.x cards
#     Kepler  - NVIDIA compute capability 3.x cards
#     Maxwell - NVIDIA compute capability 5.x cards
#     Pascal  - NVIDIA compute capability 6.x cards
#     Volta   - NVIDIA compute capability 7.x cards
# The default is "Kepler Maxwell Pascal".
# Note that NVIDIA no longer supports 1.x cards, as of CUDA 6.5.
#GPU_TARGET ?= Kepler Maxwell Pascal

# --------------------
# programs

CC        = armclang++
CXX       = armclang++
NVCC      = nvcc
FORT      = armflang

ARCH      = ar
RANLIB    = ranlib

# --------------------
# flags

# Use -fPIC to make shared (.so) and static (.a) library;
# can be commented out if making only static library.
FPIC      = -fPIC

CFLAGS    = -O3 $(FPIC) -DNDEBUG -DADD_ -Wall -fopenmp
FFLAGS    = -O3 $(FPIC) -DNDEBUG -DADD_ -Wall -Wno-unused-dummy-argument
F90FLAGS  = -O3 $(FPIC) -DNDEBUG -DADD_ -Wall -Wno-unused-dummy-argument -x f95-cpp-input
NVCCFLAGS = -O3         -DNDEBUG -DADD_       -Xcompiler "$(FPIC)" -std=c++11
LDFLAGS   =     $(FPIC)                       -fopenmp

# C++11 (gcc >= 4.7) is not required, but has benefits like atomic operations
CXXFLAGS := $(CFLAGS) -std=c++11
CFLAGS   += -std=c99

# --------------------
# libraries
BLASmp = /sw/wombat/ARM_Compiler/19.3/armpl-19.3.0_ThunderX2CN99_RHEL-7_arm-hpc-compiler_19.3_aarch64-linux/lib/libarmpl_lp64_mp.a
LAPACKmp = /sw/wombat/ARM_Compiler/19.3/armpl-19.3.0_ThunderX2CN99_RHEL-7_arm-hpc-compiler_19.3_aarch64-linux/lib/libarmpl_lp64_mp.a

# gcc with OpenBLAS (includes LAPACK)
LIB       = $(BLASmp) $(LAPACKmp) 

LIB      += -lcublas -lcusparse -lcudart -lcudadevrt -lflang-omp -lomp

# --------------------
# directories

# define library directories preferably in your environment, or here.
#OPENBLASDIR ?= /usr/local/openblas
#-include make.check-openblas
#-include make.check-cuda

LIBDIR    = -L$(CUDADIR)/lib64

INC       = -I$(CUDADIR)/include

Note that you would need BLAS and LAPACK for arm. Usually this will be from openblas - you have
to get them for arm or compile them for arm. In the above example the vendor had them.

Hope this helps. Tell us how it goes and if there are problems (in which case we can try to find
a system like this and see how exactly to compile there).

Reply all
Reply to author
0 new messages