Arrayfire cross compilation

629 views
Skip to first unread message

Damien LEFEVRE

unread,
Apr 13, 2018, 10:14:21 AM4/13/18
to ArrayFire Users
Hi,

I've build an open embedded SDK image of Tegra TX2 and I'm trying to build arrayfire using the cross compiler on Linux.

Here are my steps:
> source /opt/sdk/environment-setup-aarch64-poky-linux
> cd arrayfire
> git checkout v3.5.1
> mkdir distribution
> mkdir build && cd build
> cmake -DCMAKE_TOOLCHAIN_FILE=/opt/sdk/sysroots/x86_64-pokysdk-linux/usr/share/cmake/OEToolchainConfig.cmake -DBUILD_CPU=OFF -DBUILD_CUDA=ON -DBUILD_OPENCL=OFF -DBUILD_EXAMPLES=OFF -DBUILD_TEST=OFF -DBUILD_GRAPHICS=OFF -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/home/user/library_builds/arrayfire/build/distribution ..
-- The C compiler identification is GNU 6.4.0
-- The CXX compiler identification is GNU 6.4.0
-- Check for working C compiler: /opt/sdk/sysroots/x86_64-pokysdk-linux/usr/bin/aarch64-poky-linux/aarch64-poky-linux-gcc
-- Check for working C compiler: /opt/sdk/sysroots/x86_64-pokysdk-linux/usr/bin/aarch64-poky-linux/aarch64-poky-linux-gcc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /opt/sdk/sysroots/x86_64-pokysdk-linux/usr/bin/aarch64-poky-linux/aarch64-poky-linux-g++
-- Check for working CXX compiler: /opt/sdk/sysroots/x86_64-pokysdk-linux/usr/bin/aarch64-poky-linux/aarch64-poky-linux-g++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found FREEIMAGE: /opt/sdk/sysroots/aarch64-poky-linux/usr/include  
CMake Error at /opt/sdk/sysroots/x86_64-pokysdk-linux/usr/share/cmake-3.7/Modules/FindPackageHandleStandardArgs.cmake:138 (message):
  Could NOT find CUDA (missing: CUDA_TOOLKIT_ROOT_DIR CUDA_NVCC_EXECUTABLE
  CUDA_INCLUDE_DIRS CUDA_CUDART_LIBRARY) (Required is at least version "7.0")
Call Stack (most recent call first):
  /opt/sdk/sysroots/x86_64-pokysdk-linux/usr/share/cmake-3.7/Modules/FindPackageHandleStandardArgs.cmake:378 (_FPHSA_FAILURE_MESSAGE)
  /opt/sdk/sysroots/x86_64-pokysdk-linux/usr/share/cmake-3.7/Modules/FindCUDA.cmake:1012 (find_package_handle_standard_args)
  src/backend/cuda/CMakeLists.txt:3 (FIND_PACKAGE)


-- Configuring incomplete, errors occurred!
See also "/home/user/library_builds/arrayfire/build/CMakeFiles/CMakeOutput.log".
See also "/home/user/library_builds/arrayfire/build/CMakeFiles/CMakeError.log".



cmake cannot find CUDA. libfreeimage is found from the same root fs. The cuda libraries are installed, see

> /opt/sdk/sysroots $ sudo find -name "*cuda*"
./aarch64-poky-linux/usr/include/linux/cuda.h
./aarch64-poky-linux/usr/include/unicode/icudataver.h
./aarch64-poky-linux/usr/src/debug/icu/58.2-r0/icu/source/common/icudataver.c
./aarch64-poky-linux/usr/lib/libcuda.so
./aarch64-poky-linux/usr/lib/libcuda.so.1.1
./aarch64-poky-linux/usr/lib/libicudata.so
./aarch64-poky-linux/usr/lib/libicudata.so.58.2
./aarch64-poky-linux/usr/lib/pkgconfig/cudart-8.0.pc
./aarch64-poky-linux/usr/lib/libicudata.so.58
./aarch64-poky-linux/usr/lib/libcuda.so.1
./aarch64-poky-linux/usr/lib/.debug/libicudata.so.58.2
./aarch64-poky-linux/usr/local/cuda-8.0
./aarch64-poky-linux/usr/local/cuda-8.0/include/cuda_device_runtime_api.h
./aarch64-poky-linux/usr/local/cuda-8.0/include/cuda_runtime.h
./aarch64-poky-linux/usr/local/cuda-8.0/include/cuda_runtime_api.h
./aarch64-poky-linux/usr/local/cuda-8.0/extras/Debugger/include/cudadebugger.h
./aarch64-poky-linux/usr/local/cuda-8.0/extras/Debugger/include/cudacoredump.h
./aarch64-poky-linux/usr/local/cuda-8.0/extras/Debugger/include/cuda_stdint.h
./aarch64-poky-linux/usr/local/cuda-8.0/extras/Debugger/include/libcudacore.h
./aarch64-poky-linux/usr/local/cuda-8.0/extras/Debugger/lib64/libcudacore.a
./aarch64-poky-linux/usr/local/cuda-8.0/bin/cuda-gdb
./aarch64-poky-linux/usr/local/cuda-8.0/bin/cudafe++
./aarch64-poky-linux/usr/local/cuda-8.0/bin/cudafe
./aarch64-poky-linux/usr/local/cuda-8.0/bin/cuda-gdbserver
./aarch64-poky-linux/usr/local/cuda-8.0/bin/cuda-memcheck
./aarch64-poky-linux/usr/local/cuda-8.0/nvvm/libnvvm-samples/cuda-c-linking
./aarch64-poky-linux/usr/local/cuda-8.0/nvvm/libnvvm-samples/cuda-c-linking/cuda-c-linking.cpp
./aarch64-poky-linux/usr/local/cuda-8.0/lib/libcudart.so.8.0.84
./aarch64-poky-linux/usr/local/cuda-8.0/lib/libcudart.so.8.0
./aarch64-poky-linux/usr/local/cuda-8.0/lib/libcudart.so
./aarch64-poky-linux/usr/share/licenses/cuda-cublas
./aarch64-poky-linux/usr/share/licenses/cuda-cusparse
./aarch64-poky-linux/usr/share/licenses/cuda-command-line-tools
./aarch64-poky-linux/usr/share/licenses/cuda-cudart
./aarch64-poky-linux/usr/share/licenses/cuda-npp
./aarch64-poky-linux/usr/share/licenses/cuda-nvrtc
./aarch64-poky-linux/usr/share/licenses/cuda-curand
./aarch64-poky-linux/usr/share/licenses/cuda-core
./aarch64-poky-linux/usr/share/licenses/cuda-cufft
./aarch64-poky-linux/usr/share/licenses/cuda-cusolver

When I set the missing paths mentioned in cmake's log, new errors always come making me think that I'm probably doing something wrong. Anyone spots a missing configuration?

Thanks
-Damien

Umar Arshad

unread,
Apr 13, 2018, 10:30:38 AM4/13/18
to ArrayFire Users
Hey Damien,

This has been on my todo list for a while but I haven't gotten around to
setting up cross-compilation in CMake for ArrayFire. Normally you wouldn't
need to do anything but it doesn't always work out that way.

From the look of your directory structure, you are missing a few CUDA files.
Is that the entire list of CUDA files or just the first few. Maybe the toolkit
installation failed.

Umar

Damien LEFEVRE

unread,
Apr 16, 2018, 3:25:27 AM4/16/18
to ArrayFire Users
Hey Umar,

Now that you mentioned that, I see the static files "*.a" are missing. I'll check that today.

Thanks
-Damien

Damien LEFEVRE

unread,
May 3, 2018, 8:09:39 AM5/3/18
to ArrayFire Users
Hey Umar,

I've been digging a bit into this and here are my findings. 

For cross compilation to work, you need both native SDK (x64) and target SDK (arm64) installed side by side. The native SDK utilities are used to detect versions etc, and target installation libraries should be used for linking.

I'm currently building from a yocto recipe which make use of meta-tegra layer, see https://github.com/madisongh/meta-tegra and does some of the heavy lifting. My recipe looks like this if you are interested:

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
#####################################################################################
## Arrayfire library 
#####################################################################################

DESCRIPTION = "Arrayfire library"
SECTION = "libs"

SRCREV = "0a675e87f42455e65cef7ea09ccd3200a3dd1db7"

LIC_FILES_CHKSUM = "file://COPYRIGHT.md;beginline=23;endline=23;md5=36c365f981e36a20caa006f389db83d5"

DEPENDS = "freeimage"

LICENSE = "BSD"
EXTRA_OECMAKE += "-DBUILD_CPU=OFF -DBUILD_CUDA=ON -DBUILD_OPENCL=OFF -DBUILD_EXAMPLES=OFF -DBUILD_TEST=OFF -DBUILD_GRAPHICS=OFF -DCMAKE_BUILD_TYPE=Release"

S = "${WORKDIR}/git"

inherit cmake cuda 

EXTRA_OECMAKE_append_tegra186 = ' -DCUDA_ARCH_BIN="6.1" -DCUDA_ARCH_PTX=""'

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


When building, cmake fails width these traces:

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Log data follows:
| DEBUG: Executing shell function do_configure
| -- The C compiler identification is GNU 5.4.0
| -- The CXX compiler identification is GNU 5.4.0
| -- Check for working C compiler: /home/damien/stack/build-jetson-tx2/tmp/work/aarch64_tegra186-poky-linux/arrayfire/3.5.1-r0/recipe-sysroot-native/usr/bin/aarch64-poky-linux/aarch64-poky-linux-gcc
| -- Check for working C compiler: /home/damien/stack/build-jetson-tx2/tmp/work/aarch64_tegra186-poky-linux/arrayfire/3.5.1-r0/recipe-sysroot-native/usr/bin/aarch64-poky-linux/aarch64-poky-linux-gcc -- works
| -- Detecting C compiler ABI info
| -- Detecting C compiler ABI info - done
| -- Detecting C compile features
| -- Detecting C compile features - done
| -- Check for working CXX compiler: /home/damien/stack/build-jetson-tx2/tmp/work/aarch64_tegra186-poky-linux/arrayfire/3.5.1-r0/recipe-sysroot-native/usr/bin/aarch64-poky-linux/aarch64-poky-linux-g++
| -- Check for working CXX compiler: /home/damien/stack/build-jetson-tx2/tmp/work/aarch64_tegra186-poky-linux/arrayfire/3.5.1-r0/recipe-sysroot-native/usr/bin/aarch64-poky-linux/aarch64-poky-linux-g++ -- works
| -- Detecting CXX compiler ABI info
| -- Detecting CXX compiler ABI info - done
| -- Detecting CXX compile features
| -- Detecting CXX compile features - done
| -- Looking for pthread.h
| -- Looking for pthread.h - found
| -- Looking for pthread_create
| -- Looking for pthread_create - not found
| -- Looking for pthread_create in pthreads
| -- Looking for pthread_create in pthreads - not found
| -- Looking for pthread_create in pthread
| -- Looking for pthread_create in pthread - found
| -- Found Threads: TRUE
| -- Found FREEIMAGE: /home/damien/stack/build-jetson-tx2/tmp/work/aarch64_tegra186-poky-linux/arrayfire/3.5.1-r0/recipe-sysroot/usr/include
| -- Found CUDA: /home/damien/stack/build-jetson-tx2/tmp/work/aarch64_tegra186-poky-linux/arrayfire/3.5.1-r0/recipe-sysroot-native/usr/local/cuda-8.0 (found suitable version "8.0", minimum required is "7.0")
| -- Found NVRTC: /home/damien/stack/build-jetson-tx2/tmp/work/aarch64_tegra186-poky-linux/arrayfire/3.5.1-r0/recipe-sysroot/usr/local/cuda-8.0/include
| -- /home/damien/stack/build-jetson-tx2/tmp/work/aarch64_tegra186-poky-linux/arrayfire/3.5.1-r0/git/CMakeModules/cuda_compute_capability.cpp
| CMake Error: TRY_RUN() invoked in cross-compiling mode, please set the following cache variables appropriately:
|    RUN_RESULT_VAR (advanced)
|    RUN_RESULT_VAR__TRYRUN_OUTPUT (advanced)
| For details see /home/damien/stack/build-jetson-tx2/tmp/work/aarch64_tegra186-poky-linux/arrayfire/3.5.1-r0/build/TryRunResults.cmake
| -- CUDA Compute Detection Output:
| -- CUDA Compute Detection Return: PLEASE_FILL_OUT-FAILED_TO_RUN
| -- CUDA Compute Detection Failed
| -- You can use -DCOMPUTES_DETECTED_LIST="AB;XY" (semicolon separated list of CUDA Compute versions to enable the specified computes
| -- Individual compute versions flags are also available under CMake Advance options
| -- No computes detected. Fall back to 30, 50, 60
| -- Number of Computes Detected = 3
| -- Setting Compute 30 to ON
| -- Setting Compute 50 to ON
| -- Setting Compute 60 to ON
| -- Try OpenMP C flag = [-fopenmp]
| -- Performing Test OpenMP_FLAG_DETECTED
| -- Performing Test OpenMP_FLAG_DETECTED - Success
| -- Try OpenMP CXX flag = [-fopenmp]
| -- Performing Test OpenMP_FLAG_DETECTED
| -- Performing Test OpenMP_FLAG_DETECTED - Success
| -- Found OpenMP: -fopenmp
| CMake Error at src/backend/cuda/CMakeLists.txt:316 (MESSAGE):
|   CMake CUDA Variable CUDA_CUDA_LIBRARY Not found.
| CUDA Driver Library (libcuda.so/libcuda.dylib/cuda.lib) cannot be found.
| CMake Error at src/backend/cuda/CMakeLists.txt:337 (MESSAGE):
|   Ending CMake configuration because of missing CUDA_CUDA_LIBRARY
| -- Configuring incomplete, errors occurred!
| See also "/home/damien/stack/build-jetson-tx2/tmp/work/aarch64_tegra186-poky-linux/arrayfire/3.5.1-r0/build/CMakeFiles/CMakeOutput.log".
| See also "/home/damien/stack/build-jetson-tx2/tmp/work/aarch64_tegra186-poky-linux/arrayfire/3.5.1-r0/build/CMakeFiles/CMakeError.log".
| WARNING: exit code 1 from a shell command.
| ERROR: Function failed: do_configure (log file is located at /home/damien/stack/build-jetson-tx2/tmp/work/aarch64_tegra186-poky-linux/arrayfire/3.5.1-r0/temp/log.do_configure.613)

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

I inspected the CMakeLists.txt file and it seems it tries to find the libraries in CUDA_TOOLKIT_ROOT_DIR. Now CUDA_TOOLKIT_ROOT_DIR is the native directory and does not have the lib folder. Instead, cmake should be looking at the libraries in CUDA_TOOLKIT_TARGET_DIR which is the variable defined for the target binaries.

I'm not good at CMake. I cannot figure out where CUDA_CUDA_LIBRARY is defined and where CMakeLists.txt tried to detect the libraries. If someone point me to it, I might be able to fixed the issue.

Thanks,
-Damien

Umar Arshad

unread,
May 3, 2018, 9:36:30 AM5/3/18
to ArrayFire Users
Hey Damien,

I was actually able to cross-compile to target the TX2 a couple of days ago. I have made a commit which should slightly improve the current experience but there is still much work to be done. I have committed my changes to ee21c79. Here is what I did.

1. Download and install the latest Jetpack

2. Create a build_native folder which targets the host architecture (i.e. not the Tegra)

mkdir build_native
cd  build_native
cmake
..


3. Make the bin2cpp target. This binary is executed during the build process to generate some internal headers

make bin2cpp


4. create another build_cross folder which will be the target of the cross-compilation

cd ..
mkdir build_cross
cd build_cross


5. Run the following CMake command

cmake \
-DCMAKE_TOOLCHAIN_FILE=../CMakeModules/TegraCrossToolchain.cmake \
-DAF_BUILD_CPU=OFF    \
-DAF_BUILD_OPENCL=OFF  \
-DCUDA_<cuda_lib>_LIBRARIES=/usr/local/cuda/targets/aarch64-linux/lib/stubs/lib<cuda_lib>.so \
-DNATIVE_BIN_DIR=../build_native \
-DCUDA_architecture_build_target=6.2 \
-DCMAKE_EXE_LINKER_FLAGS=-Wl,--unresolved-symbols=ignore-in-shared-libs \
..


where <cuda_lib> is one of [cublas, cufft, cusolver, cusparse]. You will need an entry for each of those libraries because CMake cannot find those files automatically. Note these are in the lib/stubs folder. These files are only used to make the linker happy. The actual libraries are located on the Jetson.

6. Make

make -j8


This *should* work. I did this on a fresh install of Ubuntu 16.04 and JetPack 3.2. Most of these instructions are based on my memory so beware of bugs.

The main takeaway from this is:

1. create a native build directory so we can create the bin2cpp executable.
2. in a separate directory specify the Tegra toolchain file which will set the compiler(you might have to update this file for your distro)
3. Set the -DNATIVE_BIN_DIR variable to point to the native build directory from step 1
4. Build with the -Wl,--unresolved-symbols=ignore-in-shared-libs linker flags so the tests are built correctly

Let me know if you run into any issues. I would like to streamline this experience so any feedback is appreciated.

Umar

Damien LEFEVRE

unread,
May 21, 2018, 7:50:09 AM5/21/18
to ArrayFire Users
Hey Umar,

Regarding CUDA not finding the libraries and adding paths manually. I think it could be made easier. The issue is that FindCUDA.cmake does not look into the stubs folder.

OpenCV for example has fixed FindCUDA and ships it. See https://github.com/opencv/opencv/blob/master/cmake/FindCUDA.cmake. Maybe you could do something similar and we would need to make a copy of the target rootfs on the host machine.

I'm trying to build the new 3.6 release with CUDA 8 and l4t-r28.1.

Which cross compiler do you use? I am testing with linaro 5.5-2017.10 . The CMAKE_C_COMPILER and CMAKE_CXX_COMPILER names set in the TegraCrossToolchain.cmake are incompatible. I had to change them in my setup and make them point to the full linaro path. 

In your build system, do you make a copy of the target rootfs of the host machine and do you give its path to cmake?

CMake first fails. Here are the traces:

~/tmp/arrayfire/build_cross $ cmake \
> -DCMAKE_TOOLCHAIN_FILE=../CMakeModules/TegraCrossToolchain.cmake \
> -DAF_BUILD_CPU=OFF    \
> -DAF_BUILD_OPENCL=OFF  \
> -DAF_BUILD_EXAMPLES=OFF \
> -DAF_BUILD_TEST=OFF \
> -DAF_WITH_GRAPHICS=OFF \
> -DCMAKE_BUILD_TYPE=Release \
> -DCUDA_CUBLAS_LIBRARIES=/usr/local/cuda/targets/aarch64-linux/lib/stubs/libcublas.so \
> -DCUDA_CUFFT_LIBRARIES=/usr/local/cuda/targets/aarch64-linux/lib/stubs/libcufft.so \
> -DCUDA_cusolver_LIBRARY=/usr/local/cuda/targets/aarch64-linux/lib/stubs/libcusolver.so \
> -DCUDA_cusparse_LIBRARY=/usr/local/cuda/targets/aarch64-linux/lib/stubs/libcusparse.so \
> -DCUDA_rt_LIBRARY=/usr/lib/aarch64-linux-gnu/librt.so \
> -DNATIVE_BIN_DIR=../build_native \
> -DCUDA_architecture_build_target=6.2 \
> -DCMAKE_EXE_LINKER_FLAGS=-Wl,--unresolved-symbols=ignore-in-shared-libs \
> ..
-- The C compiler identification is GNU 5.5.0
-- The CXX compiler identification is GNU 5.5.0
-- Check for working C compiler: /opt/linaro/gcc-linaro-5.5.0-2017.10-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc
-- Check for working C compiler: /opt/linaro/gcc-linaro-5.5.0-2017.10-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /opt/linaro/gcc-linaro-5.5.0-2017.10-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-g++
-- Check for working CXX compiler: /opt/linaro/gcc-linaro-5.5.0-2017.10-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-g++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for CL_VERSION_2_0
-- Looking for CL_VERSION_2_0 - not found
-- Looking for CL_VERSION_1_2
-- Looking for CL_VERSION_1_2 - not found
-- Looking for CL_VERSION_1_1
-- Looking for CL_VERSION_1_1 - not found
-- Looking for CL_VERSION_1_0
-- Looking for CL_VERSION_1_0 - not found
-- Could NOT find OpenCL (missing:  OpenCL_LIBRARY OpenCL_INCLUDE_DIR) 
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Found CUDA: /usr/local/cuda (found suitable version "8.0", minimum required is "7.0") 
-- Could NOT find OpenCL (missing:  OpenCL_LIBRARY OpenCL_INCLUDE_DIR) (Required is at least version "1.2")
-- Could NOT find OpenGL (missing:  OPENGL_gl_LIBRARY) 
-- Found OpenMP_C: -fopenmp (found version "4.0") 
-- Found OpenMP_CXX: -fopenmp (found version "4.0") 
-- Could NOT find FreeImage (missing:  FreeImage_INCLUDE_DIR FreeImage_LINK_LIBRARY) 
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.1") 
-- Checking for module 'fftw3'
--   No package 'fftw3' found
-- Could NOT find FFTW (missing:  FFTW_INCLUDE_DIR FFTW_LIBRARY FFTWF_LIBRARY) 
-- Checking for module 'cblas'
--   No package 'cblas' found
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of void*
-- Check size of void* - done
-- Checking for [Accelerate]
-- Checking for [vecLib]
-- Checking for [cblas - atlas]
-- Checking for [openblas]
-- Checking for [cblas]
-- Checking for [blas]
-- CBLAS library not found.
-- Could NOT find LAPACK (missing:  LAPACK_INCLUDE_DIR LAPACK_LIBRARIES) 
-- Could NOT find Doxygen (missing:  DOXYGEN_EXECUTABLE) 
-- Check size of int
-- Check size of int - done
-- Could NOT find MKL (missing:  MKL_INCLUDE_DIR MKL_Core_LINK_LIBRARY) 
-- Found FORGE: /usr/local/include  
-- Could NOT find Boost
-- BOOST_COMPUTE: /home/damien/tmp/arrayfire/build_cross/boost_compute-prefix/src/boost_compute
-- Automatic GPU detection failed. Building for common architectures.
-- CUDA Architectures: 3.0;3.5;5.0;5.2;6.0;6.1;6.1+PTX
-- CUDA driver library missing. Looking for libcuda stub.
-- CUDA driver stub FOUND: /usr/local/cuda/lib64/stubs/libcuda.so
-- Found PythonInterp: /usr/bin/python (found version "2.7.12") 
-- Check if compiler accepts -pthread
CMake Error: TRY_RUN() invoked in cross-compiling mode, please set the following cache variables appropriately:
   THREADS_PTHREAD_ARG (advanced)
For details see /home/damien/tmp/arrayfire/build_cross/TryRunResults.cmake
-- Check if compiler accepts -pthread - no
-- Configuring incomplete, errors occurred!
See also "/home/damien/tmp/arrayfire/build_cross/CMakeFiles/CMakeOutput.log".
See also "/home/damien/tmp/arrayfire/build_cross/CMakeFiles/CMakeError.log".

I attached the log files.

Then I tried forcing THREADS_PTHREAD_ARG to 0 as suggested in  /home/damien/tmp/arrayfire/build_cross/TryRunResults.cmake
cmake \
-DCMAKE_TOOLCHAIN_FILE=../CMakeModules/TegraCrossToolchain.cmake \
-DAF_BUILD_CPU=OFF    \
-DAF_BUILD_OPENCL=OFF  \
-DAF_BUILD_EXAMPLES=OFF \
-DAF_BUILD_TEST=OFF \
-DAF_WITH_GRAPHICS=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DCUDA_CUBLAS_LIBRARIES=/usr/local/cuda/targets/aarch64-linux/lib/stubs/libcublas.so \
-DCUDA_CUFFT_LIBRARIES=/usr/local/cuda/targets/aarch64-linux/lib/stubs/libcufft.so \
-DCUDA_cusolver_LIBRARY=/usr/local/cuda/targets/aarch64-linux/lib/stubs/libcusolver.so \
-DCUDA_cusparse_LIBRARY=/usr/local/cuda/targets/aarch64-linux/lib/stubs/libcusparse.so \
-DCUDA_rt_LIBRARY=/usr/lib/aarch64-linux-gnu/librt.so \
-DNATIVE_BIN_DIR=../build_native \
-DCUDA_architecture_build_target=6.2 \
-DCMAKE_EXE_LINKER_FLAGS=-Wl,--unresolved-symbols=ignore-in-shared-libs \
-DTHREADS_PTHREAD_ARG=0 \
..

The build goes on for a while but fails at 76%.

[ 75%] Building CXX object src/backend/cuda/CMakeFiles/afcuda.dir/__/__/api/cpp/ycbcr_rgb.cpp.o
[ 75%] Building CXX object src/backend/cuda/CMakeFiles/afcuda.dir/__/common/ArrayInfo.cpp.o
[ 75%] Building CXX object src/backend/cuda/CMakeFiles/afcuda.dir/__/common/SparseArray.cpp.o
[ 75%] Building CXX object src/backend/cuda/CMakeFiles/afcuda.dir/__/common/cblas.cpp.o
[ 75%] Building CXX object src/backend/cuda/CMakeFiles/afcuda.dir/__/common/constants.cpp.o
[ 75%] Building CXX object src/backend/cuda/CMakeFiles/afcuda.dir/__/common/dim4.cpp.o
[ 75%] Building CXX object src/backend/cuda/CMakeFiles/afcuda.dir/__/common/dispatch.cpp.o
[ 76%] Building CXX object src/backend/cuda/CMakeFiles/afcuda.dir/__/common/err_common.cpp.o
[ 76%] Building CXX object src/backend/cuda/CMakeFiles/afcuda.dir/__/common/host_memory.cpp.o
make[2]: *** No rule to make target '/usr/lib/aarch64-linux-gnu/librt.so', needed by 'src/backend/cuda/libafcuda.so.3.6.0'.  Stop.
make[2]: *** Waiting for unfinished jobs....
[ 76%] Building CXX object src/backend/cuda/CMakeFiles/afcuda.dir/__/common/util.cpp.o
CMakeFiles/Makefile2:1116: recipe for target 'src/backend/cuda/CMakeFiles/afcuda.dir/all' failed
make[1]: *** [src/backend/cuda/CMakeFiles/afcuda.dir/all] Error 2
Makefile:160: recipe for target 'all' failed
make: *** [all] Error 2

Have you see these issues on your machine? 

There seem to be some setup issue, especially if it build on your computer. CMake picks up the lib64 CUDA library for example.

-Damien
CMakeOutput.log
CMakeError.log

Damien LEFEVRE

unread,
May 21, 2018, 10:51:11 AM5/21/18
to ArrayFire Users
Some more digging today.

I've tried a different way cause yours didn't work for me:

1. Installed latest linaro:
sudo mkdir -p /opt/linaro
sudo chmod -R 775 /opt/linaro
sudo chown -R $USER /opt/linaro
cd /opt/linaro
tar -xf gcc-linaro-7.2.1-2017.11-x86_64_aarch64-linux-gnu.tar.xz

2. Made a copy of the target rootfs on the host. I use that script that I copy in the jetpack installation directory:
cd ~/jetpack-3.1/
cat > updatesysroot.sh <<EOL
#!/bin/bash
ip=${1:-192.168.1.100}
rsync -aAXv --delete \
--exclude={"/bin/*",\
"/boot/*",\
"/dev/*",\
"/etc/*",\
"/home/*",\
"/lost+found",\
"/media/*",\
"/mnt/*",\
"/opt/*",\
"/proc/*",\
"/root/*",\
"/run/*",\
"/sbin/*",\
"/snap/*",\
"/srv/*",\
"/sys/*",\
"/tmp/*",\
"/var/*",\
"/usr/3rdparty/*",\
"/usr/games/*",\
"/usr/local/bin/*",\
"/usr/local/etc/*",\
"/usr/local/games/*",\
"/usr/local/man/*",\
"/usr/local/sbin/*",\
"/usr/local/share/*",\
"/usr/local/src/*",\
"/usr/sbin/*",\
"/usr/src/*",\
"/usr/bin/*",\
"/usr/libaarch64-linux-gnu/*",\
"/usr/locale/*",\
"/usr/share/*"} \
-e ssh ubuntu@$ip:/ target_rootfs
EOL
chmod +x updatesysroot.sh
./updatesysroot.sh

3. Patched /usr/share/cmake-3.5/Modules/FindCUDA.cmake
macro(cuda_find_library_local_first_with_path_ext _var _names _doc _path_ext )
  if(CMAKE_SIZEOF_VOID_P EQUAL 8)
    # CUDA 3.2+ on Windows moved the library directories, so we need the new
    # and old paths.
    set(_cuda_64bit_lib_dir "${_path_ext}lib/x64" "${_path_ext}lib64" "${_path_ext}libx64" )
  endif()
  if(CMAKE_CROSSCOMPILING AND (ARM OR AARCH64))
    set(_cuda_cross_arm_lib_dir "${_path_ext}lib/stubs")
  endif()
  if(CUDA_VERSION VERSION_GREATER "6.0")
    set(_cuda_static_lib_names "")
    foreach(name ${_names})
      list(APPEND _cuda_static_lib_names "${name}_static")
    endforeach()
  endif()
  # CUDA 3.2+ on Windows moved the library directories, so we need to new
  # (lib/Win32) and the old path (lib).
  find_library(${_var}
    NAMES ${_names} ${_cuda_static_lib_names}
    PATHS "${CUDA_TOOLKIT_TARGET_DIR}" "${CUDA_TOOLKIT_ROOT_DIR}"
    ENV CUDA_PATH
    ENV CUDA_LIB_PATH
    PATH_SUFFIXES ${_cuda_64bit_lib_dir} "${_path_ext}lib/Win32" "${_path_ext}lib" "${_path_ext}libWin32"
    DOC ${_doc}
    NO_DEFAULT_PATH
    )
  # Search default search paths, after we search our own set of paths.
  find_library(${_var}
    NAMES ${_names}
    PATHS "/usr/lib/nvidia-current"
    DOC ${_doc}
    )
endmacro()

4. I modified TegraCrossToolchain.cmake as:
set(COMPILER_PATH  /opt/linaro/gcc-linaro-7.2.1-2017.11-x86_64_aarch64-linux-gnu)

set(CMAKE_SYSROOT /home/damien/jetpack-3.1/target_rootfs)

set(CMAKE_SYSTEM_NAME Linux)
set(CMAKE_SYSTEM_PROCESSOR aarch64)

set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)
set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)
set(CMAKE_FIND_ROOT_PATH_MODE_PACKAGE ONLY)

set(CMAKE_C_COMPILER ${COMPILER_PATH}/bin/aarch64-linux-gnu-gcc)
set(CMAKE_CXX_COMPILER ${COMPILER_PATH}/bin/aarch64-linux-gnu-g++)

set(CUDA_VERSION 8.0)

set(CUDA_TOOLKIT_TARGET_DIR ${CMAKE_SYSROOT}/usr/local/cuda-${CUDA_VERSION})
set(CUDA_TOOLKIT_TARGET_DIR_INTERNAL ${CMAKE_SYSROOT}/usr/local/cuda-${CUDA_VERSION})
set(CUDA_TOOLKIT_ROOT_DIR /usr/local/cuda-${CUDA_VERSION})
set(CUDA_TOOLKIT_ROOT_DIR_INTERNAL /usr/local/cuda-${CUDA_VERSION})

5. My CMake command looks like this:
cd build_cross
cmake \
-DCMAKE_TOOLCHAIN_FILE=../CMakeModules/TegraCrossToolchain.cmake \
-DAF_BUILD_CPU=OFF    \
-DAF_BUILD_OPENCL=OFF  \
-DAF_BUILD_EXAMPLES=OFF \
-DAF_BUILD_TEST=OFF \
-DAF_WITH_GRAPHICS=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DNATIVE_BIN_DIR=../build_native \
-DCUDA_architecture_build_targets=6.2 \
..

And my CMake output looks like this:
-- The C compiler identification is GNU 7.2.1
-- The CXX compiler identification is GNU 7.2.1
-- Check for working C compiler: /opt/linaro/gcc-linaro-7.2.1-2017.11-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc
-- Check for working C compiler: /opt/linaro/gcc-linaro-7.2.1-2017.11-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /opt/linaro/gcc-linaro-7.2.1-2017.11-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-g++
-- Check for working CXX compiler: /opt/linaro/gcc-linaro-7.2.1-2017.11-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-g++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for CL_VERSION_2_0
-- Looking for CL_VERSION_2_0 - not found
-- Looking for CL_VERSION_1_2
-- Looking for CL_VERSION_1_2 - not found
-- Looking for CL_VERSION_1_1
-- Looking for CL_VERSION_1_1 - not found
-- Looking for CL_VERSION_1_0
-- Looking for CL_VERSION_1_0 - not found
-- Could NOT find OpenCL (missing:  OpenCL_LIBRARY OpenCL_INCLUDE_DIR) 
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Found CUDA: /home/damien/jetpack-3.1/target_rootfs/usr/local/cuda (found suitable version "8.0", minimum required is "7.0") 
-- Could NOT find OpenCL (missing:  OpenCL_LIBRARY OpenCL_INCLUDE_DIR) (Required is at least version "1.2")
-- Found OpenGL: /home/damien/jetpack-3.1/target_rootfs/usr/lib/aarch64-linux-gnu/libGL.so  
-- Could NOT find OpenMP_C (missing:  OpenMP_gomp_LIBRARY) (found version "4.5")
-- Could NOT find OpenMP_CXX (missing:  OpenMP_gomp_LIBRARY) (found version "4.5")
-- Found FreeImage: /home/damien/jetpack-3.1/target_rootfs/usr/include  
-- Could NOT find Boost
-- BOOST_COMPUTE: /home/damien/tmp/arrayfire2/build_cross/boost_compute-prefix/src/boost_compute
-- CUDA Architectures: 6.2
CMake Error at src/backend/cuda/CMakeLists.txt:461 (message):
  OpenMP is required to compile CUDA Backend

So I might be one CMake error away =)

The Jetson board compiler version is GNU 5.4. So I also tried with Linaro 5.5 from here:

The error becomes 
-- Could NOT find OpenMP_C (missing:  OpenMP_C_FLAGS OpenMP_C_LIB_NAMES) (found version "1.0")
-- Could NOT find OpenMP_CXX (missing:  OpenMP_CXX_FLAGS OpenMP_CXX_LIB_NAMES) (found version "1.0")

But the Jetson board output is 
-- Found OpenMP_C: -fopenmp (found version "4.0") 
-- Found OpenMP_CXX: -fopenmp (found version "4.0") 

So this would sound to me like too low version.

I did manage to build Arrayfire directly in the Jetson board and I found it to be a good idea. So I added the prebuilt libraries to the yocto build system and flashed an image to my development board. When I try to use arrayfire from arrayfire-python I get a segmentation fault when calling randu.

Using gdb form command line I can see a the fault being related to glibc. The poky version ships a glibc 2.25 and jetpack ubuntu ships with 2.23. So my best guess is that the libc version are not compatible, meaning I need to find a way to build with cross compilation to insure all dependencies / libraries are aligned.

I'm runnning out of idea. Any help is greatly appreciated at this stage.

-Damien
Reply all
Reply to author
Forward
0 new messages