installation from source, cuda backend

359 views
Skip to first unread message

gopalakr

unread,
Nov 13, 2014, 7:05:06 PM11/13/14
to arrayfi...@googlegroups.com
Installation seemed normal, but  example fails at runtime. What am I missing ?

ArrayFire v3.0.200 (CUDA, 64-bit Linux, build ca95e1d)
Platform: CUDA Toolkit 6, Driver: 331.62
[0] Tesla K40c, 11520 MB, CUDA Compute 3.5
Create a 5-by-3 matrix of random floats on the GPU
A Internal error in XXXX/arrayfire/src/backend/cuda/kernel/transpose.hpp(99):
CUDA Error (9): invalid configuration argument

Segmentation fault (core dumped)

George Trigeorgis

unread,
Nov 13, 2014, 7:12:20 PM11/13/14
to arrayfi...@googlegroups.com
+1

Shehzan Mohammed

unread,
Nov 13, 2014, 7:21:18 PM11/13/14
to arrayfi...@googlegroups.com
Hi

We apologize for the confusion.
The installers available on this website are for older commercial versions of ArrayFire (v2.1). ***THESE ARE NOT OPEN SOURCE*** and require licensing.
We are actively working on bringing out installers for the open source versions. Until then, please try building the source using our github wiki.
We are here to help you solve any build issues that you may run into.

Shehzan Mohammed
ArrayFire

gopalakr

unread,
Nov 13, 2014, 7:50:05 PM11/13/14
to arrayfi...@googlegroups.com
I was in fact trying to build the source,

installation CPU backend would fail saying this --

cc1plus: error: unrecognized command line option ‘-std=c++11’
make[2]: *** [CMakeFiles/bin2cpp.dir/common/bin2cpp.cpp.o] Error 1

then changing the -std=c++11 to -std=c++0x, it went further but broke again.


XXXX/arrayfire/src/backend/cpu/blas.cpp:47:1: error: expected unqualified-id before ‘using’
XXXX/arrayfire/src/backend/cpu/blas.cpp:54:1: error: expected unqualified-id before ‘using’
XXXX/arrayfire/src/backend/cpu/blas.cpp:68:1: error: ‘gemm_func_def’ does not name a type
:

So I set BUILD_CPU and BUILD_OPENCL to OFF which then compiles fine for the CUDA backend but then breaks at runtime..

Let me know if I can provide more info.

Shehzan Mohammed

unread,
Nov 13, 2014, 7:58:23 PM11/13/14
to arrayfi...@googlegroups.com
Thanks for the info.

Your CPU compilation is failing as you do not have C++11 capable version of gcc (4.7 or higher).

As for CUDA, if you are running the helloworld example as is, it should not be calling the transpose kernel at all.
Can you post a code snipper if you have modified it?

Can you run the tests (using make test) and let me know which tests fail?

We test internally on CUDA 6.5 so this will help us clear the bugs ASAP.

Thanks
Shehzan Mohammed
ArrayFire

gopalakr

unread,
Nov 14, 2014, 1:17:44 AM11/14/14
to arrayfi...@googlegroups.com
I can only run the CUDA option, so compiling works only with the following settings in CMakeLists.txt

OPTION(BUILD_TEST "Build Tests" OFF)

OPTION(BUILD_EXAMPLES "Build Examples" ON)

OPTION(BUILD_GTEST "Download gtest and check for updates. Necessary if you change compilers" OFF)

OPTION(BUILD_CPU "Build ArrayFire with a CPU backend" OFF)

OPTION(BUILD_CUDA "Build ArrayFire with a CUDA backend" ON)

OPTION(BUILD_OPENCL "Build ArrayFire with a OpenCL backend" OFF)


Performing update step (SVN update) for 'googletest'

Unknown command: 'patch'

Type 'svn help' for usage.

make[2]: *** [third_party/src/googletest-stamp/googletest-patch] Error 1

make[2]: *** Waiting for unfinished jobs....

At revision 660.

make[1]: *** [test/CMakeFiles/googletest.dir/all] Error 2

my svn is latest, yet it doesn't have a *svn patch* option


G

Shehzan Mohammed

unread,
Nov 14, 2014, 9:20:12 AM11/14/14
to arrayfi...@googlegroups.com
To build tests you need to turn BUILD_GTEST on too. But lets leave that for now.

Can you send me the code you are running? The build id that you put in your code output does not exist so I have to assume that you have made some commits of your own.
Without looking at the relevant parts of code, especially the example, I will not be able to comment on it.
So please send the me example code as well as any other relevant changes that you have made so that I can help you out further.

Shehzan

Shehzan Mohammed

unread,
Nov 14, 2014, 9:27:56 AM11/14/14
to arrayfi...@googlegroups.com
My bad. I forgot to check the master branch. The build ID is there.
Can you send me the code from the example?

Shehzan

gopalakr

unread,
Nov 14, 2014, 9:30:33 AM11/14/14
to arrayfi...@googlegroups.com
I have not done any changes to the code or commits myself, only trying to compile things as they are.

It is surprising that It'd use transpose !

Shehzan Mohammed

unread,
Nov 14, 2014, 9:35:00 AM11/14/14
to arrayfi...@googlegroups.com
Yes it is surprising. It shouldn't be doing that.
Can you check out the devel branch and build that? We made a few bug fixes.

Shehzan

Shehzan Mohammed

unread,
Nov 14, 2014, 9:46:53 AM11/14/14
to arrayfi...@googlegroups.com
Ok I've got it.
Transpose is called when the print happens. So it is not surprising.
If you comment out all the af_prints, it shouldn't crash.
I'm not sure why its failing though. I'm working on it.


On Thursday, November 13, 2014 7:05:06 PM UTC-5, gopalakr wrote:

gopalakr

unread,
Nov 14, 2014, 9:55:06 AM11/14/14
to arrayfi...@googlegroups.com

I cloned devel and tried building, same cuda backend. If you want me to update g++ and cuda to 6.5 Ill do that!

$./examples/helloworld_cuda 

ArrayFire v3.0.200 (CUDA, 64-bit Linux, build 6d797e6)

Platform: CUDA Toolkit 6, Driver: 331.62

[0] Tesla K40c, 11520 MB, CUDA Compute 3.5

Create a 5-by-3 matrix of random floats on the GPU

Element-wise arithmetic

Negate the first three elements of second column

Internal error in XXXX/arrayfire/src/backend/cuda/jit.cpp(262):

CU Error (209)


XXXX/arrayfire/src/array/array.cpp:570: AF_ERROR 1

terminate called after throwing an instance of 'af::exception'

Aborted (core dumped)

gopalakr

unread,
Nov 14, 2014, 9:55:55 AM11/14/14
to arrayfi...@googlegroups.com
I did comment out the af_print for this output.

Shehzan Mohammed

unread,
Nov 14, 2014, 10:07:20 AM11/14/14
to arrayfi...@googlegroups.com
So I've just built it with CUDA 6.0 and its working fine at my end.
Just to be sure, git you clone with "git clone --recursive"? (There should be files in arrayfire/src/backend/cuda/ptx/*).
If you did not run these 3 commands:

git submodule init
git submodule update
git submodule foreach git pull origin master

gopalakr

unread,
Nov 14, 2014, 10:21:44 AM11/14/14
to arrayfi...@googlegroups.com
Yes, I do see these directories, in ptx.

The error seems to suggest invoking cuda kernel is wrong. 

A Internal error in /crtx0/gopalakr/projects/arrayfire/src/backend/cuda/kernel/transpose.hpp(99):

CUDA Error (9): invalid configuration argument

with the af_print commented out, its the CU error from jit.cpp

Shehzan Mohammed

unread,
Nov 14, 2014, 10:50:39 AM11/14/14
to arrayfi...@googlegroups.com
I tried to build this for Tesla K40c with compute 35 and even that works fine.
Can you try this. Its a long shot though.


I've build the attached links for libafcuda.so and ptx_headers with gcc 4.6 and cuda 6.0 (compute 30) and c++0x.
Replace your libafcuda and ptx headers with the extracted ones. They should go into build/src/backend/cuda/

Note: my source commit is 6d797e6 from the devel branch.

Shehzan

gopalakr

unread,
Nov 14, 2014, 11:20:17 AM11/14/14
to arrayfi...@googlegroups.com
OK did these, it seemed to have gone further (the af_print part may be) but stuck on the jit thing.

./examples/helloworld_cuda 

ArrayFire v3.0.200 (CUDA, 64-bit Linux, build 6d797e6)

Platform: CUDA Toolkit 6, Driver: 331.62

[0] Tesla K40c, 11520 MB, CUDA Compute 3.5

Create a 5-by-3 matrix of random floats on the GPU

A [5 3 1 1]

   Offsets: [0 0 0 0]

   Strides: [1 5 15 15]

    0.0001     0.0001     0.0001 

    0.0001     0.0001     0.0001 

    0.0001     0.0001     0.0001 

    0.0001     0.0001     0.0001 

    0.0001     0.0001     0.0001 


Element-wise arithmetic

B Internal error in /home/shehzan/area51/arrayfire/src/backend/cuda/jit.cpp(262):

CU Error (209)


Segmentation fault (core dumped)

srmuniz

unread,
Nov 15, 2014, 9:16:22 PM11/15/14
to arrayfi...@googlegroups.com
Hello Shehzan

I'm observing a similar error to gopalakr, as shown below:

=====================================
./examples/helloworld_cuda
ArrayFire v3.0.200 (CUDA, 64-bit Linux, build ca95e1d)
Platform: CUDA Toolkit 6.5, Driver: 340.29
[0] GeForce GTX 590, 1536 MB, CUDA Compute 2.0
-1- GeForce GTX 590, 1536 MB, CUDA Compute 2.0

Create a 5-by-3 matrix of random floats on the GPU
A Internal error in /home/SrM/ArrayFire/arrayfire/src/backend/cuda/kernel/transpose.hpp(99):

CUDA Error (9): invalid configuration argument

[5 3 1 1]

Element-wise arithmetic
B Internal error in /home/SrM/ArrayFire/arrayfire/src/backend/cuda/kernel/transpose.hpp(99):

CUDA Error (9): invalid configuration argument

Segmentation fault (core dumped)
=====================================


As you can see, it happens for a different GPU (GTX590) and already with CUDA 6.5, so it does not seem to be related to that.
Curiously, I still get it passed on 67 out of 68 of the ctest (except for Test_reduce_cuda), as shown below:


=====================================
ctest
Test project /home/SrM/ArrayFire/arrayfire/build
      Start  1: Test_morph_cpu
 1/68 Test  #1: Test_morph_cpu ...................   Passed    1.16 sec
      Start  2: Test_transpose_cpu
 2/68 Test  #2: Test_transpose_cpu ...............   Passed    0.45 sec
      Start  3: Test_diff2_cpu
 3/68 Test  #3: Test_diff2_cpu ...................   Passed    0.01 sec
      Start  4: Test_scan_cpu
 4/68 Test  #4: Test_scan_cpu ....................   Passed    0.05 sec
      Start  5: Test_resize_cpu
 5/68 Test  #5: Test_resize_cpu ..................   Passed    0.84 sec
      Start  6: Test_regions_cpu
 6/68 Test  #6: Test_regions_cpu .................   Passed    0.08 sec
      Start  7: Test_gradient_cpu
 7/68 Test  #7: Test_gradient_cpu ................   Passed    0.03 sec
      Start  8: Test_random_cpu
 8/68 Test  #8: Test_random_cpu ..................   Passed   82.56 sec
      Start  9: Test_blas_cpu
 9/68 Test  #9: Test_blas_cpu ....................   Passed    0.01 sec
      Start 10: Test_moddims_cpu
10/68 Test #10: Test_moddims_cpu .................   Passed    0.01 sec
      Start 11: Test_reduce_cpu
11/68 Test #11: Test_reduce_cpu ..................   Passed    0.03 sec
      Start 12: Test_convolve_cpu
12/68 Test #12: Test_convolve_cpu ................   Passed    0.07 sec
      Start 13: Test_bilateral_cpu
13/68 Test #13: Test_bilateral_cpu ...............   Passed    3.68 sec
      Start 14: Test_sort_by_key_cpu
14/68 Test #14: Test_sort_by_key_cpu .............   Passed    0.03 sec
      Start 15: Test_approx2_cpu
15/68 Test #15: Test_approx2_cpu .................   Passed    1.45 sec
      Start 16: Test_meanshift_cpu
16/68 Test #16: Test_meanshift_cpu ...............   Passed    9.06 sec
      Start 17: Test_sort_cpu
17/68 Test #17: Test_sort_cpu ....................   Passed    0.02 sec
      Start 18: Test_basic_cpu
18/68 Test #18: Test_basic_cpu ...................   Passed    0.03 sec
      Start 19: Test_info_cpu
19/68 Test #19: Test_info_cpu ....................   Passed    0.01 sec
      Start 20: Test_where_cpu
20/68 Test #20: Test_where_cpu ...................   Passed    0.01 sec
      Start 21: Test_reorder_cpu
21/68 Test #21: Test_reorder_cpu .................   Passed    0.07 sec
      Start 22: Test_shift_cpu
22/68 Test #22: Test_shift_cpu ...................   Passed    0.03 sec
      Start 23: Test_assign_cpu
23/68 Test #23: Test_assign_cpu ..................   Passed    0.01 sec
      Start 24: Test_sort_index_cpu
24/68 Test #24: Test_sort_index_cpu ..............   Passed    0.02 sec
      Start 25: Test_histogram_cpu
25/68 Test #25: Test_histogram_cpu ...............   Passed    0.01 sec
      Start 26: Test_index_cpu
26/68 Test #26: Test_index_cpu ...................   Passed    0.02 sec
      Start 27: Test_medfilt_cpu
27/68 Test #27: Test_medfilt_cpu .................   Passed    0.01 sec
      Start 28: Test_fft_cpu
28/68 Test #28: Test_fft_cpu .....................   Passed    0.02 sec
      Start 29: Test_tile_cpu
29/68 Test #29: Test_tile_cpu ....................   Passed   12.41 sec
      Start 30: Test_approx1_cpu
30/68 Test #30: Test_approx1_cpu .................   Passed    0.24 sec
      Start 31: Test_set_cpu
31/68 Test #31: Test_set_cpu .....................   Passed    0.01 sec
      Start 32: Test_imageio_cpu
32/68 Test #32: Test_imageio_cpu .................   Passed    0.06 sec
      Start 33: Test_diff1_cpu
33/68 Test #33: Test_diff1_cpu ...................   Passed    0.01 sec
      Start 34: Test_rotate_cpu
34/68 Test #34: Test_rotate_cpu ..................   Passed    5.93 sec
      Start 35: Test_morph_cuda
35/68 Test #35: Test_morph_cuda ..................   Passed    0.34 sec
      Start 36: Test_transpose_cuda
36/68 Test #36: Test_transpose_cuda ..............   Passed    0.78 sec
      Start 37: Test_diff2_cuda
37/68 Test #37: Test_diff2_cuda ..................   Passed    0.19 sec
      Start 38: Test_scan_cuda
38/68 Test #38: Test_scan_cuda ...................   Passed    0.34 sec
      Start 39: Test_resize_cuda
39/68 Test #39: Test_resize_cuda .................   Passed    1.01 sec
      Start 40: Test_regions_cuda
40/68 Test #40: Test_regions_cuda ................   Passed    0.21 sec
      Start 41: Test_gradient_cuda
41/68 Test #41: Test_gradient_cuda ...............   Passed    0.19 sec
      Start 42: Test_random_cuda
42/68 Test #42: Test_random_cuda .................   Passed    0.25 sec
      Start 43: Test_blas_cuda
43/68 Test #43: Test_blas_cuda ...................   Passed    0.26 sec
      Start 44: Test_moddims_cuda
44/68 Test #44: Test_moddims_cuda ................   Passed    0.17 sec
      Start 45: Test_reduce_cuda
45/68 Test #45: Test_reduce_cuda .................***Failed    0.19 sec
      Start 46: Test_convolve_cuda
46/68 Test #46: Test_convolve_cuda ...............   Passed    0.30 sec
      Start 47: Test_bilateral_cuda
47/68 Test #47: Test_bilateral_cuda ..............   Passed    0.24 sec
      Start 48: Test_sort_by_key_cuda
48/68 Test #48: Test_sort_by_key_cuda ............   Passed    0.45 sec
      Start 49: Test_approx2_cuda
49/68 Test #49: Test_approx2_cuda ................   Passed    1.71 sec
      Start 50: Test_meanshift_cuda
50/68 Test #50: Test_meanshift_cuda ..............   Passed    0.28 sec
      Start 51: Test_sort_cuda
51/68 Test #51: Test_sort_cuda ...................   Passed    0.51 sec
      Start 52: Test_basic_cuda
52/68 Test #52: Test_basic_cuda ..................   Passed    0.27 sec
      Start 53: Test_info_cuda
53/68 Test #53: Test_info_cuda ...................   Passed    0.29 sec
      Start 54: Test_where_cuda
54/68 Test #54: Test_where_cuda ..................   Passed    0.17 sec
      Start 55: Test_reorder_cuda
55/68 Test #55: Test_reorder_cuda ................   Passed    0.28 sec
      Start 56: Test_shift_cuda
56/68 Test #56: Test_shift_cuda ..................   Passed    0.22 sec
      Start 57: Test_assign_cuda
57/68 Test #57: Test_assign_cuda .................   Passed    0.20 sec
      Start 58: Test_sort_index_cuda
58/68 Test #58: Test_sort_index_cuda .............   Passed    0.44 sec
      Start 59: Test_histogram_cuda
59/68 Test #59: Test_histogram_cuda ..............   Passed    0.20 sec
      Start 60: Test_index_cuda
60/68 Test #60: Test_index_cuda ..................   Passed    0.20 sec
      Start 61: Test_medfilt_cuda
61/68 Test #61: Test_medfilt_cuda ................   Passed    0.18 sec
      Start 62: Test_fft_cuda
62/68 Test #62: Test_fft_cuda ....................   Passed    0.29 sec
      Start 63: Test_tile_cuda
63/68 Test #63: Test_tile_cuda ...................   Passed   12.18 sec
      Start 64: Test_approx1_cuda
64/68 Test #64: Test_approx1_cuda ................   Passed    0.41 sec
      Start 65: Test_set_cuda
65/68 Test #65: Test_set_cuda ....................   Passed    0.20 sec
      Start 66: Test_imageio_cuda
66/68 Test #66: Test_imageio_cuda ................   Passed    0.21 sec
      Start 67: Test_diff1_cuda
67/68 Test #67: Test_diff1_cuda ..................   Passed    0.19 sec
      Start 68: Test_rotate_cuda
68/68 Test #68: Test_rotate_cuda .................   Passed    6.25 sec

99% tests passed, 1 tests failed out of 68

Total Test time (real) = 148.09 sec

The following tests FAILED:
     45 - Test_reduce_cuda (Failed)
Errors while running CTest
=====================================


In addition to that, I could get it built for the CPU and CUDA backends, but had troubles with OPENCL. In order to complete the installation I had to turn OFF openCL backend, using: 

cmake .. -DCMAKE_BUILD_TYPE=Release -DBUILD_CUDA=ON -DBUILD_OPENCL=OFF

When trying to build with openCL, I get an error at:

=====================================
...
[ 29%] Built target helloworld_cpu
[ 54%] Built target afopencl
Linking CXX executable helloworld_opencl
/home/SrM/ArrayFire/clBLAS/build/package/lib64/libclBLAS.so: undefined reference to `clRetainContext@OPENCL_1.0'
...
... ## LOTS OF SIMILAR ERRORS!! (undefined references to ...)##
...
/home/SrM/ArrayFire/clBLAS/build/package/lib64/libclBLAS.so: undefined reference to `clCreateCommandQueue@OPENCL_1.0'
collect2: error: ld returned 1 exit status
make[2]: *** [examples/helloworld_opencl] Error 1
make[1]: *** [examples/CMakeFiles/helloworld_opencl.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[ 77%] Built target afcuda
At revision 660.
[ 77%] Performing configure step for 'googletest'
-- Configuring done
-- Generating done
-- Build files have been written to: /home/SrM/ArrayFire/arrayfire/build/third_party/src/googletest-build
[ 77%] Performing build step for 'googletest'
[ 50%] Built target gtest
[100%] Built target gtest_main
[ 77%] No install step for 'googletest'
[ 78%] Completed 'googletest'
[ 79%] Built target googletest
make: *** [all] Error 2
=====================================


I hope these information may help you figuring it out. Let me know if you need more info.

Cheers!
SrM.

Pavan Yalamanchili

unread,
Nov 15, 2014, 9:52:59 PM11/15/14
to srmuniz, arrayfi...@googlegroups.com
Hi SrM,

The OpenCL problem occurs when clBLAS and ArrayFire are using different versions of libOpenCL.so. Can you make sure you are using the same OpenCL libs are being used for both these libraries.

--
Pavan

--
You received this message because you are subscribed to the Google Groups "ArrayFire Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to arrayfire-use...@googlegroups.com.
To post to this group, send email to arrayfi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/arrayfire-users/37a7bcfc-a994-42a8-b058-79419951b0ab%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Shehzan Mohammed

unread,
Nov 15, 2014, 9:58:01 PM11/15/14
to arrayfi...@googlegroups.com
Hi

We are aware of the reduction failure on CUDA and are investigating it. You can follow it on our github issues page.
We have also updated the master branch with the latest updates. You can try to build that.
Let me know if you run into the same problems again.

Shehzan

srmuniz

unread,
Nov 15, 2014, 10:06:59 PM11/15/14
to arrayfi...@googlegroups.com, srmun...@gmail.com
Hello Pavan.
Thanks for your superfast reply!

I saw some mentioning of this in the "Known issues" page, but I'm not so sure how to determine that. I mean, I followed the instruction in the Wiki and did not notice anything strange while installing clBLAS, which apparently worked fine, and later I looked into the Makefile produced by cmake and it seems to point to the right places. In addition, I would expect the cmake scripts to find the same versions of libOpenCL.so, as they are basically running from the same place, under the same conditions, (folder, session, etc.)... But, of course, I might be wrong about that.
Any suggestion on how to make sure?

All the best!

Pavan Yalamanchili

unread,
Nov 15, 2014, 10:18:17 PM11/15/14
to srmuniz, arrayfi...@googlegroups.com
Hi,

FindOpenCL.cmake is not part of the default CMake modulse. clBLAS and ArrayFire ship with their own versions of this file. They are looking for files in different order of preference. If you have more than one OpenCL SDK installed on your system, you may face issues like this.

We will update the wiki to reflect this as a possible issue. To use CUDA's OpenCL SDK on linux, you need to do the following.

export OpenCL_ROOT=$CUDA_PATH
export LD_LIBRARY_PATH=$CUDA_PATH/lib64:$LD_LIBRARYPATH

After doing the following, delete the build directories and start from scratch.
 

Shehzan Mohammed

unread,
Nov 15, 2014, 10:31:25 PM11/15/14
to arrayfi...@googlegroups.com, srmun...@gmail.com
Hey Muniz

Can you run make test with "CTEST_OUTPUT_ON_FAILURE=ON make test".
This will result in a detail print of the reduction failure. That would be helpful.

Shehzan

srmuniz

unread,
Nov 15, 2014, 11:08:13 PM11/15/14
to arrayfi...@googlegroups.com
Shehzan:

I updated the local distribution (git pull), as you suggested, and rebuilt it.
Unfortunately, I still get a similar error (although not exactly the same one as before):

==================================
./examples/helloworld_cuda
ArrayFire v3.0.200 (CUDA, 64-bit Linux, build 89efa49)

Platform: CUDA Toolkit 6.5, Driver: 340.29
[0] GeForce GTX 590, 1536 MB, CUDA Compute 2.0
-1- GeForce GTX 590, 1536 MB, CUDA Compute 2.0
Create a 5-by-3 matrix of random floats on the GPU
A Internal error in /home/SrM/ArrayFire/arrayfire/src/backend/cuda/kernel/transpose.hpp(99):
CUDA Error (9): invalid configuration argument

Segmentation fault (core dumped)
==================================



Now, for your second suggestion, below follows the detailed results of the tests:

==================================
CTEST_OUTPUT_ON_FAILURE=ON make test
Running tests...

Test project /home/SrM/ArrayFire/arrayfire/build
      Start  1: Test_morph_cpu
 1/68 Test  #1: Test_morph_cpu ...................   Passed    1.16 sec
      Start  2: Test_transpose_cpu
 2/68 Test  #2: Test_transpose_cpu ...............   Passed    0.46 sec

      Start  3: Test_diff2_cpu
 3/68 Test  #3: Test_diff2_cpu ...................   Passed    0.01 sec
      Start  4: Test_scan_cpu
 4/68 Test  #4: Test_scan_cpu ....................   Passed    0.05 sec
      Start  5: Test_resize_cpu
 5/68 Test  #5: Test_resize_cpu ..................   Passed    0.83 sec

      Start  6: Test_regions_cpu
 6/68 Test  #6: Test_regions_cpu .................   Passed    0.08 sec
      Start  7: Test_gradient_cpu
 7/68 Test  #7: Test_gradient_cpu ................   Passed    0.03 sec
      Start  8: Test_random_cpu
 8/68 Test  #8: Test_random_cpu ..................   Passed   82.57 sec

      Start  9: Test_blas_cpu
 9/68 Test  #9: Test_blas_cpu ....................   Passed    0.01 sec
      Start 10: Test_moddims_cpu
10/68 Test #10: Test_moddims_cpu .................   Passed    0.01 sec
      Start 11: Test_reduce_cpu
11/68 Test #11: Test_reduce_cpu ..................   Passed    0.03 sec
      Start 12: Test_convolve_cpu
12/68 Test #12: Test_convolve_cpu ................   Passed    0.06 sec
      Start 13: Test_bilateral_cpu
13/68 Test #13: Test_bilateral_cpu ...............   Passed    3.93 sec
      Start 14: Test_sort_by_key_cpu
14/68 Test #14: Test_sort_by_key_cpu .............   Passed    0.02 sec
      Start 15: Test_approx2_cpu
15/68 Test #15: Test_approx2_cpu .................   Passed    1.52 sec
      Start 16: Test_meanshift_cpu
16/68 Test #16: Test_meanshift_cpu ...............   Passed    9.27 sec

      Start 17: Test_sort_cpu
17/68 Test #17: Test_sort_cpu ....................   Passed    0.02 sec
      Start 18: Test_basic_cpu
18/68 Test #18: Test_basic_cpu ...................   Passed    0.03 sec
      Start 19: Test_info_cpu
19/68 Test #19: Test_info_cpu ....................   Passed    0.01 sec
      Start 20: Test_where_cpu
20/68 Test #20: Test_where_cpu ...................   Passed    0.01 sec
      Start 21: Test_reorder_cpu
21/68 Test #21: Test_reorder_cpu .................   Passed    0.07 sec
      Start 22: Test_shift_cpu
22/68 Test #22: Test_shift_cpu ...................   Passed    0.03 sec
      Start 23: Test_assign_cpu
23/68 Test #23: Test_assign_cpu ..................   Passed    0.01 sec
      Start 24: Test_sort_index_cpu
24/68 Test #24: Test_sort_index_cpu ..............   Passed    0.02 sec
      Start 25: Test_histogram_cpu
25/68 Test #25: Test_histogram_cpu ...............   Passed    0.01 sec
      Start 26: Test_index_cpu
26/68 Test #26: Test_index_cpu ...................***Exception: SegFault  0.15 sec
Running main() from gtest_main.cc
[==========] Running 181 tests from 19 test cases.
[----------] Global test environment set-up.
[----------] 5 tests from Indexing1D/0, where TypeParam = float
[ RUN      ] Indexing1D/0.Continious
[       OK ] Indexing1D/0.Continious (0 ms)
[ RUN      ] Indexing1D/0.ContiniousReverse
[       OK ] Indexing1D/0.ContiniousReverse (0 ms)
[ RUN      ] Indexing1D/0.Strided
[       OK ] Indexing1D/0.Strided (0 ms)
[ RUN      ] Indexing1D/0.StridedReverse
[       OK ] Indexing1D/0.StridedReverse (0 ms)
[ RUN      ] Indexing1D/0.Span
[       OK ] Indexing1D/0.Span (0 ms)
[----------] 5 tests from Indexing1D/0 (0 ms total)

[----------] 5 tests from Indexing1D/1, where TypeParam = double
[ RUN      ] Indexing1D/1.Continious
[       OK ] Indexing1D/1.Continious (0 ms)
[ RUN      ] Indexing1D/1.ContiniousReverse
[       OK ] Indexing1D/1.ContiniousReverse (0 ms)
[ RUN      ] Indexing1D/1.Strided
[       OK ] Indexing1D/1.Strided (0 ms)
[ RUN      ] Indexing1D/1.StridedReverse
[       OK ] Indexing1D/1.StridedReverse (0 ms)
[ RUN      ] Indexing1D/1.Span
[       OK ] Indexing1D/1.Span (0 ms)
[----------] 5 tests from Indexing1D/1 (0 ms total)

[----------] 5 tests from Indexing1D/2, where TypeParam = int
[ RUN      ] Indexing1D/2.Continious
[       OK ] Indexing1D/2.Continious (0 ms)
[ RUN      ] Indexing1D/2.ContiniousReverse
[       OK ] Indexing1D/2.ContiniousReverse (0 ms)
[ RUN      ] Indexing1D/2.Strided
[       OK ] Indexing1D/2.Strided (0 ms)
[ RUN      ] Indexing1D/2.StridedReverse
[       OK ] Indexing1D/2.StridedReverse (0 ms)
[ RUN      ] Indexing1D/2.Span
[       OK ] Indexing1D/2.Span (0 ms)
[----------] 5 tests from Indexing1D/2 (0 ms total)

[----------] 5 tests from Indexing1D/3, where TypeParam = unsigned int
[ RUN      ] Indexing1D/3.Continious
[       OK ] Indexing1D/3.Continious (0 ms)
[ RUN      ] Indexing1D/3.ContiniousReverse
[       OK ] Indexing1D/3.ContiniousReverse (0 ms)
[ RUN      ] Indexing1D/3.Strided
[       OK ] Indexing1D/3.Strided (0 ms)
[ RUN      ] Indexing1D/3.StridedReverse
[       OK ] Indexing1D/3.StridedReverse (0 ms)
[ RUN      ] Indexing1D/3.Span
[       OK ] Indexing1D/3.Span (0 ms)
[----------] 5 tests from Indexing1D/3 (0 ms total)

[----------] 5 tests from Indexing1D/4, where TypeParam = char
[ RUN      ] Indexing1D/4.Continious
[       OK ] Indexing1D/4.Continious (0 ms)
[ RUN      ] Indexing1D/4.ContiniousReverse
[       OK ] Indexing1D/4.ContiniousReverse (0 ms)
[ RUN      ] Indexing1D/4.Strided
[       OK ] Indexing1D/4.Strided (0 ms)
[ RUN      ] Indexing1D/4.StridedReverse
[       OK ] Indexing1D/4.StridedReverse (0 ms)
[ RUN      ] Indexing1D/4.Span
[       OK ] Indexing1D/4.Span (0 ms)
[----------] 5 tests from Indexing1D/4 (0 ms total)

[----------] 5 tests from Indexing1D/5, where TypeParam = unsigned char
[ RUN      ] Indexing1D/5.Continious
[       OK ] Indexing1D/5.Continious (0 ms)
[ RUN      ] Indexing1D/5.ContiniousReverse
[       OK ] Indexing1D/5.ContiniousReverse (0 ms)
[ RUN      ] Indexing1D/5.Strided
[       OK ] Indexing1D/5.Strided (0 ms)
[ RUN      ] Indexing1D/5.StridedReverse
[       OK ] Indexing1D/5.StridedReverse (0 ms)
[ RUN      ] Indexing1D/5.Span
[       OK ] Indexing1D/5.Span (0 ms)
[----------] 5 tests from Indexing1D/5 (0 ms total)

[----------] 18 tests from Indexing2D/0, where TypeParam = float
[ RUN      ] Indexing2D/0.ColumnContinious
[       OK ] Indexing2D/0.ColumnContinious (1 ms)
[ RUN      ] Indexing2D/0.ColumnContiniousReverse
[       OK ] Indexing2D/0.ColumnContiniousReverse (0 ms)
[ RUN      ] Indexing2D/0.ColumnStrided
[       OK ] Indexing2D/0.ColumnStrided (0 ms)
[ RUN      ] Indexing2D/0.ColumnStridedReverse
[       OK ] Indexing2D/0.ColumnStridedReverse (0 ms)
[ RUN      ] Indexing2D/0.RowContinious
[       OK ] Indexing2D/0.RowContinious (0 ms)
[ RUN      ] Indexing2D/0.RowContiniousReverse
[       OK ] Indexing2D/0.RowContiniousReverse (0 ms)
[ RUN      ] Indexing2D/0.RowStrided
[       OK ] Indexing2D/0.RowStrided (0 ms)
[ RUN      ] Indexing2D/0.RowStridedReverse
[       OK ] Indexing2D/0.RowStridedReverse (0 ms)
[ RUN      ] Indexing2D/0.ContiniousContinious
[       OK ] Indexing2D/0.ContiniousContinious (0 ms)
[ RUN      ] Indexing2D/0.ContiniousReverse
[       OK ] Indexing2D/0.ContiniousReverse (0 ms)
[ RUN      ] Indexing2D/0.ContiniousStrided
[       OK ] Indexing2D/0.ContiniousStrided (0 ms)
[ RUN      ] Indexing2D/0.ContiniousStridedReverse
[       OK ] Indexing2D/0.ContiniousStridedReverse (0 ms)
[ RUN      ] Indexing2D/0.ReverseContinious
[       OK ] Indexing2D/0.ReverseContinious (0 ms)
[ RUN      ] Indexing2D/0.ReverseReverse
[       OK ] Indexing2D/0.ReverseReverse (0 ms)
[ RUN      ] Indexing2D/0.ReverseStrided
[       OK ] Indexing2D/0.ReverseStrided (1 ms)
[ RUN      ] Indexing2D/0.ReverseStridedReverse
[       OK ] Indexing2D/0.ReverseStridedReverse (0 ms)
[ RUN      ] Indexing2D/0.StridedContinious
[       OK ] Indexing2D/0.StridedContinious (0 ms)
[ RUN      ] Indexing2D/0.StridedStrided
[       OK ] Indexing2D/0.StridedStrided (0 ms)
[----------] 18 tests from Indexing2D/0 (2 ms total)

[----------] 18 tests from Indexing2D/1, where TypeParam = double
[ RUN      ] Indexing2D/1.ColumnContinious
[       OK ] Indexing2D/1.ColumnContinious (0 ms)
[ RUN      ] Indexing2D/1.ColumnContiniousReverse
[       OK ] Indexing2D/1.ColumnContiniousReverse (0 ms)
[ RUN      ] Indexing2D/1.ColumnStrided
[       OK ] Indexing2D/1.ColumnStrided (0 ms)
[ RUN      ] Indexing2D/1.ColumnStridedReverse
[       OK ] Indexing2D/1.ColumnStridedReverse (0 ms)
[ RUN      ] Indexing2D/1.RowContinious
[       OK ] Indexing2D/1.RowContinious (0 ms)
[ RUN      ] Indexing2D/1.RowContiniousReverse
[       OK ] Indexing2D/1.RowContiniousReverse (0 ms)
[ RUN      ] Indexing2D/1.RowStrided
[       OK ] Indexing2D/1.RowStrided (0 ms)
[ RUN      ] Indexing2D/1.RowStridedReverse
[       OK ] Indexing2D/1.RowStridedReverse (0 ms)
[ RUN      ] Indexing2D/1.ContiniousContinious
[       OK ] Indexing2D/1.ContiniousContinious (0 ms)
[ RUN      ] Indexing2D/1.ContiniousReverse
[       OK ] Indexing2D/1.ContiniousReverse (0 ms)
[ RUN      ] Indexing2D/1.ContiniousStrided
[       OK ] Indexing2D/1.ContiniousStrided (1 ms)
[ RUN      ] Indexing2D/1.ContiniousStridedReverse
[       OK ] Indexing2D/1.ContiniousStridedReverse (0 ms)
[ RUN      ] Indexing2D/1.ReverseContinious
[       OK ] Indexing2D/1.ReverseContinious (0 ms)
[ RUN      ] Indexing2D/1.ReverseReverse
[       OK ] Indexing2D/1.ReverseReverse (0 ms)
[ RUN      ] Indexing2D/1.ReverseStrided
[       OK ] Indexing2D/1.ReverseStrided (0 ms)
[ RUN      ] Indexing2D/1.ReverseStridedReverse
[       OK ] Indexing2D/1.ReverseStridedReverse (0 ms)
[ RUN      ] Indexing2D/1.StridedContinious
[       OK ] Indexing2D/1.StridedContinious (0 ms)
[ RUN      ] Indexing2D/1.StridedStrided
[       OK ] Indexing2D/1.StridedStrided (0 ms)
[----------] 18 tests from Indexing2D/1 (1 ms total)

[----------] 18 tests from Indexing2D/2, where TypeParam = int
[ RUN      ] Indexing2D/2.ColumnContinious
[       OK ] Indexing2D/2.ColumnContinious (0 ms)
[ RUN      ] Indexing2D/2.ColumnContiniousReverse
[       OK ] Indexing2D/2.ColumnContiniousReverse (0 ms)
[ RUN      ] Indexing2D/2.ColumnStrided
[       OK ] Indexing2D/2.ColumnStrided (0 ms)
[ RUN      ] Indexing2D/2.ColumnStridedReverse
[       OK ] Indexing2D/2.ColumnStridedReverse (0 ms)
[ RUN      ] Indexing2D/2.RowContinious
[       OK ] Indexing2D/2.RowContinious (0 ms)
[ RUN      ] Indexing2D/2.RowContiniousReverse
[       OK ] Indexing2D/2.RowContiniousReverse (0 ms)
[ RUN      ] Indexing2D/2.RowStrided
[       OK ] Indexing2D/2.RowStrided (0 ms)
[ RUN      ] Indexing2D/2.RowStridedReverse
[       OK ] Indexing2D/2.RowStridedReverse (0 ms)
[ RUN      ] Indexing2D/2.ContiniousContinious
[       OK ] Indexing2D/2.ContiniousContinious (0 ms)
[ RUN      ] Indexing2D/2.ContiniousReverse
[       OK ] Indexing2D/2.ContiniousReverse (0 ms)
[ RUN      ] Indexing2D/2.ContiniousStrided
[       OK ] Indexing2D/2.ContiniousStrided (0 ms)
[ RUN      ] Indexing2D/2.ContiniousStridedReverse
[       OK ] Indexing2D/2.ContiniousStridedReverse (0 ms)
[ RUN      ] Indexing2D/2.ReverseContinious
[       OK ] Indexing2D/2.ReverseContinious (0 ms)
[ RUN      ] Indexing2D/2.ReverseReverse
[       OK ] Indexing2D/2.ReverseReverse (0 ms)
[ RUN      ] Indexing2D/2.ReverseStrided
[       OK ] Indexing2D/2.ReverseStrided (0 ms)
[ RUN      ] Indexing2D/2.ReverseStridedReverse
[       OK ] Indexing2D/2.ReverseStridedReverse (0 ms)
[ RUN      ] Indexing2D/2.StridedContinious
[       OK ] Indexing2D/2.StridedContinious (0 ms)
[ RUN      ] Indexing2D/2.StridedStrided
[       OK ] Indexing2D/2.StridedStrided (0 ms)
[----------] 18 tests from Indexing2D/2 (1 ms total)

[----------] 18 tests from Indexing2D/3, where TypeParam = unsigned int
[ RUN      ] Indexing2D/3.ColumnContinious
[       OK ] Indexing2D/3.ColumnContinious (0 ms)
[ RUN      ] Indexing2D/3.ColumnContiniousReverse
[       OK ] Indexing2D/3.ColumnContiniousReverse (0 ms)
[ RUN      ] Indexing2D/3.ColumnStrided
[       OK ] Indexing2D/3.ColumnStrided (1 ms)
[ RUN      ] Indexing2D/3.ColumnStridedReverse
[       OK ] Indexing2D/3.ColumnStridedReverse (0 ms)
[ RUN      ] Indexing2D/3.RowContinious
[       OK ] Indexing2D/3.RowContinious (0 ms)
[ RUN      ] Indexing2D/3.RowContiniousReverse
[       OK ] Indexing2D/3.RowContiniousReverse (0 ms)
[ RUN      ] Indexing2D/3.RowStrided
[       OK ] Indexing2D/3.RowStrided (0 ms)
[ RUN      ] Indexing2D/3.RowStridedReverse
[       OK ] Indexing2D/3.RowStridedReverse (0 ms)
[ RUN      ] Indexing2D/3.ContiniousContinious
[       OK ] Indexing2D/3.ContiniousContinious (0 ms)
[ RUN      ] Indexing2D/3.ContiniousReverse
[       OK ] Indexing2D/3.ContiniousReverse (0 ms)
[ RUN      ] Indexing2D/3.ContiniousStrided
[       OK ] Indexing2D/3.ContiniousStrided (0 ms)
[ RUN      ] Indexing2D/3.ContiniousStridedReverse
[       OK ] Indexing2D/3.ContiniousStridedReverse (0 ms)
[ RUN      ] Indexing2D/3.ReverseContinious
[       OK ] Indexing2D/3.ReverseContinious (0 ms)
[ RUN      ] Indexing2D/3.ReverseReverse
[       OK ] Indexing2D/3.ReverseReverse (0 ms)
[ RUN      ] Indexing2D/3.ReverseStrided
[       OK ] Indexing2D/3.ReverseStrided (0 ms)
[ RUN      ] Indexing2D/3.ReverseStridedReverse
[       OK ] Indexing2D/3.ReverseStridedReverse (0 ms)
[ RUN      ] Indexing2D/3.StridedContinious
[       OK ] Indexing2D/3.StridedContinious (0 ms)
[ RUN      ] Indexing2D/3.StridedStrided
[       OK ] Indexing2D/3.StridedStrided (1 ms)
[----------] 18 tests from Indexing2D/3 (2 ms total)

[----------] 18 tests from Indexing2D/4, where TypeParam = char
[ RUN      ] Indexing2D/4.ColumnContinious
[       OK ] Indexing2D/4.ColumnContinious (0 ms)
[ RUN      ] Indexing2D/4.ColumnContiniousReverse
[       OK ] Indexing2D/4.ColumnContiniousReverse (0 ms)
[ RUN      ] Indexing2D/4.ColumnStrided
[       OK ] Indexing2D/4.ColumnStrided (0 ms)
[ RUN      ] Indexing2D/4.ColumnStridedReverse
[       OK ] Indexing2D/4.ColumnStridedReverse (0 ms)
[ RUN      ] Indexing2D/4.RowContinious
[       OK ] Indexing2D/4.RowContinious (0 ms)
[ RUN      ] Indexing2D/4.RowContiniousReverse
[       OK ] Indexing2D/4.RowContiniousReverse (0 ms)
[ RUN      ] Indexing2D/4.RowStrided
[       OK ] Indexing2D/4.RowStrided (0 ms)
[ RUN      ] Indexing2D/4.RowStridedReverse
[       OK ] Indexing2D/4.RowStridedReverse (0 ms)
[ RUN      ] Indexing2D/4.ContiniousContinious
[       OK ] Indexing2D/4.ContiniousContinious (0 ms)
[ RUN      ] Indexing2D/4.ContiniousReverse
[       OK ] Indexing2D/4.ContiniousReverse (0 ms)
[ RUN      ] Indexing2D/4.ContiniousStrided
[       OK ] Indexing2D/4.ContiniousStrided (0 ms)
[ RUN      ] Indexing2D/4.ContiniousStridedReverse
[       OK ] Indexing2D/4.ContiniousStridedReverse (0 ms)
[ RUN      ] Indexing2D/4.ReverseContinious
[       OK ] Indexing2D/4.ReverseContinious (0 ms)
[ RUN      ] Indexing2D/4.ReverseReverse
[       OK ] Indexing2D/4.ReverseReverse (1 ms)
[ RUN      ] Indexing2D/4.ReverseStrided
[       OK ] Indexing2D/4.ReverseStrided (0 ms)
[ RUN      ] Indexing2D/4.ReverseStridedReverse
[       OK ] Indexing2D/4.ReverseStridedReverse (0 ms)
[ RUN      ] Indexing2D/4.StridedContinious
[       OK ] Indexing2D/4.StridedContinious (0 ms)
[ RUN      ] Indexing2D/4.StridedStrided
[       OK ] Indexing2D/4.StridedStrided (0 ms)
[----------] 18 tests from Indexing2D/4 (1 ms total)

[----------] 18 tests from Indexing2D/5, where TypeParam = unsigned char
[ RUN      ] Indexing2D/5.ColumnContinious
[       OK ] Indexing2D/5.ColumnContinious (0 ms)
[ RUN      ] Indexing2D/5.ColumnContiniousReverse
[       OK ] Indexing2D/5.ColumnContiniousReverse (0 ms)
[ RUN      ] Indexing2D/5.ColumnStrided
[       OK ] Indexing2D/5.ColumnStrided (0 ms)
[ RUN      ] Indexing2D/5.ColumnStridedReverse
[       OK ] Indexing2D/5.ColumnStridedReverse (0 ms)
[ RUN      ] Indexing2D/5.RowContinious
[       OK ] Indexing2D/5.RowContinious (0 ms)
[ RUN      ] Indexing2D/5.RowContiniousReverse
[       OK ] Indexing2D/5.RowContiniousReverse (0 ms)
[ RUN      ] Indexing2D/5.RowStrided
[       OK ] Indexing2D/5.RowStrided (0 ms)
[ RUN      ] Indexing2D/5.RowStridedReverse
[       OK ] Indexing2D/5.RowStridedReverse (0 ms)
[ RUN      ] Indexing2D/5.ContiniousContinious
[       OK ] Indexing2D/5.ContiniousContinious (0 ms)
[ RUN      ] Indexing2D/5.ContiniousReverse
[       OK ] Indexing2D/5.ContiniousReverse (1 ms)
[ RUN      ] Indexing2D/5.ContiniousStrided
[       OK ] Indexing2D/5.ContiniousStrided (0 ms)
[ RUN      ] Indexing2D/5.ContiniousStridedReverse
[       OK ] Indexing2D/5.ContiniousStridedReverse (0 ms)
[ RUN      ] Indexing2D/5.ReverseContinious
[       OK ] Indexing2D/5.ReverseContinious (0 ms)
[ RUN      ] Indexing2D/5.ReverseReverse
[       OK ] Indexing2D/5.ReverseReverse (0 ms)
[ RUN      ] Indexing2D/5.ReverseStrided
[       OK ] Indexing2D/5.ReverseStrided (0 ms)
[ RUN      ] Indexing2D/5.ReverseStridedReverse
[       OK ] Indexing2D/5.ReverseStridedReverse (0 ms)
[ RUN      ] Indexing2D/5.StridedContinious
[       OK ] Indexing2D/5.StridedContinious (0 ms)
[ RUN      ] Indexing2D/5.StridedStrided
[       OK ] Indexing2D/5.StridedStrided (0 ms)
[----------] 18 tests from Indexing2D/5 (1 ms total)

[----------] 7 tests from Indexing/0, where TypeParam = float
[ RUN      ] Indexing/0.4D_to_4D
/home/SrM/ArrayFire/arrayfire/test/testHelpers.hpp:69: Failure
Failed
TEST FILE NOT FOUND


      Start 27: Test_medfilt_cpu
27/68 Test #27: Test_medfilt_cpu .................   Passed    0.01 sec
      Start 28: Test_fft_cpu
28/68 Test #28: Test_fft_cpu .....................   Passed    0.02 sec
      Start 29: Test_tile_cpu
29/68 Test #29: Test_tile_cpu ....................   Passed   12.23 sec

      Start 30: Test_approx1_cpu
30/68 Test #30: Test_approx1_cpu .................   Passed    0.24 sec
      Start 31: Test_set_cpu
31/68 Test #31: Test_set_cpu .....................   Passed    0.01 sec
      Start 32: Test_imageio_cpu
32/68 Test #32: Test_imageio_cpu .................   Passed    0.05 sec

      Start 33: Test_diff1_cpu
33/68 Test #33: Test_diff1_cpu ...................   Passed    0.01 sec
      Start 34: Test_rotate_cpu
34/68 Test #34: Test_rotate_cpu ..................   Passed    6.00 sec
      Start 35: Test_morph_cuda
35/68 Test #35: Test_morph_cuda ..................   Passed    0.33 sec

      Start 36: Test_transpose_cuda
36/68 Test #36: Test_transpose_cuda ..............   Passed    0.78 sec
      Start 37: Test_diff2_cuda
37/68 Test #37: Test_diff2_cuda ..................   Passed    0.19 sec
      Start 38: Test_scan_cuda
38/68 Test #38: Test_scan_cuda ...................   Passed    0.34 sec
      Start 39: Test_resize_cuda
39/68 Test #39: Test_resize_cuda .................   Passed    1.18 sec
      Start 40: Test_regions_cuda
40/68 Test #40: Test_regions_cuda ................   Passed    0.22 sec

      Start 41: Test_gradient_cuda
41/68 Test #41: Test_gradient_cuda ...............   Passed    0.19 sec
      Start 42: Test_random_cuda
42/68 Test #42: Test_random_cuda .................   Passed    0.25 sec
      Start 43: Test_blas_cuda
43/68 Test #43: Test_blas_cuda ...................   Passed    0.27 sec
      Start 44: Test_moddims_cuda
44/68 Test #44: Test_moddims_cuda ................   Passed    0.16 sec

      Start 45: Test_reduce_cuda
45/68 Test #45: Test_reduce_cuda .................***Failed    0.19 sec
Running main() from gtest_main.cc
[==========] Running 50 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 50 tests from Reduce
[ RUN      ] Reduce.Test_sum_float
[       OK ] Reduce.Test_sum_float (129 ms)
[ RUN      ] Reduce.Test_sum_double
[       OK ] Reduce.Test_sum_double (1 ms)
[ RUN      ] Reduce.Test_sum_int
[       OK ] Reduce.Test_sum_int (0 ms)
[ RUN      ] Reduce.Test_sum_cfloat
[       OK ] Reduce.Test_sum_cfloat (0 ms)
[ RUN      ] Reduce.Test_sum_cdouble
[       OK ] Reduce.Test_sum_cdouble (1 ms)
[ RUN      ] Reduce.Test_sum_unsigned
[       OK ] Reduce.Test_sum_unsigned (0 ms)
[ RUN      ] Reduce.Test_sum_uchar
[       OK ] Reduce.Test_sum_uchar (0 ms)
[ RUN      ] Reduce.Test_min_float
[       OK ] Reduce.Test_min_float (0 ms)
[ RUN      ] Reduce.Test_min_double
[       OK ] Reduce.Test_min_double (1 ms)
[ RUN      ] Reduce.Test_min_int
[       OK ] Reduce.Test_min_int (0 ms)
[ RUN      ] Reduce.Test_min_cfloat
[       OK ] Reduce.Test_min_cfloat (1 ms)
[ RUN      ] Reduce.Test_min_cdouble
[       OK ] Reduce.Test_min_cdouble (0 ms)
[ RUN      ] Reduce.Test_min_unsigned
[       OK ] Reduce.Test_min_unsigned (0 ms)
[ RUN      ] Reduce.Test_min_uchar
[       OK ] Reduce.Test_min_uchar (1 ms)
[ RUN      ] Reduce.Test_max_float
[       OK ] Reduce.Test_max_float (0 ms)
[ RUN      ] Reduce.Test_max_double
[       OK ] Reduce.Test_max_double (0 ms)
[ RUN      ] Reduce.Test_max_int
[       OK ] Reduce.Test_max_int (0 ms)
[ RUN      ] Reduce.Test_max_cfloat
[       OK ] Reduce.Test_max_cfloat (1 ms)
[ RUN      ] Reduce.Test_max_cdouble
[       OK ] Reduce.Test_max_cdouble (0 ms)
[ RUN      ] Reduce.Test_max_unsigned
[       OK ] Reduce.Test_max_unsigned (1 ms)
[ RUN      ] Reduce.Test_max_uchar
[       OK ] Reduce.Test_max_uchar (0 ms)
[ RUN      ] Reduce.Test_anytrue_float
[       OK ] Reduce.Test_anytrue_float (0 ms)
[ RUN      ] Reduce.Test_anytrue_double
[       OK ] Reduce.Test_anytrue_double (1 ms)
[ RUN      ] Reduce.Test_anytrue_int
[       OK ] Reduce.Test_anytrue_int (0 ms)
[ RUN      ] Reduce.Test_anytrue_cfloat
[       OK ] Reduce.Test_anytrue_cfloat (0 ms)
[ RUN      ] Reduce.Test_anytrue_cdouble
[       OK ] Reduce.Test_anytrue_cdouble (1 ms)
[ RUN      ] Reduce.Test_anytrue_unsigned
[       OK ] Reduce.Test_anytrue_unsigned (0 ms)
[ RUN      ] Reduce.Test_anytrue_uchar
[       OK ] Reduce.Test_anytrue_uchar (0 ms)
[ RUN      ] Reduce.Test_alltrue_float
[       OK ] Reduce.Test_alltrue_float (0 ms)
[ RUN      ] Reduce.Test_alltrue_double
[       OK ] Reduce.Test_alltrue_double (1 ms)
[ RUN      ] Reduce.Test_alltrue_int
[       OK ] Reduce.Test_alltrue_int (0 ms)
[ RUN      ] Reduce.Test_alltrue_cfloat
[       OK ] Reduce.Test_alltrue_cfloat (1 ms)
[ RUN      ] Reduce.Test_alltrue_cdouble
[       OK ] Reduce.Test_alltrue_cdouble (0 ms)
[ RUN      ] Reduce.Test_alltrue_unsigned
[       OK ] Reduce.Test_alltrue_unsigned (0 ms)
[ RUN      ] Reduce.Test_alltrue_uchar
[       OK ] Reduce.Test_alltrue_uchar (0 ms)
[ RUN      ] Reduce.Test_count_float
[       OK ] Reduce.Test_count_float (1 ms)
[ RUN      ] Reduce.Test_count_double
[       OK ] Reduce.Test_count_double (0 ms)
[ RUN      ] Reduce.Test_count_int
[       OK ] Reduce.Test_count_int (0 ms)
[ RUN      ] Reduce.Test_count_cfloat
[       OK ] Reduce.Test_count_cfloat (1 ms)
[ RUN      ] Reduce.Test_count_cdouble
[       OK ] Reduce.Test_count_cdouble (0 ms)
[ RUN      ] Reduce.Test_count_unsigned
[       OK ] Reduce.Test_count_unsigned (0 ms)
[ RUN      ] Reduce.Test_count_uchar
[       OK ] Reduce.Test_count_uchar (1 ms)
[ RUN      ] Reduce.Test_Reduce_Big0
/home/SrM/ArrayFire/arrayfire/test/reduce.cpp:68: Failure
Value of: outData[elIter]
  Actual: 25003
Expected: currGoldBar[elIter]
Which is: 49747
at: 0 for dim 0

[  FAILED  ] Reduce.Test_Reduce_Big0 (6 ms)
[ RUN      ] Reduce.Test_Reduce_Big1
[       OK ] Reduce.Test_Reduce_Big1 (9 ms)
[ RUN      ] Reduce.Test_sum_CPP
[       OK ] Reduce.Test_sum_CPP (1 ms)
[ RUN      ] Reduce.Test_min_CPP
[       OK ] Reduce.Test_min_CPP (0 ms)
[ RUN      ] Reduce.Test_max_CPP
[       OK ] Reduce.Test_max_CPP (0 ms)
[ RUN      ] Reduce.Test_anytrue_CPP
[       OK ] Reduce.Test_anytrue_CPP (0 ms)
[ RUN      ] Reduce.Test_alltrue_CPP
[       OK ] Reduce.Test_alltrue_CPP (0 ms)
[ RUN      ] Reduce.Test_count_CPP
[       OK ] Reduce.Test_count_CPP (0 ms)
[----------] 50 tests from Reduce (160 ms total)

[----------] Global test environment tear-down
[==========] 50 tests from 1 test case ran. (160 ms total)
[  PASSED  ] 49 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] Reduce.Test_Reduce_Big0

 1 FAILED TEST


      Start 46: Test_convolve_cuda
46/68 Test #46: Test_convolve_cuda ...............   Passed    0.30 sec
      Start 47: Test_bilateral_cuda
47/68 Test #47: Test_bilateral_cuda ..............   Passed    0.24 sec
      Start 48: Test_sort_by_key_cuda
48/68 Test #48: Test_sort_by_key_cuda ............   Passed    0.46 sec
      Start 49: Test_approx2_cuda
49/68 Test #49: Test_approx2_cuda ................   Passed    1.64 sec
      Start 50: Test_meanshift_cuda
50/68 Test #50: Test_meanshift_cuda ..............   Passed    0.29 sec

      Start 51: Test_sort_cuda
51/68 Test #51: Test_sort_cuda ...................   Passed    0.51 sec
      Start 52: Test_basic_cuda
52/68 Test #52: Test_basic_cuda ..................   Passed    0.26 sec
      Start 53: Test_info_cuda
53/68 Test #53: Test_info_cuda ...................   Passed    0.31 sec

      Start 54: Test_where_cuda
54/68 Test #54: Test_where_cuda ..................   Passed    0.17 sec
      Start 55: Test_reorder_cuda
55/68 Test #55: Test_reorder_cuda ................   Passed    0.29 sec

      Start 56: Test_shift_cuda
56/68 Test #56: Test_shift_cuda ..................   Passed    0.22 sec
      Start 57: Test_assign_cuda
57/68 Test #57: Test_assign_cuda .................   Passed    0.20 sec
      Start 58: Test_sort_index_cuda
58/68 Test #58: Test_sort_index_cuda .............   Passed    0.43 sec
      Start 59: Test_histogram_cuda
59/68 Test #59: Test_histogram_cuda ..............   Passed    0.19 sec
      Start 60: Test_index_cuda
60/68 Test #60: Test_index_cuda ..................***Exception: SegFault  0.33 sec
Running main() from gtest_main.cc
[==========] Running 181 tests from 19 test cases.
[----------] Global test environment set-up.
[----------] 5 tests from Indexing1D/0, where TypeParam = float
[ RUN      ] Indexing1D/0.Continious
[       OK ] Indexing1D/0.Continious (131 ms)
[ RUN      ] Indexing1D/0.ContiniousReverse
[       OK ] Indexing1D/0.ContiniousReverse (0 ms)
[ RUN      ] Indexing1D/0.Strided
[       OK ] Indexing1D/0.Strided (0 ms)
[ RUN      ] Indexing1D/0.StridedReverse
[       OK ] Indexing1D/0.StridedReverse (0 ms)
[ RUN      ] Indexing1D/0.Span
[       OK ] Indexing1D/0.Span (0 ms)
[----------] 5 tests from Indexing1D/0 (131 ms total)

[----------] 5 tests from Indexing1D/1, where TypeParam = double
[ RUN      ] Indexing1D/1.Continious
[       OK ] Indexing1D/1.Continious (0 ms)
[ RUN      ] Indexing1D/1.ContiniousReverse
[       OK ] Indexing1D/1.ContiniousReverse (0 ms)
[ RUN      ] Indexing1D/1.Strided
[       OK ] Indexing1D/1.Strided (1 ms)
[ RUN      ] Indexing1D/1.StridedReverse
[       OK ] Indexing1D/1.StridedReverse (0 ms)
[ RUN      ] Indexing1D/1.Span
[       OK ] Indexing1D/1.Span (0 ms)
[----------] 5 tests from Indexing1D/1 (1 ms total)

[----------] 5 tests from Indexing1D/2, where TypeParam = int
[ RUN      ] Indexing1D/2.Continious
[       OK ] Indexing1D/2.Continious (0 ms)
[ RUN      ] Indexing1D/2.ContiniousReverse
[       OK ] Indexing1D/2.ContiniousReverse (0 ms)
[ RUN      ] Indexing1D/2.Strided
[       OK ] Indexing1D/2.Strided (0 ms)
[ RUN      ] Indexing1D/2.StridedReverse
[       OK ] Indexing1D/2.StridedReverse (0 ms)
[ RUN      ] Indexing1D/2.Span
[       OK ] Indexing1D/2.Span (0 ms)
[----------] 5 tests from Indexing1D/2 (0 ms total)

[----------] 5 tests from Indexing1D/3, where TypeParam = unsigned int
[ RUN      ] Indexing1D/3.Continious
[       OK ] Indexing1D/3.Continious (0 ms)
[ RUN      ] Indexing1D/3.ContiniousReverse
[       OK ] Indexing1D/3.ContiniousReverse (1 ms)
[ RUN      ] Indexing1D/3.Strided
[       OK ] Indexing1D/3.Strided (0 ms)
[ RUN      ] Indexing1D/3.StridedReverse
[       OK ] Indexing1D/3.StridedReverse (0 ms)
[ RUN      ] Indexing1D/3.Span
[       OK ] Indexing1D/3.Span (0 ms)
[----------] 5 tests from Indexing1D/3 (1 ms total)

[----------] 5 tests from Indexing1D/4, where TypeParam = char
[ RUN      ] Indexing1D/4.Continious
[       OK ] Indexing1D/4.Continious (0 ms)
[ RUN      ] Indexing1D/4.ContiniousReverse
[       OK ] Indexing1D/4.ContiniousReverse (0 ms)
[ RUN      ] Indexing1D/4.Strided
[       OK ] Indexing1D/4.Strided (0 ms)
[ RUN      ] Indexing1D/4.StridedReverse
[       OK ] Indexing1D/4.StridedReverse (0 ms)
[ RUN      ] Indexing1D/4.Span
[       OK ] Indexing1D/4.Span (0 ms)
[----------] 5 tests from Indexing1D/4 (0 ms total)

[----------] 5 tests from Indexing1D/5, where TypeParam = unsigned char
[ RUN      ] Indexing1D/5.Continious
[       OK ] Indexing1D/5.Continious (0 ms)
[ RUN      ] Indexing1D/5.ContiniousReverse
[       OK ] Indexing1D/5.ContiniousReverse (1 ms)
[ RUN      ] Indexing1D/5.Strided
[       OK ] Indexing1D/5.Strided (0 ms)
[ RUN      ] Indexing1D/5.StridedReverse
[       OK ] Indexing1D/5.StridedReverse (0 ms)
[ RUN      ] Indexing1D/5.Span
[       OK ] Indexing1D/5.Span (0 ms)
[----------] 5 tests from Indexing1D/5 (1 ms total)

[----------] 18 tests from Indexing2D/0, where TypeParam = float
[ RUN      ] Indexing2D/0.ColumnContinious
[       OK ] Indexing2D/0.ColumnContinious (0 ms)
[ RUN      ] Indexing2D/0.ColumnContiniousReverse
[       OK ] Indexing2D/0.ColumnContiniousReverse (0 ms)
[ RUN      ] Indexing2D/0.ColumnStrided
[       OK ] Indexing2D/0.ColumnStrided (1 ms)
[ RUN      ] Indexing2D/0.ColumnStridedReverse
[       OK ] Indexing2D/0.ColumnStridedReverse (0 ms)
[ RUN      ] Indexing2D/0.RowContinious
[       OK ] Indexing2D/0.RowContinious (0 ms)
[ RUN      ] Indexing2D/0.RowContiniousReverse
[       OK ] Indexing2D/0.RowContiniousReverse (0 ms)
[ RUN      ] Indexing2D/0.RowStrided
[       OK ] Indexing2D/0.RowStrided (0 ms)
[ RUN      ] Indexing2D/0.RowStridedReverse
[       OK ] Indexing2D/0.RowStridedReverse (1 ms)
[ RUN      ] Indexing2D/0.ContiniousContinious
[       OK ] Indexing2D/0.ContiniousContinious (0 ms)
[ RUN      ] Indexing2D/0.ContiniousReverse
[       OK ] Indexing2D/0.ContiniousReverse (0 ms)
[ RUN      ] Indexing2D/0.ContiniousStrided
[       OK ] Indexing2D/0.ContiniousStrided (0 ms)
[ RUN      ] Indexing2D/0.ContiniousStridedReverse
[       OK ] Indexing2D/0.ContiniousStridedReverse (0 ms)
[ RUN      ] Indexing2D/0.ReverseContinious
[       OK ] Indexing2D/0.ReverseContinious (0 ms)
[ RUN      ] Indexing2D/0.ReverseReverse
[       OK ] Indexing2D/0.ReverseReverse (1 ms)
[ RUN      ] Indexing2D/0.ReverseStrided
[       OK ] Indexing2D/0.ReverseStrided (0 ms)
[ RUN      ] Indexing2D/0.ReverseStridedReverse
[       OK ] Indexing2D/0.ReverseStridedReverse (0 ms)
[ RUN      ] Indexing2D/0.StridedContinious
[       OK ] Indexing2D/0.StridedContinious (0 ms)
[ RUN      ] Indexing2D/0.StridedStrided
[       OK ] Indexing2D/0.StridedStrided (1 ms)
[----------] 18 tests from Indexing2D/0 (4 ms total)

[----------] 18 tests from Indexing2D/1, where TypeParam = double
[ RUN      ] Indexing2D/1.ColumnContinious
[       OK ] Indexing2D/1.ColumnContinious (0 ms)
[ RUN      ] Indexing2D/1.ColumnContiniousReverse
[       OK ] Indexing2D/1.ColumnContiniousReverse (0 ms)
[ RUN      ] Indexing2D/1.ColumnStrided
[       OK ] Indexing2D/1.ColumnStrided (0 ms)
[ RUN      ] Indexing2D/1.ColumnStridedReverse
[       OK ] Indexing2D/1.ColumnStridedReverse (0 ms)
[ RUN      ] Indexing2D/1.RowContinious
[       OK ] Indexing2D/1.RowContinious (0 ms)
[ RUN      ] Indexing2D/1.RowContiniousReverse
[       OK ] Indexing2D/1.RowContiniousReverse (1 ms)
[ RUN      ] Indexing2D/1.RowStrided
[       OK ] Indexing2D/1.RowStrided (0 ms)
[ RUN      ] Indexing2D/1.RowStridedReverse
[       OK ] Indexing2D/1.RowStridedReverse (0 ms)
[ RUN      ] Indexing2D/1.ContiniousContinious
[       OK ] Indexing2D/1.ContiniousContinious (0 ms)
[ RUN      ] Indexing2D/1.ContiniousReverse
[       OK ] Indexing2D/1.ContiniousReverse (0 ms)
[ RUN      ] Indexing2D/1.ContiniousStrided
[       OK ] Indexing2D/1.ContiniousStrided (1 ms)
[ RUN      ] Indexing2D/1.ContiniousStridedReverse
[       OK ] Indexing2D/1.ContiniousStridedReverse (0 ms)
[ RUN      ] Indexing2D/1.ReverseContinious
[       OK ] Indexing2D/1.ReverseContinious (0 ms)
[ RUN      ] Indexing2D/1.ReverseReverse
[       OK ] Indexing2D/1.ReverseReverse (0 ms)
[ RUN      ] Indexing2D/1.ReverseStrided
[       OK ] Indexing2D/1.ReverseStrided (0 ms)
[ RUN      ] Indexing2D/1.ReverseStridedReverse
[       OK ] Indexing2D/1.ReverseStridedReverse (0 ms)
[ RUN      ] Indexing2D/1.StridedContinious
[       OK ] Indexing2D/1.StridedContinious (1 ms)
[ RUN      ] Indexing2D/1.StridedStrided
[       OK ] Indexing2D/1.StridedStrided (0 ms)
[----------] 18 tests from Indexing2D/1 (3 ms total)

[----------] 18 tests from Indexing2D/2, where TypeParam = int
[ RUN      ] Indexing2D/2.ColumnContinious
[       OK ] Indexing2D/2.ColumnContinious (0 ms)
[ RUN      ] Indexing2D/2.ColumnContiniousReverse
[       OK ] Indexing2D/2.ColumnContiniousReverse (0 ms)
[ RUN      ] Indexing2D/2.ColumnStrided
[       OK ] Indexing2D/2.ColumnStrided (1 ms)
[ RUN      ] Indexing2D/2.ColumnStridedReverse
[       OK ] Indexing2D/2.ColumnStridedReverse (0 ms)
[ RUN      ] Indexing2D/2.RowContinious
[       OK ] Indexing2D/2.RowContinious (0 ms)
[ RUN      ] Indexing2D/2.RowContiniousReverse
[       OK ] Indexing2D/2.RowContiniousReverse (0 ms)
[ RUN      ] Indexing2D/2.RowStrided
[       OK ] Indexing2D/2.RowStrided (0 ms)
[ RUN      ] Indexing2D/2.RowStridedReverse
[       OK ] Indexing2D/2.RowStridedReverse (0 ms)
[ RUN      ] Indexing2D/2.ContiniousContinious
[       OK ] Indexing2D/2.ContiniousContinious (1 ms)
[ RUN      ] Indexing2D/2.ContiniousReverse
[       OK ] Indexing2D/2.ContiniousReverse (0 ms)
[ RUN      ] Indexing2D/2.ContiniousStrided
[       OK ] Indexing2D/2.ContiniousStrided (0 ms)
[ RUN      ] Indexing2D/2.ContiniousStridedReverse
[       OK ] Indexing2D/2.ContiniousStridedReverse (0 ms)
[ RUN      ] Indexing2D/2.ReverseContinious
[       OK ] Indexing2D/2.ReverseContinious (0 ms)
[ RUN      ] Indexing2D/2.ReverseReverse
[       OK ] Indexing2D/2.ReverseReverse (1 ms)
[ RUN      ] Indexing2D/2.ReverseStrided
[       OK ] Indexing2D/2.ReverseStrided (0 ms)
[ RUN      ] Indexing2D/2.ReverseStridedReverse
[       OK ] Indexing2D/2.ReverseStridedReverse (0 ms)
[ RUN      ] Indexing2D/2.StridedContinious
[       OK ] Indexing2D/2.StridedContinious (0 ms)
[ RUN      ] Indexing2D/2.StridedStrided
[       OK ] Indexing2D/2.StridedStrided (0 ms)
[----------] 18 tests from Indexing2D/2 (3 ms total)

[----------] 18 tests from Indexing2D/3, where TypeParam = unsigned int
[ RUN      ] Indexing2D/3.ColumnContinious
[       OK ] Indexing2D/3.ColumnContinious (1 ms)
[ RUN      ] Indexing2D/3.ColumnContiniousReverse
[       OK ] Indexing2D/3.ColumnContiniousReverse (0 ms)
[ RUN      ] Indexing2D/3.ColumnStrided
[       OK ] Indexing2D/3.ColumnStrided (0 ms)
[ RUN      ] Indexing2D/3.ColumnStridedReverse
[       OK ] Indexing2D/3.ColumnStridedReverse (0 ms)
[ RUN      ] Indexing2D/3.RowContinious
[       OK ] Indexing2D/3.RowContinious (0 ms)
[ RUN      ] Indexing2D/3.RowContiniousReverse
[       OK ] Indexing2D/3.RowContiniousReverse (1 ms)
[ RUN      ] Indexing2D/3.RowStrided
[       OK ] Indexing2D/3.RowStrided (0 ms)
[ RUN      ] Indexing2D/3.RowStridedReverse
[       OK ] Indexing2D/3.RowStridedReverse (0 ms)
[ RUN      ] Indexing2D/3.ContiniousContinious
[       OK ] Indexing2D/3.ContiniousContinious (0 ms)
[ RUN      ] Indexing2D/3.ContiniousReverse
[       OK ] Indexing2D/3.ContiniousReverse (0 ms)
[ RUN      ] Indexing2D/3.ContiniousStrided
[       OK ] Indexing2D/3.ContiniousStrided (0 ms)
[ RUN      ] Indexing2D/3.ContiniousStridedReverse
[       OK ] Indexing2D/3.ContiniousStridedReverse (0 ms)
[ RUN      ] Indexing2D/3.ReverseContinious
[       OK ] Indexing2D/3.ReverseContinious (0 ms)
[ RUN      ] Indexing2D/3.ReverseReverse
[       OK ] Indexing2D/3.ReverseReverse (0 ms)
[ RUN      ] Indexing2D/3.ReverseStrided
[       OK ] Indexing2D/3.ReverseStrided (0 ms)
[ RUN      ] Indexing2D/3.ReverseStridedReverse
[       OK ] Indexing2D/3.ReverseStridedReverse (0 ms)
[ RUN      ] Indexing2D/3.StridedContinious
[       OK ] Indexing2D/3.StridedContinious (1 ms)
[ RUN      ] Indexing2D/3.StridedStrided
[       OK ] Indexing2D/3.StridedStrided (0 ms)
[----------] 18 tests from Indexing2D/3 (4 ms total)

[----------] 18 tests from Indexing2D/4, where TypeParam = char
[ RUN      ] Indexing2D/4.ColumnContinious
[       OK ] Indexing2D/4.ColumnContinious (0 ms)
[ RUN      ] Indexing2D/4.ColumnContiniousReverse
[       OK ] Indexing2D/4.ColumnContiniousReverse (0 ms)
[ RUN      ] Indexing2D/4.ColumnStrided
[       OK ] Indexing2D/4.ColumnStrided (0 ms)
[ RUN      ] Indexing2D/4.ColumnStridedReverse
[       OK ] Indexing2D/4.ColumnStridedReverse (0 ms)
[ RUN      ] Indexing2D/4.RowContinious
[       OK ] Indexing2D/4.RowContinious (0 ms)
[ RUN      ] Indexing2D/4.RowContiniousReverse
[       OK ] Indexing2D/4.RowContiniousReverse (0 ms)
[ RUN      ] Indexing2D/4.RowStrided
[       OK ] Indexing2D/4.RowStrided (0 ms)
[ RUN      ] Indexing2D/4.RowStridedReverse
[       OK ] Indexing2D/4.RowStridedReverse (0 ms)
[ RUN      ] Indexing2D/4.ContiniousContinious
[       OK ] Indexing2D/4.ContiniousContinious (1 ms)
[ RUN      ] Indexing2D/4.ContiniousReverse
[       OK ] Indexing2D/4.ContiniousReverse (0 ms)
[ RUN      ] Indexing2D/4.ContiniousStrided
[       OK ] Indexing2D/4.ContiniousStrided (0 ms)
[ RUN      ] Indexing2D/4.ContiniousStridedReverse
[       OK ] Indexing2D/4.ContiniousStridedReverse (0 ms)
[ RUN      ] Indexing2D/4.ReverseContinious
[       OK ] Indexing2D/4.ReverseContinious (0 ms)
[ RUN      ] Indexing2D/4.ReverseReverse
[       OK ] Indexing2D/4.ReverseReverse (0 ms)
[ RUN      ] Indexing2D/4.ReverseStrided
[       OK ] Indexing2D/4.ReverseStrided (0 ms)
[ RUN      ] Indexing2D/4.ReverseStridedReverse
[       OK ] Indexing2D/4.ReverseStridedReverse (0 ms)
[ RUN      ] Indexing2D/4.StridedContinious
[       OK ] Indexing2D/4.StridedContinious (0 ms)
[ RUN      ] Indexing2D/4.StridedStrided
[       OK ] Indexing2D/4.StridedStrided (0 ms)
[----------] 18 tests from Indexing2D/4 (3 ms total)

[----------] 18 tests from Indexing2D/5, where TypeParam = unsigned char
[ RUN      ] Indexing2D/5.ColumnContinious
[       OK ] Indexing2D/5.ColumnContinious (1 ms)
[ RUN      ] Indexing2D/5.ColumnContiniousReverse
[       OK ] Indexing2D/5.ColumnContiniousReverse (0 ms)
[ RUN      ] Indexing2D/5.ColumnStrided
[       OK ] Indexing2D/5.ColumnStrided (0 ms)
[ RUN      ] Indexing2D/5.ColumnStridedReverse
[       OK ] Indexing2D/5.ColumnStridedReverse (0 ms)
[ RUN      ] Indexing2D/5.RowContinious
[       OK ] Indexing2D/5.RowContinious (0 ms)
[ RUN      ] Indexing2D/5.RowContiniousReverse
[       OK ] Indexing2D/5.RowContiniousReverse (0 ms)
[ RUN      ] Indexing2D/5.RowStrided
[       OK ] Indexing2D/5.RowStrided (1 ms)
[ RUN      ] Indexing2D/5.RowStridedReverse
[       OK ] Indexing2D/5.RowStridedReverse (0 ms)
[ RUN      ] Indexing2D/5.ContiniousContinious
[       OK ] Indexing2D/5.ContiniousContinious (0 ms)
[ RUN      ] Indexing2D/5.ContiniousReverse
[       OK ] Indexing2D/5.ContiniousReverse (0 ms)
[ RUN      ] Indexing2D/5.ContiniousStrided
[       OK ] Indexing2D/5.ContiniousStrided (0 ms)
[ RUN      ] Indexing2D/5.ContiniousStridedReverse
[       OK ] Indexing2D/5.ContiniousStridedReverse (1 ms)
[ RUN      ] Indexing2D/5.ReverseContinious
[       OK ] Indexing2D/5.ReverseContinious (0 ms)
[ RUN      ] Indexing2D/5.ReverseReverse
[       OK ] Indexing2D/5.ReverseReverse (0 ms)
[ RUN      ] Indexing2D/5.ReverseStrided
[       OK ] Indexing2D/5.ReverseStrided (0 ms)
[ RUN      ] Indexing2D/5.ReverseStridedReverse
[       OK ] Indexing2D/5.ReverseStridedReverse (0 ms)
[ RUN      ] Indexing2D/5.StridedContinious
[       OK ] Indexing2D/5.StridedContinious (0 ms)
[ RUN      ] Indexing2D/5.StridedStrided
[       OK ] Indexing2D/5.StridedStrided (1 ms)
[----------] 18 tests from Indexing2D/5 (4 ms total)

[----------] 7 tests from Indexing/0, where TypeParam = float
[ RUN      ] Indexing/0.4D_to_4D
/home/SrM/ArrayFire/arrayfire/test/testHelpers.hpp:69: Failure
Failed
TEST FILE NOT FOUND


      Start 61: Test_medfilt_cuda
61/68 Test #61: Test_medfilt_cuda ................   Passed    0.18 sec
      Start 62: Test_fft_cuda
62/68 Test #62: Test_fft_cuda ....................   Passed    0.30 sec
      Start 63: Test_tile_cuda
63/68 Test #63: Test_tile_cuda ...................   Passed   12.31 sec
      Start 64: Test_approx1_cuda
64/68 Test #64: Test_approx1_cuda ................   Passed    0.42 sec
      Start 65: Test_set_cuda
65/68 Test #65: Test_set_cuda ....................   Passed    0.19 sec
      Start 66: Test_imageio_cuda
66/68 Test #66: Test_imageio_cuda ................   Passed    0.22 sec

      Start 67: Test_diff1_cuda
67/68 Test #67: Test_diff1_cuda ..................   Passed    0.19 sec
      Start 68: Test_rotate_cuda
68/68 Test #68: Test_rotate_cuda .................   Passed    6.26 sec

96% tests passed, 3 tests failed out of 68

Total Test time (real) = 149.03 sec

The following tests FAILED:
     26 - Test_index_cpu (SEGFAULT)
     45 - Test_reduce_cuda (Failed)
     60 - Test_index_cuda (SEGFAULT)
Errors while running CTest
make: *** [test] Error 8
==================================


As you can see, it seem that the updates produced some other changes (apparently introducing a couple of new errors).

Cheers!

Shehzan Mohammed

unread,
Nov 15, 2014, 11:12:30 PM11/15/14
to arrayfi...@googlegroups.com
Thanks for that info.

For the index tests, we added some data, so go ahead and run

git submodule update
git submodule foreach git pull origin master

We are working actively on the other issues.

-Shehzan

srmuniz

unread,
Nov 15, 2014, 11:19:57 PM11/15/14
to arrayfi...@googlegroups.com, srmun...@gmail.com
Hi Pavan.

After following your suggestios (below), while trying to build arrayfire with OpenCL, it gets much further, but it still does not work. It does build some files, but towards the end it gives similar errors to basically all the OpenCL executables test files when trying to link/build them:

...
[ 94%] [ 94%] [ 94%] Building CXX object test/CMakeFiles/approx1_opencl.dir/approx1.cpp.o
Building CXX object test/CMakeFiles/bilateral_opencl.dir/bilateral.cpp.o
Building CXX object test/CMakeFiles/convolve_opencl.dir/convolve.cpp.o
Building CXX object test/CMakeFiles/diff1_opencl.dir/diff1.cpp.o
Building CXX object test/CMakeFiles/approx2_opencl.dir/approx2.cpp.o
Linking CXX executable basic_opencl

/home/SrM/ArrayFire/clBLAS/build/package/lib64/libclBLAS.so: undefined reference to `clRetainContext@OPENCL_1.0'
...
...
/home/SrM/ArrayFire/clBLAS/build/package/lib64/libclBLAS.so: undefined reference to `clCreateCommandQueue@OPENCL_1.0'
collect2: error: ld returned 1 exit status
make[2]: *** [test/basic_opencl] Error 1
make[1]: *** [test/CMakeFiles/basic_opencl.dir/all] Error 2

make[1]: *** Waiting for unfinished jobs....
Linking CXX executable approx1_opencl

/home/SrM/ArrayFire/clBLAS/build/package/lib64/libclBLAS.so: undefined reference to `clRetainContext@OPENCL_1.0'
...
...
collect2: error: ld returned 1 exit status
make[2]: *** [test/approx2_opencl] Error 1
make[1]: *** [test/CMakeFiles/approx2_opencl.dir/all] Error 2
...

collect2: error: ld returned 1 exit status
make[2]: *** [test/assign_opencl] Error 1
make[1]: *** [test/CMakeFiles/assign_opencl.dir/all] Error 2
...

collect2: error: ld returned 1 exit status
make[2]: *** [test/diff1_opencl] Error 1
make[1]: *** [test/CMakeFiles/diff1_opencl.dir/all] Error 2

...
/home/SrM/ArrayFire/clBLAS/build/package/lib64/libclBLAS.so: undefined reference to `clCreateCommandQueue@OPENCL_1.0'
collect2: error: ld returned 1 exit status
make[2]: *** [test/convolve_opencl] Error 1
make[1]: *** [test/CMakeFiles/convolve_opencl.dir/all] Error 2

make: *** [all] Error 2
==================================


Any other ideas??

srmuniz

unread,
Nov 15, 2014, 11:40:06 PM11/15/14
to arrayfi...@googlegroups.com
Yep, that last update fixed the other (new) two errors. Now, only the test 45 (Test_reduce_cuda) fails, as before...

In any case, thank you very much (to you and Pavan) for the incredibly fast replies and updates!!
You guys are amazing.

Thanks a lot,
Sergio.

Shehzan Mohammed

unread,
Nov 16, 2014, 12:51:51 AM11/16/14
to arrayfi...@googlegroups.com
Gopalkr, Sergio,

Can both of you print out the number of threads and blocks when transpose is called?
Add the print statements to src/backend/cuda/kernel/transpose.hpp before calling the kernel (line 93).

printf("threads %d %d\n", threads.x, threads.y);
printf("blocks %d %d\n", blocks.x, blocks.y);

The error you both are seeing are related to invalid launch sizes,so this information would help.

Shehzan


On Thursday, November 13, 2014 6:05:06 PM UTC-6, gopalakr wrote:
Installation seemed normal, but  example fails at runtime. What am I missing ?

ArrayFire v3.0.200 (CUDA, 64-bit Linux, build ca95e1d)
Platform: CUDA Toolkit 6, Driver: 331.62
[0] Tesla K40c, 11520 MB, CUDA Compute 3.5
Create a 5-by-3 matrix of random floats on the GPU
A Internal error in XXXX/arrayfire/src/backend/cuda/kernel/transpose.hpp(99):

srmuniz

unread,
Nov 16, 2014, 11:34:49 AM11/16/14
to arrayfi...@googlegroups.com
Shehzan:

Here is the output:


ArrayFire v3.0.200 (CUDA, 64-bit Linux, build 89efa49)
Platform: CUDA Toolkit 6.5, Driver: 340.29
[0] GeForce GTX 590, 1536 MB, CUDA Compute 2.0
-1- GeForce GTX 590, 1536 MB, CUDA Compute 2.0
Create a 5-by-3 matrix of random floats on the GPU
A threads 32 8
blocks 1 1
Internal error in .../ArrayFire/arrayfire/src/backend/cuda/kernel/transpose.hpp(102):

CUDA Error (9): invalid configuration argument

Segmentation fault (core dumped)



Shehzan Mohammed

unread,
Nov 16, 2014, 12:18:31 PM11/16/14
to arrayfi...@googlegroups.com
Thanks for that.

Can you call af::sync() between the A = randu() call and af_printf()?
This will help by checking if the failure is happening before transpose and is only showing up there.

srmuniz

unread,
Nov 17, 2014, 8:36:03 AM11/17/14
to arrayfi...@googlegroups.com
Still gets exactly the same output...

However, I noticed that if I comment out all the af_print() commands the program follows on a few more steps (see below),  up to the line
B(seq(0, 2), 1) = B(seq(0, 2), 1) * -1;
where it throws an error at memcopy, instead of transpose.
This seems to indicate that the arrays A and B are created okay, and A's values can even be accessed for certain operations (as in sin(A)), but one gets an error when launching kernels with copy operations. What do you think? Any new clue here, or is it something you were expecting before?
Let me know if there is something else I could try here.


ArrayFire v3.0.200 (CUDA, 64-bit Linux, build 89efa49)
Platform: CUDA Toolkit 6.5, Driver: 340.29
[0] GeForce GTX 590, 1536 MB, CUDA Compute 2.0
-1- GeForce GTX 590, 1536 MB, CUDA Compute 2.0
Create a 5-by-3 matrix of random floats on the GPU
Element-wise arithmetic
Negate the first three elements of second column
Internal error in .../ArrayFire/arrayfire/src/backend/cuda/kernel/memcopy.hpp(215):

CUDA Error (9): invalid configuration argument

.../ArrayFire/arrayfire/src/array/array.cpp:335: AF_ERROR 1

terminate called after throwing an instance of 'af::exception'
Aborted (core dumped)


Cheers,
[S]

Shehzan Mohammed

unread,
Nov 17, 2014, 11:31:12 AM11/17/14
to arrayfi...@googlegroups.com
Can you post the output you got when you had af::sync() added in the code?
We made another bunch of bug fixes yesterday to the devel branch. Do you mind testing that? Also, when configuring cmake, turn CMAKE_BUILD_TYPE to "Debug"

Which distribution of linux are you running this on?

I appreciate your patience with us. As we have refactored the code from older versions and we rely on users like yourself to report issues beyond our testing and help us solve them.

Shehzan

srmuniz

unread,
Nov 17, 2014, 1:41:03 PM11/17/14
to arrayfi...@googlegroups.com
Shehzan:

The output with af_sync() was, as I said, exactly the same one (before that change)...
==========================================

ArrayFire v3.0.200 (CUDA, 64-bit Linux, build 89efa49)
Platform: CUDA Toolkit 6.5, Driver: 340.29
[0] GeForce GTX 590, 1536 MB, CUDA Compute 2.0
-1- GeForce GTX 590, 1536 MB, CUDA Compute 2.0
Create a 5-by-3 matrix of random floats on the GPU
A threads 32 8
blocks 1 1
Internal error in .../ArrayFire/arrayfire/src/
backend/cuda/kernel/transpose.hpp(102):
CUDA Error (9): invalid configuration argument

Segmentation fault (core dumped)
=======================================

The output in my last message was obtained by keeping af_sync() (where you suggested), and, in addition, commenting out all the af_print() commands below it.

I'm running Ubuntu 14.04.1 LTS (GNU/Linux 3.13.0-36-generic x86_64).

As for the repo updates, I have Good News!
After doing a simple "git pull", and changing the cmake call (setting to "Debug"),  I get, at last, helloworld_cuda working ok, i.e.: without the previous errors!
Way to go!!

==========================================

ArrayFire v3.0.200 (CUDA, 64-bit Linux, build 89efa49)
Platform: CUDA Toolkit 6.5, Driver: 340.29
[0] GeForce GTX 590, 1536 MB, CUDA Compute 2.0
-1- GeForce GTX 590, 1536 MB, CUDA Compute 2.0
Create a 5-by-3 matrix of random floats on the GPU
A [5 3 1 1]
   Offsets: [0 0 0 0]
   Strides: [1 5 15 15]
    0.7463     0.8253     0.3343
    0.3231     0.1930     0.9372
    0.3211     0.2078     0.1938
    0.6949     0.2090     0.3660
    0.1895     0.7009     0.9071

Element-wise arithmetic
B [5 3 1 1]

   Offsets: [0 0 0 0]
   Strides: [1 5 15 15]
    2.1789     2.2347     1.8281
    1.8176     1.6918     2.3059
    1.8156     1.7063     1.6926
    2.1403     1.7075     1.8579
    1.6883     2.1449     2.2877


Negate the first three elements of second column
B [5 3 1 1]

   Offsets: [0 0 0 0]
   Strides: [1 5 15 15]
    2.1789    -2.2347     1.8281
    1.8176    -1.6918     2.3059
    1.8156    -1.7063     1.6926
    2.1403     1.7075     1.8579
    1.6883     2.1449     2.2877

Fourier transform the result
C [5 3 1 1]

   Offsets: [0 0 0 0]
   Strides: [1 5 15 15]
    9.6408 0.0000i    -1.7805 0.0000i     9.9722 0.0000i
    0.0619 0.0680i    -2.0957 5.6556i     0.3752 0.0798i
    0.5651 -0.3847i    -2.6009 -0.9916i    -0.7911 -0.1679i
    0.5651 0.3847i    -2.6009 0.9916i    -0.7911 0.1679i
    0.0619 -0.0680i    -2.0957 -5.6556i     0.3752 -0.0798i

Grab last row
c [1 3 1 1]
   Offsets: [4 0 0 0]

   Strides: [1 5 15 15]
    0.0619 -0.0680i    -2.0957 -5.6556i     0.3752 -0.0798i
Create 2-by-3 matrix from host data
D [2 3 1 1]

   Offsets: [0 0 0 0]
   Strides: [1 2 6 6]
    1.0000     3.0000     5.0000
    2.0000     4.0000     6.0000

Copy last column onto first
D [2 3 1 1]

   Offsets: [0 0 0 0]
   Strides: [1 2 6 6]
    5.0000     3.0000     5.0000
    6.0000     4.0000     6.0000

Sort A and print sorted array and corresponding indices
vals [5 3 1 1]

   Offsets: [0 0 0 0]
   Strides: [1 5 15 15]
    0.1895     0.1930     0.1938
    0.3211     0.2078     0.3343
    0.3231     0.2090     0.3660
    0.6949     0.7009     0.9071
    0.7463     0.8253     0.9372

inds [5 3 1 1]

   Offsets: [0 0 0 0]
   Strides: [1 5 15 15]
         4          1          2
         2          2          0
         1          3          3
         3          4          4
         0          0          1
==========================================


However, the error with of test 45 (Test_reduce_cuda) still persists, while running ctest...
Except that now the ctests are taking much longer (about twice as long, mostly due to CPU) to complete.
In any case, Progress!! ;o)

Would you mind letting us know what was the problem?

Thanks,
Sérgio.

srmuniz

unread,
Nov 17, 2014, 3:21:50 PM11/17/14
to arrayfi...@googlegroups.com
Ok, not so fast...
Here goes another information that you may find useful. At least amusing...
I just rebuilt it, returning CMAKE_BUILD_TYPE to "Release" instead of "Debug".  Guess what happens?
....Yeah, there comes the error again(!):


ArrayFire v3.0.200 (CUDA, 64-bit Linux, build 89efa49)
Platform: CUDA Toolkit 6.5, Driver: 340.29
[0] GeForce GTX 590, 1536 MB, CUDA Compute 2.0
-1- GeForce GTX 590, 1536 MB, CUDA Compute 2.0
Create a 5-by-3 matrix of random floats on the GPU
A Internal error in .../ArrayFire/arrayfire/src/backend/cuda/kernel/transpose.hpp(102):

CUDA Error (9): invalid configuration argument

If I go back to "Debug", doing
cmake .. -DCMAKE_BUILD_TYPE=Debug -DBUILD_CUDA=ON -DBUILD_OPENCL=OFF
it works again... (as in my last message)!

So, why do you thing this happens?
What is the main difference between the two?

Cheers,
Sérgio.

Pradeep Garigipati

unread,
Nov 17, 2014, 3:36:48 PM11/17/14
to arrayfi...@googlegroups.com
Hey Sergio,

This is Pradeep from ArrayFire. Technically, there should be no difference at all as one would expect. My first guess would be that some uninitialized variables are been taken care of by debug build, where as in release it is bailing out as no such checks are carried out. We will look into it and let you know as soon as we fix it. 

Thank you for updating us with your new results.

Pradeep.

gopalakr

unread,
Nov 17, 2014, 7:39:52 PM11/17/14
to arrayfi...@googlegroups.com
OK, so I got the latest repo and compiled for Debug and it does work. Although, I did change my cuda to 6.5 and gcc to 4.9

I can work with this now.

Thanks 
Gopala

Pradeep Garigipati

unread,
Nov 18, 2014, 8:55:19 AM11/18/14
to arrayfi...@googlegroups.com
Great!, Pavan pushed couple of bug fixes yesterday evening. Are you still getting the error in releases mode after pulling latest changes ?

Regards,
Pradeep.

srmuniz

unread,
Nov 19, 2014, 10:30:56 AM11/19/14
to arrayfi...@googlegroups.com
Even with the latest (i.e. today's) updates, I still get the error at helloworld_cuda when using "Release", and the same ("undefined reference") errors while linking libclBLAS, when trying to build with OpenCL. However, tuning off OpenCL and using "Debug" still does the trick for me.

Shehzan Mohammed

unread,
Nov 21, 2014, 11:23:22 AM11/21/14
to arrayfi...@googlegroups.com
Hi

Can you create another thread for this issue?
This thread has become long and has differed from the original question.

Thanks
Shehzan
Reply all
Reply to author
Forward
0 new messages