This is the release note of v8.0.0b5. See here for the complete list of solved issues and merged PRs.
CUB is now bundled with CuPy so that everyone can use it out-of-the-box (thanks @leofang!). This release also introduces a mechanism to enable acceleration using different libraries, CUPY_ACCELERATORS environment variable. You can enable CUB and cuTENSOR by setting export CUPY_ACCELERATORS=cub,cutensor.
The new features include an implementation of the SciPy ndimage filters contributed by @coderforlife and the introduction of the cupy_backends library, used to decouple the CUDA ecosystem APIs from CuPy itself.
Currently, cupy_backends is considered an undocumented API and it is subject to further refactoring. In the meantime, you can still continue to use cupy.cuda.* APIs.
As announced previously, we dropped support for CUDA 8.0 and 9.1. We are also going to drop support for NumPy 1.15 and SciPy 1.2 or earlier in the upcoming release.
CUB is now bundled in the source tree. As a consequence, gcc-6 or later is required for the CuPy v8 build. If you are building CuPy from source on systems with legacy gcc, follow the instructions below. These steps are not necessary for general users using wheel packages.
### Ubuntu 16
$ sudo add-apt-repository ppa:ubuntu-toolchain-r/test
$ sudo apt-get update
$ sudo apt-get install g++-6
$ export NVCC="nvcc --compiler-bindir gcc-6"
### CentOS 6 and 7:
$ sudo yum install centos-release-scl
$ sudo yum install devtoolset-7-gcc-c++
$ source /opt/rh/devtoolset-7/enable
CUB-related environment variables (CUB_PATH, CUB_DISABLED) are no longer effective. You need to enable CUB by setting CUPY_ACCELERATORS=cub environment variable to boost reduction kernels and several functions such as min, max, sum, and scan.
In response to the introduction of CUPY_ACCELERATORS, you need to explicitly specify the option CUPY_ACCELERATORS=cutensor to enable cuTENSOR.
RawModule instance (#3534)CHAINER_SEED (#3674)sum_duplicate parameter in sparse min/max/argmin/argmax (#3676)cupy.fuse (#2734, thanks @xuzijian629!)cupy.convolve (#3371, thanks @Dahlia-Chehata!)cupy_backends namespace (#3386)choose_conv_method (#3464, thanks @Dahlia-Chehata!)cupy.poly1d (#3466, thanks @Dahlia-Chehata!)cusolverDn<t>syevj and cusolverDn<t>syevjBatched (#3488, thanks @dmargala!)ndimage rank-based filters (#3500, thanks @coderforlife!)ndimage common linear filters (#3505, thanks @coderforlife!)flatiter.__iter__() (#3508)has_sorted_indices, has_canonical_format, sort(ed)_indices() for sparse matrices (#3509)cupy.correlate (#3525, thanks @Dahlia-Chehata!)cupyx.scipy.sparse.kron() (#3528)ncclSend / ncclRecv from NCCL 2.7 (#3567)cupyx.scipy.fft.next_fast_len (#3571)ndimage generic filters (#3614, thanks @coderforlife!)cupy.cuda.cub module by default (#2584)CUPY_CUB_BLOCK_REDUCTION_DISABLED and CUB_DISABLED (#3461)axis=None in sparse min/max (#3515)_prepare_mask_indexing_single (#3539)compute_30 when CUDA 11 (#3578)einsum not to use cuTENSOR when accelerator is not set (#3592)CHAINER_SEED (#3674)cupy.sum (#2939)numpy.ndarray creation in cuTENSOR operation preparation (#3393)_ArgInfo init (#3549)_fft_convolve (#3560)poly1d instantiation (#3563, thanks @Dahlia-Chehata!)convolve/correlate (#3587)cupy.fft.fftfreq and cupy.fft.rfftfreq (#3653, thanks @grlee77!)cupyx.scipy.ndimage.sum taking zero-dimensional input (#3425)CUSPARSE_VERSION instead of CUDA_VERSION (#3491)min/max to return sparse matrix (#3536)ndarray and fix possible error in __del__ at fft (#3543)cupy.percentile type assignment in asarray (#3570)__name__ to custom kernels (#3626)argmin/argmax return shape (#3639)cupy.show_config (#3642)sum_duplicate parameter in sparse min/max/argmin/argmax (#3676)cupy.cuda.* (#3685).data() for std::vector (#3022)cupy.cuda.cub reusable (#3546)CUPY_ACCELERATORS (#3596)sum_duplicates (#3624)cupy_cub.cu in package data (#3572)scipy.fft when available (#3032)_cub_reduction (#3462)cupy.cuda.cub is used (#3467)testing.slow correctly (#3501)flatiter tests (#3514)slogdet tests to check dtypes of return values (#3577)test_helper (#3579)numpy_cupy_array_list_equal (#3582)numpy_cupy_array_equal instead of numpy_cupy_array_list_equal (#3599)testing.numpy_cupy_* (#3621)axis=None (#3638)min/max/argmin/argmax tests (#3656)ValueError for invalid order (#3498)ValueError for invalid clipmode (#3499)TypeError for invalid subscripts in einsum (#3502)