CuPy v8.0.0rc1 をリリースしました

6 views
Skip to first unread message

ecas...@preferred.jp

unread,
Aug 27, 2020, 4:03:18 AM8/27/20
to CuPy Japanese User Group
CuPy v8.0.0rc1 をリリースしました! リリースノートは以下の通りです。


This is the release note of v8.0.0rc1. See here for the complete list of solved issues and merged PRs.

We are planning to release the final v8.0.0 on October 1st. Please start testing your workload with this release. See the Upgrade Guide for the list of possible breaking changes.

Highlights

  • This release adds support for CUDA 11, NumPy 1.19, and SciPy 1.5.
  • Several performance improvements when using cuTENSOR, sparse matrices indexing, matrix multiplication with CUDA 11 using TF32.
  • Compatibility with numpy.poly is being increased thanks to our GSoC student @Dahlia-Chehata!
  • Added an interface (#3126) to support using external memory allocators such as the PyTorch one (pytorch/pytorch#33860).

Notes on Wheel Packages

  • CuPy for CUDA 11.0 (cupy-cuda110) wheel packages are currently available only for Windows. We are going to publish Linux wheels once we get approval from the PyPI team. (Meanwhile, Linux wheels can be downloaded from the Assets section below (or pip install cupy-cuda110 -f https://github.com/cupy/cupy/releases/tag/v8.0.0rc1). Those wheels will be removed once we publish the package on PyPI.)
  • CuPy for CUDA 10.1 (cupy-cuda101), 10.2 (cupy-cuda102), and 11.0 (cupy-cuda110) packages are built with cuDNN v8 support but without bundled cuDNN shared libraries (see #3724 for the discussion). To use cuDNN features, You need to download cuDNN library using the following command: python -m cupyx.tools.install_library --library cudnn --cuda X.X.
    It is also possible to install cuDNN v8.0.x via the system package manager (e.g., apt install libcudnn8 or yum install libcudnn8) or manually install it and set LD_LIBRARY_PATH environment variables.

Changes without compatibility

Deprecate cupy.sparse package (#3839#3856)

CuPy's sparse matrix support was initially implemented in the cupy.sparse package. It was moved to the cupyx.scipy.sparse namespace in CuPy v5, while keeping the cupy.sparse one for backward compatibility.
Since there is no equivalent package in NumPy, it was decided that it will be deprecated and
eventually removed.

Deprecate *_enabled flags under cupy.cuda (#3732)

Before it was possible to use cupy.cuda.nccl_enabled or similar to detect whether NCCL, cuTENSOR or other optional CUDA libraries are available to use. Now this pull-request introduced a per-module flag (cupy.cuda.nccl.availablecupy.cuda.cutensor.available) to obtain the same information.

Bump version in Docker images (#3733)

The current base Docker images have been updated from Ubuntu 16.04, CUDA 9.2, and Python 3.5 to Ubuntu 18.04, CUDA 10.2, and Python 3.6.

New Features

  • Add cupy.ndim (#3060)
  • Add PythonFunctionAllocator (#3126)
  • Compressed Sparse Inner Indexing (#3486)
  • Add cupy.polyadd (#3548)
  • Add cupy.polymul (#3590)
  • Add cupy.polysub (#3593)
  • Add most of scipy.linalg.special_matrices (#3641)
  • Add scipy.signal functions that are simple wrappers of ndimage functions (#3645)
  • Add cupyx.scipy.ndimage.fourier_shiftfourier_gaussianfourier_uniform (#3654)
  • Add 2D Sparse Slicing (#3657)
  • Add 2D Sparse Slicing + Row Indexing (#3658)
  • Add 2D Sparse Slicing + Row & Column Indexing (#3659)
  • Add cupy.roots for Hermitian or symmetric matrix (#3703)
  • Add cupy.polyval (#3725)
  • Support __cuda_array_interface__ in cupy.poly1d (#3729)
  • Implement library preloading for wheels (#3731)
  • Add cupy.poly1d.__pow__ (#3734)
  • Add scipy.signal.convolve and correlate functions (#3748)
  • Add trimcoef (#3793)

Enhancements

  • Avoid disk I/O in compiler (#3164)
  • Add check for method in Randomstate seed (#3282)
  • Support negative axis in sparse min/max/argmin/argmax (#3497)
  • Mark nonzero parameters experimental in sparse min/max (#3583)
  • Add a compile method for RawKernel and RawModule (#3644)
  • Handle __cuda_array_interface__ in asnumpy (#3718)
  • Use cublasGemmEx in tensordot_core when CUDA11 (#3719)
  • Deprecate *_enabled flags under cupy.cuda (#3732)
  • Fix handle types to intptr_t (#3746)
  • Support TF32 (#3810)
  • Deprecate cupy.sparse package (#3839)
  • Add path and readonly options to cupyx.optimizing.optimize (#3845)
  • Adding a workaround for even-length inputs to scipy.signal.sepfir2d (#3750)
  • Add multi-axis support to cupy.flip (#3742)

Performance Improvements

  • Speed up cupy.vdot (#3678)
  • Improve cupy.cutensor (#3700)
  • More improvement of cupy.cutensor (#3744)
  • Improve 2D sparse row slicing (#3782)
  • Improve median_filter, rank_filter and percentile_filter (#3813)
  • Improve CSR matrix getrowgetcol and some slicing (#3851)

Bug Fixes

  • Fix float16 ndarray input in histogram with CUB (#3617)
  • Support order argument in cupy.onescupy.full and cupy.eye (#3655)
  • Work around a known CUB SpMV bug (#3679)
  • Fix broken message format (#3691)
  • Fix can_use_device_segmented_reduce() for incompatible axes (#3740)
  • Fix circular imports (#3743)
  • Skip FFT input checks for some CUDA >= 10.1 cases (#3763)
  • Fix CUDA 11 multi-GPU FFT bug (#3775)
  • Temporary fixes for cudnn v8 (#3790)
  • Fix cupy.correlate (#3801)
  • Copy input by default for C2R transform (#3848)
  • Fix cupy.sparse.* deprecation (#3856)
  • Fix cub not bundled in wheels (#3879)
  • Fix wheel not loading bundled cuDNN on Windows (#3880)
  • Add option to include wheel metadata (#3881)
  • Fix not to use cupy.cuda.* from CuPy codebase (#3883)

Code Fixes

  • Add cupy_backends/cuda/libs/cutensor.pxd (#3595)
  • Refactor _make_decorator in helper.py (#3697)
  • Refactor cupy.poly1d tests (#3704)
  • Remove unnecessary imports in cupy._sorting (#3706)
  • Rename cupy.binary submodule to cupy._binary (#3707)
  • Rename cupy.creation submodule to cupy._creation (#3708)
  • Rename cupy.functional submodule to cupy._functional (#3710)
  • Rename cupy.indexing submodule to cupy._indexing (#3711)
  • Remove unnecessary imports of cupy.linalg (#3714)
  • Rename cupy.misc submodule to cupy._misc (#3726)
  • Rename cupy.padding submodule to cupy._padding (#3727)
  • Rename submodules under cupy.random package (#3772)
  • Refactor logical routines from core.pyx (#3804)
  • Refactor binary-op routines from core.pyx (#3816)
  • Fix typo (#3850)
  • Resolve circular imports between cupy and cupyx.scipy (#3854)

Documentation

  • Correct format of docstrings in creation routines (#3752)
  • Update docs for v8 (#3802)
  • Fix a broken document (#3807)
  • Add cupy-cuda110 package to README (#3817)
  • Fix documents to reflect CUPY_ACCELERATORS (#3818)
  • Support Optuna v2 (install docs) (#3842)
  • Add upgrade guide for v8 (#3863)
  • Fix broken link in the installation guide (#3864)

Installation

  • Bump version in Docker images (#3733)
  • Update classifiers in setup.py (#3814)
  • Install SciPy and Optuna to Docker image (#3844)

Tests

  • Fix wrong test file name (#3722)
  • Fix test to run without NCCL (#3735)
  • Avoid mutation of os.environ (#3749)
  • Relax tolerance in TestArrayElementwiseOp::test_doubly_broadcasted_pow (#3758)
  • More on using unittest.mock (#3791)
  • Fix test to run without cuDNN (#3846)

Others

  • Bump version to v8.0.0rc1 (#3882)
  • Make nvrtc getPTX use bytes instead of unicode (#3237)
  • Add hiprtc support (#3238)
  • Fix build and import errors for ROCm (#3786)

Contributors

The CuPy Team would like to thank all those who contributed to this release!

@anaruse@cjnolet@coderforlife@Dahlia-Chehata@jakirkham@leofang@niteya-shah@pentschev

Reply all
Reply to author
Forward
0 new messages