CuPy v9.0.0a1 をリリースしました

11 views
Skip to first unread message

Emilio Castillo

unread,
Oct 29, 2020, 3:31:35 AM10/29/20
to CuPy Japanese User Group

CuPy v9.0.0a1 をリリースしました! リリースノートは以下の通りです。

This is the release note of v9.0.0a1. See here for the complete list of solved issues and merged PRs.

Highlights

CUDA 11.1 Support

Support for CUDA 11.1 is added in #4184, with CUDA 11.1, GeForce RTX 30 series and Quadro RTX series can now be used in CuPy.

Notes on Wheel Packages

CuPy for CUDA 11.1 (cupy-cuda111) wheel packages are currently only available for Windows. We are going to publish Linux wheels once we get approval from the PyPI team. Meanwhile, Linux wheels can be downloaded from the Assets section below (or pip install cupy-cuda111 -f https://github.com/cupy/cupy/releases/tag/v9.0.0rc1).

New Features
  • Add compressed sparse __setitem__ (#3533)
  • Add cupy.polyfit (#3747)
  • Support sparse pointwise division by vectors or matrices (#3838)
  • Add cudaGetDeviceProperties (#3858)
  • Support sparse pointwise maximum and minimum (#3860)
  • Add all binary morphology functions to cupyx.scipy.ndimage (#3907)
  • Support cublasXgetrsBatched and add cupy.cublas.batched_gesv (#3936)
  • Add cupy.testing.shaped_sparse_random (#3944)
  • Add sparse pointwise equality & inequality functions (#3945)
  • Add remaining grayscale morphology operations to cupyx.scipy.ndimage (#3946)
  • Add histogram2d and histogramdd (#3947)
  • Add cupy.gradient (#3963)
  • Add several functions to cupyx.scipy.ndimage.measurements (#3979)
  • Add cupyx.scipy.linalg.lu (#3995)
  • Add cupy.apply_along_axis (#4008)
  • Add cupyx.scipy.sparse.linalg.norm (#4017)
  • Add missing sparse matrix constructors (#4052)
  • Add cupy.cusolver.gels (#4064)
  • Add @ operator support to cupyx.scipy.sparse (#4075)
  • Add cupy.nancumsum and cupy.nancumprod (#4077)
  • Add order option in cupy.testing.shaped_random (#4091)
  • Add cupy.nanmedian (#4092)
  • Add complex dtype support in cupy.nanmin and cupy.nanmax (#4097)
  • Add cupy.append and cupy.resize (#4112)
  • Add cupyx.scipy.sparse.linalg.eigsh (#4138)
  • Add support for CUDA 11.1 (#4184)
Enhancements
  • Support list bins with histogram (#3542)
  • Add a cuFFT plan cache (#3730)
  • Support transforming NumPy arrays with multi-GPU Plan1d (#3766)
  • Show numpy and scipy versions in show_config (#3768)
  • Add cuTENSOR 1.2 support (#3884)
  • Update FP16 header to CUDA 11.0 Update 1 (11.0.3) (#3888)
  • Check format of sparse matrix in numpy_cupy_array_equal (#3897)
  • Improve accuracy of cupy.around (#3904)
  • Bump cuDNN version to v8.0.3 (#3985)
  • Add complex dtype support to cupyx.scipy.linalg.lu_factor/solve (#4002)
  • Add cython bindings to cuSPARSE csrsv2/csrsm2 related functions (#4031)
  • Support pickling cupy.RawKernel (#4055)
  • Allow non-contiguous array input to binary morphology functions (#4058)
  • Improve performance of binary morphology for fully nonzero structuring elements (#4059)
  • Bump cuDNN to v8.0.4 (#4065)
  • Add *svdjBatched prototypes (#4071)
  • Defer import in cupy/_environment.py (#4162)
  • Record Cython build and runtime versions (#4164)
Performance Improvements
  • Use cuTENSOR in cupy.prod, cupy.max, cupy.min, cupy.ptp and cupy.mean (#3765)
  • Use _csr_row_index for CSR matrix major-axis slicing with step (#3852)
  • Improve CSR matrix column fancy indexing (#3886)
  • Use LU-decomposition based solver in cupy.linalg.solver (#3942)
  • Improve cupyx.scipy.sparse int x int indexing (#3981)
  • Avoid using CUlinkState unless absolutely necessary (#3992)
  • Improve cupy.in1d (#4018)
  • Improve cupy.cuda.cub.device_segmented_reduce() (#4161)
Bug Fixes
  • Fix cooperative kernel launch (#3894)
  • Fix dtype in CSR matrix division (#3905)
  • Fix csr2csc for zero-size matrix (#3919)
  • Handle transfer to cupy view (#3928)
  • Fix _compressed_sparse_matrix._minor_slice for step > 1 case (#3948)
  • Fix csr_matrix._get_intXslice for step < 0 case (#3951)
  • Fix sparse.__getitem__ not to return view of input (#3975)
  • ROCm: fix rocBLAS and rocSOLVER version displays (#3988)
  • Add a kernel for integer GEMM (#3994)
  • Fix typos in cupy.cuda.cufft (#4014)
  • Fix managed memory leak (#4015)
  • Fix potential segfault when reduction axis is empty (#4024)
  • Use __dealloc__ instead of __del__ for cdef class (#4036)
  • Fix typo in _binary_erosion (#4038)
  • Fix CUB block reduction for F-order arrays with ndim > 2 (#4062)
  • Add work-around for issue in cutensorReduction of cuTENSOR 1.2.1 (#4081)
  • Handle np.nan and np.inf constant values properly in ndimage functions (#4083)
  • Fix argmax and argmin for F-order inputs (#4084)
  • Workaround cudaPointerGetAttributes error in CUDA 10.2+ (#4085)
  • Fix argmax/argmin in CUB block reduction for F-order arrays with ndim > 1 (#4096)
  • Fix getDeviceProperties for HIP (#4108)
  • Add compute capability checking for cublasGemmEx() (#4114)
  • Fix 64-bit int types in type_dispatcher.cuh (#4124)
  • Fix mode='opencv' case in cupyx.scipy.ndimage.affine_transform (#4130)
  • Add compute_35 for CUDA 11.0+ (#4137)
  • Fix device properties for cuda 9.2 (#4142)
  • Fix cupyx.seterr() when linalg not supplied (#4150)
  • Fix broadcasting behavior in ndimage.measurements functions (#4151)
  • Fix argwhere for 0d inputs (#4167)
  • Fix nonzero for 0d inputs (#4168)
  • Fix to use current stream properly with CUDA-related libraries (#4173)
Code Fixes
  • Split cupy cuda header (#3616)
  • Rename cupy.io submodule to cupy._io (#3712)
  • Rename cupy.logic submodule to cupy._logic (#3715)
  • Rename cupy.manipulation submodule to cupy._manipulation (#3716)
  • Rename cupy.math submodule to cupy._math (#3717)
  • Rename submodules under cupy.linalg package (#3741)
  • Rename cupy.statistics submodule to cupy._statistics (#3774)
  • Rename cupy.util submodule to cupy._util (#3779)
  • Rename submodules under cupyx.linalg package (#3784)
  • Refactor CSR sparse matrix row fancy indexing (#3865)
  • Rename submodule under cupy.prof package (#3869)
  • Rename submodule under cupy.fft package (#3870)
  • Hide private names in cupy/__init__.py (#3871)
  • Rename cupyx.rsqrt submodule (#3873)
  • Rename cupyx.runtime submodule (#3874)
  • Rename cupyx.scatter submodule (#3875)
  • Rename submodule under cupyx.scipy.fft (#3899)
  • Rename submodule under cupyx.scipy.fftpack (#3900)
  • Rename submodules under cupyx.scipy.sparse (#3901)
  • Rename submodules under cupyx.scipy.special (#3902)
  • Hide private names in cupyx/scipy/__init__.py (#3912)
  • Hide private names in cupyx.time (#3965)
  • Hide private names in cupy.cudnn (#3966)
  • Hide private names in cupy.cusolver (#3967)
  • Hide private names in cupy.cusparse (#3968)
  • Hide private names in cupy.cutensor (#3969)
  • Move _normalize_axis_index to cupy/core/internal.pyx (#4057)
  • Move matmul from core.pyx to _routine_linalg.pyx (#4060)
Documentation
  • Fix wrong curand enum names (#3840)
  • Add cupy.searchsorted to doc (#3908)
  • Update cupyx.scipy API documentation (#3954)
  • Fix docs of cupyx.scipy.linalg.lu_factor (#4011)
  • Improve the plan cache documentation (#4013)
  • Update README and docs for unified tagline (#4047)
  • Simplify ROCm install guide (#4048)
  • Fix typo (#4053)
  • Add note about starting nvprof with profiling off (#4144)
  • Fix docstrings of cupyx.scipy.ndimage.{minimum,maximum}_position (#4146)
Installation
  • Add CUDA_VERSION define for Cython compilation (#3877)
Tests
  • Code fix on tests for cupyx.scipy.ndiamge stats functions (#3426)
  • Add different dtype input test in histogram (#3618)
  • Fix 32-bit boundary test to run on Windows (#3859)
  • Fix cupy.ndim test style (#3890)
  • Fix test fail when cudnn is unavailable (#3906)
  • Add v8 to list of known branch in FlexCI script (#3911)
  • Fix side effects in some tests (#3934)
  • Fix some test to check compatibility with scipy's behavior (#3955)
  • Refactor sparse indexing tests (#3958)
  • Require SciPy 1.2 for sparse comparison (#4033)
  • Add generate_matrix to cupy.testing (#4070)
  • Make parameterized dtype test skip by pytest.skip (#4094)
  • ROCm gpg url changed (#4127)
  • Fix tests that have side effects (#4149)
  • Enhance dtype error message in testing helpers (#4156)
  • Fix polyfit tests tolerance (#4159)
  • Use testing.assert_warns (#4169)
HIP/ROCm
  • ROCm: Fix bugs and test suites to make ROCm/HIP happy - Part 1 (#3823)
  • ROCm: Fix bugs and test suites to make ROCm/HIP happy - Part 2 (#3835)
  • ROCm: Support rocTX (#3843)
  • ROCm: Support rocFFT/hipFFT (#3896)
  • ROCm: Support more hipBLAS/rocBLAS and rocSOLVER functions (#3950)
  • ROCm: Support hipCUB/rocPRIM (#4027)
  • ROCm: Support RCCL (#4099)
  • ROCm: Build on latest ROCm (#4110)
Others
  • Disable github checks annotations of Codecov (#4020)
  • Bump version to v9.0.0a1 (#4194)
Contributors

The CuPy Team would like to thank all those who contributed to this release!

@anaruse @carterbox @cjnolet @Dahlia-Chehata @garanews @grlee77 @kalvdans @leofang @mrkwjc @saswatpp

Reply all
Reply to author
Forward
0 new messages