CuPy v8.0.0b3 をリリースしました

1 view
Skip to first unread message

ecas...@preferred.jp

unread,
May 29, 2020, 2:05:23 AM5/29/20
to CuPy Japanese User Group

CuPy v8.0.0b3 をリリースしました! リリースノートは以下の通りです。



This is the release note of v8.0.0b3. See here for the complete list of solved issues and merged PRs.

As announced in the previous release, we are dropping support for CUDA 8.0 / 9.1 in v8 releases (#3301). Based on the feedback from users, we will continue to provide cuDNN support (#3303).


Highlights


CuPy v8.0.0b3 introduces a mechanism for optimizing internal parameters when launching reduction kernels using Optuna. Depending on your GPU and the kernels you execute, you can take advantage of this feature and improve the performance of your codes by letting Optuna to automatically find the best parameters for your GPU.
To take advantage of this, call functions that perform reductions with the following:

with cupyx.optimizing.optimize(key=None):
    # cupy reduction function
    y = cupy.sum(x)


CuPy is also taking part in GSoC 2020 and we keep adding new functions to improve our compatibility with NumPy.


New Features

  • Optimize kernel launch parameters using Optuna (#2731)
  • Support cuSPARSE generic API (#3242)
  • Implement flatiter.base property (#3250)
  • Implement flatiter.__len__() special method (#3251)
  • Implement flatiter.__next__() special method (#3252)
  • Implement putmask function (#3261, thanks @rushabh-v!)
  • Show versions of CUB and cuTENSOR on cupy.show_config (#3271)
  • Enable getting R2C/C2R FFT plans from get_fft_plan() (#3293, thanks @leofang!)
  • Support surface memory in RawKernel (#3294, thanks @leofang!)
  • Add cupy.bartlett (#3307, thanks @niteya-shah!)
  • Add mean for sparse matrices (#3333)
  • Support max_duration argument in cupyx.time.repeat (#3357)
  • Support OptimizeContext serialization (#3367)

Enhancements

  • Support primitive complex scalar in RawKernel (#2606)
  • Fix the internal streams in multi-GPU Plan1d (#3260, thanks @leofang!)
  • Support additional dtypes and axis sequences in cupy.median (#3280, thanks @grlee77!)
  • Support multiple architectures in CUPY_NVCC_GENERATE_CODE (#3330, thanks @leofang!)
  • Fix too small max_total_time_per_trial (#3365)

Performance Improvements

  • Rewrite cupyx.scipy.ndimage.interpolation using ElementwiseKernel (#3166, thanks @grlee77!)
  • Improve ElementwiseKernel cpu time (#3298)
  • Performance improvements to blackmanhanning and hamming methods (#3312, thanks @niteya-shah!)
  • Use local cache in cupy.RawKernel (#3341, thanks @leofang!)
  • Reduce memory usage of cupy.linalg.svd (#3347)

Bug Fixes

  • Fix SciPy version check in cupyx.scipy.fft (#3311, thanks @grlee77!)
  • Ensure runtime context on a per-device basis (#3321, thanks @leofang!)
  • Fix put when using scalars (#3328)
  • Assign a work space to ormqr functions in _solve (#3331)
  • Fix linalg.svd for 0-sized matrices (#3354)
  • Fix wrong parameter names in kernel launch optimizers (#3364)
  • cupy.around behaves differently from NumPy for EVEN_NUMBER+0.5 (#3335)

Code Fixes

  • Add alias of shape type (#3310)
  • Use shape_t instead of tuple (#3315)

Documentation

  • Add PFN to the README (#3276)
  • Remove upper restrictions for numpy and scipy in doc build (#3337)

Tests

  • Add tests for optimizer for kernel launch parameters (#3363)

Others

  • Bump version to v8.0.0b3 (#3376)
Reply all
Reply to author
Forward
0 new messages