Global Arrays version 5.8

Skip to first unread message

Palmer, Bruce J

Nov 2, 2020, 3:22:39 PM11/2/20



The Global Arrays team is pleased to announce the release of version 5.8. The notes for this release are attached below. You can access the latest release at


Release Notes:

  • Known Bugs
    • The MPI RMA port remains unreliable for many MPI implementations. Open MPI
      still reports many failures in the test suit. Intel MPI is better but still
      reports several failures. It is recommended to use the latest MPI
      implementations available.
  • Added
    • Version function that can be used to report the current version, subversion
      and patch numbers of the current release
    • Overlay option for creating new GAs on top of existing GAs
    • The number of progress ranks per node in the progress ranks runtime is now
    • Functions for duplicating process groups and returning a process group that
      only contains the calling process
    • 64-bit versions of block-cyclic data distribution functions to
      C interface
    • Non-blocking test function
    • Read-only property based on caching
    • GA name can be recovered from handle
    • Added profiling capabilities to the GA branch that automatically generates
      a log file in the running directory. This can be controlled with GAW_FILE_PREFIX
      environment variable to add a prefix for the log files and the GAW_FMT
      environment variable to create a CSV format or human readable format. The
      default format is human readable.
      • For autotools, add --enable-profile=1 in the configure line
      • For CMake add -DENABLE_PROFILING=ON
  • Changed
    • Non-blocking handle management was completely revamped. This simplifies
      implementation and removes some bugs. The number of outstanding non-blocking
      calls was increased to 256
    • Modified internal function that computes rank of processors on the world
      communicator so that it does not use the MPI_Comm_translate_ranks function.
      This function is implemented with a loop that scales as the square of the
      number of processors and is very slow at large processor counts
    • modified internal iterators so that block cyclic data distributions work on
      processor groups
    • Improved CMake build
    • Modified ga_print_distribution so that it works on block-cyclic data
  • Fixed
    • Fixed a non-blocking error that was showing up in nbtest.x



Bruce Palmer

Senior Research Scientist

Pacific Northwest National Laboratory

Richland, WA 99352

(509) 375-3899


Reply all
Reply to author
0 new messages