This release added optimized variants of RandomAccess that use Linear Congruential Generator for random number generation. Global reduction was added to error calculation in MPI FFT to achieve more accurate error estimates. The order of benchmarks was rearranged so that the HPL component runs last and may be aborted if the performance of other components was not satisfactory. RandomAccess is now first to assist in tuning the code. Assorted bugs were fixed.
This is a bugfix release that fixes two bugs that were introduced in version 1.3.0. The bugs were in PTRANS and FFT components of the code. The version of HPL was updated. The 32-bit pseudo random number generator (PRNG) was replaced with a 64-bit one. Three numerical checks of the solution residual were replaced with a single one. Support was added for 64-bit systems with large memory sizes. A limit on FFT vector size was introduced so they fit in a 32-bit integer (which is only applicable when using FFTW version 2).
This version contains many bugfixes, major
features, and minor enhancements, many of which
were contributed by users. The major focus of this
release was to improve accuracy of the reported
performance results and ensure scalability of the
code on the largest supercomputer installations
with hundreds of thousands of computational cores.
The following features were added: simplified and/or
automatic configuration, OpenMP support added to FFT and
RandomAccess, time-bound in MPI-RandomAccess, parallel
verification in MPI-RandomAccess, and enhanced and
improved reporting of results.