Solving General (full) Linear Equation Systems
The Linpack benchmark is quite popular to compare computers with respect to their floating point performance. A linear equation system with a high number of unknows (which is equal to the matrix dimension) has to be solved. We use the routines DGETRF and DGETRS routines which are part of the LAPACK library, which typically is provided by the manufacturers. The most compute intense part is a matrix multiplication routine DGEMM called by DGETRF. This routine can highly be tuned for optimal cache usage by a suitable block agorithm. In order to squeeze out the maximum performance the matrix dimension is chosen such that the available memory is fully used. As a consequence a high percentage of the runtime is spent in the DGEMM routine, but also the total runtime of this benchmark increases with the amount of memory deployed.
We used the Intel MKL for the Woodcrest based machine and the Sun Performance Library for the Opteron and UltraSPARC based machines. Here the matrix dimension is only 5000 which leads to a memory footprint of some 200 MB only, to keep the runtime of this benchmark short.
The following graph compares the performance in MFlop/s for such a linpack benchmark with 5000 unknowns when run with an encreasing number of threads.
It can clearly be seen that the UltraSPARC T2 processor only has a limited floating point performance compared to the other machines which we benchmarked. Because of the extremely cache-friendly algorithm all processors can nicely be fed with data, to keep all floating point units busy.