Basic Linear Algebra Subroutines (BLAS-3)  axe building blocks to solve a lot of numerical problems (Cholesky factorization, Gram-Schmidt ortonormalization, LU decomposition,...). Their efficient implementation on a given parallel machine is a key issue for the maximal exploitation of the system’s computational power. In this work we refer to a massively parallel processing SIMD machine (the APEIOO/Quadrics ) and to the adoption of the hyper-systolic method [3, 6,4] to efficiently implement BLAS-3 on such a machine. The results we achieved (nearly 60-70% of the peak performances for large matrices) demonstrate the validity of the proposed approach. The work is structured as follows: section 1 is devoted to review BLAS-3, in section 2 we recall the hyper-systolic method, subsequently (section 3), the target machine is described and (section 4) the HS implementation is shown. Finally (section 5), some experimental results are given.
|Publication status||Published - 1998|
|Event||4th International Workshop on Applied Parallel Computing, PARA 1998 - Umea, Sweden|
Duration: 1 Jan 1998 → …
|Conference||4th International Workshop on Applied Parallel Computing, PARA 1998|
|Period||1/1/98 → …|
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Computer Science(all)
Coletta, M., Lippert, T., & Palazzari, P. (1998). Hyper-systolic implementation of BLAS-3 routines on the APE100/quadrics machine. Paper presented at 4th International Workshop on Applied Parallel Computing, PARA 1998, Umea, Sweden.