Hyper-systolic implementation of BLAS-3 routines on the APE100/quadrics machine

Marco Coletta, Thomas Lippert, Paolo Palazzari

Research output: Contribution to conferencePaper

Abstract

Basic Linear Algebra Subroutines (BLAS-3) [1] axe build­ing blocks to solve a lot of numerical problems (Cholesky factorization, Gram-Schmidt ortonormalization, LU decomposition,...). Their efficient implementation on a given parallel machine is a key issue for the maximal exploitation of the system’s computational power. In this work we refer to a massively parallel processing SIMD machine (the APEIOO/Quadrics [2]) and to the adoption of the hyper-systolic method [3, 6,4] to efficiently implement BLAS-3 on such a machine. The results we achieved (nearly 60-70% of the peak performances for large matrices) demonstrate the va­lidity of the proposed approach. The work is structured as follows: section 1 is devoted to review BLAS-3, in section 2 we recall the hyper-systolic method, subsequently (section 3), the target machine is described and (section 4) the HS implementation is shown. Finally (section 5), some experimental results are given.
Original languageEnglish
Publication statusPublished - 1998
Externally publishedYes
Event4th International Workshop on Applied Parallel Computing, PARA 1998 - Umea, Sweden
Duration: 1 Jan 1998 → …

Conference

Conference4th International Workshop on Applied Parallel Computing, PARA 1998
CountrySweden
CityUmea
Period1/1/98 → …

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Coletta, M., Lippert, T., & Palazzari, P. (1998). Hyper-systolic implementation of BLAS-3 routines on the APE100/quadrics machine. Paper presented at 4th International Workshop on Applied Parallel Computing, PARA 1998, Umea, Sweden.