- Next: Non-uniform memory access machines
- Up: Departmental seminar at UWCC
- Previous: Problems addressed by ScaLAPACK
ScaLAPACK Key Ideas
-
Use block-partitioned algorithms to maximize data reuse in lower
levels of memory
- reduce cache misses
- reduce frequency of communication
-
Use Parallel BLAS (PBLAS) as main computational building blocks.
-
Use Basic Linear Algebra Communication Subprograms (BLACS) to perform
communication
- Hide parallelism within the PBLAS
-
Matrices and vectors are global objects
-
Fine-tune performance by adjusting data layout parameters
Important:
The PBLAS make use of the sequential BLAS for which assembly coded
versions exist for many processors.
Slide 24 of Departmental seminar at UWCC, David W. Walker, UWCC. (Updated 01/31/96)