ALIs
kommt nochAutomatically Tuned Linear Algebra Software (ATLAS)
This document gives an overview of high-performance BLAS implementations, mostly the ATLAS (Automatically Tuned Linear Algebra Software).
Important Note
ATLAS is presently not officially supported by LRZ. While the software is still available on some systems, there are no plans to keep the software up-to-date or to install it on new platforms.
Please use the Intel Math Kernel Libraries if you need an optimized BLAS/LAPACK routine.
Overview and Purpose
ATLAS is an approach for the automatic generation and optimization of numerical software for processors with deep memory hierarchies and pipelined functional units. The production of such software for machines ranging from desktop workstations to embedded processors can be a tedious and time consuming task. ATLAS has been designed to automate much of this process. Initial work was for general matrix multiply, DGEMM. In the v3 releases also parts of LAPACK have been taken into account.
-
ATLAS now incorporates Bo Kagstrom and Per Ling's Superscalar GEMM-based Level 3 BLAS in order to supply complete BLAS Level 3
-
ATLAS now supports the Level 1 BLAS operation SCAL, in order to examine usefulness of generating Level 1 ops.
Authors of the ATLAS Library
R. Clint Whaley, Jack Dongarra
Computer Science Department
University of Tennessee
Knoxville, TN 37996-1301
and
Mathematical Sciences Section
Oak Ridge National Laboratory
Oak Ridge, TN 37831
Supported Platforms at LRZ
The library itself is highly machine dependent. Thus it is extremely important to use the appropriate library for the respective machine otherwise the results may be unpredictable or even wrong.
Linux versions of ATLAS
Since the Linux cluster consists of heterogeneous hardware, including Pentium II, Pentium III, Pentium III/Cascades and Pentium 4 processors, the correct version of ATLAS must be linked depending on the processor the program will run on. However it does not matter on which machine the linking itself is performed.
The link step consists of specifying a library path and the required libraries themselves, to be stored by the user in an environment variable $BLAS. For IA32 systems, please use
pgf90 -o ./myprog <all_my_objects> -L/usr/local/lib $BLAS
For IA64 systems you might specify
efc -o ./myprog <all_my_objects> -L/usr/local/sys/lib/atlas/3.4.1/lib $BLAS
Furthermore, a version of LAPACK with ATLAS-optimized LU and Cholesky decomposition routines is also available, which is contained in the same path given above. The following table gives an overview of the value of $BLAS and $LAPACK for the various processors available.
|
Processor |
$BLAS setting |
$LAPACK setting |
remarks |
|---|---|---|---|
|
Pentium III |
-lf77blas_PIII_256K -latlas_PIII_256K |
-llapack_ATLAS_PIII_256K |
Version 3.2 |
|
Pentium III/Cascades |
-lf77blas_PIII_2048K -latlas_PIII_2048K |
-llapack_ATLAS_PIII_2048K |
No SSE available yet. |
|
Pentium 4 |
-lf77blas_P4 -latlas_P4 |
-llapack_ATLAS_P4 |
Version 3.4.1 with SSE2 support (this doubles performance for D and Z calls). Suitable for 256 and 512 KByte L2 Cache processor versions |
|
Itanium2 |
-lf77blas -latlas |
-llapack |
Please note that this table applies for the Fortran Compiler. If you use ATLAS from C, you have to replace the Fortran Interface Library libf77blas_* by the corresponding libcblas_*. Furthermore, only the PGI Fortran Compiler is presently supported via ATLAS; if you use the Intel Compiler LRZ recommends that you use the Math Kernel Library from Intel.
Please look at the introductory documentation for the Linux Cluster for the processor type associated with the various machines.
An Alternative to ATLAS: The FLAME BLAS
Availability
Kazushige Goto (Visiting Scientist, FLAME project, UT-Austin) has developed a very performant BLAS implementation which is available for various architectures, and in many cases beats ATLAS as well as the commercial BLAS implementations.
If you want to make use of the Goto BLAS on LRZ platforms, please make sure you include a citation (see Web Page for details).
The Goto BLAS is available on the Itanium2 platforms (Linux-Cluster
Usage
Generally the libraries are only available as shared implementations; they're always installed in the directory GOTO=/usr/local/sys/lib/goto_blas. Linkage is then done via
$F90 -o myprog ... -L/usr/local/sys/lib/goto_blas -lgoto_$ARCH
where $F90 and $ARCH are appropriately chosen from the following table.
|
Platform |
Fortran Compiler $F90 |
Architecture $ARCH |
Remarks |
|---|---|---|---|
|
Pentium 4 |
ifc |
p4_512-r0.6 |
also link in $GOTO/xerbla_ifc.o |
|
Pentium 4 |
ifc |
p4_256-r0.6 |
this would apply to lxsrv13-18, which only have 256 KByte L2
Cache. |
|
Itanium2 |
efc |
it2-r0.7 |
also link in $GOTO/xerbla_efc.o |
|
Itanium2 multi-threaded |
efc |
it2p-r0.7 |
also link in $GOTO/xerbla_efc.o |
|
Power4 |
xlf_r |
power4 |
also link in -lessl_r since only DGEMM and SGEMM are implemented. |
Contact Point at LRZ
For questions concerning the usage of the ATLAS or Goto BLAS libraries contact the LRZ Support Team: