ALIs

kommt noch

Automatically Tuned Linear Algebra Software (ATLAS)

This document gives an overview of high-performance BLAS implementations, mostly the ATLAS (Automatically Tuned Linear Algebra Software).

Important Note

ATLAS is presently not officially supported by LRZ. While the software is still available on some systems, there are no plans to keep the software up-to-date or to install it on new platforms.

Please use the Intel Math Kernel Libraries if you need an optimized BLAS/LAPACK routine.

Overview and Purpose

ATLAS is an approach for the automatic generation and optimization of numerical software for processors with deep memory hierarchies and pipelined functional units. The production of such software for machines ranging from desktop workstations to embedded processors can be a tedious and time consuming task. ATLAS has been designed to automate much of this process. Initial work was for general matrix multiply, DGEMM. In the v3 releases also parts of LAPACK have been taken into account.

  • ATLAS now incorporates Bo Kagstrom and Per Ling's Superscalar GEMM-based Level 3 BLAS in order to supply complete BLAS Level 3

  • ATLAS now supports the Level 1 BLAS operation SCAL, in order to examine usefulness of generating Level 1 ops.

Authors of the ATLAS Library

R. Clint Whaley, Jack Dongarra
Computer Science Department
University of Tennessee
Knoxville, TN 37996-1301
        and
Mathematical Sciences Section
Oak Ridge National Laboratory
Oak Ridge, TN 37831

Supported Platforms at LRZ

The library itself is highly machine dependent. Thus it is extremely important to use the appropriate library for the respective machine otherwise the results may be unpredictable or even wrong.

Linux versions of ATLAS

Since the Linux cluster consists of heterogeneous hardware, including Pentium II, Pentium III, Pentium III/Cascades and Pentium 4 processors, the correct version of ATLAS must be linked depending on the processor the program will run on. However it does not matter on which machine the linking itself is performed.

The link step consists of specifying a library path and the required libraries themselves, to be stored by the user in an environment variable $BLAS. For IA32 systems, please use

pgf90 -o ./myprog <all_my_objects> -L/usr/local/lib $BLAS

For IA64 systems you might specify

efc -o ./myprog <all_my_objects> -L/usr/local/sys/lib/atlas/3.4.1/lib $BLAS

Furthermore, a version of LAPACK with ATLAS-optimized LU and Cholesky decomposition routines is also available, which is contained in the same path given above. The following table gives an overview of the value of $BLAS and $LAPACK for the various processors available.

Processor 

$BLAS setting 

$LAPACK setting 

remarks

Pentium III 

-lf77blas_PIII_256K -latlas_PIII_256K 

-llapack_ATLAS_PIII_256K 

Version 3.2

Pentium III/Cascades 

-lf77blas_PIII_2048K -latlas_PIII_2048K 

-llapack_ATLAS_PIII_2048K 

No SSE available yet.

Pentium 4 

-lf77blas_P4 -latlas_P4 

-llapack_ATLAS_P4 

Version 3.4.1 with SSE2 support (this doubles performance for D and Z calls). Suitable for 256 and 512 KByte L2 Cache processor versions

Itanium2 

-lf77blas -latlas 

-llapack 

Please note that this table applies for the Fortran Compiler. If you use ATLAS from C, you have to replace the Fortran Interface Library libf77blas_* by the corresponding libcblas_*. Furthermore, only the PGI Fortran Compiler is presently supported via ATLAS; if you use the Intel Compiler LRZ recommends that you use the Math Kernel Library from Intel.

Please look at the introductory documentation for the Linux Cluster for the processor type associated with the various machines.

An Alternative to ATLAS: The FLAME BLAS

Availability

Kazushige Goto (Visiting Scientist, FLAME project, UT-Austin) has developed a very performant BLAS implementation which is available for various architectures, and in many cases beats ATLAS as well as the commercial BLAS implementations.

If you want to make use of the Goto BLAS on LRZ platforms, please make sure you include a citation (see Web Page for details).

The Goto BLAS is available on the Itanium2 platforms (Linux-Cluster

Usage

Generally the libraries are only available as shared implementations; they're always installed in the directory GOTO=/usr/local/sys/lib/goto_blas. Linkage is then done via

$F90 -o myprog ... -L/usr/local/sys/lib/goto_blas -lgoto_$ARCH

where $F90 and $ARCH are appropriately chosen from the following table.

Platform 

Fortran Compiler $F90 

Architecture $ARCH 

Remarks

Pentium 4 

ifc 

p4_512-r0.6 

also link in $GOTO/xerbla_ifc.o

Pentium 4 

ifc 

p4_256-r0.6 

this would apply to lxsrv13-18, which only have 256 KByte L2 Cache.
Again, link in $GOTO/xerbla_ifc.o

Itanium2 

efc 

it2-r0.7 

also link in $GOTO/xerbla_efc.o

Itanium2 multi-threaded 

efc 

it2p-r0.7 

also link in $GOTO/xerbla_efc.o

Power4 

xlf_r 

power4 

also link in -lessl_r since only DGEMM and SGEMM are implemented.

Contact Point at LRZ

For questions concerning the usage of the ATLAS or Goto BLAS libraries contact the LRZ Support Team:

Further Information