Skip to content

CHPC Software: Math Libraries

By the term Math library in this document we consider a software package that includes functions that perform certain mathematical operations. This is a very wide term and as such the list below is not exhaustive, but, it represents the most commonly used math functions in scientific and engineering computations.

Math libraries can be roughly divided into general libraries, which provide multitude of functionality, and specialized libraries that provide specific functionality. Among the general libraries we include Intel Math Kernel Library (MKL) library, or GNU Scientific Library (GSL). The pecialized libraries include BLAS and LAPACK linear algebra libraries, FFTW Fast Fourier Transform library, etc. The general libraries often provide optimized functionality of the specialized libraries, or use them underneath.

The below listed libraries are the most common libraries that we provide, if you don't see the one you need on the list, please, contact us.

MKL contains highly optimized math routines. It includes full optimized BLAS, LAPACK, sparse solvers, vector math library, random number generators and and fast Fourier transform routines (including FFTW wrappers). For more information, consult the Intel Math Kernel Library Documentation.

MKL is supplied as an independent module, depending on a compiler. For example to use MKL with the Intel compiler, we need to load both the compiler and the MKL module:

module load intel-oneapi-compilers intel-oneapi-mkl

Compilation instructions:

The examples below (diagonalization of a symmetric matrix) require the source files lapack1.f90 and lapack1.c

Intel Fortran (using dynamic linking )

ifort lapack1.f90 -o lapack1_ifort -L$MKLROOT/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -Wl,-rpath=$MKLROOT/lib/intel64

Intel C/C++ (using dynamic linking)

icc lapack1.c -o lapack1_icc -L$MKLROOT/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -Wl,-rpath=$MKLROOT/lib/intel64

If you use the C++ compiler, please replace icc by icpc and change the suffix .c into .cc in the previous statement.

It is also possible to incorporate OpenMP-threaded MKL into an OpenMP or mixed MPI/OpenMP code. To do so, parallelize your code with OpenMP but leave the MKL calls unthreaded, and instead link the threaded MKL library as e.g.:

icc lapack1.c -o lapack1_icc_mt -L$MKLROOT/lib/intel64 -Wl,-rpath=$MKLROOT/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread

Then run as you usually would with given OMP_NUM_THREADS and MKL calls will run over that many threads as well.

For distributed (MPI) parallel linear algebra routines, ScaLAPACK is also fully implemented inside MKL and recommended to use instead of the reference ScaLAPACK distribution. From release 11.2 (2015), MKL also includes cluster sparse matrix solvers based on PARDISO. Large sparse eigenproblems can be solved, to certain tolerance, using PRIMME, which can be linked to MKL for LAPACK/BLAS.

These and other advanced MKL routines require relatively complex linking schemes for which the best is to use the MKL Link Line Advisor page. The MKL Link Advisor also lets you define link flags for GNU and PGI compilers, which we recommend to use as MKL generally provides superior performance. To use GNU or PGI compilers with MKL, first load the intel module, then load the GNU or PGI module, and then other potential libraries to use with GNU or PGI compiler.

MKL also includes interface for FFTW - commonly used Fast Fourier Transform library. It is advantageous to use this interface especially when building multi CPU architecture binaries with the -ax Intel compiler flag. The header files for the FFTW interface are at $MKLROOT/include/fftw.

GSL is a numerical library for C/C++ provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting. There are over 1000 functions in total with an extensive test suite. While GSL is not parallel, it is reasonably thread safe and its routines should be callable from parallel code sections. One can also link a parallel BLAS library such as MKL or ACML and utilize the shared memory parallelism they provide.

GNU gcc

module load gcc/8.5.0 gsl
gcc source.c -o executable -I$GSL_ROOT/include -L$GSL_ROOT/lib -lgsl -lcblas -Wl,-rpath=$GSL_ROOT/lib

This links with the generic unoptimized version of BLAS. $GSL_INCDIR and $GSL_LIBDIR are environment variables defined in the gsl module.

Intel C/C++

module load intel-oneapi-compilers gsl intel-oneapi-mkl
icc
(or icpc) source.c -O3 -axCORE-AVX2,AVX,SSE4.2 -o executable -I$GSL_ROOT/include -L$GSL_ROOT/lib -lgsl -L$MKLROOT/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lpthread -Wl,-rpath=$GSL_ROOT/lib -Wl,-rpath=$MKLROOT/lib/intel64

This links with MKL threaded BLAS library for optimal performance and OpenMP parallelism.

OpenBLAS is is an optimized BLAS library based on GotoBLAS2. Its advantage is a relative simplicity, disadvantage is a low maturity. Some of the applications we build link to OpenBLAS for simplicity, but we recommend that everyone uses MKL instead. OpenBLAS is available via module load, e.g. module load gcc/8.5.0 openblas. Linking is relatively simple with adding the following to the link line: -Wl,-rpath=$OPENBLAS_ROOT/lib -L$OPENBLAS_ROOT/lib -lopenblas.

LAPACK (Linear Algebra PACKage) provides routines for solving systems of simultaneous linear equations, least-squares solutions of linear systems of equations, eigenvalue problems, and singular value problems. It runs on single processor only. The CentOS 7 operation system comes with reference LAPACK (and BLAS), but we highly recommend to use the Intel MKL which includes full LAPACK for optimal performance. Linking LAPACK with MKL is the same as linking BLAS, described above.

The ScaLAPACK (or Scalable LAPACK) library includes a subset of LAPACK routines redesigned for distributed memory MIMD parallel computers. It is written in a Single-Program- Multiple-Data style using explicit message passing for interprocessor communication. It assumes matrices are laid out in a two-dimensional block cyclic decomposition.

The fundamental building blocks of the ScaLAPACK library are distributed memory versions (PBLAS) of the Level 1, 2 and 3 BLAS, and a set of Basic Linear Algebra Communication Subprograms (BLACS) for communication tasks that arise frequently in parallel linear algebra computations. In the ScaLAPACK routines, all interprocessor communication occurs within the PBLAS and the BLACS. One of the design goals of ScaLAPACK was to have the ScaLAPACK routines resemble their LAPACK equivalents as much as possible.
 
Intel MKL provides full ScaLAPACK and we recommend using it along with the Intel compilers and Intel MPI library for optimal performance. See the Intel MKL Link Line Advisor for the correct compiler and linker flags. The following will link hybrid MPI and OpenMP program with ScaLAPACK from MKL, with Intel Fortran and Intel MPI, and with long integer support (64 bit integers) for large data sizes:
module load intel-oneapi-compilers intel-oneapi-mpi
mpiifort -openmp -o executable program.f90 -Wl,-rpath=$MKLROOT/lib/intel64 -L$MKLROOT/lib/intel64 -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_core -lmkl_intel_thread -lmkl_blacs_intelmpi_ilp64 -liomp5 -lpthread -lm  -I$MKLROOT/lib/include

Fastest Fourier Transform in the West (FFTW) is a high performance Fast Fourier Transform (FFT) library. Apart from being optimized for most PC architectures it also includes OpenMP and MPI parallelism. Latest serial and threaded OpenMP builds with the three compilers that we support (GNU, Intel and NVHPC) can be accessed through their respective modules.  To link serial FFTW with e.g. Intel compiler, simply add -L$FFTW_ROOT/lib -lfftw3to the link line. To link OpenMP FFTW, add -lfftw3_omp to the serial link line.

For example, for the Intel compiler with OpenMP:

module load intel-oneapi-compilers fftw
pgcc myprog.c -o myprog.exe -I$FFTW_ROOT/include -L$FFTW_ROOT/lib -Wl,-rpath=$FFTW_ROOT/lib -lfftw3 -lfftw3_omp

Please, note that there is also FFTW version 2 which is still used in some of the codes, which is incompatible with FFTW 3. This one is available as module fftw/2.1.5.

Also note that the Intel MKL includes FFTW wrappers with the FFT performance being on par with FFTW, for the information how to link see our MKL documentation.

 

Last Updated: 3/22/22