OpenBLAS

OpenBLAS is an open-source implementation of the BLAS (Basic Linear Algebra Subprograms) and LAPACK APIs with many hand-crafted optimizations for specific processor types. It is developed at the Lab of Parallel Software and Computational Science, ISCAS.

OpenBLAS adds optimized implementations of linear algebra kernels for several processor architectures, including Intel Sandy Bridge and Loongson. It claims to achieve performance comparable to the Intel MKL: this mostly holds true on the BLAS part, while the LAPACK part falls behind. On machines that support the AVX2 instruction set, OpenBLAS can achieve similar performance to MKL, but there are currently almost no open source libraries comparable to MKL on CPUs with the AVX512 instruction set.

OpenBLAS is a fork of GotoBLAS2, which was created by Kazushige Goto at the Texas Advanced Computing Center.

History and present
OpenBLAS was developed by the parallel software group led by Professor Yunquan Zhang from the Chinese Academy of Sciences.

OpenBLAS was initially only for the Loongson CPU platform. Dr. Xianyi Zhang contributed a lot of work. Since GotoBLAS was abandoned, the successor OpenBLAS is now developed as an open source BLAS library for multiple platforms, including x86, ARMv8, MIPS, and RISC-V platforms, and is respected for its excellent portability.

The parallel software group is modernizing OpenBLAS to meet current computing needs. For example, OpenBLAS's level-3 computations were primarily optimized for large and square matrices (often considered as regular-shaped matrices). And now irregular-shaped matrix multiplication are also supported, such as tall and skinny matrix multiplication (TSMM), which supports faster deep learning calculations on the CPU. TSMM is one of the core calculations in deep learning operations. Besides this, the compact function and small GEMM will also be supported by OpenBLAS.