OneAPI (compute acceleration)

oneAPI is an open standard, adopted by Intel, for a unified application programming interface (API) intended to be used across different computing accelerator (coprocessor) architectures, including GPUs, AI accelerators and field-programmable gate arrays. It is intended to eliminate the need for developers to maintain separate code bases, multiple programming languages, tools, and workflows for each architecture.

oneAPI competes with other GPU computing stacks: CUDA by Nvidia and ROCm by AMD.

Specification
The oneAPI specification extends existing developer programming models to enable multiple hardware architectures through a data-parallel language, a set of library APIs, and a low-level hardware interface to support cross-architecture programming. It builds upon industry standards and provides an open, cross-platform developer stack.

Data Parallel C++
DPC++ is a programming language implementation of oneAPI, built upon the ISO C++ and Khronos Group SYCL standards. DPC++ is an implementation of SYCL with extensions that are proposed for inclusion in future revisions of the SYCL standard, including: unified shared memory, group algorithms, and sub-groups.

Libraries
The set of APIs spans several domains, including libraries for linear algebra, deep learning, machine learning, video processing, and others.

The source code of parts of the above libraries is available on GitHub.

The oneAPI documentation also lists the "Level Zero" API defining the low-level direct-to-metal interfaces and a set of ray tracing components with its own APIs.

Hardware abstraction layer
oneAPI Level Zero,  the low-level hardware interface, defines a set of capabilities and services that a hardware accelerator needs to interface with compiler runtimes and other developer tools.

Implementations
Intel has released oneAPI production toolkits that implement the specification and add CUDA code migration, analysis, and debug tools. These include the Intel oneAPI DPC++/C++ Compiler, Intel Fortran Compiler, Intel VTune Profiler and multiple performance libraries.

Codeplay has released an open-source layer  to allow oneAPI and SYCL/DPC++ to run atop Nvidia GPUs via CUDA.

University of Heidelberg has developed a SYCL/DPC++ implementation for both AMD and Nvidia GPUs.

Huawei released a DPC++ compiler for their Ascend AI Chipset

Fujitsu has created an open-source ARM version of the oneAPI Deep Neural Network Library (oneDNN) for their Fugaku CPU.

Unified Acceleration Foundation (UXL) and the future for oneAPI
Unified Acceleration Foundation (UXL) is a new technology consortium that are working on the contiuation of the OneAPI initiative, with the goal to create a new open standard accelerator software ecosystem, related open standards and specification projects through Working Groups and Special Interest Groups (SIGs). The goal will compete with Nvidia's CUDA. The main companies behind it are Intel, Google, ARM, Qualcomm, Samsung, Imagination, and VMware.