Direct Rendering Infrastructure



The Direct Rendering Infrastructure (DRI) is the framework comprising the modern Linux graphics stack which allows unprivileged user-space programs to issue commands to graphics hardware without conflicting with other programs. The main use of DRI is to provide hardware acceleration for the Mesa implementation of OpenGL. DRI has also been adapted to provide OpenGL acceleration on a framebuffer console without a display server running.

DRI implementation is scattered through the X Server and its associated client libraries, Mesa 3D and the Direct Rendering Manager kernel subsystem. All of its source code is free software.

Overview
In the classic X Window System architecture the X Server is the only process with exclusive access to the graphics hardware, and therefore the one which does the actual rendering on the framebuffer. All that X clients do is communicate with the X Server to dispatch rendering commands. Those commands are hardware independent, meaning that the X11 protocol provides an API that abstracts the graphics device so the X clients don't need to know or worry about the specifics of the underlying hardware. Any hardware-specific code lives inside the Device Dependent X, the part of the X Server that manages each type of video card or graphics adapter and which is also often called the video or graphics driver.

The rise of 3D rendering has shown the limits of this architecture. 3D graphics applications tend to produce large amounts of commands and data, all of which must be dispatched to the X Server for rendering. As the amount of inter-process communication (IPC) between the X client and X Server increased, the 3D rendering performance suffered to the point that X driver developers concluded that in order to take advantage of 3D hardware capabilities of the latest graphics cards a new IPC-less architecture was required. X clients should have direct access to graphics hardware rather than relying on another process to do so, saving all the IPC overhead. This approach is called "direct rendering" as opposed to the "indirect rendering" provided by the classical X architecture. The Direct Rendering Infrastructure was initially developed to allow any X client to perform 3D rendering using this "direct rendering" approach.

Nothing prevents DRI from being used to implement accelerated 2D direct rendering within an X client. Simply no one has had the need to do so because the 2D indirect rendering performance was good enough.

Software architecture
The basic architecture of the Direct Rendering Infrastructure involves three main components:
 * the DRI client &mdash;for example, an X client performing "direct rendering"&mdash; needs a hardware-specific "driver" able to manage the current video card or graphics adapter in order to render on it. These DRI drivers are typically provided as shared libraries to which the client is dynamically linked. Since DRI was conceived to take advantage of 3D graphics hardware, the libraries are normally presented to clients as hardware-accelerated implementations of a 3D API such as OpenGL, provided by either the 3D-hardware vendor itself or a third party such as the Mesa 3D free software project.
 * the X Server provides an X11 protocol extension &mdash;the DRI extension&mdash; that the DRI clients use to coordinate with both the windowing system and the DDX driver. As part of the DDX driver, it's quite common that the X Server process also dynamically links to the same DRI driver that the DRI clients do, but to provide hardware-accelerated 3D rendering to the X clients using the GLX extension for indirect rendering (for example, remote X clients that can't use direct rendering). For 2D rendering, the DDX driver must also take into account the DRI clients using the same graphics device.
 * the access to the video card or graphics adapter is regulated by a kernel component called the Direct Rendering Manager (DRM). Both the X Server's DDX driver and each X client's DRI driver must use DRM to access the graphics hardware. DRM provides synchronization to the shared resources of the graphics hardware &mdash;resources such as the command queue, the card registers, the video memory, the DMA engines, ...&mdash; ensuring that the concurrent access of all those multiple competing user-space processes don't interfere with each other. DRM also serves as a basic security enforcer that doesn't allow any X client to access the hardware beyond what it needs to perform the 3D rendering.

DRI1
In the original DRI architecture, due to the memory size of video cards at that time, there was a single instance of the screen front buffer and back buffer (also of the ancillary depth buffer and stencil buffer), shared by all the DRI clients and the X Server. All of them rendered directly onto the back buffer, that was swapped with the front buffer at vertical blanking interval time. In order to render to the back buffer, a DRI process should ensure that the rendering was clipped to the area reserved for its window.

The synchronization with the X Server was done through signals and a shared memory buffer called the SAREA. The access to the DRM device was exclusive, so any DRI client had to lock it at the beginning of a rendering operation. Other users of the device &mdash;including the X Server&mdash; were blocked in the meantime, and they had to wait until the lock was released at the end of the current rendering operation, even if it wouldn't be any conflict between both operations. Another drawback was that operations didn't retain memory allocations after the current DRI process released its lock on the device, so any data uploaded to the graphics memory such as textures were lost for upcoming operations, causing a significant impact on graphics performance.

Nowadays DRI1 is considered completely obsolete and must not be used.

DRI2
Due to the increasing popularity of compositing window managers like Compiz, the Direct Rendering Infrastructure had to be redesigned so that X clients could also support redirection to "offscreen pixmaps" while doing direct rendering. Regular X clients already respected the redirection to a separate pixmap provided by the X Server as a render target &mdash;the so-called offscreen pixmap&mdash;, but DRI clients continued to do the rendering directly into the shared backbuffer, effectively bypassing the compositing window manager. The ultimate solution was to change the way DRI handled the render buffers, which led to a completely different DRI extension with a new set of operations, and also major changes in the Direct Rendering Manager. The new extension was named "DRI2", although it's not a later version but a different extension not even compatible with the original DRI &mdash;in fact both have coexisted within the X Server for a long time.

In DRI2, instead of a single shared (back) buffer, every DRI client gets its own private back buffer &mdash;along with their associated depth and stencil buffers&mdash; to render its window content using the hardware acceleration. The DRI client then swaps it with a false "front buffer", which is used by the compositing window manager as one of the sources to compose (build) the final screen back buffer to be swapped at the VBLANK interval with the real front buffer.

To handle all these new buffers, the Direct Rendering Manager had to incorporate new functionality, specifically a graphics memory manager. DRI2 was initially developed using the experimental TTM memory manager, but it was later rewritten to use GEM after it was chosen as the definitive DRM memory manager. The new DRI2 internal buffer management model also solved two major performance bottlenecks present in the original DRI implementation:


 * DRI2 clients no longer lock the entire DRM device while using it for rendering, since now each client gets a separate render buffer independent from the other processes.
 * DRI2 clients can allocate their own buffers (with textures, vertex lists, ...) in the video memory and keep them as long as they want, which significantly reduces video memory bandwidth consumption.

In DRI2, the allocation of the private offscreen buffers (back buffer, fake front buffer, depth buffer, stencil buffer, ...) for a window is done by the X Server itself. DRI clients retrieve those buffers to do the rendering into the window by calling operations such as DRI2GetBuffers and DRI2GetBuffersWithFormat available in the DRI2 extension. Internally, DRI2 uses GEM names &mdash;a type of global handle provided by the GEM API that allows two processes accessing a DRM device to refer to the same buffer&mdash; for passing around "references" to those buffers through the X11 protocol. The reason why the X Server is in charge of the buffer allocation of the render buffers of a window is that the GLX extension allows for multiple X clients to do OpenGL rendering cooperatively in the same window. This way, the X Server manages the whole lifecycle of the render buffers along the entire rendering process and knows when it can safely recycle or discard them. When a window resize is performed, the X Server is also responsible for allocating new render buffers matching the new window size, and notifying the change to the DRI client(s) rendering into the window using an InvalidateBuffers event, so they would retrieve the GEM names of the new buffers.

The DRI2 extension provides other core operations for the DRI clients, such as finding out which DRM device and driver they should use (DRI2Connect) or getting authenticated by the X Server in order to be able to use the rendering and buffer facilities of the DRM device (DRI2Authenticate). The presentation of the rendered buffers in the screen is performed using the DRI2CopyRegion and DRI2SwapBuffers requests. DRI2CopyRegion can be used to do a copy between the fake front buffer and the real front buffer, but it doesn't provide any synchronization with the vertical blanking interval, so it can cause tearing. DRI2SwapBuffers, on the other hand, performs a VBLANK-synchronized swap between back and front buffer, if it's supported and both buffers have the same size, or a copy (blit) otherwise.

DRI3
Although DRI2 was a significant improvement over the original DRI, the new extension also introduced some new issues. In 2013, a third iteration of the Direct Rendering Infrastructure known as DRI3 was developed in order to fix those issues.

The main differences of DRI3 compared to DRI2 are:


 * DRI3 clients allocate their render buffers, instead of relying on the X Server for doing the allocation, which was the method supported by DRI2.
 * DRI3 gets rid of the old insecure GEM buffer sharing mechanism based on GEM names (global GEM handles) for passing buffer objects between a DRI client and the X Server in favor of the one more secure and versatile based on PRIME DMA-BUFs, which uses file descriptors instead.

Buffer allocation on the client side breaks GLX assumptions in the sense that it's no longer possible for multiple GLX applications to render cooperatively in the same window. On the plus side, the fact that the DRI client is in charge of its own buffers throughout their lifetime brings many advantages. For example, it is easy for the DRI3 client to ensure that the size of the render buffers always match the current size of the window, and thereby eliminate the artifacts due to the lack of synchronization of buffer sizes between client and server that plagued window resizing in DRI2. A better performance is also achieved because now DRI3 clients save the extra round trip waiting for the X Server to send the render buffers. DRI3 clients, and especially compositor window managers, can take advantage of keeping older buffers of previous frames and reusing them as the basis on which to render only the damaged parts of a window as another performance optimization. The DRI3 extension no longer needs to be modified to support new particular buffer formats, since they are now handled directly between the DRI client driver and the DRM kernel driver. The use of file descriptors, on the other hand, allows the kernel to perform a safe cleanup of any unused GEM buffer object &mdash;one with no reference to it.

Technically, DRI3 consists of two different extensions, the "DRI3" extension and the "Present" extension. The main purpose of the DRI3 extension is to implement the mechanism to share direct rendered buffers between DRI clients and the X Server. DRI clients allocate and use GEM buffers objects as rendering targets, while the X Server represents these render buffers using a type of X11 object called "pixmap". DRI3 provides two operations, DRI3PixmapFromBuffer and DRI3BufferFromPixmap, one to create a pixmap (in "X Server space") from a GEM buffer object (in "DRI client space"), and the other to do the reverse and get a GEM buffer object from an X pixmap. In these DRI3 operations GEM buffer objects are passed as DMA-BUF file descriptors instead of GEM names. DRI3 also provides a way to share synchronization objects between the DRI client and the X Server, allowing both a serialized access to the shared buffer. Unlike DRI2, the initial DRI3Open operation &mdash;the first every DRI client must request to know which DRM device to use&mdash; returns an already open file descriptor to the device node instead of the device node filename, with any required authentication procedure already performed in advance by the X Server.

DRI3 provides no mechanism to show the rendered buffers on the screen, but relies on another extension, the Present extension, to do so. Present is so named because its main task is to "present" buffers on the screen, meaning that it handles the update of the framebuffer using the contents of the rendered buffers delivered by client applications. Screen updates have to be done at the proper time, normally during the VBLANK interval in order to avoid display artifacts such as tearing. Present also handles the synchronization of screen updates to the VBLANK interval. It also keeps the X client informed about the instant each buffer is really shown on the screen using events, so the client can synchronize its rendering process with the current screen refresh rate.

Present accepts any X pixmap as the source for a screen update. Since pixmaps are standard X objects, Present can be used not only by DRI3 clients performing direct rendering, but also by any X client rendering on a pixmap by any means. For example, most existing non-GL based GTK+ and Qt applications used to do double buffered pixmap rendering using XRender. The Present extension can also be used by these applications to achieve efficient and non-tearing screen updates. This is the reason why Present was developed as a separate standalone extension instead of being part of DRI3.

Apart from allowing non-GL X clients to synchronize with VBLANK, Present brings other advantages. DRI3 graphics performance is better because Present is more efficient than DRI2 in swapping buffers. A number of OpenGL extensions that weren't available with DRI2 are now supported based on new features provided by Present.

Present provides two main operations to X clients: update a region of a window using part of or all the contents of a pixmap (PresentPixmap) and set the type of presentation events related to a certain window that the client wants to be notified about (PresentSelectInput). There are three presentation events about which a window can notify an X client: when an ongoing presentation operation &mdash;normally from a call to PresentPixmap&mdash; has been completed (PresentCompleteNotify), when a pixmap used by a PresentPixmap operation is ready to be reused (PresentIdleNotify) and when the window configuration &mdash;mostly window size&mdash; changes (PresentConfigureNotify). Whether a PresentPixmap operation performs a direct copy (blit) onto the front buffer or a swap of the entire back buffer with the front buffer is an internal detail of the Present extension implementation, instead of an explicit choice of the X client as it was in DRI2.

Adoption
Several open source DRI drivers have been written, including ones for ATI Mach64, ATI Rage128, ATI Radeon, 3dfx Voodoo3 through Voodoo5, Matrox G200 through G400, SiS 300-series, Intel i810 through i965, S3 Savage, VIA UniChrome graphics chipsets, and nouveau for Nvidia. Some graphics vendors have written closed-source DRI drivers, including ATI and PowerVR Kyro.

The various versions of DRI have been implemented by various operating systems, amongst others by the Linux kernel, FreeBSD, NetBSD, OpenBSD, and OpenSolaris.

History
The project was started by Jens Owen and Kevin E. Martin from Precision Insight (funded by Silicon Graphics and Red Hat). It was first made widely available as part of XFree86 4.0 and is now part of the X.Org Server. It is currently maintained by the free software community.

Work on DRI2 started at the 2007 X Developers' Summit from a proposal by Kristian Høgsberg. Høgsberg himself wrote the new DRI2 extension and the modifications to Mesa and GLX. In March 2008 DRI2 was mostly done, but it couldn't make into X.Org Server version 1.5 and had to wait until version 1.6 from February 2009. The DRI2 extension was officially included in the X11R7.5 release of October 2009. The first public version of the DRI2 protocol (2.0) was announced in April 2009. Since then there have been several revisions, the most recent being version 2.8 from July 2012.

Due to several limitations of DRI2, a new extension called DRI-Next was proposed by Keith Packard and Emma Anholt at the X.Org Developer's Conference 2012. The extension was proposed again as DRI3000 at Linux.conf.au 2013. DRI3 and Present extensions were developed during 2013 and merged into the X.Org Server 1.15 release from December 2013. The first and only version of the DRI3 protocol (1.0) was released in November 2013.