Radeon 9000 series

The R300 GPU, introduced in August 2002 and developed by ATI Technologies, is its third generation of GPU used in Radeon graphics cards. This GPU features 3D acceleration based upon Direct3D 9.0 and OpenGL 2.0, a major improvement in features and performance compared to the preceding R200 design. R300 was the first fully Direct3D 9-capable consumer graphics chip. The processors also include 2D GUI acceleration, video acceleration, and multiple display outputs.

The first graphics cards using the R300 to be released were the Radeon 9700. It was the first time that ATI marketed its GPU as a Visual Processing Unit (VPU). R300 and its derivatives would form the basis for ATI's consumer and professional product lines for over 3 years.

AGP (9xxx series)

 * All models are manufactured with a 150 nm fabrication process
 * All models include DirectX 8.1 and OpenGL 1.4


 * 1 Pixel shaders : Vertex shaders : Texture mapping units : Render output units

IGP (9xxx series)

 * All models are manufactured with a 150 nm fabrication process
 * All models include DirectX 8.1 and OpenGL 1.4
 * Based on the Radeon 9200


 * 1 Pixel shaders : Vertex shaders : Texture mapping units : Render output units

AGP (9xxx series)

 * All models include DirectX 9.0 and OpenGL 2.0


 * 1 Pixel shaders : Vertex shaders : Texture mapping units : Render output units
 * 2 The 256-bit version of the 9800 SE when unlocked to 8-pixel pipelines with third party driver modifications should function close to a full 9800 Pro.

Development
ATI had held the lead for a while with the Radeon 8500 but NVIDIA retook the performance crown with the launch of the GeForce 4 Ti line. A new high-end refresh part, the 8500XT (R250) was supposedly in the works, ready to compete against NVIDIA's high-end offerings, particularly the top line Ti 4600. Pre-release information listed a 300 MHz core and RAM clock speed for the R250 chip. ATI, perhaps mindful of what had happened to 3dfx when they took focus off their Rampage processor, abandoned it in favor of finishing off their next-generation R300 card. This proved to be a wise move, as it enabled ATI to take the lead in development for the first time instead of trailing NVIDIA. The R300, with its next-generation architecture giving it unprecedented features and performance, would have been superior to any R250 refresh.

The R3xx chip was designed by ATI's west coast team (formerly ArtX Inc.), and the first product to use it was the Radeon 9700 PRO (internal ATI code name: R300; internal ArtX codename: Khan), launched in August 2002. The architecture of R300 was quite different from its predecessor, Radeon 8500 (R200), in nearly every way. The core of 9700 PRO was manufactured on a 150 nm chip fabrication process, similar to the Radeon 8500. However, refined design and manufacturing techniques enabled a doubling of transistor count and a significant clock speed gain.

One major change with the manufacturing of the core was the use of the flip-chip packaging, a technology not used previously on video cards. Flip chip packaging allows far better cooling of the die by flipping it and exposing it directly to the cooling solution. ATI thus could achieve higher clock speeds. Radeon 9700 PRO was launched clocked at 325 MHz, ahead of the originally projected 300 MHz. With a transistor count of 110 million, it was the largest and most complex GPU of the time. A slower chip, the 9700, was launched a few months later, differing only by lower core and memory speeds. Despite that, the Radeon 9700 PRO was clocked significantly higher than the Matrox Parhelia 512, a card released but months before R300 and considered to be the pinnacle of graphics chip manufacturing (with 80 million transistors at 220 MHz), up until R300's arrival.

Architecture
The chip adopted an architecture consisting of 8 pixel pipelines, each with 1 texture mapping unit (an 8x1 design). While this differed from the older chips using 2 (or 3 for the original Radeon) texture units per pipeline, this did not mean R300 could not perform multi-texturing as efficiently as older chips. Its texture units could perform a new loopback operation which allowed them to sample up to 16 textures per geometry pass. The textures can be any combination of one, two, or three dimensions with bilinear, trilinear, or anisotropic filtering. This was part of the new DirectX 9 specification, along with more flexible floating-point-based Shader Model 2.0+ pixel shaders and vertex shaders. Equipped with 4 vertex shader units, R300 possessed over twice the geometry processing capability of the preceding Radeon 8500 and the GeForce4 Ti 4600, in addition to the greater feature-set offered compared to DirectX 8 shaders.

ATI demonstrated part of what was capable with pixel shader PS2.0 with their Rendering with Natural Light demo. The demo was a real-time implementation of noted 3D graphics researcher Paul Debevec's paper on the topic of high dynamic range rendering. A noteworthy limitation is that all R300-generation chips were designed for a maximum floating point precision of 96-bit, or FP24, instead of DirectX 9's maximum of 128-bit FP32. DirectX 9.0 specified FP24 as a minimum level for conforming to the specification for full precision. This trade-off in precision offered the best combination of transistor usage and image quality for the manufacturing process at the time. It did cause a usually visibly imperceptible loss of quality when doing heavy blending. ATI's Radeon chips did not go above FP24 until R520.

The R300 was the first board to truly take advantage of a 256-bit memory bus. Matrox had released their Parhelia 512 several months earlier, but this board did not show great gains with its 256-bit bus. ATI, however, had not only doubled their bus to 256-bit, but also integrated an advanced crossbar memory controller, somewhat similar to NVIDIA's memory technology. Utilizing four individual load-balanced 64-bit memory controllers, ATI's memory implementation was quite capable of achieving high bandwidth efficiency by maintaining adequate granularity of memory transactions and thus working around memory latency limitations. "R300" was also given the latest refinement of ATI's innovative HyperZ memory bandwidth and fillrate saving technology, HyperZ III. The demands of the 8x1 architecture required more bandwidth than the 128-bit bus designs of the previous generation due to having double the texture and pixel fillrate.

Radeon 9700 introduced ATI's multi-sample gamma-corrected anti-aliasing scheme. The chip offered sparse-sampling in modes including 2×, 4×, and 6×. Multi-sampling offered vastly superior performance over the supersampling method on older Radeons, and superior image quality compared to NVIDIA's offerings at the time. Anti-aliasing was, for the first time, a fully usable option even in the newest and most demanding titles of the day. The R300 also offered advanced anisotropic filtering which incurred a much smaller performance hit than the anisotropic solution of the GeForce4 and other competitors' cards, while offering significantly improved quality over Radeon 8500's anisotropic filtering implementation which was highly angle dependent.

On March 14, 2008, AMD released the 3D Register Reference for R3xx.

Performance
Radeon 9700's advanced architecture was very efficient and, of course, more powerful compared to its older peers of 2002. Under normal conditions it beats the GeForce4 Ti 4600, the previous top-end card, by 15–20%. However, when anti-aliasing (AA) and/or anisotropic filtering (AF) were enabled it would beat the Ti 4600 by anywhere from 40–100%. At the time, this was quite astonishing, and resulted in the widespread acceptance of AA and AF as critical, truly usable features.

Besides advanced architecture, reviewers also took note of ATI's change in strategy. The 9700 would be the second of ATI's chips (after the 8500) to be shipped to third-party manufacturers instead of ATI producing all of its graphics cards, though ATI would still produce cards off of its highest-end chips. This freed up engineering resources that were channeled towards driver improvements, and the 9700 performed phenomenally well at launch because of this. id Software technical director John Carmack had the Radeon 9700 run the E3 Doom 3 demonstration.

The performance and quality increases offered by the R300 GPU is considered to be one of the greatest in the history of 3D graphics, alongside the achievements GeForce 256 and Voodoo Graphics. Furthermore, NVIDIA's response in the form of the GeForce FX 5800 was both late to market and somewhat unimpressive, especially when pixel shading was used. R300 would become one of the GPUs with the longest useful lifetime in history, allowing playable performance in new games at least 3 years after its launch.

Further releases
A few months later, the 9500 and 9500 PRO were launched. The 9500 PRO had half the memory bus width of the 9700 PRO, and the 9500 was also missing (disabled) half the pixel processing units and the hierarchical Z-buffer optimization unit (part of HyperZ III). With its full 8 pipelines and efficient architecture, the 9500 PRO outperformed all of NVIDIA's products (save the Ti 4600). Meanwhile, the 9500 also became popular because it could in some cases be modified into the much more powerful 9700. ATI only intended for the 9500 series to be a temporary solution to fill the gap for the 2002 Christmas season, prior to the release of the 9600. Since all of the R300 chips were based on the same physical die, ATI's margins on 9500 products were low. Radeon 9500 was one of the shortest-lived product of ATI, later replaced by the Radeon 9600 series. The logo and box package of the 9500 was resurrected in 2004 to market the unrelated and slower Radeon 9550 (which is a derivative of the 9600).

Refreshed
In early 2003, the 9700 cards were replaced by the 9800 (or, R350). These were R300s with higher clock speeds, and improvements to the shader units and memory controller which enhanced anti-aliasing performance. They were designed to maintain a performance lead over the recently launched GeForce FX 5800 Ultra, which it managed to do without difficulty. The 9800 still held its own against the revised FX 5900, primarily (and significantly) in tasks involving heavy SM2.0 pixel shading. Another selling point for the 9800 was that it was still a single-slot card, compared to the dual-slot requirements of the FX 5800 and FX 5900. A later version of the 9800 Pro with 256 MiB of memory used GDDR2. The other two variants were the 9800, which was simply a lower-clocked 9800 Pro, and the 9800 SE, which had half the pixel processing units disabled (could sometimes be enabled again). Official ATI specifications dictate a 256-bit memory bus for the 9800 SE, but most of the manufacturers used a 128-bit bus. Usually, the 9800 SE with 256-bit memory bus was called "9800 SE Ultra" or "9800 SE Golden Version".

Alongside the 9800, the 9600 (a.k.a. RV350) series was rolled out in early 2003, and while the 9600 PRO didn't outperform the 9500 PRO that it was supposed to replace, it was much more economical for ATI to produce by way of a 130 nm process (all ATI's cards since the 7500/8500 had been 150 nm) and a simplified design. Radeon 9600's RV350 core was basically a 9800 Pro cut in half, with exactly half of the same functional units, making it a 4×1 architecture with 2 vertex shaders. It also lost part of HyperZ III with the removal of the hierarchical z-buffer optimization unit, the same as Radeon 9500. Using a 130 nm process was also good for pushing up the core clock speed. The 9600 series, all with high default clocking, was shown to have quite a bit of headroom by overclockers (achieving over 500 MHz, from 400 MHz on the Pro model). While the 9600 series was less powerful than the 9500 and 9500 Pro it replaced, it did largely manage to maintain the 9500's lead over NVIDIA's GeForce FX 5600 Ultra, and it was ATI's cost-effective answer to the long-time mainstream performance board, GeForce4 Ti 4200.

During the summer of 2003, the Mobility Radeon 9600 was launched, based upon the RV350 core. Being the first laptop chip to offer DirectX 9.0 shaders, it enjoyed the same success of the previous Mobility Radeons. The Mobility Radeon 9600 was originally planned to use a RAM technology called GDDR2-M. The company developing that memory went bankrupt and the RAM never arrived, so ATI was forced to use regular DDR SDRAM. Undoubtedly there would have been power usage savings, and perhaps performance gains with GDDR2-M. In fall 2004, a slightly faster variant, the Mobility Radeon 9700 was launched (which was still based upon the RV350, and not the older R300 of the desktop Radeon 9700 despite the naming similarity).

Later in 2003, three new cards were launched: the 9800 XT (R360), the 9600 XT (RV360), and the 9600 SE (RV350). The 9800 XT was slightly faster than the 9800 PRO had been, while the 9600 XT competed well with the newly launched GeForce FX 5700 Ultra. The RV360 chip on 9600 XT was the first graphics chip by ATI that utilized Low-K chip fabrication and allowed even higher clocking of the 9600 core (500 MHz default). The 9600 SE was ATI's answer to NVIDIA's GeForce FX 5200 Ultra, managing to outperform the 5200 while also being cheaper. Another "RV350" board followed in early 2004, on the Radeon 9550, which was a Radeon 9600 with a lower core clock (though an identical memory clock and bus width).

Worthy of note regarding the R300-based generation is that the entire lineup utilized single-slot cooling solutions. It was not until the R420 generation's Radeon X850 XT Platinum Edition, in December 2004, that ATI would adopt an official dual-slot cooling design.