The FX features DDR, DDR2 or GDDR-3 memory, a 130 nm fabrication process, and Shader Model 2.0/2.0A compliant vertex and pixel shaders. The FX series is fully compliant and compatible with DirectX 9.0b. The GeForce FX also included an improved VPE (Video Processing Engine), which was first deployed in the GeForce4 MX. Its main upgrade was per pixel video-deinterlacing — a feature first offered in ATI's Radeon, but seeing little use until the maturation of Microsoft's DirectX-VA and VMR (video mixing renderer) APIs. Among other features was an improved anisotropic filtering algorithm which was not angle-dependent (unlike its competitor, the Radeon 9700/9800 series) and offered better quality, but affected performance somewhat. Though Nvidia reduced the filtering quality in the drivers for a while, the company eventually got the quality up again, and this feature remains one of the highest points of the GeForce FX family to date (however, this method of anisotropic filtering was dropped by Nvidia with the GeForce 6 series for performance reasons, and then re-introduced with the GeForce 8 series).
The advertising campaign for the GeForce FX featured the Dawn fairy demo, which was the work of several veterans from the computer animation Final Fantasy: The Spirits Within. Nvidia touted it as "The Dawn of Cinematic Computing", while critics noted that this was the strongest case of using sex appeal in order to sell graphics cards yet. It is still probably the best-known of the Nvidia Demos.
A second reason was Nvidia's commitment with Microsoft. Many of Nvidia’s best engineers were working on the Xbox contract, developing a motherboard solution, including the API used as part of the SoundStorm platform and the graphics processor (NV2A). The Xbox venture diverted most of Nvidia's engineers over not only the NV2A's initial design-cycle but also during the mid-life product revisions needed to discourage hackers. The Xbox contract did not allow for falling manufacturing costs, as process technology improved, and Microsoft sought to renegotiate the terms of the contract, withholding the DirectX 9 specifications as leverage. As a result, Nvidia and Microsoft relations, which had previously been very good, deteriorated. (Both parties later settled the dispute through arbitration and the terms were not released to the public.)
Due to the Xbox dispute, Nvidia was not consulted when the DirectX 9 specification was drawn up, while ATI designed the Radeon 9700 to such specifications. Rendering color support was limited to 24 bits floating point, and shader performance had been emphasized throughout development, since this was to be the main focus of DirectX 9. Microsoft's shader compiler was also built using the Radeon 9700 as the base card instead of Nvidia's offering. In contrast, Nvidia’s cards offered 16 and 32 bit floating point modes, offering either lower visual quality (as compared to the competition), or slow performance. The 32 bit support made them much more expensive to manufacture requiring a higher transistor count. Shader performance was often only half or less the speed provided by ATI's competing products. Having made its reputation by providing easy to manufacture DirectX compatible parts, Nvidia had misjudged Microsoft’s next standard, and was to pay a heavy price for this error. As more and more games started to rely on DirectX 9 features, the poor shader performance of the GeForce FX series became ever more obvious. With the exception of the FX 5700 series (a late revision), the FX series lacked performance compared to equivalent ATI parts.
Finally, Nvidia's transition to a 130 nm manufacturing process encountered unexpected difficulties. Nvidia had ambitiously selected TSMC's then state-of-the-art (but unproven) Low-K dielectric 130 nm process node. After sample silicon-wafers exhibited abnormally high defect-rates and poor circuit performance, Nvidia was forced to re-tool the NV30 for a conventional (FSG) 130 nm process node. (Nvidia's manufacturing difficulties with TSMC spurred the company to search for a second foundry. Nvidia selected IBM to fabricate several future GeForce chips, citing IBM's process technology leadership. Yet curiously, Nvidia avoided IBM's Low-K process.)
Hardware enthusiasts saw the GeForce FX series as a disappointment as it did not live up to expectations. Nvidia had aggressively hyped the card up throughout the summer and autumn of 2002, to combat ATI Technologies' autumn release of the powerful Radeon 9700. ATI's very successful Shader Model 2 card had arrived several months earlier than Nvidia's first NV30 board, the GeForce FX 5800.
When the FX 5800 finally launched, it was discovered after testing and research on the part of hardware analysts that the NV30 was not a match for Radeon 9700's R300 core. This was especially true when pixel shading was involved. Additionally, the 5800 had roughly a 30% memory bandwidth deficit caused by the use of a comparatively narrow 128-bit memory bus (ATI and other companies moved to 256-bit). Nvidia planned to use the new, state-of-the-art GDDR-2 instead because of its support for much higher clock rates. It couldn't clock high enough to make up for the bandwidth of a 256-bit bus, however.
While the NV30's direct competitor, the R300 core, was capable of 8 pixels per clock with its 8 pipelines, the NV30 architecture was unable to render 8 color + Z pixels per clock. It was thus actually more easily categorized as a 4 × 2 design capable of 8 Z pixels, 8 stencil operations, 8 textures, and 8 shader operations per clock. This limited its pixel fill-rate in the majority of 3D applications. However, in games with heavy use of stencil shadows, such as those based on the Doom3 engine, NV30 did benefit from its 8 pixels/operations per clock capabilities, because the engine does a Z-only pass. This was not a typical rendering scenario, however.
The initial version of the GeForce FX (the 5800) was one of the first cards to come equipped with a large dual-slot cooling solution. Called "Flow FX", the cooler was stunningly apparent in comparison to ATI's small single-slot cooler on the 9700 series. Not only that, but it was very loud and garnered complaints from gamers and developers alike. It was even jokingly coined the 'Dustbuster' and graphics cards which happen to be loud are often compared to the GeForce FX 5800 for this reason.
Firstly, the chips were designed for use with a mixed precision programming methodology. A 64-bit precision "FP16" mode would be used for situations where high-precision math was seen as unnecessary to maintain image quality. In other cases, where mathematical accuracy was more important, a 128-bit "FP32" mode would be utilized. The ATI R300-based cards did not benefit from partial precision because they always operated at shader model 2's required minimum of 96-bit FP24 for full precision. For a game title to use FP16, the programmer had to specify which effects used the lower precision using "hints" within the code. Because ATI didn't benefit from the lower precision and the R300 performed far better on shaders overall, and because it took more effort to optimize shader code for the lower precision, the NV3x hardware was usually crippled to running full precision full-time.
The NV3x chips also used a processor architecture that relied heavily on the effectiveness of the video card driver's shader compiler. Proper instruction ordering and instruction composition of shader code could dramatically boost the chip's computational efficiency. Compiler development is a long and difficult task and this was a major challenge that Nvidia tried to overcome during most of NV3x's lifetime. Nvidia released several guidelines for creating GeForce FX-optimized code and worked with Microsoft to create a special shader model called "Shader Model 2.a". This model leveraged the design of NV30 in order to extract greater performance and flexibility. Nvidia would also controversially rewrite game shader code and force the game to use their shader code instead of what the developer had written. However, such code would often result in lower final image quality.
Then, Valve Software came forth with their experience using the hardware with their upcoming game, Half-Life 2. Using a pre-release build of the highly anticipated game, powered by the Source engine, Valve published benchmarks revealing a complete generational gap (80–120% or more) between the GeForce FX 5900 Ultra and the ATI Radeon 9800 Pro. In shader 2.0-utilizing game-levels, Nvidia's top-of-the-line FX 5900 Ultra performed about as fast as ATI's mainstream Radeon 9600, which cost approximately a third as much as the Nvidia card. Valve had initially planned on supporting partial floating point precision (FP16) to optimize for NV3x, but they eventually discovered that this plan would take far too long to accomplish. ATI's cards did not benefit from FP16 mode, so all of the work would be entirely for Nvidia's NV3x cards, a niche too small to be worthy of the time and effort. When Half-Life 2 was released a year later, Valve opted to make all GeForce FX hardware default to using the game's DirectX 8 shader code in order to enable adequate performance from the Nvidia cards.
It is possible to force Half Life 2 to run in DirectX 9 mode on all cards with a simple tweak to a configuration file. When users and reviewers attempted this, they noted the significant performance loss on NV3x cards. Only the top of the line variants (5900 and 5950) remained playable. Later, there were two "fan-patches" to make Half-Life 2 run better on the GeForce FX cards. The first was a method of using an application called 3DAnalyze to force partial precision (FP16) on all shaders on the GeForce FX cards while running the game. This method allowed users of lower-end GeForce FX cards (such as 5600 and 5700) to run the game acceptably, while significantly improving performance on the FX 5800 and 5900/5950 series graphics cards. This method brought along an image quality degradation in several areas throughout the game. However, later a patch was developed by a fan using the Source SDK, which re-ordered and re-arranged the shaders to better suit the GeForce FX architecture, and also added partial precision hints to most of the shaders in the game (in contrast to the earlier method which would force partial precision). This patch brought about a similar (and significant) performance increase for the GeForce FX 5700/5800/5900 series of graphics cards, and also did not have any image quality loss.
Nvidia historically has been known for their impressive OpenGL driver performance and quality, and the FX series certainly maintained this. However, with regard to image quality in both Direct3D and OpenGL, they aggressively began various questionable optimization techniques not seen before. They started with filtering optimizations by changing how trilinear filtering operated on game textures, reducing its accuracy, and thus visual quality. Anisotropic filtering also saw dramatic tweaks to limit its use on as many textures as possible to save memory bandwidth and fillrate. Tweaks to these types of texture filtering can often be spotted in games from a shimmering phenomenon that occurs with floor textures as the player moves through the environment (often signifying poor transitions between mip-maps). Changing the driver settings to "High Quality" can alleviate this occurrence at the cost of performance.
Nvidia also began to clandestinely replace pixel shader code in software with hand-coded optimized versions with lower accuracy, through detecting what program was being run. These "tweaks" were especially noticed in benchmark software from Futuremark. In 3DMark03 it was found that Nvidia had gone to extremes to limit the complexity of the scenes through driver shader changeouts and aggressive hacks that prevented parts of the scene from even rendering at all. This artificially boosted the scores the FX series received. Side by side analysis of screenshots in games and 3DMark03 showed noticeable differences between what a Radeon 9800/9700 displayed and what the FX series was doing. Nvidia also publicly attacked the usefulness of these programs and the techniques used within them in order to undermine their influence upon consumers. It should however be noted that ATI also created a software profile for 3DMark03. In fact, this is also a frequent occurrence with other software, such as games, in order to work around bugs and performance quirks. With regards to 3DMark, Futuremark began updates to their software and screening driver releases for these optimizations.
Both Nvidia and ATI have optimized drivers for tests like this historically. However, Nvidia went to a new extreme with the FX series. Both companies optimize their drivers for specific applications even today (2008), but a tight rein and watch is kept on the results of these optimizations by a now more educated and aware user community.
In late April 2003, Nvidia introduced the mid-range GeForce FX 5600 and budget GeForce FX 5200 models to address these segments. Each had an "Ultra" variant and a slower, cheaper non-Ultra variant. With conventional single-slot cooling and a mid-range price-tag, the 5600 Ultra had respectable performance but failed to measure up to its direct competitor, Radeon 9600 Pro. The GeForce FX 5600 parts did not even advance performance over the GeForce 4 Ti chips they were designed to replace. In DirectX 8 applications, the 5600 lost to or matched the GeForce 4 Ti 4200. Likewise, the entry-level FX 5200 did not perform as well as the DirectX 7.0 generation GeForce 4 MX440, despite the FX 5200 possessing a notably better 'checkbox' feature-set. FX 5200 was easily outperformed by the older Radeon 9000. The utility of the DirectX 9 pixel shader 2.0 performance of these parts was questionable at best.
Also in May 2003, Nvidia launched a new top-end model, the GeForce FX 5900 Ultra. This chip, based on a heavily revised NV35 GPU, fixed many of the shortcomings of the 5800, which had been quietly discontinued. While the 5800 used fast but hot and expensive GDDR-2 and had a 128-bit memory bus, the 5900 moved to slower and cheaper DDR SDRAM, but it more than made up for it with a wider 256-bit memory bus. The 5900 Ultra performed somewhat better than the Radeon 9800 Pro in everything not heavily using shaders, and had a quieter cooling system than the 5800, but most cards based on the 5900 still occupied two slots.
In October 2003, Nvidia brought out a more potent mid-range card using technology from NV35; the GeForce FX 5700, using a new NV36 core. The FX 5700 was ahead of the Radeon 9600 Pro and XT in games with light use of pixel shaders. In December 2003, Nvidia launched the 5900XT, a board identical to the 5900, but clocked slower and using slower memory. It managed to more soundly defeat Radeon 9600 XT, but was still behind in a few shader-heavy scenarios.
The final GeForce FX model released was the 5950 Ultra, which was a 5900 Ultra with higher clock speeds. The board was fairly competitive with the Radeon 9800XT, again as long as pixel shaders were lightly used.
|Card Model||Codename||Core Design||Clocks core/mem||Memory Bus||Architecture Information|
|FX 5200||NV34||2:4:4||250/200||64 or 128 Bit||Entry level chip. Replacement for GeForce4 MX family. Quadro FX 330, 500, 600 is based on the GeForceFX 5200. The GeForce FX 5100 is an uncommon cutdown FX5200 available in 64 and 128 MB sizes, it was available only in AGP, and used a lower clocked nv34 core. Lacked IntelliSample technology. No lossless color compression or Z compression. PCX uses AGP to PCIe bridge chip for use on PCIe motherboards. Has 2 pixel pipelines if pixel shading is used, but a "fast" 4x1 mode exists as well. Each pixel pipe = 1 FP32 ALU handling 2 TMUs + 2 FX12 Mini-ALU (each one can do 2 MULs or 1 ADD or 1 MAD)|
|FX 5100||NV34||2:4:4||250/200||64 Bit|
|FX 5200 Ultra||NV34||2:4:4||325/325||128 Bit|
|PCX 5300||NV34||2:4:4||250/200||64 or 128 Bit|
|FX 5500||NV34||2:4:4||270/200||64 or 128 Bit|
|FX 5600||NV31||2:4:4||325/275||64 or 128 Bit||Midrange chip. Sometimes slower than GeForce4 Ti 4200. No Quadro equivalent. Actually has 3 vertex shaders, but 2 are defective. Has 2 pixel pipelines if pixel shading is used, but a "fast" 4x1 mode exists as well. Each pixel pipe = 1 FP32 ALU handling 2 TMUs + 2 FX12 Mini-ALU (each one can do 2 MULs or 1 ADD or 1 MAD). Two 5600 Ultra's exist; the "flipchip" version used a new production process common to the 5900 series, allowing higher clockspeeds.|
|FX 5600 Ultra||NV31||2:4:4||350/350||128 Bit|
|FX 5600 Ultra Flipchip||NV31||2:4:4||400/400||128 Bit|
|FX 5600 XT||NV31||2:4:4||235/200||64 or 128 Bit|
|FX 5700||NV36||3:4:4||425/250||128 Bit||NV36, like NV35, swapped hardwired DirectX 7 T&L Units + DirectX 8 integer pixel shader units for DirectX 9 floating point units. Again, like NV31 and NV34, NV36 is a 2 pipeline design but with a special 4x1 mode for some situations. Quadro equivalent is the Quadro FX 1100. Later models were equipped with GDDR3, which was also clocked higher than the DDR or GDDR2 modules previously used. On Ultra, RAM speed of 475 MHz also seen. PCX uses AGP to PCIe bridge chip for use on PCIe motherboards. Has 2 pixel pipelines if pixel shading is used. Each pixel pipe = 1 FP32 ALU handling 2 TMUs + 2 FP32 mini ALU (each one can do 1 MUL or 1 ADD or 1 FP16 MAD).|
|FX 5700 LE||NV36||3:4:4||250/200||64 or 128 Bit|
|FX 5700 Ultra||NV36||3:4:4||475/450||128 Bit (GDDR2/GDDR3)|
|PCX 5750||NV36||3:4:4||425/250||128 Bit|
|FX 5800||NV30||3:8:4||400/400||128 Bit (GDDR2)||Production was troubled by migration to 130 nm processes at TSMC. Produced a lot of heat. Cooler nicknamed the 'Dustbuster', 'Vacuum Cleaner', or 'Hoover' by some sites; Nvidia later released a video mocking the cooler. Due to manufacturing delays it was quickly replaced by the on-schedule NV35. Its Quadro sibling, Quadro FX 1000, 2000 was somewhat more successful. Double Z fillrate (helps shadowing). Each pixel pipe = 1 FP32 ALU handling 2 TMUs + 2 FX12 Mini-ALU (each one can do 2 MULs or 1 ADD or 1 MAD)|
|FX 5800 Ultra||NV30||3:8:4||500/500||128 Bit (GDDR2)|
|FX 5900||NV35||3:8:4||400/425||256 Bit||Swapped hardwired DirectX 7 T&L Units + DirectX 8 integer pixel shader units for DirectX 9 floating point units. Introduced a new feature called 'UltraShadow', upgraded to CineFX 2.0 Specification. Removed the noisy cooler, but still stole the PCI slot adjacent to the card by default. Quadro equivalent is QuadroFX 700, 3000. PCX uses AGP to PCIe bridge chip for use on PCIe motherboards. Double Z fillrate (helps shadowing). Each pixel pipe = 1 FP32 ALU handling 2 TMUs + 2 FP32 mini ALU (each one can do 1 MUL or 1 ADD or 1 FP16 MAD).|
|FX 5900 Ultra||NV35||3:8:4||450/425||256 Bit|
|PCX 5900||NV35||3:8:4||350/275||256 Bit|
|FX 5900 XT||NV35||3:8:4||400/350||256 Bit|
|FX 5950||NV38||3:8:4||475/475||256 Bit||Essentially a speed bumped GeForceFX 5900. Some antialiasing and shader unit tweaks in hardware. PCX uses AGP to PCIe bridge chip for use on PCIe motherboards. Quadro equivalent is QuadroFX 1300.|
|PCX 5950||NV38||3:8:4||350/475||256 Bit|