The great advantage of fuzz testing is that the test design is extremely simple, and free of preconceptions about system behavior. Fuzz testing was developed at the University of Wisconsin-Madison in 1988/89 by Professor Barton Miller and the students in his graduate Advanced Operating Systems class.
Fuzz testing is also used as a gross measurement of a large software system's reliability, availability and security.
Fuzz testing is thought to enhance software security and software safety because it often finds odd oversights and defects which human testers would fail to find, and even careful human test designers would fail to create tests for.
However, fuzz testing is not a substitute for exhaustive testing or formal methods: it can only provide a random sample of the system's behavior, and in many cases passing a fuzz test may only demonstrate that a piece of software handles exceptions without crashing, rather than behaving correctly. Thus, fuzz testing can only be regarded as a bug-finding tool rather than an assurance of quality.
Modern software has several different types of inputs:
There are at least two different forms of fuzz testing:
By using all of these techniques in combination, fuzz-generated randomness can test the un-designed behavior surrounding a wider range of designed system states.
Fuzz testing may use tools to simulate all of these domains.
On the other hand, bugs found using fuzz testing are frequently severe, exploitable bugs that could be used by a real attacker. This has become even more true as fuzz testing has become more widely known, as the same techniques and tools are now used by attackers to exploit deployed software. This is a major advantage over binary or source auditing, or even fuzzing's close cousin, fault injection, which often relies on artificial fault conditions that are difficult or impossible to exploit.
The randomness of inputs used in fuzzing is often seen as a disadvantage, as catching a boundary value condition with random inputs is highly unlikely. Robustness testing was introduced by PROTOS researchers in 1999 to increase the test efficiency through systematic tests. Robustness testing is a model based fuzzing technique, an extension of syntax testing, that systematically will explore the input space defined by various communication interfaces or data formats, and will generate intelligent test cases that find crash-level flaws and other failures in software.
The most common problem with an event-driven program is that it will often simply use the data in the queue, without even crude validation. To succeed in a fuzz-tested environment, software must validate all fields of every queue entry, decode every possible binary value, and then ignore impossible requests.
One of the more interesting issues with real-time event handling is that if error reporting is too verbose, simply providing error status can cause resource problems or a crash. Robust error detection systems will report only the most significant, or most recent error over a period of time.
One common problem with a character driven program is a buffer overrun, when the character data exceeds the available buffer space. This problem tends to recur in every instance in which a string or number is parsed from the data stream and placed in a limited-size area.
Another is that decode tables or logic may be incomplete, not handling every possible binary value.
Database fuzz is controversial,because input and comparison constraints reduce the invalid data in a database. However, often the database is more tolerant of odd data than its client software, and a general-purpose interface is available to users. Since major customer and enterprise management software is starting to be open-source, database-based security attacks are becoming more credible.
A common problem with fuzz databases is buffer overflow. A common data dictionary, with some form of automated enforcement is quite helpful and entirely possible. To enforce this, normally all the database clients need to be recompiled and retested at the same time. Another common problem is that database clients may not understand the binary possibilities of the database field type, or, legacy software might have been ported to a new database system with different possible binary values. A normal, inexpensive solution is to have each program validate database inputs in the same fashion as user inputs. The normal way to achieve this is to periodically "clean" production databases with automated verifiers.