Static code analysis is the analysis of computer software that is performed without actually executing programs built from that software (analysis performed on executing programs is known as dynamic analysis). In most cases the analysis is performed on some version of the source code and in the other cases some form of the object code. The term is usually applied to the analysis performed by an automated tool, with human analysis being called program understanding or program comprehension.
The sophistication of the analysis performed by tools varies from those that only consider the behavior of individual statements and declarations, to those that include the complete source code of a program in their analysis. Uses of the information obtained from the analysis vary from highlighting possible coding errors (e.g., the lint tool) to formal methods that mathematically prove properties about a given program (e.g., its behavior matches that of its specification).
Some people consider software metrics and reverse engineering to be forms of static analysis.
A growing commercial use of static analysis is in the verification of properties of software used in safety-critical computer systems and
locating potentially vulnerable code.
Formal methods is the term applied to the analysis of software
) whose results are obtained purely through the use of rigorous mathematical methods. The mathematical techniques used include denotational semantics
, axiomatic semantics
, operational semantics
, and abstract interpretation
It has been proven that, barring some hypothesis that the state space of programs is finite and small, finding possible run-time errors, or more generally any kind of violation of a specification on the final result of a program, is undecidable: there is no mechanical method that can always answer truthfully whether a given program may or may not exhibit runtime errors. This result dates from the works of Church, Gödel and Turing in the 1930s (see the halting problem and Rice's theorem). As with most undecidable questions, one can still attempt to give useful approximate solutions.
Some of the implementation techniques of formal static analysis include:
- Model checking considers systems that have finite state or may be reduced to finite state by abstraction;
- Data-flow analysis is a lattice-based technique for gathering information about the possible set of values;
- Abstract interpretation models the effect that every statement has on the state of an abstract machine (i.e., it 'executes' the software based on the mathematical properties of each statement and declaration). This abstract machine overapproximates the behaviours of the system: the abstract system is thus made simpler to analyze, at the expense of incompleteness (not every property true of the original system is true of the abstract system). If properly done, though, abstract interpretation is sound (every property true of the abstract system can be mapped to a true property of the original system).
- Use of assertions in program code as first suggested by Hoare logic. There is tool support for some programming languages (e.g., the SPARK programming language (a subset of Ada) and the Java Modeling Language — JML — using ESC/Java and ESC/Java2).