For our readers not yet familiar with the bits and bobs of static analysis, we’d like to make a brief introduction on the meaning of soundness and why it makes a difference.
Program analysis infers and proves properties on software. This is normally done by the developer who tests her/his code, that is, executes the software with some specific inputs and environment conditions.
At the end of the test, the developer might
- have found a bug, or
- be sure that with those inputs and environment conditions the program behaves correctly.
Static vs. Dynamic
Testing is also called dynamic program analysis. However, one cannot test a program on all the possible inputs and environment conditions. In particular, when dealing with industrial software a developer can test only a minimal part of the program.
Static program analysis is the dual approach. This approach looks at the software without executing it. Compilers widely apply static analysis. For instance, the Java compiler discovers that a local variable might have not been initialized, or that a wrong value is assigned to a variable (e.g., if the program tries to assign 1 to a string).
These errors are discovered statically without actually executing the code. An additional property of some static analyses is soundness. This guarantees that, if the static analyzer tells that a program satisfies a given property, then all the executions of the program will respect it. For instance, a static analyzer might prove that a program cannot raise a
Sound static analysis
But how does sound static analysis work? Briefly, it builds up a computable overapproximation of all possible executions. How exactly?
Let’s start with a simple example. Imagine to have a very simple program that receives an integer
i and returns
i+1. You can test it with 0, 1, 2, … and so on. But in order to discover it might cause an overflow you have to test it with
MAX_VALUE. Instead, static analysis approximates the input with some abstract value. Consider for instance to approximate numerical values with intervals. Then you’ll know that the input variable is in the interval
[MIN_VALUE..MAX_VALUE], and if you add 1 to
MAX_VALUE you get an overflow.
So summarizing: sound static analysis finds all bugs, unsound analysis finds some bugs!
Now we imagine you are wondering about what’s the price to pay to get this unbelievable result. The short answer is: false alarms! For the details… Stay tuned!