Precise Analysis of String Expressions
Aske Simon Christensen
February 2003 |
Abstract:
We perform static analysis of Java programs to answer a simple
question: which values may occur as results of string expressions? The
answers are summarized for each expression by a regular language that is
guaranteed to contain all possible values. We present several applications of
this analysis, including statically checking the syntax of dynamically
generated expressions, such as SQL queries. Our analysis constructs flow
graphs from class files and generates a context-free grammar with a
nonterminal for each string expression. The language of this grammar is then
widened into a regular language through a variant of an algorithm previously
used for speech recognition. The collection of resulting regular languages is
compactly represented as a special kind of multi-level automaton from which
individual answers may be extracted. If a program error is detected, examples
of invalid strings are automatically produced. We present extensive
benchmarks demonstrating that the analysis is efficient and produces results
of useful precision
Available as PostScript, PDF. |