In this thesis, we aim at designing both theoretically and experimentally methods for the automatic detection of potential bugs in software – or the proof of the absence thereof. This detection is done statically by analyzing programs' source code without running them. We rely on the abstract interpretation framework to derive sound, computable semantics. In particular, we focus on analyzing dynamic programming languages. The target of this work is the analysis of Python programs combined with native C libraries.
The abstract interpretation workflow requires a concrete semantics, which will be approximated to create sound computable analyses. Our first step is thus to define the uncomputable, collecting semantics of Python formally. This semantics relies on previous work and retro-engineering from the source code of the reference interpreter. We took special care to make the semantics explainable by providing references to the actual implementation’s source code for each case of the semantics.
We have implemented all the analyses presented in this thesis within the Mopsa framework. From a static analysis developer’s perspective, Mopsa has a double goal. It aims at reducing the cost of lifting an abstract domain from a toy language to a real-world one. It also aspires to define relational analyses composed of loosely coupled abstract domains that can cooperate. We take particular care to define all our abstract domains independently from each other. Mopsa’s core is language agnostic, making Mopsa suitable to analyze different programming languages. The analyses rely on some general use domains, such as numerical domains, the recency abstraction of dynamic memory allocation, and the loop iterator.
We present a first analysis aiming at detecting type-related errors in Python programs. It does so by inferring both the nominal and structural types of objects. Python’s type annotations can be reused to support libraries. This analysis is refined into a value analysis which is more precise and infers numerical invariants for the manipulated builtin numeric values. These analyses rely on abstractions of dynamic memory allocation and data structures that we have adapted to the case of dynamic programming languages. Both analyses scale to real-world benchmarks a few thousand lines long used by Python developers.
Most Python programs rely on libraries written in C for the sake of performance or code reuse. We define a multilanguage analysis that can detect runtime errors in every part of a multilanguage program: in the Python part, in the C part, and at the boundary between the languages. Before defining this analysis, we provide an overview of the different interoperability mechanisms available between Python and C, and we define a collecting semantics for the interoperability mechanism provided by the reference interpreter. This analysis can tackle tests of popular, real-world libraries a few thousand lines of C and Python long in a few minutes.