Abstract-Interpretation-Based Language Server for Python

Most IDEs for Python development follow traditional code analysis approaches that are known to work well for statically-typed languages such as Java or C++. A lot of modern Python code is untyped, and these conservative tools tend to assign the “Any” type for variables that do not have explicit type annotations. However, programmers are usually able to deduce types (and often concrete values) by looking at surrounding code and jumping through function definitions and call sites.

Abstract interpretation is a mathematical tool that frames the problem of program analysis in terms of solving constraint systems on lattices. Program states are abstractly represented as elements of a state lattice that encode knowledge about possible values of all variables and memory locations at a particular point in the program. Dataflow relationships are encoded as constraints that link program states at different points in the program.

Task

The goal of this thesis is to adapt and use techniques known from abstract interpretation to enhance analysis of untyped code in Python IDEs. This should result in more accurate code completion, go to definition, inlay hints etc. and will make it possible to automatically insert inferred type annotations.

Additional Material

Contact

Jonathan Brachthäuser

Philipp Schuster