Note (unrelated to Psyche-C)
C language draft proposal: Enabling Generic Functions and Parametric Types in C (prototype).
Psyche-C is a platform for implementing static analysis of C programs. At its core, it includes a C compiler frontend that performs both syntactic and semantic analysis. Yet, as opposed to actual C compilers, Psyche-C doesn't build a symbol table during parsing. Despite this, even with zero setup or in broken build setups, Psyche-C still offers accurate syntax analysis (through syntax disambiguation) and partial semantic analysis.
Bellow are the main characteristics of Psyche-C:
- Clean separation between the syntactic and semantic compiler phases.
- Algorithmic and heuristic syntax disambiguation.
- Optional type inference as a recovery mechanism from
#include
failures (not yet in master). - API inspired by that of the Roslyn .NET compiler and LLVM's Clang.
Psyche-C is written as a C++ library, but it comes with a builtin driver: cnippet. You can use it by passing to it either the typical command line arguments of an actual compiler or the actual compiler's whole invocation as a subcommand.
Example with cnippet only:
cnip -analysis /path/to/analysis.dylib -I/path/to/whatever file.c
Example with compiler's invocation as a subcommand:
cnip -analysis /path/to/analysis.dylib -- gcc -I/path/to/whatever file.c
See psychec-analysis for a trivial example of how to implement an analysis.
Psyche-C began as a type inference tool for C, aimed at enabling static analysis of incomplete programs. However, the compiler frontend at its core wasn't good enough, so I decided to rewrite it pretty much from scratch. I used this rewrite also as an opportunity to extend Psyche-C into a platform for static analysis in general. The result of this work is what exists today in the master branch, but that doesn't yet include a port of the type inference from the original branch.
With type inference enabled, if you "compile" the snippet below with cnippet, Psyche-C will infer T
and synthesize a declaration for it.
void f()
{
T v = 0;
v->value = 42;
v->next = v;
}
Synthesized declaration for T
.
typedef struct TYPE_2__ TYPE_1__;
struct TYPE_2__
{
int value;
struct TYPE_2__* next;
} ;
typedef TYPE_1__* T;
You might want to use this functionality to:
- Enable, on incomplete programs, analyses that depend on complete programs.
- Generate test-input/mocks to validate functions in isolation.
- Prototype an algorithm while only sketching its data-structures
- Compile a snippet for inspection of static properties of its object code.
- The Doxygen-generated API.
- A contributor's wiki.
- An online interface that offers a glimpse of Psyche-C's type inference.
- Articles/blogs:
To build:
cmake CMakeLists.txt && make -j 4
To run the tests:
./test-suite
Of Psyche-C itself:
-
Type Inference for C: Applications to the Static Analysis of Incomplete Programs
ACM Transactions on Programming Languages and Systems — TOPLAS, Volume 42, Issue 3, Article No. 15, Dec. 2020. -
Inference of static semantics for incomplete C programs
Proceedings of the ACM on Programming Languages, Volume 2, Issue POPL, Jan. 2018, Article No. 29.
That use Psyche-C:
-
SLaDe: A Portable Small Language Model Decompiler for Optimized Assembly
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization - CGO, 2024. -
AnghaBench: a Suite with One Million Compilable C Benchmarks for Code-Size Reduction
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization — CGO, 2021. -
Generation of in-bounds inputs for arrays in memory-unsafe languages
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization — CGO, Feb. 2019, p. 136-148. -
Automatic annotation of tasks in structured code
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques — PACT, Nov. 2018, Article No. 31.