Abstract
Decompilation is currently a widely used tool in reverse engineering and exploit detection in binaries. Ghidra, developed by the National Security Agency, is one of the most popular decompilers. It decompiles binaries to high P-Code, from which the final decompilation output in C code is generated. Ghidra allows users to work with P-Code, so users can analyze the intermediate representation directly. Several projects make use of this to build tools that perform verification, decompilation, taint analysis and emulation, to name a few. P-Code lacks a formal semantics, and its documentation is limited. It has a notoriously subtle semantics, which makes it hard to do any sort of analysis on P-Code. We show that P-Code, as-is, cannot be given an executable semantics. In this paper, we augment P-Code and define a complete, executable, formal semantics for it. This is done by looking at the documentation and the decompilation results of binaries with known source code. The development of a formal P-Code semantics uncovered several issues in Ghidra, P-Code, and the documentation. We show that these issues affect projects that rely on Ghidra and P-Code. We evaluate the executability of our semantics by building a P-Code interpreter that directly uses our semantics. Our work uncovered several issues in Ghidra and allows Ghidra users to better leverage P-Code.
Original language | English |
---|---|
Title of host publication | Verified Software. Theories, Tools and Experiments |
Subtitle of host publication | 14th International Conference, VSTTE 2022 |
Editors | Akash Lal, Stefano Tonetta |
Publisher | Springer, Cham |
Pages | 111-128 |
Number of pages | 18 |
Edition | 1 |
ISBN (Electronic) | 9783031258039 |
ISBN (Print) | 9783031258022 |
DOIs | |
Publication status | Published - 2023 |
Event | 14th International Conference on Verified Software. Theories, Tools and Experiments, VSTTE 2022 - Trento, Italy Duration: 17 Oct 2022 → 18 Oct 2022 https://vstte22.fbk.eu/ |
Publication series
Series | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 13800 LNCS |
ISSN | 0302-9743 |
Conference
Conference | 14th International Conference on Verified Software. Theories, Tools and Experiments, VSTTE 2022 |
---|---|
Country/Territory | Italy |
City | Trento |
Period | 17/10/22 → 18/10/22 |
Internet address |
Keywords
- Decompilation
- Formal semantics
- P-Code