A Formal Semantics for P-Code

Nico Naus*, Freek Verbeek, Dale Walker, Binoy Ravindran

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference Article in proceedingAcademicpeer-review


Decompilation is currently a widely used tool in reverse engineering and exploit detection in binaries. Ghidra, developed by the National Security Agency, is one of the most popular decompilers. It decompiles binaries to high P-Code, from which the final decompilation output in C code is generated. Ghidra allows users to work with P-Code, so users can analyze the intermediate representation directly. Several projects make use of this to build tools that perform verification, decompilation, taint analysis and emulation, to name a few. P-Code lacks a formal semantics, and its documentation is limited. It has a notoriously subtle semantics, which makes it hard to do any sort of analysis on P-Code. We show that P-Code, as-is, cannot be given an executable semantics. In this paper, we augment P-Code and define a complete, executable, formal semantics for it. This is done by looking at the documentation and the decompilation results of binaries with known source code. The development of a formal P-Code semantics uncovered several issues in Ghidra, P-Code, and the documentation. We show that these issues affect projects that rely on Ghidra and P-Code. We evaluate the executability of our semantics by building a P-Code interpreter that directly uses our semantics. Our work uncovered several issues in Ghidra and allows Ghidra users to better leverage P-Code.

Original languageEnglish
Title of host publicationVerified Software. Theories, Tools and Experiments
Subtitle of host publication14th International Conference, VSTTE 2022
EditorsAkash Lal, Stefano Tonetta
PublisherSpringer, Cham
Number of pages18
ISBN (Electronic)9783031258039
ISBN (Print)9783031258022
Publication statusPublished - 2023
Event14th International Conference on Verified Software. Theories, Tools and Experiments, VSTTE 2022 - Trento, Italy
Duration: 17 Oct 202218 Oct 2022

Publication series

SeriesLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13800 LNCS


Conference14th International Conference on Verified Software. Theories, Tools and Experiments, VSTTE 2022
Internet address


  • Decompilation
  • Formal semantics
  • P-Code


Dive into the research topics of 'A Formal Semantics for P-Code'. Together they form a unique fingerprint.

Cite this