A Formal Semantics for P-Code

Nico Naus*, Freek Verbeek, Dale Walker, Binoy Ravindran

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference Article in proceedingAcademicpeer-review

Abstract

Decompilation is currently a widely used tool in reverse engineering and exploit detection in binaries. Ghidra, developed by the National Security Agency, is one of the most popular decompilers. It decompiles binaries to high P-Code, from which the final decompilation output in C code is generated. Ghidra allows users to work with P-Code, so users can analyze the intermediate representation directly. Several projects make use of this to build tools that perform verification, decompilation, taint analysis and emulation, to name a few. P-Code lacks a formal semantics, and its documentation is limited. It has a notoriously subtle semantics, which makes it hard to do any sort of analysis on P-Code. We show that P-Code, as-is, cannot be given an executable semantics. In this paper, we augment P-Code and define a complete, executable, formal semantics for it. This is done by looking at the documentation and the decompilation results of binaries with known source code. The development of a formal P-Code semantics uncovered several issues in Ghidra, P-Code, and the documentation. We show that these issues affect projects that rely on Ghidra and P-Code. We evaluate the executability of our semantics by building a P-Code interpreter that directly uses our semantics. Our work uncovered several issues in Ghidra and allows Ghidra users to better leverage P-Code.

Original languageEnglish
Title of host publicationVerified Software. Theories, Tools and Experiments
Subtitle of host publication14th International Conference, VSTTE 2022
EditorsAkash Lal, Stefano Tonetta
PublisherSpringer, Cham
Pages111-128
Number of pages18
Edition1
ISBN (Electronic)9783031258039
ISBN (Print)9783031258022
DOIs
Publication statusPublished - 2023
Event14th International Conference on Verified Software. Theories, Tools and Experiments, VSTTE 2022 - Trento, Italy
Duration: 17 Oct 202218 Oct 2022
https://vstte22.fbk.eu/

Publication series

SeriesLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13800 LNCS
ISSN0302-9743

Conference

Conference14th International Conference on Verified Software. Theories, Tools and Experiments, VSTTE 2022
Country/TerritoryItaly
CityTrento
Period17/10/2218/10/22
Internet address

Keywords

  • Decompilation
  • Formal semantics
  • P-Code

Fingerprint

Dive into the research topics of 'A Formal Semantics for P-Code'. Together they form a unique fingerprint.

Cite this