Abstract
Even though heavily researched, a full formal model of the x86-64 instruction set is still not available. We present libLISA, a tool for automated discovery and analysis of the ISA of a CPU. This produces the most extensive formal x86-64 model to date, with over 118000 different instruction groups. The process requires as little human specification as possible: specifically, we do not rely on a human-written (dis)assembler to dictate which instructions are executable on a given CPU, or what their in- and outputs are. The generated model is CPU-specific: behavior that is "undefined"is synthesized for the current machine. Producing models for five different x86-64 machines, we mutually compare them, discover undocumented instructions, and generate instruction sequences that are CPU-specific. Experimental evaluation shows that we enumerate virtually all instructions within scope, that the instructions' semantics are correct w.r.t. existing work, and that we improve existing work by exposing bugs in their handwritten models.
Original language | English |
---|---|
Article number | 283 |
Number of pages | 29 |
Journal | Proceedings of the ACM on Programming Languages |
Volume | 8 |
Issue number | OOPSLA2 |
DOIs | |
Publication status | Published - 8 Oct 2024 |
Keywords
- instruction enumeration
- instruction semantics
- synthesis