Semantic Apparatus – Obtaining High-Level Semantic Information from Binary Code

Cited by Lee Sonogan

Binary Code Wallpapers - Wallpaper Cave

Abstract by Javier Escalada G ́omez

In software reverse engineering, decompilation is the process of re-covering source code from binary files. Decompilers are used when it is necessary to understand or analyze software for which the source code is not available. Although existing decompilers commonly obtain source code with the same behavior as the binaries, that source code is usually hard to interpret and certainly differs from the original code written by the programmer. This is because obtaining the original source code from a binary file is an undecidable problem. The cause is that the compiler discards high-level information in the translation process, such as type information, that cannot be recovered in the inverse process.

Existing decompilers associate binary patterns with high-level lan-guage constructs so that binary code can be decompiled. However, those binary patterns strongly depend on different variables such as the compiler used to generate the binaries, target microprocessor and operating system, and the compiler options. If the values of one of these variables change, so do the binary patterns.

Publication: Computer Science Thesis (Peer-Reviewed Journal)

Pub Date: May 2021 Doi:

Keywords: Big code, decompilation, machine learning, binary patterns,
language constructs, assembly patterns, big data, Cnerator (Plenty more sections and references in this research article)

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.