Cited by Lee Sonogan
Abstract by Javier Escalada G ́omez
In software reverse engineering, decompilation is the process of re-covering source code from binary files. Decompilers are used when it is necessary to understand or analyze software for which the source code is not available. Although existing decompilers commonly obtain source code with the same behavior as the binaries, that source code is usually hard to interpret and certainly differs from the original code written by the programmer. This is because obtaining the original source code from a binary file is an undecidable problem. The cause is that the compiler discards high-level information in the translation process, such as type information, that cannot be recovered in the inverse process.
Existing decompilers associate binary patterns with high-level lan-guage constructs so that binary code can be decompiled. However, those binary patterns strongly depend on different variables such as the compiler used to generate the binaries, target microprocessor and operating system, and the compiler options. If the values of one of these variables change, so do the binary patterns.
Publication: Computer Science Thesis (Peer-Reviewed Journal)
Pub Date: May 2021 Doi: https://www.reflection.uniovi.es//ortin/theses/escalada.pdf
Keywords: Big code, decompilation, machine learning, binary patterns,
language constructs, assembly patterns, big data, Cnerator
https://www.reflection.uniovi.es//ortin/theses/escalada.pdf (Plenty more sections and references in this research article)