Greg Hoglund: Active Reversing: The Next Generation of Reverse Engineering




Black Hat Briefings, USA 2007 [Video] Presentations from the security conference. show

Summary: Most people think of reverse engineering as a tedious process of reading disassembled CPU instructions and attempting to predict or deduce what the original 'c' code was supposed to look like. This process is difficult, time consuming, and expensive, but it doesn't need to be. Software programs can be made to reverse engineer themselves. Software, as a machine, can be understood by active observation, as opposed to static decompilation and prediction. In other words, you can reverse engineer software by using it, as opposed to reading code. Code is nothing more than an abstraction of runtime states. When software operates it reverse engineers itself by design, exposing its conceptual abstraction to the CPU and memory. The problem is that computers only need to know about what the current state is, and because of that, they discard this veritable treasure trove of information. Observation of software behavior provides no less data than static reverse engineering, and in fact provides a great deal more information that is easier to understand and costs less to obtain. Human reverse engineers need tools and methods to capture and analyze this data. Traditional debugging tools don't tie run-time information to abstract functionality because all this state information is too complex. But what the debugger doesn't see is precisely what the reverse engineer does see while running the program. The human mind grasps abstract functionality, the intent behind the seething mass of code and data. This is why automated program analysis can never replace the human mind. Humans use software at a high layer of abstraction while the computer sees only the fine grains of detail. The challenge for the reverse engineer is to join the two extremes. Historically, this chasm between total abstraction and microscopic granularity has been bridged by static disassembly and this is the reason most people haven't tackled reverse engineering. In truth, most people who are daunted by this barrier could, in fact, be excellent reverse engineers. This is a terrible shame because there are many tools and techniques available for reverse engineering that do not, or at least, should not require reading disassembled instructions. And even though the tools can't go from fine grains to mountains automatically, proper usage can reveal the links between user action and execution under the hood. This talk introduces a new method of reverse engineering coined 'Active' Reversing. Active Reversing includes debugging tools driven with techniques of use such as substring scanning, access breakpoints, dataflow tracing, behavioral set operations, run tracing, data sampling, proximity browsing, comparative memory scans, hit counters, and more. Some of the tools and techniques have been in use for quite some time, others are new concepts. In either case, never have all the techniques been formally presented as a new methodology. Active Reversing is a fresh new look on an old subject.