Declarative Demand-Driven Reverse Engineering
DDeclarative Demand-Driven Reverse Engineering
Yihao Sun ∗ , Jeffrey Ching † , and Kristopher Micinski ‡ Department of Electical Engineering and Computer Science, Syracuse UniversityEmail: ∗ [email protected], † [email protected], ‡ [email protected] Abstract —Binary reverse engineering is a challenging taskbecause it often necessitates reasoning using both domain-specificknowledge (e.g., understanding entrypoint idioms common to anABI) and logical inference (e.g., reconstructing interproceduralcontrol flow). To help perform these tasks, reverse engineersoften use toolkits (such as IDA Pro or Ghidra) that allow themto interactively explicate properties of binaries. We argue thatdeductive databases serve as a natural abstraction for interfacingbetween visualization-based binary analysis tools and high-performance logical inference engines that compute facts aboutbinaries. In this paper, we present a vision for the future in whichreverse engineers use a visualization-based tool to understandbinaries while simultaneously querying a logical-inference engineto perform arbitrarily-complex deductive inference tasks. Wecall our vision declarative demand-driven reverse engineering(D RE for short), and sketch a formal semantics whose goal isto mediate interaction between a logical-inference engine (suchSouffl´e) and a reverse engineering tool. We describe a prototypetool, d3re , which are using to explore the D RE vision. Whilestill a prototype, we have used d3re to reimplement severalcommon querying tasks on binaries. Our evaluation demonstratesthat d3re enables both better performance and more succinctimplementation of these common RE tasks.
I. I
NTRODUCTION
Binary reverse engineering (henceforth RE) is the processby which we start with some input binary (sequence ofbytes) and employ various reasoning principles to explicateits behavior when executed as code. While RE tasks are oftenpartially automated (e.g., via decompilation), full automationis often impossible: the extreme semantic expressivity af-forded to binaries (including encrypted code, stripped symboltables, etc..) often necessitates open-ended exploration andcase-specific reasoning. Recent literature suggests that manypracticioners follow an iterative approach involving severalrounds of hypothesis formation and validation/falsification,often assisated via a combination of static and dynamicanalysis [1]–[3].To rapidly interact with a binary, RE practicioners often usereverse engineering tools such as Ghidra [4], IDA Pro [5],or Radare2 [6]. The goal of these tools is to allow anRE to quickly explore the binary and visualize it (typicallyinteractively, via a GUI or CLI) in a variety of ways. Forexample, a reverse engineer looking for a time bomb mayfirst search for calls to the system’s time function, and thenwalk backwards to understand whether each call is associatedwith legitimate or malicious behavior. In doing so, the REmay need to reason about, e.g., indirect control flow, or even When unambiguous, we will use the term RE both to mean the processof reverse engineering and a reverse engineering practicioner identify the time function (in a stripped binary). Because REsare expert users, and often skilled programmers, RE toolsprovide programmatic interfaces that enable REs to system-atize reasoning tasks via extensions. A broad range of popularextensions exist for several tools which perform such tasks asloading the results of static analyses [7], [8], interacting withdebuggers [9], and identifying common cryptographic-relevantcode [10].In this paper, we argue that deductive databases (e.g., Data-log) serve as a natural abstraction boundary between RE toolsand logical inference tasks over binaries. We envision a futurein which a reverse engineer interactively explores a binaryusing an RE tool while simultaneously querying arbitrarily-complex logical properties written in a terse declarative style.We call this Declarative Demand-Driven Reverse Engineering(henceforth D RE). In D RE, an RE interacts with a deductivedatabase by giving inputs (e.g., the currently highlightedaddress) to a rule-based deductive inference system writtenin a declarative language such as Datalog. Rules inductivelycompute relations over facts about the binary. As an example,consider a relation direct call ∈ Addr × Addr which relatescallsite addresses (offsets within the binary) to procedureinvocation target addresses. In our vision, D RE allows REs tointeractively compute with and visualize the results of queriesover these deductive rules.We see D RE as a natural extension of several observationsabout the state of the start. First, many existing RE toolsassemble databases to index various properties (e.g., addresses,symbols, etc...) of binaries for quick exploration. Deduc-tive databases further allow REs to write arbitrary logicalqueries which are computed maximally efficiently via, e.g.,compilation to relational algebra kernels as done in Souffl´e.Deductive databases have also enabled several recent advancesin binary analysis demonstrating both efficiency and robust-ness over conventional techniques. For example, the Datalog-based disassembler ddisasm achieves both faster and more-precise disassembly than other state-of-the-art disassemblers,and OOAnalyzer uses Prolog to enable declarative recovery ofclasses from compiled
C++ code.In this short paper we describe our progress in implementinga prototype tool, d3re , which we are building to realize theD RE vision. d3re allows REs to interactively define andcalculate queries of arbitrary complexity over large productionbinaries and then visualize their results using Ghidra. Toimplement d3re we have designed an interface, which we callthe mediator , that sits between a traditional Datalog solver andan RE tool. We briefly formalize this interaction between the a r X i v : . [ c s . P L ] J a n E tool and logic solver in Section III, and go on to describeour prototype Ghidra extension that enables visualizing theresults of binary analyses in our tool. Using this formalism,we describe how d3re readily enables a broad range of binaryanalyses and sketch a vision for how we believe D RE willprove to be a natural ergonomic for reverse engineering.We have measured the robustness of d3re in several ways.First, we wanted to know whether d3re could truly live upto our vision of being a natural replacement for the kindsof scripts REs already use in their day-to-day work. Toevaluate this, we reimplemented a set of currently-existingGhidra scripts. We happily observed that d3re was not onlyan ergonomic advantage (allowing us to write succinct butobviously-correct queries) but also a performance advantage.For example, many Ghidra scripts play tricks to avoid un-necessarily complexity that would arise in a straightforwardimplementation, e.g., iterating over a set of functions in a loopto check a property resulting in super-linear complexity. In d3re , the Datalog solver was naturally able to compile andorganize work in an optimal way. We discuss this and otherresults in Section IV. We conclude with a brief overview ofrelated work and our outlook on future directions in Section VSpecifically, we claim the following three contributions: • A formalization of our metadatabase as a database ofdatabases used to optimize subsequent invocations of theDatalog solver (Section III). • A prototype tool, d3re , consisting of a server whichwraps ddisasm with logic to enable chaining multiplesubsequent calls via the metadatabase. Also included in d3re is an extension to the Ghidra RE toolkit to enablevisualizing results computed using d3re . • An evaluation of d3re on a set of benchmarks demon-strating positive initial results indicating that d3re couldreplace present-day binary analysis infrastructure (e.g.,Ghidra scripts) and directly enable more efficient andsuccinct implementation.II. O
VERVIEW
In this section, we demonstrate the vision and application ofD RE by illustrating how a reverse engineer might explicatea vulnerability due to an uninitialized global variable. Weconsider a particular binary,
CROMU_00038 , from DARPA’sCyber Grand Challenge which contains a function pointerwhich is uninitialized when an invalid flag is set in themetadata portion of an input file [11], [12]. We demonstratehow our prototype tool, d3re , can be used to build a declar-ative query to find uninitialized function entry points andvisualize them within Ghidra. We do not claim that d3re canimmediately or automatically discover vulnerabilities—in thissection we try to focus on how its declarative reasoning insteadenables rapidly exploring a binary to uncover some property.The vulnerable segment of code is a use-before-definitionbug shown in Figure 1. The swap word function is initializedinside of the main function based on a value parsed in a TIFFheader—if the flag does not match or thefunction is left uninitialized and the call on line 17 crashes. // swap_short and swap_word only initializedwithin if if (tiff_hdr->Byte_Order == 0x4949) { printf("Intel formatted integers\n"); swap_word = intel_swap_word; } else if (tiff_hdr->Byte_Order == 0x4d4d) { printf("Motorola formatted integers\n"); swap_word = motorola_swap_word; } else { printf("Invalid header values\n"); _terminate(-1); } // might cause an uninitialized variable bug here offset = swap_word(tiff_hdr->Offset_to_IFD); Fig. 1: Uninitialized variable vulnerability in CROMU0038source code >>> load dl/use_def_global.dl>>> run dl/use_def_global.dl>>> load dl/uninitialized.dl>>> run dl/uninitialized.dl>>> highlight>>> comment>>> query use_before_def_global00004feb 0000a180 swap_short00005017 0000a188 swap_word...0000515e 0000a180 swap_short
Fig. 2: d3re
REPL session used in this overview.
Loading the binary:
To begin an analysis of a binary, anRE will load the binary into a reverse engineering tool. In ourcurrent implementation of d3re , a user opens two processessilmultaneously: a GUI-based instance of Ghidra, and a ter-minal running d3re ’s REPL. The user can explore the binaryusing all of the normal features of Ghidra and use all of itsconventional analyses (e.g., to recover entrypoints). However, d3re ’s REPL communicates with Ghidra so that when d3re ’sanalysis finishes Ghidra’s views update as appropriate.
Initial processing:
It is conventional that reverse engi-neering tools will apply a set of analyses to a binary todisassemble it and index various items such as entrypointsand callsites In d3re , the user builds queries in Datalogstarting from a large initial set of Datalog rules that buildon top of ddisasm , a Datalog-based disassembly engine [13].Analogously to the indexing and analysis operations providedby Ghidra (and other RE tools), d3re invokes ddisasm onceto build an initial database.Building on top of ddisasm was initially a strategicchoice— ddisasm already includes facilities to parse objectfiles and transform them into input databases in the stylerequired by Souffl´e. Initially, we extended ddisasm ’s set ofrules with additional user-specific queries—a slow process, as ddisasm can take several minutes to run on large binaries.2 ef_global(EA,dest) :-code(EA), instruction_get_dest_op(EA,Index,_),pc_relative_operand(EA,Index,dest),defined_symbol(dest,_,"OBJECT","GLOBAL",_,_).used_global(EA,dest,Index) :-code(EA), instruction_get_src_op(EA,Index,_),pc_relative_operand(EA,Index,dest),defined_symbol(dest,_,"OBJECT","GLOBAL",_,_).def_used_global(EA_def,GA,EA_used,Index) :-used_global(EA_used,GA,Index),block_last_def_global(EA_used,EA_def,GA).def_used_global(EA_def,GA, EA_used, Index) :-last_def_global(Block,EA_def,GA),code_in_block(EA_used, Block),used_global(EA_used, GA, Index),!block_last_def_global(EA_used,_,GA),.
Fig. 3: Global Var Use-Def analysisThis was at odds with our goal of enabling rapid real-timefeedback to users of d3re .D RE builds upon a key observation that we have foundcrucial to enable efficient interactive binary analyses in prac-tice: because Datalog is monotonic, we can evaluate anextended program (i.e., a program extended with a set ofadditional rules or queries) by using the database resultingfrom the calculation of the previous program. Thus, running ddisasm once allows pre-populating a large set of inferredrelations for a wide range of interesting facts about binaries,including intraprocedural reachability and calling conventions.When a binary is loaded, d3re invokes ddisasm with oneslight modification: every Datalog relation in ddisasm ’s ruledatabase (used by ddisasm to build a disassembly) is modifiedto be an output relation. In ddisasm , only dissassembly-relevant relations are output, rather than internal relations (e.g.,those that relate to intraprocedural reachability). By markingall ddisasm ’s relations as output relations, d3re providesthem to the user as primitives with which to build queriesover binaries . After the binary is loaded, all rules declaredin ddisasm will be available for querying. Additionally, ddisasm will be run only once, even if the user uploadsthe same binary several times. All facts generated in this stepwill be stored in a temporary folder on disk managed by themetadatabase (described in Section III). Designing a query to explicate use-before-define:
In theD RE approach, REs interactively build queries to highlightvarious portions of the program that match certain properties.They then manually inspect the results of their queries anduse their intuition to build subsequent queries. Along the way,the RE may choose to add comments to various instructions,functions, or other forms and browse those instructions inGhidra. In d3re , the communication between the logical rulesand the state of the RE tool is reconciled by input and outputtables—RE users can write queries that consume the state of A relevant analogy might be that ddisasm is the standard library of d3re use_before_def_global(EA_used,GA,Name) :-used_global(EA_used,GA,Index),!def_used_global(_,GA,EA_used,_),defined_symbol(GA,_,"OBJECT","GLOBAL",_,Name).use_before_def_global(EA_used,GA,Name) :-used_global(EA_used,GA,Index),def_used_global(EA_def,GA,EA_used,_),!def_null_global(EA_def,GA),defined_symbol(GA,_,"OBJECT","GLOBAL",_,Name).
Fig. 4: uninitialized variablesthe RE tool (such as currentAddress , the currently-selectedaddress) as input relations, perform logical inference, and leavetheir output in relations such as comment(addr,“vuln”) .Like ddisasm , d3re uses the Souffl´e Datalog engine toperform logical inference over binaries. Users of d3re canincrementally build up more rules in the interactive REPL(shown in Figure 2). Currently, our REPL allows loading rulesby loading new files—we plan on adding direct support fornew rules, along with error-reporting feedback soon.Knowing there was an uninitialized global function pointerbeing used, a user of d3re might first define a setof relations to build up def-use-chains of global vari-ables. Datalog code to implement these queries is illus-trated in Figure 3. The last two rules build up a rela-tion def used global(EA def,GA,EA used,Index) , whichinfers that at address EA def , the global variable (at address GA ) is defined and used at address EA used at operand index
Index . While this is a relatively coarse query, we envision theuser could run the query on the binary to visualize a largeanswer set. In our setting, this can be done using the highlight or comment commands, which display the data marked to behighlighted by the most recent result computation.Based on our definitions in Figure 3, we can define a relationfor variables which are possibly used before they are defined.We demonstrate this in Figure 4. In the first rule, we say thatif there is some usage of a global variable at some address,but in that address, we can’t find any definition related toit, then we will consider variable there as an uninitializedvariable; The second clause says that for some usage of aglobal variable even if it has some definition associated withit, if that definition is nullptr, we will still consider that thereis a use-before-def vulnerability here. Refining the query: In d3re , users can easily access theresult of the rule and all facts generated by ddisasm throughthe GUI by writing into output tables using d3re rules. Unfor-tunately, our above query produces over 50 possible results—checking each occurrence would still be a timely endeavor.Next, we narrow down the query space to the range of just themain function. We use an auxiliary predicate, code in range ,which we seed with constants for the beginning and end of the main function we gain from inspecting the binary in Ghidra. code_in_range(19490,21704).use_before_def_global(EA_used, GA, Name) :- ode_in_range(from, to), EA_used >= from,EA_used < to,used_global(EA_used,GA,Index),!def_used_global(_,GA,EA_used,_),defined_symbol(GA,_,"OBJECT","GLOBAL",_,Name). After new rules are applied, the output of the programbecomes empty: however, this does not specify the programis free from the vulnerability. First, because of our constraint,only the main function is searched, bugs may still hide inother functions. Secondly, if all usage of a variable is beforeit’s definition, null pointer error can still appear: programmersmay initialize a variable to
NULL and use several non-total branches to initialize the pointer, leaving the pointeruninitialized at the join point when no switch fires. We modifyour rules to account for this: def_null_global(EA,GA) :-def_global(EA,GA), instruction_get_src_op(EA,_,Op),op_immediate(Op,offset), offset=0.use_before_def_global(EA_used, GA, Name) :-code_in_range(from,to), EA_used >= from,EA_used < to,used_global(EA_used,GA,Index),def_used_global(EA_def,GA,EA_used,_),!def_null_global(EA_def,GA),defined_symbol(GA,_,"OBJECT","GLOBAL",_,Name).
This change results in 19 addresses to search, and com-bining these results with use-def information in the previousstep and intra-procedural control-flow graph in Ghidra, wecan fairly easily infer that the global variable swap_word is initialized to at address , that both conditionaljumps and fail, and observe a subsequentusage of swap_word at which will trigger a crash.At any stage in our process, we can sync Ghidra’s UI with thecurrent database using several REPL commands (an exampleis shown in Figure 5). In a fully-fledged implementation of d3re , we hope to have UI gadgets (or templates) to help usersinteractively build queries. For example, we may allow theuser to select a region of the binary and build a rule thatapplies only to that region, or right-click on a function andbuild a rule specific to callers of that function. We believe thiswill need to be informed by a combination of interviews withexpert users, participatory design, and (perhaps) user studies.This is work we plan to undertake now that we have proveninitial success to ourselves with d3re .We conclude this section by remarking upon the nature ofour analyses. Our analyses would be considered na¨ıve by thestandards of industrial static analyses. Indeed, our reasoning isnot even sound—we can restrict ourselves to looking at resultsfor only one function or ignore complex behavior. Still, webelieve that this iterative ad-hoc reasoning is a technique manyreverse engineers already employ—the vision of D RE isto harmoniously leverage state-of-the-art deductive reasoningengines while performing human-guided RE tasks. III. D
ESIGN AND I MPLEMENTATION
In this section, we present both a formal semantics forD RE and describe our implementation of d3re . The high-level architecture of d3re is outlined in Figure 6. Conceptually,the key idea of our semantics is to maintain a metadatabase to allow efficient incremental reuse of previously-computeddatabases. In d3re , this metadatabase takes the form of aserver which accepts Datalog programs to run to a fixed-point.The metadatabase (server) interacts with both the REPLprocess and Ghidra to render output databases into viewannotations (e.g., highlights or comments) in the Ghidra UIbased on REPL commands. Our visualization is currentlylimited to printing to Ghidra’s console, highlighting a set oflines (typically some output relation), or annotating a line witha comment (whose contents may be dynamically determinedvia a Datalog query). We plan to investigate adding commentsto other Ghidra UI elements (such as inferred classes) andother visual integration as future work.
A. Formal semantics of D REDue to space restrictions, we present only a sketch of aformal semantics for D RE. Semantics of Datalog programsare typically phrased in terms of an extensional database(EDB), an extensionally-enumerated set of ground facts, andintensional database (IDB), the set of rules defining the pro-gram [14]. Datalog’s semantics is given by a least-fixed-pointof an “immediate consequence” operator over the rules forthe program. Because Datalog programs have a finite Herbrandbase (sets of atoms), this fixed-point necessarily exists (thoughin practice Datalog engines allow extra-logical behavior suchas arithmetic). Datalog’s conventional semantics is monotonic ,in the sense that strictly more facts are accumulated as thefixed-point computation evolves—negation is allowed onlywhen it may be stratified.Fig. 5: Ghidra with highlights and comments declarativelyspecified to output results inferred via d3re for our example.4ig. 6: High-level components and their interactions in d3re .TABLE I: Script size (lines of code) of Ghidra script (Python)vs. d3re
Datalog Ghidra Python d3re
Datalognon-xor 33 8overflow 60 18basicblk 37 4findcrypto 166 45We define an
EDB metadatabase as a graph of EDBs withlabeled edges, (∆ , P → ) , where ∆ is a set of EDBs, each EDBenumerating tuples for a given set of relations, and P → is arelation in ∆ × Rules × ∆ . When we process a program, P ,using an input EDB, we traverse the graph (∆ , P → ) to find themost optimal, compatible EDB to start execution of P . Aidedby Datalog’s monotonicity, we define an EDB as compatibleif it was produced by a subset of rules (or facts) from theinput program / EDB. We conclude our formalism sketch byremarking that (∆ , P → ) , given our usage, also forms a lattice. B. Implementation of d3re d3re is implemented in two parts: a REPL that communi-cates with Ghidra’s GUI and a background service to managethe metadatabase and run the Datalog engine. The REPLcurrently communicates with Ghidra via a third-party exten-sion named ghidra bridge [15], which we plan to replaceimminently with an extension using protocol buffers.To execute a Datalog program P, d3re analyzes the fileusing the logic sketched in the above section to determine anoptimal compatible EDB to use. In the common case, a userwill gradually accumulate a stream of programs P , P (cid:48) , P (cid:48)(cid:48)(cid:48) consisting of a mix of rules and assumptions. In the future, weenvision that certain assumptions (e.g., about calling conven-tions) may be implemented as GUI extensions rather than, e.g.,manually-enumerated facts. After each run, the metadatabasewill index the output facts and associate them with the program P , establishing an edge in the aforementioned graph. In ourexperiments, we refer to this as “caching.” TABLE II: Running time of Ghidra scripts vs. equivalentimplementation in d3re (all numbers in seconds). bison souffle gzip re2c redis rsyncnon-xor Ghidra 3.569 107.5 2.205 3.903 10.52 3.050non-xor d3re d3re d3re d3re TABLE III: Runtime of successive invocations to d3re with(C) and without (S) rule caching. ddisasm stack var heap var static var unl static souffle
C 170 11.88 58.35 5.008 0.039 souffle
S 170 11.79 66.02 67.00 66.52 bison
C 7 0.932 1.409 0.545 0.022 bison
S 7 0.934 1.916 2.122 2.075 re2c
C 9 1.457 4.417 0.704 0.025 re2c
S 9 1.494 5.257 5.449 5.458 redis
C 11 1.918 2.544 1.302 0.025 redis
S 11 1.919 3.525 3.712 3.726 rsync
C 8 0.766 0.908 0.481 0.028 rsync
S 8 0.783 1.325 1.423 1.384
IV. E
VALUATION
We evaluated d3re both qualitatively, by implementing sev-eral queries, and quantitatively by measuring its performancein benchmarks. While d3re is still a work in progress, wehad several hypotheses we aimed to test as we designed andconducted these experiments. First, we wanted to understandwhether d3re provided the necessary building blocks to enablereplacing currently-existing Ghidra scripts. Second, we wantedto understand whether d3re could offer performance compet-itive with the kinds of Ghidra scripts that reverse engineerstypically use. Last, we wanted to understand the performanceof Ghidra for performing several repeated queries that mightmirror a realistic end-to-end workload using d3re . Ghidra Script Replication Study: we wanted to determinewhether d3re could realistically be used to accomplish thekinds of tasks that reverse engineers face on a day-to-daybasis. This is an admittedly challenging question, which weplan to eventually evaluate in several ways including userstudies. However, as initial work in this direction we arbitrarilyselected four Ghidra scripts listed in the awesome-ghidra
GitHub repository [16]. The scripts we chose are listed inTable I, along with their corresponding lines of code in Python/ Datalog. While Ghidra scripts may consist of a mix ofPython and Java, our experience is that most scripts use asmall subset of the Python API. The first three are relativelysmall and find instructions that match a specific template, e.g., non-xor finds xor instructions that aren’t zeroing registers,and overflow heuristically searches for potential overflows in5alls to common functions such as strcpy . Our largest was findcrypto , which looks for common cryptographic constants.
Qualitative Results of our Replication Study:
Our experi-ence using d3re to replace Ghidra scripts must be understoodin the context that we are expert users and the developers of d3re . However, we are pleasantly surprised that d3re enabledus to succinctly write equivalent implementations of eachGhidra script: we rewrote each script in substantially lessDatalog code. This is because the declarative nature of Datalogeliminates the need for much of the conventional ceremonyaround, e.g., looping over instructions and checking against atype that we found in our evaluation scripts. Key to D RE’ssuccess, we believe, is its ability to directly use relations from ddisasm : we found that much of the necessary work of, e.g.,filtering instructions by their type or operand was very usefulat achieving succinct Datalog in practice. We are in the initialplanning stages of developing a reverse engineering tutorial(or mini-course) around d3re , and are hoping to use this torecruit developers to get more realistic assessment of d3re ’susability by professional REs.
Quantitative Results of our Replication Study:
We hopedthat d3re , being based on a high-performance Datalog solver,would offer performance competitive with Ghidra’s currentscripts. Each of our evaluation scripts processed the entirebinary and would highlight or label certain instructions. To testthe Ghidra scripts, we used Python’s standard time functionbefore and after the script’s work finished. We evaluated thecorresponding Datalog program by using Souffl´e’s internalperformance timers. We then benchmark Ghidra vs. d3re ona corpus of six binaries (all sized less than 10MB), five from ddisasm ’s test suite and Souffl´e, shown at the top of Table II.We used the latest versions of each pre-built in the latest ArchLinux, but we used a pre-built version of Souffl´e. For eachscript, we waited for all of Ghidra’s typical analyses to finish,and similarly we ran ddisasm to build up the initial inputdatabase for d3re .The body of Table II compares the runtime of each Ghidrascript versus its corresponding implementation in d3re . Thesingle occurrence of – indicates that Ghidra did not finishwithin an hour. Broadly, we found that d3re outperformedGhidra for each of the scripts in our replication study. As wehad hoped, d3re ’s design allowed us to leverage useful rela-tions from ddisasm . We found that many scripts do things likenaive loops over sets of functions or symbols to locate someproperty. By contrast, the declarative style of d3re allowed usto write these not only more succinctly (e.g., Datalog naturallyaggregates results) but also more efficiently—Souffl´e opti-mally compiles input programs to efficient relational algebrakernels that loop only when necessary. We did observe variousways in which d3re ’s limitations could cause performanceissues. For example, the findcrypto script scans the binaryfor 256-segments of code. d3re is built on Souffl´e, whichsupports 64-bit primitive ints, but not 256-byte sequences.Thus, we had to build up sequences via a set of Datalog rules,causing inefficient memory representation due to the necessaryduplication due to representing subsequences as Datalog facts.
Evaluating End-to-End Behavior in Subsequent Invoca-tions:
To understand the effect of caching via repeated callsto d3re , we ran four subsequent analysis queries in a rowusing both our caching-based approach and without caching(wherein we started only with the results of ddisasm ). Ourresults are shown in Table III: the time of the cached run(C) is shown above the time for the correspond sequentialrun (S). As each query builds on the previous, we expectcaching to reduce the amount of work and commensuratelyreduce the runtime. stack var finds stack-allocated variables,while heap var calculates stack variables holding pointers toheap values based on stack var . static var and unl static attempt to find uninitialized global variables. Overall, wefound rule caching was especially important on larger binariesversus sequential runs, justifying our choice to structure themetadatabase as a graph.V. R ELATED W ORK AND C ONCLUSION
We conclude with a brief discussion of proximately-relatedwork that lies at the synthesis of reverse engineering and static/ dynamic analysis, and contextualize this work in terms of ouraspirations for the future of D RE. There has been extensivework using logic programming, and in particular Datalog, forstatic analysis of higher-level langauges such as Java [17]–[19]. The success of the Souffl´e Datalog engine has inspired re-cent adoption of logic programming within the binary analysiscommunity. For example, Datalog Disassembly uses Souffl´eto achieve both faster and more-precise disassembly than thestate-of-the-art disassembler Ramblr [13]. Similarly, OOAna-lyzer uses XSB-Prolog, a version of Prolog implemented asa library [20]. We are currently reimplementing OOAnalyzerin d3re targeting
C++
Linux binaries. We feel particularlyexcited about this direction because we believe Souffl´e willbe immediately more scalable than XSB-Prolog.While there are a broad range of plugins for Ghidra andIDA Pro to load the results of static analyses, we believe d3re is the first to focus on the combination of open-endeddeductive logical inference and rapid interactivity (enabled byour metadatabase). We believe the most closely related work isPonce [21], which enables GUI-based symbolic execution. Weplan to integrate symbolic execution into d3re as a long-termgoal, inspired by the recent work of Formulog [22].Our goal in this work was to introduce a new visionfor reverse engineering, D RE, wherein expert users rapidlyquery high-performance logical inference engines to helpthem accomplish their day-to-day work in RE, vulnerabilityconstruction, and penetration testing. Visualization-based toolssuch as Ghidra are of immense value in understanding a binary,but have fundamentally different design considerations thanhigh-performance logical inference enginges (such as Souffl´e).Recent work in compiling Datalog to parallel relational algebra(e.g., Gilray et al. [23]) has enabled a new frontier in scaleof Datalog-based analyses. We hope that developments suchas these will someday enable realizing fully the vision ofD RE to help reverse engineers perform powerful static binaryanalyses at unprecedented scale.6
EFERENCES[1] D. Votipka, S. Rabin, K. Micinski, J. S. Foster, and M. L. Mazurek, “Anobservational investigation of reverse engineers’ processes,” in
USENIXSecurity 2020 , pp. 1875–1892, 2020.[2] J. Smith, B. Johnson, E. Murphy-Hill, B. Chu, and H. Richter Lipford,“Questions developers ask while diagnosing potential security vulnera-bilities with static analysis,” pp. 248–259, 08 2015.[3] B. Johnson, Y. Song, E. Murphy-Hill, and R. Bowdidge, “Why don’tsoftware developers use static analysis tools to find bugs?,” pp. 672–681,05 2013.[4] “Ghidra released by national security agency.” https://ghidra-sre.org/.[5] Hexray,
Hex-rays:The IDA Pro disassembler and debugger. [6] “Radare2.” https://github.com/radareorg/radare2.[7] E. Schulte, J. Dorn, A. Flores-Montoya, A. Ballman, and T. Johnson,“Gtirb: Intermediate representation for binaries,” 07 2019.[8] Grammatech, “Gtirb.” https://github.com/GrammaTech/gtirb-ghidra-plugin.[9] “ret-sync.” https://github.com/bootleg/ret-sync.[10] “py-findcrypt-ghidra.” https://github.com/AllsafeCyberSecurity/py-findcrypt-ghidra.[11] “Grammatech’s cyber grand challenge program repository.” https://github.com/GrammaTech/cgc-cbs. Accessed: 2020-01-10.[12] “Qualifier challenge - cromu 00038.” https://github.com/GrammaTech/cgc-cbs/tree/master/cqe-challenges/CROMU 00038. Accessed: 2020-01-10.[13] A. Flores-Montoya and E. Schulte, “Datalog disassembly,” in { USENIX } Security Symposium (USENIX Security 20) , 2020.[14] S. Ceri, G. Gottlob, and L. Tanca, “What you always wanted toknow about datalog (and never dared to ask),”
IEEE Transactions onKnowledge and Data Engineering , vol. 1, no. 1, pp. 146–166, 1989.[15] “Ghidra bridge.” https://github.com/justfoxing/ghidra bridge. Accessed:2020-01-10.[16] “Awesome ghidra.” https://github.com/AllsafeCyberSecurity/awesome-ghidra.[17] Y. Smaragdakis, G. Kastrinis, and G. Balatsouras, “Introspective analy-sis: context-sensitivity, across the board,” in
Proceedings of the 35thACM SIGPLAN Conference on Programming Language Design andImplementation , pp. 485–495, 2014.[18] H. Jordan, B. Scholz, and P. Suboti´c, “Souffl´e: On synthesis of programanalyzers,” in
International Conference on Computer Aided Verification ,pp. 422–430, Springer, 2016.[19] B. Scholz, H. Jordan, P. Suboti´c, and T. Westmann, “On fast large-scaleprogram analysis in datalog,” in
Proceedings of the 25th InternationalConference on Compiler Construction , CC 2016, (New York, NY, USA),pp. 196–206, Association for Computing Machinery, 2016.[20] E. J. Schwartz, C. F. Cohen, M. Duggan, J. Gennari, J. S. Havrilla,and C. Hines, “Using logic programming to recover c++ classes andmethods from compiled executables,” in
Proceedings of the 2018 ACMSIGSAC Conference on Computer and Communications Security , CCS’18, (New York, NY, USA), pp. 426–441, Association for ComputingMachinery, 2018.[21] “Ponce (ida pro plugin).” https://github.com/illera88/Ponce. Accessed:2020-01-10.[22] A. Bembenek, M. Greenberg, and S. Chong, “Formulog: Datalog forsmt-based static analysis,”
Proc. ACM Program. Lang. , vol. 4, Nov.2020.[23] T. Gilray and S. Kumar, “Distributed relational algebra at scale,” in , pp. 12–22, 2019., pp. 12–22, 2019.