Monet: A User-oriented Behavior-based Malware Variants Detection System for Android
Mingshen Sun, Xiaolei Li, John C.S. Lui, Richard T.B. Ma, Zhenkai Liang
TTECHNICAL REPORT OF MONET: A USER-ORIENTED BEHAVIOR-BASED MALWARE VARIANTS DETECTION SYSTEM FOR ANDROID 1
Monet: A User-oriented Behavior-based MalwareVariants Detection System for Android
Mingshen Sun, Xiaolei Li, John C.S. Lui,
Fellow, IEEE, ACM , Richard T.B. Ma, Zhenkai Liang
Abstract —Android, the most popular mobile OS, has around
78 % of the mobile market share. Due to its popularity, it attractsmany malware attacks. In fact, people have discovered aroundone million new malware samples per quarter [1], and it wasreported [2] that over
98 % of these new malware samples arein fact “ derivatives ” (or variants) from existing malware families.In this paper, we first show that runtime behaviors of malware’score functionalities are in fact similar within a malware family.Hence, we propose a framework to combine “ runtime behavior ”with “ static structures ” to detect malware variants. We presentthe design and implementation of M
ONET , which has a client anda backend server module. The client module is a lightweight, in-device app for behavior monitoring and signature generation,and we realize this using two novel interception techniques. Thebackend server is responsible for large scale malware detection.We collect malware samples and top 500 benign apps tocarry out extensive experiments of detecting malware variantsand defending against malware transformation. Our experimentsshow that M
ONET can achieve around
99 % accuracy in de-tecting malware variants. Furthermore, it can defend against10 different obfuscation and transformation techniques, whileonly incurs around performance overhead and about battery overhead. More importantly, M
ONET will automaticallyalert users with intrusion details so to prevent further maliciousbehaviors.
I. I
NTRODUCTION A NDROID is a mobile operating system from Google andit powered mobile devices dominate around . ofthe smartphone OS market in the first quarter of 2016 [3].Android applications (apps for short) can be downloaded notonly from the Google’s official market Google Play, but alsofrom third-party markets [4], [5], forums [6] and web sites.Although Google Play scans any uploaded apps to reducemalware [7], other markets/sites usually do not have sufficientmalware screening, and they become main hotbeds for spread-ing Android malware. As a result, Android attracts millionsof malware. It is reported that
97 % of mobile malware is onthe Android platform [8].Android provides various security mechanisms, such asthe permission mechanism [9] and app verification [10]. Thepermission mechanism constrains functionalities of an app.Apps can only use permissions which are explicitly declaredin their manifest files. When installing an app, users canreview the requested permissions to decide whether to install
Mingshen Sun and John C.S. Lui are with the Department of ComputerScience and Engineering, The Chinese University of Hong Kong. Part ofthis work was done during Mingshen’s internship at National University ofSingapore. Email: { mssun, cslui } @cse.cuhk.edu.hk.Xiaolei Li, Richard T.B. Ma, and Zhenkai Liang are with theSchool of Computing, National University of Singapore. Email: [email protected], { tbma, liangzk } @comp.nus.edu.sg. the app or not. The permission system makes it difficultfor attackers to obtain arbitrary privilege, but it does nothelp if the user accepts dangerous permissions requested bymalware (and unfortunately, many users do exactly that). Inaddition, because of the permission abuse problem [11]–[13],malware can still find its way to attack many Android devices.Furthermore, researchers also propose a number of novelattack methods [14]–[18] targeting Android.Malware detection is the key to provide Android security.Due to the difference in architectures, application structuresand distribution channel, Android is very different from tradi-tional platforms, hence conventional detection methods cannotbe easily adapted to Android systems. To detect Androidmalware, a number of systems were proposed by industriesand research communities. A widely deployed solution is toscan apps in the Android application market, i.e., the Bouncerscanner [7] in the Google Play Store. This helps to reduce (butnot eliminate) malware in the Google Play market. However,due to the openness of the Android ecosystem, users ofteninstall apps from other markets or directly download fromother sites (e.g., web forums). Hence, it is important to havein-device detection systems to target malicious apps.Broadly speaking, there are two types of in-device malwaredetection systems. The first one is to perform static malwaredetection. This type of systems [11], [19]–[21] uses staticinformation such as API calling information and control flowgraphs to generate signatures for detection. For example, anti-virus engines will scan files in apps after their installation.However, studies [22], [23] have shown that these types ofanti-virus engines can be easily bypassed using transformationattacks (i.e., code obfuscation techniques like package namesubstitution and reflection technique). Furthermore, sophisti-cated signature generation and signature matching techniquesbased on control flow analysis incur considerable computationoverhead, and consume energy on mobile devices which havelimited battery resource, preventing them from being adoptedas in-device detection systems.The second type of in-device detection system is the dy-namic intrusion prevention system, as seen in several prod-ucts [24]–[26] and research studies [27]–[29]. These systemswork in the background and monitor apps at runtime. Oncethey discover any suspicious behavior, a notification willpopup to alert the users. Note that suspicious behaviors areusually based on sensitive APIs. Many benign apps (e.g., textmessage management apps) may also invoke these APIs (e.g.,sending text message API) for legitimate reasons. Therefore,this type of systems may introduce false alerts and makesintrusion notifications annoying and less preferable. Moreover, a r X i v : . [ c s . CR ] D ec ECHNICAL REPORT OF MONET: A USER-ORIENTED BEHAVIOR-BASED MALWARE VARIANTS DETECTION SYSTEM FOR ANDROID 2 a study [27] also shows that existing products in the marketcan be easily circumvented.According to a survey [2], it was reported that over
98 % of new malware samples are in fact derivatives (or variants)from existing malware families. These malware variants usemore sophisticated techniques like dynamic code loading,manifest cheating, string and call graph obfuscation to hidethemselves from existing detection systems. Although thesetechniques can help malware to hide their malicious logic,we observe that the “ runtime behaviors ” of malware’s corefunctionalities, such as unauthorized subscription of premiumservices or privilege escalation at runtime, remain unchanged.The runtime behaviors of a new malware variant and its earliergeneration are usually very similar. A detection system basedon runtime behaviors of malware will be able to detect mostmalware and their variants more reliably. In addition, the staticstructures of the malware are often similar within a malwarefamily.With this observation, we present the design and implemen-tation of M
ONET , an Android malware detection system thatcombines “ static logic structures ” and “ dynamic runtime in-formation ”. M
ONET consists of a client module and a backendserver module. The client module is a lightweight, in-deviceapp for malware behavior monitoring and signature generationusing two novel interception techniques, while the backendserver module is responsible for malware signature detection.Our system can accurately describe the behaviors of an appto detect and classify malware variants and defend againstobfuscation attacks. We focus on classifying malware basedon their behavior similarity. The M
ONET ’s client module canbe easily deployed on any Android mobile device. Moreover,it has low computational overhead and low demand on batteryresources. Specifically, we make the following contributions: • We design and implement a runtime behavior signaturewhich can represent both the logic structures and the run-time behaviors of an app. Our runtime behavior signatureis effective to detect malware variants and transformedmalware. • We implement a lightweight, in-device malware detectionsystem, for Android devices. We propose two novelinterception techniques, and show that it is easy to deployand it provides informative alerts to users. • We implement the solution, and demonstrate its effec-tiveness and its low overhead, both on CPU and batteryresources.The rest of the paper is organized as follows. Section IIintroduces the necessary background on Android. In Sec-tion III, we present the design of runtime behavior signature.In Section IV, we describe the M
ONET system, and the imple-mentation details. In Section V, we evaluate the effectivenessand performance of M
ONET . Section VI presents the relatedwork and the conclusion is given in Section VII.II. B
ACKGROUND
In this section, we introduce the essential backgroundknowledge of Android malware variants and evaluation. Wealso discuss the intent interface and binder mechanism, which are important knowledge needed to design our interceptiontechniques.
A. Malware Variants and Evolution
To circumvent detection and to quickly deploy malware,hackers usually do not develop new malware from scratch,but rather improve existing logic or add new malicious logicinto existing malware. They also repackage malware usingdisassembled tools [30], [31] to disassemble a benign app,and inject it with malicious logic, then repackage it as a newbut malicious app. We call a set of malware with similarlogic as a malware family . Moreover, if anti-virus enginescan successfully detect these malware, malware writers willupdate parts of the logic of the original malware using someobfuscation techniques. These newly generated malware willhave similar behavior as the original one. We call these mal-ware as a “ variant ” within this malware family. According toa report [2], many Android malware samples are variations ofexisting malware. For example, the
DroidKungfu family hasfour variants. They use native code, string obfuscation and en-cryption to make the malware more complicated and difficultfor detection. Studies [22], [23] have shown that using simpletransformations, anti-virus engines can be bypassed easily. Wecall the static and automatic transformation techniques suchas string obfuscation, inserting junk instructions, renamingclass names, as “ transformation attacks ”. Therefore, detectingmalware variants and defending against transformation attacksare challenging problems.
B. Intent & Binder Mechanism
There are four types of components in an Android app. Theyare activity , service , content provider and broadcast receiver .An activity represents a screen on the devices which caninteract with users. A service is a long-running backgroundcomponent which does not have a user interface, and theirfunctions are to support tasks running in the background(such as playing music). Android provides many system-levelservices in the framework layer, for example, the activitymanager and the SMS manager. Developers can also defineservices in their apps to provide functions for other apps.Content providers manage structured data such as SQLitedatabase for apps. Broadcast receivers listen to events fromother components such as boot completed events and SMSreceived event.Because each component has individual functionalities andis isolated from other components, Android provides an inter-face which is called intent to connect these components. Anintent is a messaging object which facilitates a componentto request action from another component. Normally, onecomponent can use intents to start an activity, start a serviceor deliver a broadcast. There are two types of intents: explicitintent and implicit intent . Explicit intent can start a componentby specifying a full class name. For instance, knowing thenames of classes, developers can use an explicit intent to startan activity or service in their own apps. Instead of explicitlydeclaring the name, implicit intent does not need the name ofa component. Implicit intent can declare a general action to ECHNICAL REPORT OF MONET: A USER-ORIENTED BEHAVIOR-BASED MALWARE VARIANTS DETECTION SYSTEM FOR ANDROID 3
Fig. 1. Intent and binder mechanism. perform. Other components which are capable of performingsuch actions will handle this intent. For example, if an appwants to make a phone call, it can use an implicit intent with adialing action (i.e., android.intent.action.DIAL ) to starta dailer activity. However, if there is more than one dailer app,the system will popup a dialog for users to choose.From the operating system’s perspective, one intent callinvolves three steps, which we illustrate in Figure 1. Forinstance, activity A in an app wants to start the service S using intent. Firstly, A will request Service Manager to providethe address of the
Activity Manager which is responsiblefor the activity related operations (e.g., starting activities andservices). Then, A will request Activity Manager to start theservice S . In the final step, Activity Manager will tell this appto start the service S .Because each app runs in a sandbox within an Androidsystem, components belonging to different apps cannot di-rectly communicate with each other in user space. But instead,Android system provides a kernel driver which is called the binder in kernel space for inter-process communication. Wewant to emphasize that intent is a high level abstractionin the application framework layer, and the implementationof intent utilizes binder driver in the kernel layer. Figure 1illustrates the work flow of the intent call in the previousexample. All the communications in the above mentionedthree steps need to go through the binder driver. We calla binder communication as a binder transaction . There areseveral attributes in each binder transaction. Binder descriptor is a string which represents the target of this transaction.
Transaction code is an integer indicating the action of thistransaction. For instance, in the binder transaction from an appto the Activity Manager for starting an activity, the descriptoris android.app.IActivityManager and transaction codeis 3. Besides the intent call, other APIs which need inter-process communications also utilize the binder mechanism.For example, to send a text message, an app will use the binderto request the SMS Manager to send a message through theSMS driver. In summary, binder calls can represent all inter-process communication including the intent calls betweenapps. III. S
YSTEM D ESIGN
In this section, we first state our problem, and then wediscuss the system design of M
ONET , in particular, the design on the runtime behavior signature generation and the malwaredetection algorithm.
A. Problem Statement
One way to quickly mutate an Android malware is touse obfuscation methods to transform original codes to hideits malicious logic. Conventional methods for PC cannot bedirectly adapted to Android. Existing in-device solutions havelimited capability to recognize malware, especially under theconstraint of CPU resources and battery power. Our aim is todesign a new and novel user-oriented approach for malwaredetection to achieve the following goals: (1) resistant to mal-ware variants and transformation attacks , (2) user-orientedand easy to deploy , and (3) highly efficient and scalable todetect large number of malware variants. • Resistant to Malware Variants and TransformationAttacks. M ONET should detect malware variants whichhave similar runtime behavior. In addition, the transfor-mation of static features such as package name, string andinstruction order should not affect our detection results. • User-oriented and Easy Deployment. M ONET ’s clientmodule is designed for common mobile device usersrather than app marketplace to prevent malware. It shouldbe easy to deploy, e.g., without modifying existing An-droid firmware. Moreover, after installing M
ONET ona mobile device, it should not consume much batteryresource. • High Efficiency and Scalability.
After generating thesignature, M
ONET ’s client module will send the infor-mation to the M
ONET ’s backend server for signaturedetection. The backend server needs to be efficient andscalable to support a large number of real time signaturedetection requests.We like to mention that many current user-oriented anti-virus software programs only rely on static signatures whichare generated from disassembled codes and other static re-sources (e.g., package names or unique strings within amalware family). In addition, many current dynamic analysissystems are designed only for assisting researchers to betterunderstand the dynamic behaviors of malware. The currentin-device intrusion prevention systems cannot determine themaliciousness of suspicious apps for users. Furthermore, mo-bile devices usually have constrained battery and computationresources, so conventional host-based intrusion preventionsystems may not be appropriate.
B. Overview of Monet
Our system, M
ONET , determines the runtime behaviorsignature of malware, and it combines both the static logicstructure and the runtime information. Runtime behavior isdifficult to change, and this provides additional informationfor us to perform effective malware variant detection. Usingthis runtime behavior signature, M
ONET can detect malwarevariants and defend against malware transformation attacks.We design two interception techniques to realize our systemso that users can easily install the M
ONET ’s client module onAndroid devices to provide malware protection.
ECHNICAL REPORT OF MONET: A USER-ORIENTED BEHAVIOR-BASED MALWARE VARIANTS DETECTION SYSTEM FOR ANDROID 4 M ONET uses the following four steps to extract runtimeinformation to perform malware detection: (1) static behav-ior graph generation , (2) runtime information collection , (3) runtime behavior signature generation , and (4) signature de-tection . Figure 2 illustrates the work flow of M
ONET to detectmalware on Android devices.When users install a new app on their devices, M
ONET monitors the installation event in the background, and extractsthe static information including component information fromthe app’s manifest file and static logic from the disassembledcodes. Then, M
ONET generates a static behavior graph basedon the static structure of the app before launching the app.After launching the app, M
ONET monitors and collects run-time information including binder transactions as well as someimportant system calls (e.g., socket() and execve() systemcall). If the system detects an intrusive action, it will popup awarning dialog to alert the user about the suspicious actions.If the user cannot determine the maliciousness of this action,the system will conduct further malware detection. M
ONET generates a runtime behavior graph for this app using the staticbehavior graph and the collected binder call information, andsuspicious system calls will also be recorded for detection.Finally, M
ONET uses both the runtime behavior graph and thesuspicious system call set as the runtime behavior signature ,which is sent to the backend detection server for furtheranalysis. The M
ONET ’s backend detection server, it will matchany uploaded signature with existing malware signatures in thedatabase, and return the result to the mobile device and notifyusers about the detection result.
C. Runtime Behavior Signature M ONET uses runtime behavior signature (RBS) for malwaredetection. Runtime behavior signature includes both the run-time behavior graph (RBG) and the suspicious system call set (SSS). RBG contains not only the high level logic structure ofan app, but also describes the calling actions among these logicstructures at runtime. SSS contains execution information ofsensitive system calls at runtime.RBG is one of the basic elements for our behavior-baseddetection system. An RBG of an app is a directed graph overa set of app components and system components with two sets C and B . C represents a set of app components which are allcomponents used within an app and system components whichare system services used, and B represents a set of bindercalls. The set of vertices corresponds to the components in C .The set of edges corresponds to the binder calls between twovertices in B . The label of vertex contains the correspondingcomponents names and properties. The label of edge consistsof binder transaction code representing the calling purposeand the binder content containing essential information. Forthe implicit intent call in the RBG, because we do not knowwhich component will handle the action of this intent, wetreat this action as a node in the RBG. The property in thevertex label of a component indicates whether a node is an appcomponent or a system component. In summary, because RBGdescribes the high-level logic structure within an app and theruntime interactions with other functional system components,we can use an RBG for behavior-based malware detection. To further explain runtime behavior graph, we use an RBGof a malware ( o5android ) as an example to illustrate thedetails of RBG. This malware will register itself as a deviceadministrator to prevent uninstallation, and it also uses theGoogle Cloud Messaging services to communicate with itscommand-and-control server to avoid detection. Figure 3 illus-trates a part of the RBG of this malware. The black circles inthe graph represent app components (i.e., the properties of thenodes) in the malware, and beside the nodes are the names ofthe nodes (i.e., the class names of the components). The whitecircles, on the other hand, represent system services whichwere requested by the malware at runtime, and the namesof nodes are descriptors representing the system services. Alink between two nodes implies a binder call between twonodes. The label of the link contains the transaction code andcontent of a binder call. In the left oval of the graph, there isa binder call from com.google.elements.AdminActivity to android.app.action.ADD DEVICE ADMIN . The code represents an action to start an activity. Because the malwareuses implicit intent to start the device administration app, theintent action is treated as a vertex in the RBG, which is thewhite node in the left oval. This part of the RBG describes amalicious behavior of the malware, which is registering theservice as a device administrator. In the right dotted oval,there are two nodes and a link calling from com.google.elements.MainActivity to com.google.android.c2dm.intent REGISTER . The behavior represented in this dottedoval is to initiate the Google Cloud Messaging service. Wewill illustrate the generation method of RBG in the followingsubsections.RBG utilizes the specific app structure and communicationmechanism for Android to record runtime behaviors. RBGcontains two pieces of important information. The first one isthe calling relation between components inside an app or whatwe call the logic structure, e.g., Activity MainActivity startsthe service
AdminService . The second component is whatwe call the runtime behaviors, e.g., Activity
MainActivity obtains the device’s unique ID through a telephony manager.Combining the logic structure and the runtime behaviors, RBGcan accurately describe the characteristics of a malware. Thisis fundamentally different from existing static approaches [32]which simply use static features for malware detection. Next,we further elaborate how to use an RBG as a malwaresignature for detection.
Role of Suspicious System Call Set (SSS):
SSS is aset of potentially dangerous system calls. For example, thesystem call includes socket and execve because malwarecan use socket to download malicious executable files anduse execve to launch those programs. Firstly, malware mayuse socket (i.e., network) to communicate with the commandand control server. M
ONET will capture the address of theconnected server. Secondly, some trojans will execute roottools at runtime to gain root access and privilege. For example,
DroidKungfu is a trojan malware which will execute the secbino binary to exploit system vulnerabilities. Because wecan only obtain the calling process (i.e., app) rather thancalling component of system calls in the kernel layer, we
ECHNICAL REPORT OF MONET: A USER-ORIENTED BEHAVIOR-BASED MALWARE VARIANTS DETECTION SYSTEM FOR ANDROID 5
Fig. 2. Overview of M
ONET . Runtime behavior signature will be generated through static behavior graph generation, runtime information collection andruntime behavior graph generation.Fig. 3. Example of runtime behavior graph. separate SSS as another element of runtime behavior signatureand record them in SSS at runtime.Together, both RBG and SSS constitute the runtime behav-ior signature of the app and we use it for malware detection.There are several reasons that RBG and SSS are suited as abasis for malware detection. Firstly, every component in anapp has to use the binder mechanism to communicate withother components. So binder calls can accurately representapps’ runtime behavior. Also, for network behavior and binaryexecution, SSS can capture these suspicious system calls assupplementary runtime behaviors. Secondly, logic structures ofa malware family are usually very similar. Although malwaremay use static obfuscation methods to avoid detection by staticanalysis, malware variants have similar run-time behavior withthe original malware. Therefore, with an accurate representa-tion of static structure and runtime behaviors, RBG and SSScan be used as a runtime behavior signature to detect malwarevariants and transformation attacks.To generate an app’s RBG, we need to extract the logicstructure and the runtime behaviors. One can extract theapp component information from the disassembled resources.However, we also need to execute the app to obtain the callingrelation between components. Moreover, the calling relationsrely on the execution routines of an app. To accomplish this,we propose to first use the static behavior graph (SBG),which can represent the static logic structure before launchingthe apps. In essence, SBG is a simplified RBG which onlyincludes the app components and their static calling relation.In summary, SBG describes the skeleton (i.e., logic structure)of an app, and connections within the skeleton are provided by the runtime information, which we obtain from RBG.Specifically, there are two phases to generate an app’s RBG.They are: (1) static behavior graph generation and (2) runtimebehavior completion , which we explain as follows. (1) Static Behavior Graph Generation:
To generate anRBG, we first use the static information to generate the staticbehavior graph (SBG). SBG is a subgraph of RBG, but itdoes not contain runtime information. There are two steps togenerate SBG. The first step is to extract app components fromthe app’s manifest file (i.e.,
AndroidManifest.xml file). Thesecond step is to find intent calls between components, i.e.,one app component which starts another app component.Note that for the second step, due to the limitation ofstatic data-flow analysis, it is impossible to find all intentcalls. For example, a malware can hide an intent call withina native code or obfuscate action string in the implicit intentcall. Moreover, traditional static analysis methods impose highcomputation complexity. M
ONET uses an alternative methodto statically extract all intent calls. Firstly, M
ONET will use thedisassembled code to generate the control flow graph (CFG)for each class. Secondly, it searches all intent call methods(i.e., startActivity and startService
APIs) in the CFG.Because there are several attributes in these intent call methodsto indicate the caller and target, we can then keep track of thesevariables. Here, we use the reaching definition algorithm [33]to locate the caller and target. Lastly, we can determine anintent call and add a link in the SBG.We want to point out that a full CFG and reaching definitionanalysis for an app will cost a lot computation resource. This isnot feasible for battery constrained mobile devices. Therefore,we build a CFG and use the reaching definition algorithm onlywithin a component class. For other binder calls which cannotbe found by the SBG generation process, we can obtain themat runtime.Figure 4 depicts an example of statically finding an intentcall, which initiates from the activity A to the activity B . Wefirst locate the startActivity API call. The parameter i is the intent object. Then, by using the reaching definitionalgorithm, we can find the definition of i . Note that i isdefined by the intent constructor. The parameters of the intentconstructor are the caller class and the target class of an intentcall. Therefore, we locate the caller variable (i.e., v1 ) andtarget variable (i.e., v2 ) of this intent call from the constructormethod of intent. Then, we find the definitions of v1 and v2 . ECHNICAL REPORT OF MONET: A USER-ORIENTED BEHAVIOR-BASED MALWARE VARIANTS DETECTION SYSTEM FOR ANDROID 6
Fig. 4. Data-flow analysis for generating the static behavior graph (SBG).
Lastly, the system determines an intent call from the activity A (i.e., this ) to activity B , and this edge will be added intothe SBG of this app. Using the above algorithm, most ofthe intent calls can be found and added to the SBG, whichrepresents the skeleton of the app. Because we only performreaching definition algorithm within each component logic,if definitions reside in other classes, we cannot locate thisbinder call. Moreover, some binder calls may be hidden insidenative code. Therefore, the remaining calls will be recordedat runtime. (2) Runtime Behavior Completion: Because SBG is basedon static resources, it only possesses limited logic structure in-formation. For example, malware samples may hide maliciouslogic by obfuscation and reflection techniques. To gain thesehidden logic, we should capture runtime information. Afterexecuting the app at runtime, M
ONET can collect runtimebinder calls. Then M
ONET will use these calls to complete theSBG and generate an RBG. After generating the RBG, whichis a part of the signature of the suspicious app. M
ONET willsend it and SSS to the backend detection server for malwaredetection. In Section IV, we will discuss in detail how weimplement the runtime behavior collection process in M
ONET . D. Malware Detection
When the M
ONET ’s backend detection server receives theuploaded runtime behavior signature of a suspicious app, itwill execute the signature matching algorithm to determineif this suspicious app is a malware. The detection algorithminvolves three parts: (1) graph decoupling , (2) malware sig-nature generation and (3) signature matching . (1) Graph Decoupling: Because repackaged malware con-tains both benign logic and malicious logic, we need to per-form a graph decoupling for all uploaded RBG to separate thislogic for malware detection. Figure 5 illustrates the process ofgraph decoupling. Suppose we have an RBG of a repackagedmalware. There are two steps to achieve graph decoupling.Firstly, we remove all nodes which are system componentsand edges connected to these nodes (e.g., the white nodes inthe figure). Then, we obtain several disconnected subgraphs ofthe original RBG. Secondly, for each disconnected subgraph,we add back the removed system component nodes whichhave links with nodes in this subgraph. Then, we re-link theadded nodes to nodes in the subgraph. Lastly, we will obtainseveral individual graphs (e.g., the two graphs in the upper
Fig. 5. The graph decoupling process. circle and the lower dotted circle showed in the figure) whichcontain logic structure and runtime behavior belonging to theseseparated graphs. By using graph decoupling, we can easilyseparate malicious logic and runtime behaviors from originalmixed RBG. (2) Malware Signature Generation:
Because maliciousruntime behaviors are captured at runtime, some behaviorscan only be triggered by certain events. Moreover, automaticapp-behavior triggering is still an ongoing research problem,and existing studies [34], [35] cannot effectively trigger allmalicious behaviors. To make the detection more accurate,malware analyzer should manually trigger the malicious eventsat runtime. Therefore, before matching an uploaded suspicioussignature, malware analyzer needs to launch the capturedmalware samples in M
ONET and triggers the malicious be-havior manually. M
ONET will generate RBG and SSS for thismalware. For the RBG, M
ONET will then perform the graphdecoupling process to obtain a set of individual RBGs. Mal-ware analyzer then determines which RBG contains maliciousbehaviors. These malicious RBGs will be stored as malwaresignature in the signature database. In Section IV, we willelaborate the implementation of our signature database. (3) Signature Matching:
Signature matching is to matchthe uploaded suspicious runtime behavior signature (includ-ing SSS and RBG) with existing malware signatures in thedatabase to determine whether an app is malware or not.The signature matching process consists of SSS matchingand RBG matching. For SSS, suspicious system calls can bethe indicator of a malware. For instance, one suspicious SSScontains a connection to a well-known remote command andcontrol server, or it has an execution of a root exploit binary.For RBG matching, it involves two steps. In the first step, weuse the graph decoupling algorithm to separate the suspiciousRBG into a set of decoupled RBG ( D ). For the second step,the backend detection server will execute a graph similarityalgorithm to compare graph in the decoupled RBG set ( D )with graphs in the malware RBG set ( M ). We say that thereis a match if there exists a d ∈ D and an m ∈ M such that thesimilarity between d and m is smaller than a threshold ( T ).In the M ONET backend detection server, we use the graphedit distance algorithm to measure the similarity between twoRBGs. The similarity of two runtime behavior graph G and G is: sim ( G , G ) = 1 − min( | V i | + | V d | + | E i | + | E d | ) | V | + | V | + | E | + | E | , where | V i | and | V d | are the number of vertex-edit operations of vertexinsertion and vertex deletion from G to G . | E i | and | E d | ECHNICAL REPORT OF MONET: A USER-ORIENTED BEHAVIOR-BASED MALWARE VARIANTS DETECTION SYSTEM FOR ANDROID 7
Fig. 6. Example of graph edit distance. are the number of edge-edit operations of vertex insertion andvertex deletion from G to G . We calculate the minimumoperation to transform G to G . Then, | V | + | V | + | E | + | E | quantifies the maximum operations from G and G . There-fore, a high similarity score of two RBGs implies that itneeds small number of transformations from one to another.Figure 6 illustrates an example of graph edit distance betweentwo RBGs: G and G . Both of them have six nodes and sixedges. They have the same graph structure except that oneedge in G points to a different node (i.e., dotted link in thefigure). The number of edge-edit operations from G to G is because we have to delete one edge and insert a newedge. Therefore, the similarity score between G and G is − (1 + 1 + 0 + 0) / (6 + 6 + 6 + 6) = 0 . . In other words,these two graphs G and G is highly similar.IV. I MPLEMENTATION OF M ONET
In this section, we present the implementation of M
ONET .The system consists of two parts: a client app (which canbe installed in any Android device) to capture the behaviorand generate signatures, and a backend detection server todetermine whether a suspicious app is a malware variant.
A. Client App
The M
ONET client app can generate SBG for newly in-stalled apps. At runtime, the M
ONET client app monitorsintrusive transactions and system calls. Once a suspiciousbehavior is detected, the M
ONET client uses the collectedruntime information to generate the RBG and the SSS for theexecuted app, and then sends them as the monitored behaviorof that app to the backend detection server for malware de-tection. In our implementation, the client app consists of threemain components, (1)
SBG generator , (2) runtime informationcollector and (3)
RBG and SSS generator . (1) SBG generator: The M
ONET client app monitors the appinstallation events (i.e.,
PACKAGE INSTALL and
PACKAGEADDED action). Once an app is installed, SBG generator willuse the smali/baksmali library [30] as a disassembler to disas-semble newly installed apps. The output is a set of disassem-bled codes. In addition, the SBG generator will also translatethe compiled binary
AndroidManifest.xml file into a plaintext file. As we discussed in Section III, to generate an SBG,the SBG generator will first generate a control flow graph(CFG) for each component class. Secondly, it will extractcomponent information from the
AndroidManifest.xml . With the CFG and component information, it uses a dataflow analysis technique and reaching definition algorithm togenerate a static behavior graph based on compiler theory.The reaching definition algorithm we used is based on thecompiler theory, and the algorithm is depicted in Algorithm 1.Input to the algorithm is a CFG of an app component classgenerated from the disassembled code. In this algorithm,
GEN [ B ] is the definitions within the code block B , and KILL [ B ] is the definitions which are redefined (i.e., assignedwith other values) in block B . After calculating the reachingdefinition, we obtain two sets of definitions: IN [ B ] and OU T [ B ] . IN [ B ] contains definitions which reach B ’s entry,and OU T [ B ] contains definitions which reach B ’s exit. Forexample, if we want to find the definition of variable i in the startActivity(i) block ( b ), using the reaching definitionalgorithm, we can obtain definitions that reach block B from IN [ b ] list. If there is a definition of i in the list, we canfind which statement defines the i variable. Lastly, we canalso determine the value of i in that statement. In summary,this algorithm statically finds binder calls (links) between appcomponents (nodes) to generate an SBG. The complexity ofreaching definition algorithm is O ( n ) , where n is the numberof blocks in a CFG. For all the apps and malware we tested,the value of n is between and . Algorithm 1
Reaching definition algorithm
Input:
Control flow graph:
CF G = ( N , E , ENT RY , EXIT ) Output: IN [ B ] and OUT [ B ] sets OUT [ ENT RY ] ← ∅ for all basic block B other than ENT RY do OUT [ B ] ← ∅ end forwhile changes to any OUT occur dofor all block B other than ENT RY do IN [ B ] = S ( OUT [ p ]) . for all predecessors p of BOUT [ B ] = GEN [ B ] S ( IN [ B ] − KILL [ B ]) end forend while (2) Runtime Information Collector: The runtime informa-tion collector runs in the background of an Android deviceand it collects all binder transactions to generate an RBGand specific system calls to generate an SSS. We implementthe runtime information collector using two interception tech-niques on binder calls and system calls respectively. Figure 7illustrates our implementation. It contains two functional parts:the binder call interception and the system call interception . • Binder Call Interception. M ONET needs to collect thebinder calls information including the binder transaction code,the transaction descriptors and various additional attributes.The M
ONET client app uses the hooking technique on bindercalls. In essence, the client app injects libraries into apps andsystem services to hook binder transactions. The first hookingplace is on the JNI interface for intercepting upper binderrelated APIs between the Java layer and the native layer. Usingthis method, we can intercept all binder calls initiated by thisapp from the Java layer. The second hooking place is on theService Manager. Because all binder requests need to first gothrough the Service Manager, the M
ONET client app will alsointercept calls to the Service Manager to avoid any malware
ECHNICAL REPORT OF MONET: A USER-ORIENTED BEHAVIOR-BASED MALWARE VARIANTS DETECTION SYSTEM FOR ANDROID 8
Fig. 7. Implementation of the M
ONET runtime information collector.TABLE IB
INDER CALL INFORMATION AT RUNTIME . Caller Component Target Component Code Code Action *.MainActivity
PackageManager getPackageInfo *.WorkService ConnectivityManager getActiveNetworkInfo *.WorkService PhoneSubInfo getDeviceId *.AdminService DevicePolicyManager isAdminActive. . . . . . . . . . . .* Package name: com.google.elements using native code to initiate malicious binder calls. Figure 7depicts the technical details of our binder call interception. Forexample, if a malware uses the sendTextMessage() API tosend a premium message, this API call will go through severallower layer APIs in the Android SDK. At the end, this methodcall will be handled by a binder object. This binder object willcall the transact()
JNI method to invoke the native function.M
ONET will capture this binder transaction and record thisbinder call. In addition, the M
ONET client app can also obtainthe runtime calling stack trace of this JNI method to find outwhich component is initiating the binder call. Because thisbinder call is an intrusive transaction, we will then be ableto notify users about the intrusive events. Note that M
ONET will also generate an RBG using the current collected runtimeinformation and send it to the detection server for malwaredetection. Table I depicts some binder call records of the o5android malware. The record includes caller componentnames, target component names, binder call codes and corre-sponding actions of the codes. For example, the com.google.elements.WorkService component will request device IDfrom the PhoneInfoSub component at runtime. These binderrecords will be used to complete the SBG to generate RBGfor detection. • System Call Interception.
To intercept system calls, weimplement a loadable kernel module (LKM) for the Linuxkernel. The kernel module will first search the address of the sys call table structure. The sys call table struc-ture stores the pointers of system call implementations. Inthe M
ONET client app, we get the sys_call_table addressfrom the vector_swi handler [36]. Using this method, wecan determine the address of the sys call table fordifferent build versions of the Linux kernel. Then, to in-tercept system calls, we change the system call addressesin the sys_call_table to addresses of our own functions. Inside our methods, the M
ONET client app will write thecalling information including caller process ID and systemcall parameters into a device driver ( /dev/monet ) to pass theinformation to the user layer app. At the end of the function,M
ONET will call back the original functions to continue theoriginal logic of the app.In our current implementation, we intercept two systemcalls: socket() (i.e., sys_call_table[__NR_socket] ) and execve() (i.e., sys_call_table[__NR_execve] ). By re-placing the system call entries in the system call table, weredirect these two system calls to our interception first andthen return back to their original system calls. For execve() ,the kernel adds a wrapper to adjust the parameter r3 beforeperforming the actual execve task. The wrapper points r3 to astack location calculated from the stack pointer sp . Therefore,we should guarantee that the stack pointer sp is not corruptedduring our interception.Intercepting these two system calls can expose most ofthe malicious behavior in apps. Firstly, malware could usethe network to communicate with their remote command andcontrol servers. Therefore, to intercept this kind of behavior,we should intercept socket() system call in the kernel layerso that M ONET will get the network connection informationeither from the Java APIs or from native codes. Secondly,many malware (e.g.,
DroidKungfu ) attempt to execute a rootexploit when launching the malware. Therefore, execve() isanother dangerous behavior which we need to keep track.We like to point out that the interception technique forbinder calls is easy to deploy on Android devices. Thedeployment needs root privilege to inject libraries into appsand system services. There are several tools which provideroot privilege management for apps. Moreover, they will alsoprevent malware abusing root privilege to keep the devicesecure. For the interception on system calls, because the kernelfor the current Android system is stable and will not have manymodifications, and loadable kernel module is compatible forthe current systems and easy to deploy. Furthermore, using theabove mentioned hooking technique, M
ONET can be deployedon a wide variety of Android-based mobile devices. (3) RBG and SSS Generator:
With the collected binder calland system call information, M
ONET builds an RBG and anSSS. RBG is based on the SBG which was generated at theinstallation time of a new app. From the runtime informationcollector, we can gain a vector of binder calls sequence atruntime with the caller class names, binder call descriptors andbinder call content. With this information, we can completethe SBG to generate an RBG. For suspicious system calls,M
ONET reads the calling information from the kernel spacevia the device driver, and puts the system calls which belongto current app process to SSS.
B. Backend Detection Server
The backend detection server is responsible for storingmalware signatures in the database, and to perform malwaredetection using our signature matching algorithm. Because anSSS is for detecting network address and binary root exploitin the blacklist, the SSS matching is based on a traditional
ECHNICAL REPORT OF MONET: A USER-ORIENTED BEHAVIOR-BASED MALWARE VARIANTS DETECTION SYSTEM FOR ANDROID 9 hashing matching implementation. Note that usually, we onlyneed to use the RBG for the logic structure and runtimeinformation for detection. The matching algorithm of RBGneeds to perform graph similarity computation, but graphcomparison is computationally expensive. Therefore, basedon the properties of the runtime behavior graph, we use aB+ tree to index malware signature to optimize the detectionprocess. In the current implementation, we use the number ofapp components as a key to the B+ tree, and this informationis easy to derive from RBGs. To insert a record in the B+tree, it only requires O (log b n ) operations, and performing arange query with k elements requires O (log b n + k ) operations,where n is the number of nodes in the B+ tree and b is themaximum number of children nodes for the internal node.Lastly, by using the B+ tree, we only need to compare RBGswithin a range. For example, if we need to detect an RBGwith n nodes, we only need to query and compare malwareRBGs in our database within n − α and n + α nodes, where α is a constant integer we set in M ONET . In our experiments,we set α = 5 . If the number of nodes for malware RBGs inthe database is not in [ n − α, n + α ] , with high probability,the similarity scores between the uploaded RBG and RBGs inthe database will be low. Using this method, it will reduce thecomparison computation for malware detection.Overall, the workflow of detection can be described asfollows: (1) Monet detects suspicious transaction calls bymonitoring IPC; (2) A warning dialog pops out to users andat the same time the signature is sent to server for evaluation;(3) Because these two operations are asynchronized processes,users can wait for detection results then decide whetherto block the malicious events. Considering some detectionmay occur without network connection, we pre-loaded widelydetected malware signatures for offline detection.V. E VALUATION
In this section, we first present our experimental setup anddataset. Then, we present the evaluation results on the accuracyand effectiveness of M
ONET to detect malware variants anddefend against malware transformation. We also present thebattery consumption of the M
ONET ’s client module.
A. Experimental Setup & Dataset
In our experiment, we use an LG Nexus 5 mobile phone totest our client app. Our test phone runs the Google officialAndroid firmware, or KitKat 4.4.4 with the build numberKTU84P and kernel version 3.4.0. Our backend detectionserver is a Dual-core .
10 GHz
PC and memory.We collected 3,723 malware samples from the AndroidMalware Genome Project [20], DroidAnalytics [37] samplesand contagio minidump forums [38]. In addition, we alsodownloaded the top apps from the Google Play market(i.e., the ranking is based on the download number rankinglist). Note that we need these legitimate apps to evaluateM
ONET ’s capability on true negative, as well as to explorethe number of nodes within an RBG.To analyze the characteristics of these apps, we executethese apps for one minute and generate their corresponding · − Number of Nodes in RBG D i s t r i bu ti on F r e qu e n c y Top 500 AppsMalware Samples
Fig. 8. Distribution of the number of nodes in RBG for top apps and malwaresamples.
RBGs. Figure 8 depicts the distribution of the number ofnodes in an RBG for malware or for benign apps. From thefigure, we see that most of the apps contain less than nodes in their RBGs. In Section III, we discussed that manygraph similarity algorithms require high computation. Becausethe number of nodes in RBG is small, the computation ofgraph comparison is therefore acceptable. We will present theperformance evaluation of the backend detection server in laterexperiment results. B. Evaluation on Detection Capability M ONET uses the runtime behavior signature for malwaredetection. It can detect exiting malware samples and their vari-ants, as well as malware which uses transformation techniques.Let us present our results.
Experiment 1 (Accuracy and Effectiveness on DetectingMalware Variants):
DroidKungfu malware is a popularrepackaged malware. It injects malicious classes into benignapps including tools and games. There are four variants(
DKF1 , DKF2 , DKF3 and
DKF4 ) of
DroidKungfu malware. Theoriginal malware (
DKF1 ) listens to the battery change and bootcomplete actions. If these actions are triggered,
DKF1 performsseveral behaviors including reading/writing data in the XMLfile, starting another service, installing a new app, or gainingroot privilege, etc. For the following evolved malware variants,
DKF2 uses native code to execute root exploit.
DKF3 uses stringobfuscation and AES encryption methods to hide maliciousstring signature.
DKF4 uses the same package name as thehosted benign app to hide its static signatures.We performed experiments to see the effectiveness ofM
ONET in using one malware signature (e.g.,
DKF1 ) to detectother malware variants within the same malware family (e.g.,
DKF2 to DKF4 ). Table II shows the detection results foreach variant of the
DroidKungfu malware family. We use DFK1, DKF2 , DKF3 and DKF4 samples fordetection. We measure the true positive (TP), false positive(FP), true negative (TN), false negative (FN) as well as theaccuracy (
ACC = (
T P + T N ) / ( T P + T N + F P + F N ) ) foreach DroidKungfu variant using SSS, or RBG only, or theircombination as signature respectively. We set the threshold T to be . for our detection server. For example, we first useone sample of DroidKungfu1 to generate a runtime behaviorsignature. Then, we install all other samples and benignapps on our test phone with M
ONET , and run the apps for one
ECHNICAL REPORT OF MONET: A USER-ORIENTED BEHAVIOR-BASED MALWARE VARIANTS DETECTION SYSTEM FOR ANDROID 10 minute. To simulate user interactions, we use monkey [39]to generate pseudorandom system/user events such asclicks, touches and gestures, etc. More sophisticated triggeringmethods or real users’ interactions will help our system tocapture runtime behavior thoroughly.From our experiments, we found that out of aredetected as DKF1 malware, and so our true positive rate is / . There is one DKF1 sample which is not detected asmalware, so our false negative rate is / . We manuallyreview the disassembled code of this malware sample. Wefound that hackers declare the malicious component name inthe manifest file, but this malware does not contain any ma-licious logic. Because current anti-virus engines may dependon this unique static component name for detection. M ONET is based on runtime behaviors, so this app will not be detectedas malware. All benign apps are detected correctly and soour true negative rate is , and none of the benign apps arereported as malware, so our false positive rate is . We alsofound that most malicious logic will be initiated at the startuptime of malware samples. Therefore, one-minute runningtime is enough for performing this effectiveness evaluation.However, longer monitoring frames can help the system tocomprehensively complete the runtime behaviors for detection.Let us illustrate the effectiveness of M ONET using the RGBand SSS for detection. From our experiment, we see thatwhen using the runtime malware signature including RBGand SSS for detection, the average accuracy of detecting four
DroidKungfu variants is around
99 % . Secondly, if we onlyuse RBG for detection, the accuracy is . , which drops alittle but it is still very effective in malware variant detection.The reason is that some malicious binder calls and systemcalls are not triggered in the automatic triggering process. Theaverage detection time on our test detection server is about . seconds. The data transformation time through Wi-Fi networkis about two seconds. In summary, the total detection time foreach malware sample is less than three seconds under a stablenetwork status.Besides detecting existing malware within one variant,M ONET is also effective to detect evolved malware vari-ant. To illustrate this capability, we use a runtime behaviorsignature from one variant of the
DroidKungfu family todetect other variants. Figure 9 illustrates the accuracy ofour detection using different signatures. For example, wefirst use
DroidKungfu1 ( DKF1 ) signature to detection othervariant samples (
DKF2 , DKF3 , DKF4 ). The accuracy for the nextgeneration variant (
DKF2 ) is still high. Because some samplesof
DKF3 and
DKF4 variants change behavior in interacting withthe command and control server, the detection accuracy dropsa little. In summary, the detection accuracy of two consecutivevariants is above
90 % . Experiment 2 (Defending Against Malware Transforma-tion):
Transformation attacks use static obfuscation toolsto hide malicious logic. Traditional feature-based anti-virusengines rely heavily on specific patterns of malware fordetection. But string obfuscation and encryption can changethe pattern and bypass these transitional anti-virus engines.Moreover, obfuscation also makes the logic complicated such
TABLE IID
ETECTION RESULTS FOR
DroidKungfu
MALWARE FAMILY WITH
BENIGN APPS FROM G OOGLE P LAY . MalwareVariants * TPR FNR TNR FPR ACC
DKF1 . H . . DKF2 . H . . DKF3 . H . . DKF4 . H . . Total . H . . * Runtime behavior signature usage: : SSS, H : RBG only : SSS and RBG together.DKF1 DKF2 DKF3 DKF4 . . . DroidKunfu
Variants A cc u r ac y DKF1 Signature DKF2 SignatureDKF3 Signature DKF4 Signature
Fig. 9. Detecting
DroidKungfu
Malware Variants. that malware researchers cannot easily analyze the maliciouslogic. Instead of relying on string patterns, M
ONET usesmalicious behaviors for detection because malicious behaviorsare difficult to transform. In this experiment, we use a self-made malware ( o5android ). This malware will request fordevice administrator, or send text messages, or gain deviceid, etc. Moreover, hackers generated a set of malware whichhave a random configuration file so the MD5 values aredifferent. We also use two transformation tools (ADAM [22]and DroidChameleon [23]) to generate obfuscated appsfrom three original malware. In addition, we also implementreflection and dynamic loading techniques to complementexisting methods. We use twelve types of transformationtechniques in the experiment. Table III shows the descriptionsof these twelve transformation techniques. We install these transformed malware on the device with the M ONET clientmodule. out of are detected as o5android malware byour system. Because some techniques used by the transfor-mation tools may corrupt the logic of malware, five of them ECHNICAL REPORT OF MONET: A USER-ORIENTED BEHAVIOR-BASED MALWARE VARIANTS DETECTION SYSTEM FOR ANDROID 11
TABLE IIID
ESCRIPTIONS OF TRANSFORMATION TECHNIQUES . Transformation Techniques
1. renaming classes
2. reversing bytecode order
3. string encryption
4. arrays encryption
5. removing debug information
6. reordering instructions
7. inserting non-trivial junk instructions
8. inserting NOP instructions
9. renaming method
10. renaming fields
11. reflection
12. dynamic loading
Total
50 45
TABLE IVB
ENCHMARK RESULTS . Test Baseline M
ONET
OverheadCPU
21 043 20 015 4 . Memory
14 201 13 019 8 . I/O .
325 311 4 . . Composite . crash after transformation. So we cannot consider them in theexperiment. We also conduct an experiment on a real worldmalware family called FakeAV . This malware family utilizesa simple transformation method to generate large amount ofsamples. We successfully detect all collected nine malwaresamples with different hash values (e.g., SHA1).
Experiment 3 (Performance and Battery Overhead):
Weuse Quadrant Standard Edition v2.1.1 [40] to measure thegeneral purpose benchmark for CPU, memory, I/O, 2D and3D graphics. Table IV shows the benchmark results. BecauseM
ONET will intercept binder calls and system calls, we haveround overhead in memory and I/O benchmarks. We alsomeasure the battery overhead introduced by M
ONET . We firstcheck the battery overhead in the standby mode. We use afully-charged test phone in standby mode for hours. Thedevice with M ONET installed only has . battery overheadas compared with device without M ONET . Then, we use thephone for one hour with heavy usage including minutesgame playing, minutes network surfing and minutestelephone call. We monitor the battery capacity by reading the /sys/class/power_supply/battery/capacity file. Thebattery of M ONET for a heavy user is about . . Insummary, M ONET has a low impact on the battery resource.
Experiment 4 (Capability to Alert Users):
Figure 10demonstrates two screenshots of M
ONET . When users launchthe o5android malware, M
ONET detects a malicious be-havior, which is requesting users to add itself as a deviceadministration. From the left screenshot, M
ONET shows apopup dialog to indicate the app is starting the device manager
Fig. 10. Screenshots of M
ONET . for ADD DEVICE ADMIN action. The content of this intentis a message in Russian which means “encrypt applicationdata”. o5android is using this message to deceive users toaccept this
ADD DEVICE ADMIN request. At the same timeof this alert, M
ONET will send runtime behavior signature tothe backend detection server. In the right screenshot, the alertdialog shows the detection result, and users can click “Deny”button to avoid executing malicious behavior.VI. R
ELATED W ORK
With the emergence of malware on the Android ecosystem,researchers have proposed a number of systems to detectAndroid malware based on static resources such as permis-sion information, disassembled codes and other resources.Zhou et al. [41] and Asokan [42] systematically analyzethe evolution of Android malware. DroidMOSS [20], Jux-tapp [43], DNADroid [44], AnDarwin [45], MassVet [46],ViewDroid [47], Dendroid [48], ResDroid [49], and DroidEa-gle [50] aim at detecting repackaged and clone malware.DroidRanger [21] uses permission-based footprinting andheuristic schemes to detect existing malware. RiskRanker [32]can automatically uncover malicious behaviors of zero-daymalware. DREBIN [51], DroidSIFT [52] and ICCDetector [53]use machine learning algorithm to detect malware. Thereare a number of works [19], [54]–[57] which use staticdataflow analysis to identify malicious logic in Android appsand classify existing malware. To prevent malware exploitingcapability leaks and content leaks vulnerabilities, systems [11],[20] aim at detecting such loopholes in apps. All thesesystems are based on static features of malware. However,current malware use advanced obfuscation methods to bypassdisassembled tools or hide the malicious logic in native code.Moreover, learning-based malware algorithm is not compu-tational efficiency and their effectiveness strongly dependson the feature selection. In contrast, our system uses both static features and dynamic runtime information to describemalicious behavior, and M
ONET is effective in defense againstlogic transformation.To analyze sophisticated malware, researchers propose anumber of dynamic analysis systems. TaintDroid [58], Tain-tART [59], DroidScope [60], VetDroid [61], CopperDroid [62]and DroidBox [63] detect malicious behavior using dynamic
ECHNICAL REPORT OF MONET: A USER-ORIENTED BEHAVIOR-BASED MALWARE VARIANTS DETECTION SYSTEM FOR ANDROID 12 analysis. Marvin [64] combines static and dynamic analysis forclassifying malicious apps. In addition, some systems [65] areproposed to track information flow to prevent privacy leakage.However, these systems are designed for malware analysts. Itis difficult for regular mobile device users to install them ontheir device to detect and prevent malware. Therefore, severalsystems [27]–[29], [66]–[70] are proposed to prevent intrusionon devices for regular users. However, these systems can onlywarn users about the suspicious behaviors at runtime, andusers cannot easily determine whether a suspicious behavioris from a malware or not. Our system is designed for reg-ular mobile users. If an intrusion from a suspicious app isdetected, M
ONET can effectively determine the malware fromits runtime behavior and alert the user.There are a number of malware detection systems based ondynamic behavior or runtime information for mobile devices.Bose et al. [71] propose a behavior-based detection systemfor Symbian OS, which is an outdated mobile system. Atthat time, malware in mobile devices were rare and simple.pBMDS [72] and DroidScribe [73] uses machine learningmethods to classify the behaviors of apps. However, the modelonly works on keyboard inputs, while most interactions withdevices are on the touchscreen nowadays. Crowdroid [74] andMADAM [75] utilize system call sequences as malware be-havior for detection. System calls contain less semantic infor-mation and cannot accurately represent a malicious behavior.M
ONET captures binder transactions and system calls, forthey contain more semantic information which can accuratelydescribe the runtime behavior.VII. C
ONCLUSION
In this paper, we present the design and implementationof M
ONET to detect malware variants and to defend againsttransformation attack. M
ONET will generate a runtime be-havior signature which consists of RBG and SSS to ac-curately represent the runtime behavior of a malware. Oursystem includes a backend detection server and a client appwhich is easy to deploy on mobile devices. Our experimentsshow that M
ONET can accurately detect malware variantsand defend against transformation attacks with only a min-imal performance and battery overhead. Note that recently,Google released Android 5.0 Lollipop which will replace theDalvik virtual machine with ART. ART runtime abandonsthe virtual machine mechanism, but uses the ahead-of-timecompilation. Therefore, our current implementation using thebinder interception may not be directly applicable to the ARTruntime. However, because the application package structureand binder mechanism remain unchanged, so one can easilyextend M
ONET on the ART runtime. This is our future work.R
NDSS , 2012.[12] S. Bugiel, L. Davi, A. Dmitrienko, T. Fischer, A.-R. Sadeghi, andB. Shastry, “Towards taming privilege-escalation attacks on android.”in
NDSS , 2012.[13] A. P. Felt, H. J. Wang, A. Moshchuk, S. Hanna, and E. Chin, “Permissionre-delegation: Attacks and defenses.” in
USENIX Security Sym. , 2011.[14] B. Saltaformaggio, R. Bhatia, Z. Gu, X. Zhang, and D. Xu, “Guitar: Piec-ing together android app guis from memory images,” in
Proceedings ofthe 22nd ACM SIGSAC Conference on Computer and CommunicationsSecurity , 2015.[15] B. Cooley, H. Wang, and A. Stavrou, “Activity spoofing and its defensein android smartphones,” in
ACNS , 2014.[16] C. Ren, Y. Zhang, H. Xue, T. Wei, and P. Liu, “Towards discovering andunderstanding task hijacking in android,” in
USENIX Security , 2015.[17] H. Huang, S. Zhu, K. Chen, and P. Liu, “From system services freezingto system server shutdown in android: All you need is a loop in an app,”in
CCS , 2015.[18] C. Mulliner, W. Robertson, and E. Kirda, “Virtualswindle: An automatedattack against in-app billing on android,” in
ASIACCS , 2014.[19] M. Zhang, Y. Duan, H. Yin, and Z. Zhao, “Semantics-aware an-droid malware classification using weighted contextual api dependencygraphs,” in
Proceedings of the 21st ACM Conference on Computer andCommunications Security , 2014.[20] W. Zhou, Y. Zhou, X. Jiang, and P. Ning, “Detecting repackaged smart-phone applications in third-party android marketplaces,” in
CODASPY ,2012.[21] Y. Zhou, Z. Wang, W. Zhou, and X. Jiang, “Hey, you, get off of mymarket: Detecting malicious apps in official and alternative androidmarkets.” in
NDSS , 2012.[22] M. Zheng, P. P. Lee, and J. C. S. Lui, “Adam: an automatic and extensibleplatform to stress test android anti-virus systems,” in
DIMVA , 2013.[23] V. Rastogi, Y. Chen, and X. Jiang, “Droidchameleon: evaluating androidanti-malware against transformation attacks,” in
ASIA CCS
ACSAC , 2014.[28] R. Xu, H. Sa¨ıdi, and R. Anderson, “Aurasium: Practical policy enforce-ment for android applications.” in
USENIX Security Sym. , 2012.[29] B. Davis and H. Chen, “Retroskeleton: Retrofitting android apps,” in
MobiSys , 2013.[30] “smali/baksmali,” https://code.google.com/p/smali/.[31] “Apktool: A tool for reverse engineering android apk files,” 2012.[32] M. Grace, Y. Zhou, Q. Zhang, S. Zou, and X. Jiang, “Riskranker:scalable and accurate zero-day android malware detection,” in
MobiSys .ACM, 2012.[33] A. V. Aho,
Compilers: Principles, Techniques and Tools , 2003.[34] V. Rastogi, Y. Chen, and W. Enck, “Appsplayground: automatic securityanalysis of smartphone applications,” in
CODASPY , 2013.[35] C. Zheng, S. Zhu, S. Dai, G. Gu, X. Gong, X. Han, and W. Zou, “Smart-droid: an automatic system for revealing ui-based trigger conditions inandroid applications,” in
SPSM
TrustCom , 2013.[38] “Contagio mobile malware mini dump,” http://contagiominidump.blogspot.com.[39] “monkey,” http://developer.android.com/tools/help/monkey.html.[40] “Aurora softworks quadrant standard edition,” https://play.google.com/store/apps/details?id=com.aurorasoftworks.quadrant.ui.standard.[41] Y. Zhou and X. Jiang, “Dissecting android malware: Characterizationand evolution,” in
IEEE Sym. on Security and Privacy , 2012.[42] N. Asokan, “On mobile malware infections,” in
Proceedings of the 2014ACM conference on Security and privacy in wireless & mobile networks .ACM, 2014.
ECHNICAL REPORT OF MONET: A USER-ORIENTED BEHAVIOR-BASED MALWARE VARIANTS DETECTION SYSTEM FOR ANDROID 13 [43] S. Hanna, L. Huang, E. Wu, S. Li, C. Chen, and D. Song, “Juxtapp: Ascalable system for detecting code reuse among android applications,”in
DIMVA , 2013.[44] J. Crussell, C. Gibler, and H. Chen, “Attack of the clones: Detectingcloned applications on android markets,” in
ESORICS , 2012.[45] ——, “Andarwin: Scalable detection of semantically similar androidapplications,” in
ESORICS 2013 , 2013.[46] K. Chen, P. Wang, Y. Lee, X. Wang, N. Zhang, H. Huang, W. Zou, andP. Liu, “Finding unknown malice in 10 seconds: Mass vetting for newthreats at the google-play scale,” in
USENIX Security , 2015.[47] F. Zhang, H. Huang, S. Zhu, D. Wu, and P. Liu, “Viewdroid: to-wards obfuscation-resilient mobile application repackaging detection,”in
Proceedings of the 2014 ACM conference on Security and privacy inwireless & mobile networks , 2014.[48] G. Suarez-Tangil, J. E. Tapiador, P. Peris-Lopez, and J. Blasco, “Den-droid: A text mining approach to analyzing and classifying code struc-tures in android malware families,”
Expert Systems with Applications ,2014.[49] Y. Shao, X. Luo, and C. Qian, “Towards a salable resource-drivenapproach for detecting repackaged android applications,” in
ACSAC ,2014.[50] M. Sun, M. Li, and J. C. S. Lui, “Droideagle: Seamless detectionof visually similar android apps,” in
Proceedings of the 8th ACMConference on Security & Privacy in Wireless and Mobile Networks ,ser. WiSec ’15, 2015.[51] D. Arp, M. Spreitzenbarth, M. H¨ubner, H. Gascon, K. Rieck, andC. Siemens, “Drebin: Effective and explainable detection of androidmalware in your pocket,” in
Prof. of the Network and Distributed SystemSecurity Symposium , 2014.[52] S. Roy, J. DeLoach, Y. Li, N. Herndon, D. Caragea, X. Ou, V. P.Ranganath, H. Li, and N. Guevara, “Experimental study with real-world data for android app security analysis using machine learning,”in
Proceedings of the 31st Annual Computer Security ApplicationsConference . ACM, 2015.[53] K. Xu, Y. Li, and R. H. Deng, “Iccdetector: Icc-based malware detectionon android,”
IEEE Transactions on Information Forensics and Security ,2016.[54] W. Enck, D. Octeau, P. McDaniel, and S. Chaudhuri, “A study of androidapplication security.” in
USENIX security symposium , 2011.[55] L. Lu, Z. Li, Z. Wu, W. Lee, and G. Jiang, “Chex: statically vettingandroid apps for component hijacking vulnerabilities,” in
Proceedingsof the 2012 ACM conference on Computer and communications security .ACM, 2012.[56] S. Arzt, S. Rasthofer, C. Fritz, E. Bodden, A. Bartel, J. Klein,Y. Le Traon, D. Octeau, and P. McDaniel, “Flowdroid: Precise context,flow, field, object-sensitive and lifecycle-aware taint analysis for androidapps,” in . ACM, 2014.[57] C. Yang, Z. Xu, G. Gu, V. Yegneswaran, and P. Porras, “Droidminer:Automated mining and characterization of fine-grained malicious behav-iors in android applications,” in
ESORICS 2014 , 2014.[58] W. Enck, P. Gilbert, B.-G. Chun, L. P. Cox, J. Jung, P. McDaniel, andA. N. Sheth, “Taintdroid: an information flow tracking system for real-time privacy monitoring on smartphones,”
Communications of the ACM ,2014.[59] M. Sun, T. Wei, and J. C. S. Lui, “Taintart: A practical multi-levelinformation-flow tracking system for android runtime,” in
Proceedings ofthe 23rd ACM Conference on Computer and Communications Security ,ser. CCS’16, 2016.[60] L.-K. Yan and H. Yin, “Droidscope: Seamlessly reconstructing the osand dalvik semantic views for dynamic android malware analysis.” in
USENIX Security Symposium , 2012.[61] Y. Zhang, M. Yang, B. Xu, Z. Yang, G. Gu, P. Ning, X. S. Wang, andB. Zang, “Vetting undesirable behaviors in android apps with permissionuse analysis,” in
Proceedings of the 2013 ACM SIGSAC conference onComputer & communications security . ACM, 2013.[62] K. Tam, S. J. Khan, A. Fattori, and L. Cavallaro, “Copperdroid: Auto-matic reconstruction of android malware behaviors.” in
NDSS , 2015.[63] “Droidbox,” https://code.google.com/p/droidbox/.[64] M. Lindorfer, M. Neugschwandtner, and C. Platzer, “Marvin: Efficientand comprehensive mobile app classification through static and dynamicanalysis,” in
Computer Software and Applications Conference (COMP-SAC), 2015 IEEE 39th Annual . IEEE, 2015.[65] M. I. Gordon, D. Kim, J. H. Perkins, L. Gilham, N. Nguyen, and M. C.Rinard, “Information flow analysis of android applications in droidsafe.”in
NDSS , 2015. [66] S. Bugiel, S. Heuser, and A.-R. Sadeghi, “Flexible and fine-grainedmandatory access control on android for diverse security and privacypolicies.” in
Usenix security , 2013.[67] C. Wu, Y. Zhou, K. Patel, Z. Liang, and X. Jiang, “Airbag: Boostingsmartphone resistance to malware infection,” in
NDSS , 2014.[68] X. Li, H. Hu, G. Bai, Y. Jia, Z. Liang, and P. Saxena, “Droidvault: Atrusted data vault for android devices,” in
ICECCS . IEEE, 2014.[69] X. Wang, K. Sun, Y. Wang, and J. Jing, “Deepdroid: Dynamicallyenforcing enterprise policy on android devices.” in
NDSS , 2015.[70] M. Sun, J. C. S. Lui, and Y. Zhou, “Blender: Self-randomizing addressspace layout for android apps,” in
Proceedings of the 19th InternationalSymposium on Research in Attacks, Intrusions and Defenses , ser. RAID’16, 2016.[71] A. Bose, X. Hu, K. G. Shin, and T. Park, “Behavioral detection ofmalware on mobile handsets,” in
MobiSys . ACM, 2008.[72] L. Xie, X. Zhang, J.-P. Seifert, and S. Zhu, “pbmds: a behavior-basedmalware detection system for cellphone devices,” in
Prof. of the rd ACM Conf. on Wireless network security , 2010.[73] S. K. Dash, G. Suarez-Tangil, S. Khan, K. Tam, M. Ahmadi, J. Kinder,and L. Cavallaro, “Droidscribe: Classifying android malware based onruntime behavior,” 2016.[74] I. Burguera, U. Zurutuza, and S. Nadjm-Tehrani, “Crowdroid: behavior-based malware detection system for android,” in