Toward A Network-Assisted Approach for Effective Ransomware Detection
Tianrou Xia, Yuanyi Sun, Sencun Zhu, Zeeshan Rasheed, Khurram Shafique
AA Network-Assisted Approach for RansomwareDetection
Tianrou Xia , Yuanyi Sun , Sencun Zhu , Zeeshan Rasheed , Khurram Hassan-Shafique Abstract —ansomware is one kind of malware using cryptog-raphy to prevent victims from normal use of their computers.As a result, victims lose the access to their files and desktopsunless they pay the ransom to the attackers. By the end of 2019,ransomware attack had caused more than 10 billion dollars offinancial loss to enterprises and individuals. In this work, wepropose a Network-Assisted Approach (NAA), which containslocal detection and network-level detection, to help user determinewhether a machine has been infected by ransomware. To evaluateits performance, we built 100 containers in Docker to simulatenetwork scenarios. A hybrid ransomware sample which is closeto real-world ransomware is deployed on stimulative infectedmachines. The experiment results show that our network-leveldetection mechanisms are separately applicable to WAN and LANscenarios for ransomware detection.ansomware is one kind ofmalware using cryptography to prevent victims from normal useof their computers. As a result, victims lose the access to theirfiles and desktops unless they pay the ransom to the attackers.By the end of 2019, ransomware attack had caused more than10 billion dollars of financial loss to enterprises and individuals.In this work, we propose a Network-Assisted Approach (NAA),which contains local detection and network-level detection, tohelp user determine whether a machine has been infected byransomware. To evaluate its performance, we built 100 containersin Docker to simulate network scenarios. A hybrid ransomwaresample which is close to real-world ransomware is deployed onstimulative infected machines. The experiment results show thatour network-level detection mechanisms are separately applicableto WAN and LAN scenarios for ransomware detection.R
I. I
NTRODUCTION
Ransomware is a type of malware which blocks computerusers access to their data or systems by encrypting importantfiles in computers. Victims have to pay the requested ransom toget decryption keys from the attackers so that they can recovertheir data and systems. Sometimes the files cannot be recoveredeven if ransom is paid either because by accident the victimdestroys the file which contains decryption key or because theattacker breaks promise. Since ransomware attack is easy toimplement and attackers can extort a large amount of moneyonce it succeeds, a lot of ransomware have emerged in recentyears and caused huge losses worldwide.Here are some examples of ransomware attacks. Petya [29]is a family of ransomware first discovered in March 2016. Ittargeted Microsoft Windows-based systems and encrypted a harddrives file system table to prevent the system from booting.Victims had to pay the ransom in Bitcoin in order to regainaccess to the system. In June 2017, a derivative of Petya calledNotPetya [29] launched a global attack on Microsoft Windowssystems again via EternalBlue exploits and totally caused morethan 10 billion dollars financial losses. In October 2017, a newransomware attack named Bad Rabbit [1] was discovered inRussia and Ukraine, which follows a similar pattern to Petya.It encrypted the Windows user’s file tables and then demanded aBitcoin payment to decrypt them. Some researchers believed that Bad Rabbit had been distributed due to a bogus update to AdobeFlash software. At that time, a lot of agencies were affected by thisransomware including Interfax, Odessa International Airport,Kiev Metro and the Ministry of Infrastructure of Ukraine.In 2018 and 2019, ransomware still played an important rolein malware family and exerted a significant impact on globalcomputer users, especially Microsoft Windows operating systemusers. GrandGrab, Hermes2.1, Ryuk, Scarab, LockerGoga, etc.are all ransomware emerged during these two years targetingat Microsoft Windows since this system is the most commonoperating system used by enterprises and organizations thatare potential blackmail objects for whom large ransoms areaffordable.As Linux operating system becomes increasingly popularin recent years and more businesses than ever are runningon Linux now, Linux-oriented ransomware have sprung up toattack Linux users for exorbitant profits. In 2017, KillDisk [2]ransomware encrypted files, demanded bitcoin ransoms and leftLinux systems unbootable. Erebus [3] ransomware affected about3400 of NAYANAs clients via malware-containing advertisements.In 2019, Lilocked [4] ransomware targeted Linux servers andgain root access to encrypt the files with extensions such as PHP,HTML, CSS, etc. The victims were guided to dark web to makea payment in bitcoin in order to recover their files. The mech-anism behind this ransomware is still a secret, researchers arelooking out for a sample to discover the solution for decryptingaffected files. Compared with ransomware targeted for MicrosoftWindows operating system, Linux-oriented ransomware havenot made a huge impact on enterprises and individuals up tonow. However, this situation could change in the near futurebecause the ransomware makers are always driven by profits. Itis inevitable that more companies and individuals in industry willadopt Linux system due to its security, stability and open-source-ness, which will lead to the generation of many ransomwaretargeted at Linux operating system.Among all types of ransomware in ransomware family,cryptoworm is one of the most troublesome genre. It spreadsin the form of a worm, which means it can replicate itself andspread to other computers. Thus, cryptoworm can produce moreserious consequence than other kinds of ransomware from theoverall point of view once it is successfully designed and put intouse by attackers. WannaCry [30] is an example of cryptowormwhich broke out in May 2017. It used EternalBlue exploits togain accesses to Microsoft Windows operating systems. As soonas the cryptoworm infected a computer, it encrypted data onthe computer and later extorted Bitcoin cryptocurrency fromthe victim. Many organization systems were infected and helpedspread WannaCry at that time because those systems did notapply newest patches released by Microsoft. This attack affectedabout 200,000 computers across 150 countries and resulted intotal damages ranging from hundreds of millions to billions ofdollars.Since ransomware attacks emerge in endlessly, people allaround the world is suffering from unanticipated threats to theirproperty. To help individuals and collectives to get rid of this a r X i v : . [ c s . CR ] A ug ind of financial loss, ransomware detection is an indispensabletopic in study.To mitigate the damage of ransomware attacks especiallycryptoworm attacks, we proposed a Network-Assisted Approach(NAA) for ransomware detection, which combines local detectionand network-level detection that successively give user a localreport and a comprehensive report about respective detectionresult. The comprehensive detection report uses wisdom of thecrowd to help computer users determine whether they areundergoing a ransomware attack so that they can take actionstimely and avoid ransomware extortion.We designed a local detection algorithm that is applicable onall kinds of operating systems and implemented a local detectionmechanism prototype targeted at Linux system. In the localdetection algorithm, we considered three features displayed onlocal hosts, among which there is a brand new feature neverbeen used by previous works to the best of our knowledge, togenerate a local report in an accurate and instant manner.As for network-level detection, we adapted ant colony opti-mization algorithm to our problem and implemented an ACO-based Mechanism (ACOM) which sufficiently collects informationfrom other machines so that a comprehensive report can begenerated to help user determine whether the local host isattacked by ransomware or not. We also implemented a simplemethod named Broadcasting Mechanism (BM) which exhaus-tively collects information and used wisdom of the crowd tohelp user determine current safety state. These two network-leveldetection mechanisms are separately suitable for ransomwaredetection in WAN and LAN.To estimate the performance of NAA, we established 100containers in Docker and applied a Linux ransomware sampleGonnaCry to simulative infected containers to mimic networkscenarios. Then, we launched NAA in each container to achievethe evaluation of accuracy, message overheads and latency.The main contributions of this paper are: (1) Propose a ransomware detection approach NAA es-pecially targets at cryptoworm, which combines localdetection and network-level detection to generate areport for user’s reference. (2)
Present a local detection algorithm applicable to alloperating systems and implement a prototype on Linuxsystem. (3)
Apply ACO algorithm to network-level detection to im-plement a sufficient and reliable network-level detectionmechanism ACOM to collect information from network. (4)
Build a network scenario by establishing 100 containersin Docker and launching a ransomware sample on sim-ulative infected machines to estimate the performanceof NAA.The rest of this paper is organized as follows: Section 2describes the background knowledge of both ransomware andransomware detection approaches; Section 3 explains our moti-vations and generalizes the outline of NAA; Section 4 describesthe design and implementation of our local detection mechanism;Section 5 describes the details of ACOM and BM; Section 6evaluates NAA’s performance from accuracy, message overheadsand latency; Section 7 concludes the paper and discusses futurework.
II. R
ELATED W ORK
Cryptographic ransomware is also called crypto ransomware,which always encrypts user files and then extorts users for cryptocurrencies before providing the decryption key. It is favoredby attackers because digital currencies such as Ukash or Bitcoinand other cryptocurrency provide strong anonymity, making itdifficult to trace and prosecute the perpetrators based on ransompayment transactions. Previous works on ransomware detectionmainly focused on checking the features that are displayed due toransomware behaviors on local hosts. And, they were designed forMicrosoft Windows operating system because Windows is morevulnerable and is the target of most crypto ransomware.In 2015, Ahmadian et al. [9] proposed a comprehensiveransomware taxonomy and presented a connection monitor andconnection breaker technique for detecting highly survivableransomwares in the key exchange protocol step. In 2016, Paiket al. [23] proposed a storage-level detection method, whichdetects the existence of ransomware based on storage-accessactivities, e.g., number of files accessed and read/write frequency.Scaife et al. [25] presented an early-warning detection systemthat alerts users during suspicious file activity using a set ofbehavior indicators like entropy, file differences, magic bytes andread/write frequency. K. Cabaj et al. [12] analyzed the behavior ofa popular ransomware named CryptoWall and proposed two real-time mitigation methods using SDN-based algorithm. C. Moore[20] investigated ransomware detection methods that implementcanary files to monitor changes under folders and ascertained thatcanary files offer limited value as there is no way to influencethe ransomware to access the folder containing monitored files.Sgandurra et al. [26] presented a machine learning approachfor dynamically analyzing and classifying ransomware. It mon-itors application behaviors and checks characteristic signs ofransomware including file extension, read/write frequency andfunction calls.In 2017, Y. Feng et al. [16] proposed a new approachbased on deception and behavior monitoring to detect cryptoransomware with no loss. Their approach creates decoy filesand makes ransomware operate on decoy files firstly, and thenmonitor the decoy files by checking whether they are encryptedby ransomware through the comparison of Shannon entropy,file type and sdhash between original files and changed files.Chadha et al. [13] discussed several machine learning algorithmsfor discovering DGA domains and analyzed their performance.Kirda et al. [18] presented a dynamic analysis system whichautomatically generates an artificial environment and detectswhen ransomware interacts with user data. In their system,entropy, removed files and read/write frequency are considered asmonitor objects. Chen et al. [15] monitored the actual behaviorsof software to generate API call flow graphs and used datamining techniques to build a detection model for decide whetherthe software is benign or is a ransomware. Kharraz et al. [17]proposed a defense approach which maintains a transparentbuffer for all storage I/O and then monitors the I/O requestpatterns of applications on a per-process basis for signs ofransomware-like behaviors including entropy, removed files, fileextension and read/write frequency.In 2018, Khashif et al. [28] presented a hybrid approach thatcombined static and dynamic analysis to generate a set of featuresthat characterizes the ransomware behavior. This approach ana-lyzes software binary code first, and then checks entropy, canaryfiles, read/write frequency and function calls. Alaam et al. [10]presented a detection tool which uses artificial neutral networkand Fast Fourier Transformation (FFT) to develop a solutionto ransomware detection by checking functions and frequency.Quinkert et al. [24] presented a defense system that learnsfeatures of malicious domains by observing the domains involvedin known ransomware attacks and then monitors newly registereddomains to identify potentially malicious ones. Moussaileb etal. [22] presented a graph-based ransomware countermeasurehich uses per-thread file system to highlight the maliciousbehaviors such as modification of canary files and accesses tolarge number of directories in a small time period. Morato et al.[21] proposed an algorithm that can detect ransomware actionover shared documents by applying a network traffic inspectiondevice between local users and shared volumes. The inspectiondevice extracts SMB protocol commands through every accessto the shared volumes it monitored and analyzes SMB trafficto determine whether the network volumes shared using SMBprotocol is attacked by ransomware or not.In 2019, A. O. Almashhadani et al. [11] demonstrated a com-prehensive behavioral analysis of crypto ransomware networkactivities including DGA, SMB traffic and general traffic fordetection of ransomware. Lee et al. [19] proposed a methodthat utilizes an entropy technique to measure a characteristicof the encrypted files. Machine learning is applied for classifyinginfected-files-based file entropy analysis.The above literature covers almost all features that character-izes the ransomware behaviors. Our local detection mechanismalso uses some of this kind of features to help determine whethera local host is a suspicious victim or not whereas a brandnew feature ”read/write pattern” is considered as well to helpmake accurate diagnosis on local hosts. Moreover, our approachcontains network-level detection to offer more accurate detectionresults. While the papers mentioned above designed defensemethods for Windows system, our prototype of the local detectionmechanism targets at Linux system which is the next popularattack object of ransomware attacks although our approachis applicable on both Windows system and Linux system. Wecan easily derive a Windows version using the same design butdifferent system libraries and tools.
III. O UR N ETWORK -A SSISTED A PPROACH
A. Background Knowledge1) Characteristics of Ransomware Behaviors:
Ransomwareattacks always access victims operating system in some way andencrypt a large number of user files or system files or bothautomatically in a short time. During this process, the infectedsystem performs differently from what it should be when thereis no ransomware attack. This common trace provides variouskinds of useful information that can be extracted from a suspectedvictim when a ransomware is running. Although there are somedifferences on key generation and key preservation strategiesamong different ransomware, we can still conclude the followingcommon features that show so obvious distinctions between safeand infected circumstances that can be used for ransomwaredetection. (1)
KeywordsIf a software is a ransomware, it probably containssome keywords that are commonly used in ransomwarebinaries. For example, bitcoin, crypto, ransom, etc.are common strings frequently appear in ransomwarebinaries. By inspecting software binaries, we can figureout some suspicious software even before ransomwareattack happens. (2)
Function CallsSince ransomware needs to encrypt files, it always callsfunctions related to cryptographic algorithms, includingkey generation, encryption and decryption functions.These functions may be written by the attacker orinvoked from existing libraries. We can inspect binariesto locate the software that call these functions. (3)
Data InformationOnce a file is encrypted, we can observe some changes on this file. The file extension may be modified to a specificextension designated by the ransomware. The entropyof the file increases due to the randomness of data afterencryption. The magic bytes of this file are differentfrom original bytes because they are encrypted. Somefiles are even deleted since the ransomware created newfiles to store encrypted versions. All of these featuresprovide useful information for ransomware detection. (4)
Metadata InformationMetadata information refers to some indirect informa-tion we can collect during ransomware attacks insteadof information from file contents. Ransomware is anautomatic program that encrypts a large number of filesin a very short time in most cases due to super-fast com-putation speed of computers. So, when a ransomwareis working, it accesses many files and directories, andthen performs read and write operations on these files inshort time periods, which leads to high file/directory ac-cess rate and high read/write frequency on a computer.This phenomenon also indicates a potential ransomwareattack. (5)
Network TrafficSome ransomware generate and store their keys ona remote server so that victims cannot figure outdecryption keys without paying ransom. As for this kindof ransomware, it must contact the remote server to getencryption key during the attack. Thus, an unknownnetwork traffic that is not produced by the user of thelocal host can be inspected, which helps ransomwaredetection.All these features listed above can be used to judge if thecomputer is in abnormal conditions and thus help determinewhether there is ransomware working on this computer. However,one feature alone in consideration is insufficient for accuratedetection results. So, most detection approaches pick severalfeatures and combine their checking results together to decidewhether to alert user ransomware attack or not.
2) Wisdom of the Crowd:
The wisdom of the crowd [31]is a collective opinion produced by a group of people insteadof an individual. Some experiments showed that the collectiveknowledge of ordinary people is more precise than that ofan expert. The reason for this phenomenon is that there isidiosyncratic noise associated with each individual judgment, andtaking the average over a large number of responses will go someway toward canceling the effect of this noise. Thus, this notionhas been applied to many social information sites such as Quora,Wikipedia and Yahoo! Answers.For high accuracy of ransomware detection results, we alsouse wisdom of the crowd in network-level detection of NAA togenerate a comprehensive report for users to reference and todetermine whether they need to do further actions to deal withpotential ransomware attacks.
B. Our Motivation
As we can observe in ransomware attack cases, majority ofransomware have this property: Their appearance and diffusionare related to network. If one machine is infected by someransomware, the others in the same local area network (LAN)are potential victims. That is because LAN is deployed by entitiessuch as enterprises, laboratories and schools to interconnectcomputers within a small area. Computers in one LAN areoften equipped with the same operating system and the sameversion. So, if a computer is attacked by ransomware viasome exploit, the others are possibly attacked or going to beattacked because they share the same vulnerability. Worse, ifhe ransomware is a cryptoworm that actively scans the localnetwork to compromise other machines, those computers in thesame LAN are hence under high risk. Even if in a wild areanetwork (WAN), cryptoworm can spread in high speed becauseit is self-propagating, which means one infected computer caninfect almost all computers communicated with it and result infast increase of infected computers.So, network related information that is corresponding tothe conditions of other computers in network is very usefulin ransomware detection especially in cryptoworm detection.However, existing approaches for ransomware detection onlyconsider the characteristics of ransomware behaviors on localhosts as the parameters of their detection tools. One exceptionis the work [21] that analyzes file sharing traffic in a volumesharing scenario to detect possible ransomware. However, thiswork still did not have an eye on the security information ofother computers in network.Motivated by the above observations, we propose a network-assisted approach which contains both local detection andnetwork-level detection. Local detection is responsible for check-ing local features of a machine to make a preliminary diagnosiswhereas network-level detection collects security conditions ofother machines in a specific area to help determine whetherthe local host is infected or not. Since one machine can bein both LAN and WAN, our network-level detection has twoseparate schemes for these two scenarios. To achieve securityconditions of other machines from WAN, we design an ACO-based Mechanism (ACOM), which uses ant colony optimizationalgorithm to efficiently collect maximum amount of informationwith minimum network resource consumption and report itsdetection result to the user. To obtain desired information fromLAN, we design Broadcasting Mechanism (BM). It directlycollects information of all machines in LAN and uses wisdom ofthe crowd to report a comprehensive detection result. With theinformation provided by ACOM and BM, NAA can accuratelydetect ransomware especially cryptoworm and help user judgewhether the local host is infected or not.
C. Outline of NAA
NAA is a ransomware detection application that does bothlocal detection and network-level detection. The local detectionchecks local features while the network-level detection collectssecurity conditions of other machines from network to provideinformation for user to judgement whether the local host isattacked or not. Figure 1 shows the workflow of NAA.First of all, we run the local detection mechanism on eachlocal host. If the local detection mechanism finds anomalous tasks,it suspends them using kill -STOP pid command and accordinglyraises an alert to the end user. Then, based on his knowledge, theuser should respond to NAA whether the anomalous behaviorsare caused by a legitimate user operation or not (e.g., when theuser is encrypting files with a special tool). If they are, NAAwill resume the suspended processes using ”kill -CONT pid”command and continue to do local detection. If the user indicatesthese behaviors are anomalous (either because they are trulyanomalous or because the user has no idea on what is going on),network-level detection should be launched. During the processof network-level detection, ACOM is responsible for collectinginformation from WAN and BM is responsible for collectinginformation from LAN. Once both mechanisms finish their work,a comprehensive report will be sent to the user describing thecurrent network-wide situation. Then, the user can get an ideaon the fraction of computers in the LAN that are also in theanomalous state and how ACOM views about the current stateof this local host. Based on such given information, the user can
Fig. 1. Workflow of NAA. make a judgement about whether this local host is in danger. Ifthe answer is yes, NAA finishes its work; otherwise, the computeris considered safe and NAA will resume the suspended tasks andcontinue with local detection.In the following sections, we will explain how local detectionand network-level detection work and generate reports in detail.
IV. L
OCAL D ETECTION
A. Design of Local Detection Algorithm
Review that local detection checks some common features onlocal hosts which always display different characteristics undersafe condition and infected condition. Section 3.1 introducedcharacteristics of ransomware behaviors and listed the featuresthat could be considered in local detection.In our local detection algorithm, we pick entropy, read/writefrequency and read/write pattern as input parameters to diagnosethe local host because the combination of these three parametersprovides both high accuracy and efficiency. Among them, entropyand read/write frequency are classic features used by previousmethods while read/write pattern is a brand new feature firstlyproposed by this paper.Entropy is the measurement of the randomness originallyused in thermodynamics. In 1984, Claude E. Shannon appliedentropy to digital communications in his paper A MathematicalTheory of Communication [27]. After that, people started to useentropy to describe the extent of the randomness of a digitalfile. Encrypted files and compressed files tend to have higherentropy than normal files because the bytes in encrypted filesand compressed files are more random. So, we can use entropy tohelp us determine if a file is in normal condition or not. However,even if we can find files with high entropy in a system, we cannotdeem that this system is infected by ransomware because thereare two exceptions: 1) The files with high entropy are compressedfiles instead of encrypted files. 2) The files are encrypted files, butthey are encrypted by authorized users. Thus, this feature alones not sufficient to produce an accurate ransomware detectionresult. We use further features to help us make more accuratejudgements.Read/write frequency describes the frequency of read andwrite operations on a machine. Ransomware always encryptsmany files in a short time because they do not want to be detectedbefore they finish work. Moreover, they want to encrypt as manyfiles as possible so that the attackers are more likely to get ransomfrom the victim. We all know that file encryption task is relatedto read and write operations. So, if a ransomware is working,we can probably observe high read and write frequencies on asystem. However, we still cannot make accurate diagnosis aboutwhether a system is a potential victim or not with these twofeatures because there are still some exceptions such as batch filecompression. It has the same behaviors as ransomware attackwhen only considering entropy and read/write frequency.To distinguish the behaviors of ransomware attack and othernormal behaviors that also result in high file entropy and highread/write frequency such as batch file compression, we takeread/write patterns as the third feature since different tasksusually have different read/write patterns. To our best knowledge,this feature has not been used in prior work. We use it todistinguish user’s normal behaviors from ransomware activities.Here read/write patterns refer to the relationship between readand write operations occurred on a system. For example, if thereis a read operation right after a write operation, we can use { write, read } to describe their relation during this period. Ifthere is a read operation before write operation, but betweenthem exists a close operation, we can use { read, ..., write } todescribe the read/write pattern in this scenario which meansthere exist(s) other operation(s) between read and write. Whenransomware is encrypting a file, it always reads the originalfile first and writes the ciphertext into a new created file.Then, the original file is deleted so that only encrypted file left.Ransomware encrypts files one after another, which makes readand write operations pairs appear at intervals. So, the patternof ransomware activity can be concluded as { read, write } . Incontrast, batch file compression task continuously reads each filein a specified directory and finally writes compressed texts intothe compressed file after closing these original files. There is noadjacent read and write operations in its pattern. Other tasks alsohave their own read/write patterns that are usually different fromthose of ransomware activities. Thus, read/write patterns can helpus filter out some benign behaviors when detecting ransomwareattacks.Our local detection algorithm comprehensively considersthese three features to make a conclusion about whether thelocal host is anomalous or not. This algorithm is applicable toall operating systems because no matter what kind of operatingsystem the ransomware is working on, it has common behaviorswhich will cause common characteristics. In our implementation,we used this algorithm to build a local detection mechanismprototype for Linux system as an example. B. Implementation of Local Detection Mechanism
This subsection describes the outline and details of the localdetection mechanism which is implemented to support network-level detection methods. Since the local detection algorithmdescribed in Section 4.1 is applicable to any operating system, weselected Linux system as an example to implement a prototype.
1) Overview:
As we already known, all ransomware encryptfiles to extort victims. Thus, all ransomware activities are relatedto operations on file system. To monitor the related operationson Linux file system, we use a tool called inotify [5] which is aLinux kernel subsystem that can monitor file system events and report changes. Inotify events include IN OPEN, IN ACCESS,IN MODIFY, IN DELETE and etc., among which IN ACCESSindicates read operation and IN MODIFY indicates write oper-ation. We can use several system calls provided by the inotifyAPI to monitor a specified directory. To monitor the entire filesystem, we can use ”/.” as the directory name to be monitoredwhich represents the root directory of Linux file system. Onceinotify starts to work, all events occurred in the directory treecan be captured and an event handler defined by us will dealwith these events following detection requirements.Our local detection mechanism prototype utilizes inotifyto monitor Linux file system and combines altogether threefeatures mentioned in Section 4.1 (entropy, read/write frequen-cies, read/write patterns) to measure whether the local host isanomalous or not. Figure 2 shows the workflow of the localdetection mechanism.
Fig. 2. Workflow of local detection mechanism.
At the very beginning, we add a watch to the root directory sothat we can monitor the entire file system. Then, start inotify. Wefirst check read/write patterns because it can be done instantlywhen a new event is monitored. If there is a pattern matching withanomalous pattern, that is, the checking result of the first moduleis ”anomalous”, start a new thread to do further detection. Thispattern checking module keeps working no matter what the resultis because inotify keeps monitoring the file system and we dontwant to miss any possibly upcoming anomalous patterns. Whenwe start the new thread, we also pass the path of the file wherethe anomalous write operation happened.The new thread works on checking the other features. It firstchecks file entropy of the potentially encrypted file whose pathwas passed by the pattern checking module when the new threadwas created. If the file entropy is too high to be normal, that is,the checking result of the second module is ”anomalous”, go tothe next module to check read/write frequency. Otherwise, thenew thread stops because the local host is currently in safe state.Our reason for this judgement is that, the modified file, wherethe anomalous pattern is discovered, has normal entropy valuewhich means it is not encrypted. This phenomenon is impossibleto occur if the local host is undergoing a ransomware attack.In the third module, we check read/write frequency. If currentread/write frequency on this system is too high to be normal, thatis, the checking result of the third module is ”anomalous”, thelocal detection mechanism can make a diagnosis that this machineis anomalous because it shows anomalous characteristics in allthree aspects. Otherwise, stop the new thread because the localhost is safe. Note that, the local host has an initial state: safe. Ifthe local detection mechanism cannot find the proof to confirmthis machine is in anomalous state, we consider it is safe bydefault.he rest of this subsection elaborates on how each module isimplemented.
2) Check Read/Write Patterns:
According to the work proce-dure of common ransomware, we know that ransomware alwaysautomatically encrypt files one after another. As for each file, theencryption task consists of several file operations: (1) Open theoriginal file; (2) Create a new file; (3) Open the new file; (4) Readplaintext from the original file; (5) Write ciphertext in the newfile; (6) Close the original file; (7) Close the new file; (8) Deletethe original file. During this process, we can observe adjacentread and write operations with read before write. To distinguishread/write patterns of file encryption task with that of other tasks,we also observed the read/write patterns of some common userbehaviors. By adding a watch to a particular directory, we canobserve the events in this directory.Table I lists file operations during file encryption and othernormal tasks. According to this table, we can find that theread/write pattern of file encryption task is { read, write } whichindicates a single pair of read and write operations with readbefore write. This { read, write } pair can appear many times,but other operations exist between two adjacent pairs. Filemodification and compression tasks have the following read/writepattern, { read, ..., write } , which means some other operationsbetween read and write operations. When we decompress afile, only read operation occurs. The most confusing task isbrowsing a webpage, because it has similar read/write patternsas file encryption. When we browse a webpage, we can observeadjacent read and write operations as well. However, there existscontinuous read operations before a write operation or iterativeread/write pairs. So, we mark the read/write patterns of browsinga webpage as { read*, write } and { read, write } *, which aredifferent from the read/write pattern of file encryption task.Thus, we consider { read, write } as an anomalous read/writepattern indicating file encryption activities. Only when there is aread operation right before a write operation and before them areother file operations, we can say we find an anomalous pattern.We set a judgement condition that if there exists { read, write } ona monitored system, the local host is potentially in risk, furtherdiagnosis is in need. Otherwise, the local host is safe. Since inotifymonitors the entire file system in the implementation of our localdetection mechanism, we admit that sometimes some operationsfrom different tasks may mix together. That is, inotify maycapture an operation from task A after an operation from taskB but before another operation from task B, which may generateanomalous pattern while there is no anomalous behaviors. In thiscase, this pattern checking module causes false positives, that iswhy we need further diagnosis to check other features.To be aware of the anomalous read/write patterns in time, wecustomize the inotify event handler in the following way: recordall monitored events in order in an event list; once coming acrossa write operation, check the last two operations in event list. Ifthe last one is read as well as the last-second one is neither readnor write, the anomalous read/write pattern is found; otherwise,empty the event list and continue to add monitored events intothe list. Figure 3 shows the code of our event handler.Once an anomalous read/write pattern { read, write } isdiscovered on a system, the checking result of the first moduleis ”anomalous”. So, we should start a new thread to do furtherdiagnosis and pass the path of the file where this anomalous writeoperation happened to the new thread so that the second modulecan directly locate the file it needs to check.
3) Check File Entropy:
There is an existing algorithm for fileentropy calculation [6]. Given a file, this algorithm traverses thetarget file to get the frequency count of each byte value and then
Fig. 3. Event handler for local detection. uses the following formula to cumulatively calculate the entropyof the entire file. entropy = entropy + freq ∗ log freq (1) Here, the variable ”entropy” is initialized to 0 and graduallyincreases until all ”freq” related values are included, the variable”freq” represents the frequency of each byte value. With thisalgorithm, we can easily calculate final entropy value for a targetfile.To distinguish normal files and encrypted files through fileentropy, we launched an experiment to calculate the entropyvalues of various kinds of normal files and encrypted files. Table IIlists the entropy values of many different types of files in normalstate and encrypted state.We can observe from Table II that text files which consist ofEnglish words have relatively low entropy in normal state. Theentropy of this kind of normal files ranges from 4.0 to 5.0 whilethat of their corresponding encrypted files ranges from 7.0 to 8.0in Linux file system. As for other types of files such like picturesand audios, they have relatively high entropy even in normalstate. After being encrypted, their entropy values are tend to be8. So, we deal with different kinds of files in different ways. Asfor a text file, we set the threshold 6.00. As for an non-text file,the threshold is set to be 7.99. Then, we can determine whethera file is anomalous or not by checking its entropy.First, we check file extension of the target file. If the fileextension is out of our knowledge, this file must be encrypted byransomware because ransomware always modify file extensionafter encrypting a file. If we can recognize the file extension,calculate file entropy and compare entropy value with appropriatethreshold value. If the entropy of the inspected file is greaterthan or equal to the threshold value, this file is considered tohave an anomalous entropy value. That is, the checking resultof the second module is ”anomalous”. Then, the third feature”read/write frequency” should be checked for final detectionresult. Otherwise, this is not an encrypted file, hence not aransomware attack. asks File operations Read/write patternsEncrypt a file open, create, open, read, write, close, close, delete. { read, write } Modify a file open, read, close, open, create, open, close, write, close. { read, ..., write } Compress a file open, create, open, read, close, write, close. { read, ..., write } Decompress a file open, read, close. { read } Browse a webpage 1 create, open, write, close, read, ..., read, write. { read*, write } Browse a webpage 2 ..., read, write, read, write, ..., read, write. { read, write } * TABLE I. R
EAD / WRITE PATTERNS OF DIFFERENT TASKS
File types Normal state Encrypted state.txt 4.62 7.98.log 4.76 7.83.conf 4.47 7.92.pgn 7.91 8.00.jpeg 7.94 8.00.pptx 7.94 8.00.mp3 7.95 8.00
TABLE II. F
ILE ENTROPY OF DIFFERENT FILES IN NORMAL STATE AND ENCRYPTED STATE
4) Check Read/Write Frequency:
The final checkpoint con-cerns read/write frequency on the local host. Once a read or writeoperation is monitored by inotify, the event handler will recordthe time it occurred, as shown in Figure 3. What is more, theredundant contents in time list will be removed at the beginningof the new thread so that only the read and write operations thatoccurred after { read, write } pattern will be recorded in time list.Since we ran a new thread for further diagnosis, event handlercan continue to record the time of upcoming read and writeoperations. With the recorded information in time list, we cancalculate read/write frequency in the system after the anomalouspattern is found, which is defined as the average number ofread/write operations occurred per second: read/write frequency = operation countsduration , (2) where ”operation counts” represents the total number of recordedread and write operations after an anomalous read/write pattern,”duration” represents the time interval between the first recordedoperation time and the last one in time list. We can achieve thevalue of ”operation counts” by counting the number of elementsin time list and calculate ”duration” by computing the differencebetween the first and the last element in time list.To distinguish normal read/write frequency with anomalousread/write frequency caused by ransomware activities, we didtwo experiments that respectively tests the read/write frequencyduring simulative ransomware activities and user normal behav-iors.In the first experiment, we use AES ciphers and RSA cipherfrom openssl library to encrypt files whose sizes range from 1KBto 1MB. As for each test, given cipher type and file size, encrypt100 files automatically. Table III shows the experiment results.When the file size is specified, the read/write frequency hardlychanges with different ciphers applied. When the cipher type isdecided, larger files tend to cause larger read/write frequency.When we use RSA cipher, it can only encrypt small files due tothe limitation of its encryption key length in openssl library, so,we did not get test results for relatively large files when RSAis applied. However, it does not matter because in real-worldransomware, RSA is always used to encrypt keys whose length isrelatively small. In the tests, we also observed the number of readand write operations occurred during file encryption tasks. Byanalyzing the data in Table III, we found the read/write frequencyon a system undergoing ransomware attack should be over 600operations per second. Even if the ransomware is encrypting filessmaller than 1 KB, the read/write frequency could not be smaller than 600 op/sec. The reason is that, when the file size is 1 KB,there are totally 200 read and write operations happened on 100files. That is to say, there is only one read and one write operationduring the encryption of one file. So, when ransomware workson files that are smaller than 1 KB, the number of read/writeoperations will not change whereas the time consumption canbe smaller than that of encrypting 1 KB files, which makesread/write frequency larger than 600 op/sec. Therefore, we can setthe lower bound of the read/write frequency during ransomwareactivity to be larger than 600 op/sec.Then, we use another experiment to test the read/writefrequency during normal user behaviors. Table IV shows theexperiment results. For example, when we use Firefox, themaximum read/write frequency on this machine is 322 op/sec andthe average read/write frequency is 95 op/sec. When we watch avideo on YouTube, the maximum frequency is 342 op/sec whilethe average frequency is only 105 op/sec. We can observe thatthe upper bound of read/write frequency during normal useractivities are smaller than 400 op/sec.Since the upper bound of normal read/write frequency islower than 400 op/sec meanwhile the lower bound of anomalousread/write frequency is higher than 600 op/sec. We pickedthe mid number 500 as the threshold. If the current observedread/write frequency is greater than or equal to 500 op/sec,the checking result of the third module will be ”anomalous”.Then, the local detection mechanism can finish its work with an”anomalous” detection result. Otherwise, since the read/writefrequency is normal, this machine is considered safe.In summary, the local detection mechanism uses inotify tokeep monitoring the local host and checking read/write patterns.An anomalous read/write pattern will trigger further diagnosis. Ifall features show anomalous checking results, the local detectionmechanism will send an alert to user reporting anomalous state onthis machine and suspicious tasks that are performing anomalousbehaviors. After that, all running tasks on this machine will besuspended and then the network-level detection will be triggeredto collect information from other machines. C. Validation of Local Detection Mechanism
As we mentioned in Section 3.1, using one feature alone todetect ransomware is not sufficient because single feature methodswill cause many false positives and false negatives. For example,if we use file entropy as the only feature to determine whether
KB 10 KB 100 KB 500 KB 1 MBAES 128 CBC 742 op/sec 1379 op/sec 8318 op/sec 33876 op/sec 43363 op/secAES 256 CBC 724 op/sec 1437 op/sec 8642 op/sec 33920 op/sec 43780 op/secAES 128 ECB 749 op/sec 1440 op/sec 8758 op/sec 34162 op/sec 43027 op/secAES 256 ECB 788 op/sec 1380 op/sec 8546 op/sec 33697 op/sec 43998 op/secRSA 651 op/sec - - - -Op counts 200 400 2600 12400 24600
TABLE III. R
EAD / WRITE FREQUENCY DURING BATCH FILE ENCRYPTION . Applications Max Frequency Average FrequencyFirefox 322 op/sec 95 op/secText editor 210 op/sec 88 op/secLibreOffice writer 310 op/sec 35 op/secYouTube 342 op/sec 105 op/secAmazon 281 op/sec 121 op/secGmail 253 op/sec 74 op/sec
TABLE IV. R
EAD / WRITE FREQUENCY DURING NORMAL BEHAVIORS . a machine is infected, the compressed files will be mistaken forencrypted files and result in false positives. To validate the serviceof our local detection mechanism, we applied it on two machinesunder two different scenarios.In the first scenario, both of these two test machines are safe.We ran our local detection mechanism on them for two days andused them as usual such as doing course projects, reading papers,writing assignments, watching movies, playing computer gamesand etc. In the second scenario, we also ran our local detectionmechanism on these two test machines for 2 days, but duringthis period, we applied ransomware samples on them at randomtime for 48 times and observed detection results.Table V shows the test results, we can know that there were3 false positives on Machine1 but no false negative case duringthe experiment. That is to say, when the test machines are insafe state, our local detection mechanism reported ”anomalous”detection results for three times on Machine1. When the testmachines are under the risk of ransomware attacks, all attackswere correctly detected and reported by our local detectionmechanism. We also found the reason for these 3 false positives.They are caused by file encryption behaviors performed byauthorized users.Sometime, although there is no ransomware attack, users’ransomware-like behaviors will cause false positives. That’s whywe need network-level detection to help us correct some falsepositives of local detection and to provide users with moreaccurate information to judge whether there is a ransomwareattack indeed. V. N
ETWORK -L EVEL D ETECTION
The network-level detection works on collecting securityconditions of other machines from network and generating acomprehensive report to help user determine whether there existsransomware attack. It can help correct some false positives madeby local detection and it enjoys excellent functionality especiallywhen there is a cryptoworm attack.The general idea of network-level detection is that, if multiplemachines manifested the similar anomalous behavior at aboutthe same time, it is likely a cryptoworm attack. If only a fewmachines are anomalous, these machines may be misdiagnosedby local detection because cryptoworm spreads swiftly, causinga mass of infected machines. It is easy to know the number ofanomalous machines in LAN by collecting information from allthe peers. However, this idea is hard to be put into practice in WAN because it is difficult to efficiently collect useful information.If we query all machines in WAN for their security conditions,it will be time and network-resource consuming. If we only pickseveral machines as representatives, their information may notbe reliable because a few machines information cannot reveal thecondition of the entire WAN. To solve this dilemma, our solutionis to apply the ACO algorithm to the network-level detection sothat we can more efficiently collect the most useful informationin the least time.To sum up, we use ACO-based Mechanism (ACOM) tocollect information from selected machines in WAN and useBroadcasting Mechanism (BM) to collect information from allmachines in the same LAN. Then, we can use wisdom of thecrowd to provide user with collected data for reference and helpuser determine whether to treat this machine as an infected oneor not.
A. ACO-based Mechanism1) Ant Colony Optimization:
Ant colony optimization (ACO)is an optimization technique inspired by the path finding behav-iors of ants searching for food [7]. In nature, ants use pheromoneto communicate with each other. They left pheromone alongwith the path they find food so that other ants can also findfood following the pheromone trails. When there are multiplepheromone paths ahead, ants make decision depending on thestrength of pheromone trails. Most ants choose the strongestpheromone trial and only a small number of ants choose otherways. Over time, pheromone trails will gradually evaporate. Thismeans that pheromone trails which no longer lead to a foodsource will eventually stop being used, promoting ants to findnew paths and new food sources. Figure 4 gives an example ofhow ants searching for food.
Fig. 4. Path finding behavior of ants searching for food.
Suppose the food resource is on the left side and the antcolony is on the right side. There are two paths between food achine Number of false positives Number of false negativesMachine1 3 0Machine2 0 0
TABLE V. F
ALSE POSITIVES AND FALSE NEGATIVES CAUSED BY LOCAL DETECTION MECHANISM resource and ant colony. Path A has shorter distance while pathB has longer distance. At the very beginning, both paths may bechosen by ants from the ant colony and pheromone trails are lefton both paths. Since path A has shorter distance, the ants on pathA spend less time to go and back which makes the pheromonetrails on this path stronger than that on path B. The strongerpheromone trail on path A will attract more ants to this path.Overtime, almost all ants choose path A instead of path B. Thatis a process how ants find the shortest path between two places.So, ACO algorithm is always applied to optimization problemssuch as travelling salesman problem and various scheduling androuting problems. It has also been applied to detect networkintrusions and Botnet servers [14].Our problem is similar to travelling salesman problem.Instead of finding the shortest way to go through all cities, wewant to find the shortest way to collect most information fromother machines in WAN. So, we used ACO algorithm to helpus do network-level detection in WAN scenario so that we canprovide user with a helpful report without consuming too muchnetwork resources.
2) Design of ACOM:
There are two key elements in ACO:ants and pheromone. To apply ACO to the network-level detec-tion, we should first decide what roles these two elements shouldplay in our approach. Since we want to collect most informationfrom other machines in WAN, we use ants to collect and transmitinformation among machines just as what they do when searchingfor food. Each anomalous machine creates an ant and sends it tothe network. Each time an ant passes an anomalous machine, itrecords the security condition of this machine in it and share theinformation it has collected with the next machine it reaches. Weconsider pheromone as the number of anomalous machines eachant has collected, and it can be left on the machines that the antpassed. In this manner, as ants travel in WAN, machines can haveincreasing knowledge of the number of anomalous machines inWAN.Then, according to the records in an ant when it finishes itswork and the level of pheromone left on the machine, ACOM willgenerate a report telling user current situation in WAN. Figure5 shows the pseudo code of ACOM, which describes the workprocedure of this network-level detection mechanism.Once ACOM is launched, the anomalous local host creates anant and then sends this ant to network. The next destination ofthe ant should be randomly selected from all machines this localhost can contact with. Then, ACOM goes into a while loop. Inthis loop, the ant firstly notifies the current local host to do localdetection again if this local host is not doing local detection. Thenthey exchange information with each other. The local host hereindicates the machine that an ant is currently on. For example, wesay machine A created an ant and sent it to machine B, the eventexchange information happens between the ant and machine B.After information exchange, ACOM checks if the ant has achievedits goal which is the number of anomalous machines it needsto collect during its travel. If the ant has collected sufficientanomalous machines indicating a cryptoworm attack, it will goback to the original machine that created this ant and report tothe user saying that ”At least T users in WAN think you are inhigh risk”. Here, T should be replaced by the value of thresholddetermined in different network environments. If the ant has notachieved goal but has reached the upper bound of its capability,
Fig. 5. Pseudo code describing the work procedure of ACOM. it will go back as well but report that ”We inquired 20 usersin WAN, only A user(s) think(s) your are in risk.” A should bereplaced by the number of anomalous machines known by theant. Both of the above two cases lead to the end of ACOM since itfinished to provide user with wisdom of the crowd for reference.Otherwise, the ant should continue to work. The current localhost it is on should decide the next stop of the ant accordingto pheromone information and send the ant to the next stop.The work procedure in the while loop iterates until the ant goesback to its original local host and reports our judgement. Thisis the entire workflow of ACOM. The detailed implementation ofACOM will be illustrated in the following subsection.
3) Implementation of ACOM:
In the workflow of ACOM,there are three important functions: CreateAnt(), ExchangeIn-formation(), and DecideDirection(). The details of these threefunctions are explained below.Key Function 1: CreateAnt()Ants are used to help the anomalous machines collect secu-rity condition information of other machines from network. InACOM, anomalous machines create their own ants and send themto network to collect information of other machines. When a localhost creates an ant, it needs to tell the ant three main things:goal, home, and (upper) limit.From a global perspective, we need to set a threshold T todetermine the upper bound of number of anomalous machinesin a safe scenario. That is to say, if ACOM on one anomalousmachine can obtain information of more than T anomalousmachines from WAN, it will alert user to potential high risk.If ACOM finds less than T anomalous machines from WAN,it concludes there is no cryptoworm attack and reports itsjudgement to user. An ant’s goal is related to the threshold T.It is defined as the number of anomalous machines that the antneeds to collect during its travel. Let the value of goal be G, G = T − P (cid:48) . (3) In equation (3), P’ indicates the number of anomalous machinesnown to the local host that created the ant, and it is treatedas the pheromone level. We will explain more details aboutpheromone in the next function ExchangeInformation(). Thevalue of goal equals to the difference between threshold andpheromone because before a specific ant is created, some otherants may have travelled through this local host and depositinformation about other anomalous machines observed duringtheir traversals. As such, leveraging such information, this newant will not need to start from scratch to reach the threshold.If the ant can find G anomalous machines from WAN, we thinkthis machine is probably infected by cryptoworm. Otherwise, wereport this machine is probably not infected. That is, ACOM willreport our judgement according to ant’s detection results.The second thing the local host needs to tell the created antis the home address. Home address is the IP address of this localhost. With this address information, the ant could return andreport detection results when it finishes its work.The system parameter limit stipulates that each ant can onlytravel through at most N machines. We set this limitation becausewe do not want the ant to go through so many machines thatconsumes a great amount of time and network resources.Key Function 2: ExchangeInformation()As an ant arrives at a new machine, it exchanges informationwith the current local host so that both the ant and the currentlocal host can enrich their knowledge about security conditionin WAN. On one hand, ant tells local host a list contains allanomalous machines it has collected up to now as well as thecount of anomalous machines which is considered as pheromone.This process is to mimic the behavior of ants in nature that leavepheromone trails on their way to food resources. On the otherhand, local host tells ant its local detection result: whether itis anomalous or not. So, after exchanging information, ant maycollect one more record while local host receives pheromone.We also mimicked the property of pheromone that, it evapo-rates over time. We use this property because the machines do notneed to keep very old information on them since the conditionsof other machines in WAN may change over time. In our model,pheromone value remains unchanged in the first 10 seconds afterit reaches the local host. Then, it decreases at a rate of 10% persecond. Suppose the original amount of pheromone is p, we cancalculate pheromone p left on some machine after t seconds usingthis formula: p (cid:48) ( t ) = (cid:4) . t − ∗ p (cid:5) , t ≥ . (4) Review the goal of each ant in function CreateAnt(), the value ofp’ we can achieve in equation (4) should be used as the variablep’ in the equation (3) to calculate the goal of each ant when beingcreated.After exchanging information, the ant can decide whether itshould go back home and report its detection result. If it has notfinished its work, the local host should help ant decide direction,that is, which machine to go as the next stop.Key Function 3: DecideDirection()In nature, ants decide their directions depending on thestrength of pheromone trails ahead; In ACOM, the next des-tination of an ant is also decided depending on pheromoneinformation left on the current local host. Since we want the antto achieve its goal in shorter time if there exist some anomalousmachines in WAN, the optimal direction of the ant should be ananomalous machine so that it can finish its work earlier.To help an ant choose the next stop according to pheromoneinformation on the current local host, our strategy is to assign weights to other machines that the current local host can contactwith. Since the local host has pheromone information left byall passed ants, it has already known some anomalous machinesin WAN. So, it should assign larger weights to these alreadyknown anomalous machines just like the already known shorterpaths in nature having stronger pheromone trails. It assignssmaller weights to unknown machines just like uncertain pathsto food sources in nature having weaker pheromone trails. In ourimplementation, the larger weights are set to 2 while the smallerweights are set to 1 to simply distinguish known anomalousmachines and unknown machines. The stops which an ant haspreviously passed are assigned with weight 0 because the ant doesnot need to go back to the previous stops to gather information.With weights set, current local host can calculate the possibil-ity of each machine to be chosen as the next stop. The anomalousmachines which have larger weights are more likely to be selectedas destination of the ant. Suppose there are totally n machines inreach, the probability for some machine to be chosen is equal tothe weight of this machine over the total weights of all machinesin reach: probability ( k ) = weight ( k ) (cid:80) ni =1 weight ( i ) , ≤ k ≤ n. (5) By this way, the next stop of the ant is decided in randombut is not completely in random. The ant is more likely to besent to an anomalous machine so that it can collect sufficientanomalous machines to prove a risky condition as soon as possibleif there exist cryptoworm attack. Meanwhile, it is also possiblethat the ant can go to an undiscovered machine just like an ant innature opening up a new path. Thus, we can guarantee that theinformation collected by ants are typical enough to conclude thecurrent situation in network while very limited network resourcesand time will be consumed by ACOM.
B. Broadcasting Mechanism
While ACOM is designed for collecting security conditioninformation from WAN, another network-level detection methodcalled Broadcasting Mechanism (BM) is especially designed fordetection in LAN. It exhaustively inquiries all machines in LANand uses wisdom of the crowd to help user determine whetherthe local host is infected. This process does not consume toomuch network resource since the number of machines in LANis limited, but it provides overall view of security condition inLAN.Once BM is launched on a local host, it broadcasts theanomalous condition of the local host to all other machines inLAN meanwhile it receives this kind of information from otheranomalous machines so that it can have a general idea about thenumber of anomalous machines in LAN at this point. Then, itgenerates a comprehensive report to tell user current securitycondition in LAN. For example, if there are totally 100 machinesand 80 of them are anomalous, BM will generate a report sayingthat ”80% machines in LAN also experience anomalies, so yourcomputer is in high risk of cryptoworm attack.” Based on thereports from ACOM and BM, the user can make a judgementby himself(herself) about whether to treat his(her) computer asan infected machine.
VI. E
VALUATION OF N ETWORK -A SSISTED A PPROACHES
In this section, we describe how we established a test envi-ronment in which 100 Docker containers are used to simulatea real-world network scenario and a Linux ransomware samplealled GonnaCry [8] is applied on simulative infected machinesto evaluate the performances of NAA.Although NAA is an integrated approach, we comparedthe accuracy, message overheads and latency of local detectionmechanism, ACOM and BM to verify whether network-leveldetection can improve local detection and to verify applicabilityof ACOM and BM in different scenarios. To distinguish thelocal detection mechanism used by ACOM and BM with themechanism itself when treated as an independent mechanism, wename the independent local detection mechanism
Direct Report (DR). In the rest of this section, we will compare DR, ACOM andBM to have an comprehensive evaluation about the performanceof each part of NAA. Note that, DR directly uses the detectionresult of local detection mechanism as the final result; ACOM issupported by the local detection mechanism and further uses theACO algorithm to perform network-level detection to achievea final report; BM also uses the local detection mechanismas a baseline and then collects information of all machines insimulative network to make a final report according to thenumber of anomalous machines.
A. Experiment Environment
Docker is a platform that provides resources and services forapplication development and test. It uses OS-level virtualization todeliver software in packages called containers. Containers can beconsidered as simplified virtual machines because each containerhas its own configuration files and libraries but is run by asingle operating system kernel which results in fewer resourcesdemands. Containers can communicate with each other throughwell-defined channels as well as maintaining isolated from oneanother. So, we use Docker containers to simulate the real-worldnetwork scenario instead of using virtual machines due to thefunctionality and simplification of containers. In our experiment,we established 100 containers, each of which is equipped with DR,ACOM and BM, to simulate a network environment containing100 machines which can communicate with each other whenit is needed. When testing a specific mechanism, we run thismechanism on all 100 containers for 10 times and observe itsaverage performances.To simulate the scenarios that some specified machines areattacked by ransomware, we run a Linux ransomware samplecalled GonnaCry on these specified containers and then executea detection mechanism on each container to test its performancesin this situation. GonnaCry employs a hybrid scheme which isutilized by most real-world ransomware nowadays combiningasymmetric encryption and symmetric encryption together. Tomake the ransomware more secure from the attacker’s perspec-tive, GonnaCry contacts a remote server which keeps a pairof RSA keys for it, although the ransomware itself also has itsown RSA key pair so that the victims cannot get the decryptionkey directly from their local hosts. The working procedure ofGonnaCry is as following: The remote server generates a pair ofRSA keys. The public key S pub is hardcoded in GonnaCry whilethe private key S priv is preserved on the remote server. WhenGonnaCry starts to work, it generates its own RSA key pair onthe local host. The public key is called C pub and the privatekey is called C priv. Then, it uses AES cipher to encrypt thelocal private key C priv with the servers public key S pub andalso uses AES cipher to encrypt target files with local public keyC pub. In this case, if someone wants to recover these encryptedfiles, he/she needs to get the servers private key S priv first torecover the local private key C priv so that he/she can use C privto decrypt files. Since the servers private key S priv is stored onthe remote server, the victim has to pay the ransom to obtain thiskey. We apply GonnaCry on simulative infected machines due toits realism. We respectively simulated 11 different scenarios with increas-ing numbers of infected machines and decreasing numbers ofsafe machines while the total number is always 100. In eachscenario, we respectively apply three different mechanisms oncontainers and test 10 times to achieve reasonable average resultsof accuracy, message overhead and latency.To determine the value of limit N and threshold T, we triedmany different values under this 100-machine scenario. Finally,we decided that N = 20 and T = 3 because this setting contributesa best balance between accuracy and efficiency which considersboth time consumption and network resource consumption.
B. Accuracy
Accuracy is defined as the correctly reported cases out ofoverall cases, that is, accuracy = (true positives + true negatives)/ (true positives + false positives + true negatives + false negatives).Since BM just reports the fact it observed using wisdom of thecrowd instead of reporting its own judgement, we only evaluatethe accuracy of DR and ACOM. The result is shown in Figure 6,the x-axis represents the number of infected machines, the y-axisindicates accuracy of DR and ACOM.
Fig. 6. Accuracy Comparison of ACOM and DR
We can observe that ACOM has greater advantage over DRwhen there are only a few infected machines. As the numberof infected machines increases, although ACOM does not haveevident superiority, it is still more accurate than DR in mostcases. This test result proves that the network-level detection canhelp improve accuracy of local detection. Plus the comprehensivereport from BM, user can make an even more precise decisionabout whether the local host is attacked by ransomware. If theransomware is a cryptoworm, it can be detected at very beginningif NAA is deployed due to high accuracy of ACOM at the timethat only a few machines are infected.
C. Message Overheads
Message overhead is another important factor in considera-tion since we do not want to cause too much network resourceconsumption during the process of ransomware detection. If aransomware detection approach produces huge resource con-sumption which is heavier than the damage of ransomwareitself, it should not be put into practice. It is obvious thatthese three mechanisms we put forward will not cause hugeresource consumption compared with the expensive extortion feeof ransomware, but we still want to figure out their messageoverhead to see which mechanism is optimal from this aspect.We define message overhead as the extra messages producedby ransomware detection approaches. In ACOM, machines needto send and receive ants during the detection process. In BM,achines need to send and receive news about whether a specificmachine is anomalous or not. So, both ACOM and BM produceextra messages when they are running. Figure 7 shows themessage overhead of Dr, BM and ACOM. The x-axis indicates thenumber of infected machines and the y-axis indicates the numberof messages being produced during each detection process.
Fig. 7. Message overhead of three mechanisms.
We can observe that DR performs best when coming acrossmessage overhead measurement because it directly uses thedetection results of local detection mechanism which does notproduce any additional messages. BM produces more messageoverheads than ACOM does in most cases. As the numberof infected machines increases, the message overhead of BMdrastically grows while that of ACOM slightly grows. The reasonis that, BM requires each anomalous machine to send messages toall peers while ACOM only allows each ant to go through at most20 machines. Thus, apply BM to LAN scenario is a reasonablearrangement from message overhead’s perspective since there arelimited machines in LAN making the message overhead of BMcountable.
D. Latency
Latency is the time duration that each mechanism needsto complete its task. We calculated the average latency on allmachines in each test. Figure 8 shows the latency of DR, BMand ACOM. The x-axis indicates the number of infected machinesand the y-axis indicates the average seconds that each mechanismconsumes during its work.
Fig. 8. Latency of three mechanisms.
As for DR, its latency is approximately 0 because it only doeslocal detection which can be completed in very short time. As forBM, no matter how many victims exist, the anomalous machines always broadcast a message and receive messages from otheranomalous peers and then a report is sent to user dependingon the number of anomalous machines in LAN. All machineswork in parallel following the above procedure, which makesthe runtime of all machines be similar to the runtime of onerandomly picked machine. So, the average latency of BM only hasa little fluctuation as the number of infected machines increases.As for ACOM, each anomalous machine creates an ant that goesthrough at least 3 machines one after one. As the number ofinfected machines increases, more ants will be created whichmakes average runtime increase. So, ACOM has the worst latencyamong three integrated approaches while Direct Report almosthas no latency. However, the high latency of ACOM does not doextra damage to infected machines because all suspicious tasksare suspended before ACOM is launched so that ransomwarecannot encrypt files when network-level detection is working.
E. Loss Assessment
In this section, we estimated the damages that a ransomwarecan cause on a machine before it is detected by our ransomwaredetection approach NAA. That is, how many files can be en-crypted before the ransomware is detected and terminated.We can learn from the test results shown above that ACOMhas relatively long delay before reporting our diagnosis to user.However, it does not result in additional damage because beforeACOM is launched, all suspicious tasks are suspended until usertakes further actions. So, the number of files being encryptedduring the process of local detection is exactly the losses of thismachine. Figure 9 shows the average number of encrypted fileson a victim machine if NAA is applied on. The x-axis is thenumber of infected machines in LAN, the y-axis is the averagenumber of encrypted files.
Fig. 9. Average number of encrypted files.
We can observe that no matter how many machines areinfected, the number of encrypted files on each machine rangesfrom 15 to 30, which is acceptable loss owe to the quick job ofour local detection mechanism.Based on our evaluation results concerning accuracy, messageoverheads, latency and loss assessment, we find that network-level detection can indeed help improve the accuracy of localdetection. From message overhead’s point of view, ACOM isapplicable to WAN scenario while BM is applicable to LANscenario for network-level detection. Moreover, NAA providesgood performance especially for detecting cryptoworm attacksince our network-level detection can provide user with veryaccurate alert in the early stage of cryptoworm attack.
II. C
ONCLUSION AND F UTURE W ORK
In this paper, we propose a network-assisted approach calledNAA for ransomware detection which combines local detectionand network-level detection together. We first describe a localdetection mechanism which uses three local features to judgementwhether the local host is anomalous. In network-level detection,we implement ACOM to efficiently collect information in WANscenario and put forward BM which exhaustively inquires allmachines in LAN. Then, the network-level detection uses wisdomof the crowd to provide user with a comprehensive report so thatuser can easily make his(her) judgement based on the informationwe offered. To evaluate our approach, we use docker to establishthe experiment environment and use GonnaCry to simulateransomware attack. The test results show that NAA is moreaccurate than local only detection and is especially applicablefor cryptoworm detection meanwhile the loss of files during theworking procedure of NAA is acceptable.However, due to the limited resource of Linux ransomwaresample, we only used GonnaCry to simulate ransomware attackin our evaluation experiments. In the future, we will test the per-formance of NAA using some other Linux ransomware samplesespecially Linux cryptoworm samples when they are accessible. R EFERENCES , pages 79–84, 2015.[10] Manaar Alam, Sarani Bhattacharya, Debdeep Mukhopadhyay, andAnupam Chattopadhyay. RAPPER: ransomware prevention viaperformance counters.
CoRR , abs/1802.03909, 2018.[11] A. O. Almashhadani, M. Kaiiali, S. Sezer, and P. OKane. A multi-classifier network-based crypto ransomware detection system: Acase study of locky ransomware.
IEEE Access , 7:47053–47067,2019.[12] Krzysztof Cabaj and Wojciech Mazurczyk. Using software-definednetworking for ransomware mitigation: The case of cryptowall.
IEEE Network , 30(6):14–20, 2016.[13] S. Chadha and U. Kumar. Ransomware: Let’s fight back! In , pages 925–930, 2017.[14] C.-M Chen and J. Lai, G.-Hand Lin. Detection of c&c servers basedon swarm intelligence approach.
Journal of Computers (Taiwan) ,29:190–201, 10 2018.[15] Zhi-Guo Chen, Ho-Seok Kang, Shang-nan Yin, and Sung-RyulKim. Automatic ransomware detection and analysis based ondynamic API calls flow graph. In
Proceedings of the InternationalConference on Research in Adaptive and Convergent Systems, RACS2017, Krakow, Poland, September 20-23, 2017 , pages 196–201. ACM,2017. [16] Yun Feng, Chaoge Liu, and Baoxu Liu. Poster: a new approachto detecting ransomware with deception. In , 2017.[17] Amin Kharraz and Engin Kirda. Redemption: Real-time protectionagainst ransomware at end-hosts. In Marc Dacier, Michael Bailey,Michalis Polychronakis, and Manos Antonakakis, editors,
Researchin Attacks, Intrusions, and Defenses - 20th International Symposium,RAID 2017, Atlanta, GA, USA, September 18-20, 2017, Proceedings ,volume 10453 of
Lecture Notes in Computer Science , pages 98–119.Springer, 2017.[18] E. Kirda. Unveil: A large-scale, automated approach to detectingransomware (keynote). In , pages1–1, 2017.[19] K. Lee, S. Lee, and K. Yim. Machine learning based file entropyanalysis for ransomware detection in backup systems.
IEEE Access ,7:110205–110215, 2019.[20] Chris Moore. Detecting ransomware with honeypot techniques. In
Cybersecurity and Cyberforensics Conference, CCC 2016, Amman,Jordan, August 2-4, 2016 , pages 77–81. IEEE, 2016.[21] Daniel Morat´o, Eduardo Berrueta, Eduardo Maga˜na, and MikelIzal. Ransomware early detection by the analysis of file sharingtraffic.
J. Netw. Comput. Appl. , 124:14–32, 2018.[22] Routa Moussaileb, Benjamin Bouget, Aur´elien Palisse, H´el`ene LeBouder, Nora Cuppens, and Jean-Louis Lanet. Ransomware’s earlymitigation mechanisms. In Sebastian Doerr, Mathias Fischer, Se-bastian Schrittwieser, and Dominik Herrmann, editors,
Proceedingsof the 13th International Conference on Availability, Reliability andSecurity, ARES 2018, Hamburg, Germany, August 27-30, 2018 , pages2:1–2:10. ACM, 2018.[23] Joon-Young Paik, Keuntae Shin, and Eun-Sun Cho. Poster:Self-defensible storage devices based on flash memory againstransomware. In
Proceedings of IEEE Symposium on Security andPrivacy , 2016.[24] Florian Quinkert, Thorsten Holz, K. S. M. Tozammel Hossain,Emilio Ferrara, and Kristina Lerman. RAPTOR: ransomwareattack predictor.
CoRR , abs/1803.01598, 2018.[25] Nolen Scaife, Henry Carter, Patrick Traynor, and Kevin R. B.Butler. Cryptolock (and drop it): Stopping ransomware attackson user data. In ,pages 303–312. IEEE Computer Society, 2016.[26] Daniele Sgandurra, Luis Mu˜noz-Gonz´alez, Rabih Mohsen, andEmil C. Lupu. Automated dynamic analysis of ransomware:Benefits, limitations and use for detection.
CoRR , abs/1609.03020,2016.[27] Claude E. Shannon. A mathematical theory of communication.
ACM SIGMOBILE Mob. Comput. Commun. Rev. , 5(1):3–55, 2001.[28] S. K. Shaukat and V. J. Ribeiro. Ransomwall: A layered defensesystem against cryptographic ransomware attacks using machinelearning. In2018 10th International Conference on CommunicationSystems Networks (COMSNETS)