Shifu Hou | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shifu Hou is active.

Explore More

Publication

Featured researches published by Shifu Hou.

web intelligence | 2016

Deep4MalDroid: A Deep Learning Framework for Android Malware Detection Based on Linux Kernel System Call Graphs

Shifu Hou; Aaron Saas; Lifei Chen; Yanfang Ye

With explosive growth of Android malware and due to its damage to smart phone users (e.g., stealing user credentials, resource abuse), Android malware detection is one of the cyber security topics that are of great interests. Currently, the most significant line of defense against Android malware is anti-malware software products, such as Norton, Lookout, and Comodo Mobile Security, which mainly use the signature-based method to recognize threats. However, malware attackers increasingly employ techniques such as repackaging and obfuscation to bypass signatures and defeat attempts to analyze their inner mechanisms. The increasing sophistication of Android malware calls for new defensive techniques that are harder to evade, and are capable of protecting users against novel threats. In this paper, we propose a novel dynamic analysis method named Component Traversal that can automatically execute the code routines of each given Android application (app) as completely as possible. Based on the extracted Linux kernel system calls, we further construct the weighted directed graphs and then apply a deep learning framework resting on the graph based features for newly unknown Android malware detection. A comprehensive experimental study on a real sample collection from Comodo Cloud Security Center is performed to compare various malware detection approaches. Promising experimental results demonstrate that our proposed method outperforms other alternative Android malware detection techniques. Our developed system Deep4MalDroid has also been integrated into a commercial Android anti-malware software.

knowledge discovery and data mining | 2017

HinDroid: An Intelligent Android Malware Detection System Based on Structured Heterogeneous Information Network

Shifu Hou; Yanfang Ye; Yangqiu Song; Melih Abdulhayoglu

With explosive growth of Android malware and due to the severity of its damages to smart phone users, the detection of Android malware has become increasingly important in cybersecurity. The increasing sophistication of Android malware calls for new defensive techniques that are capable against novel threats and harder to evade. In this paper, to detect Android malware, instead of using Application Programming Interface (API) calls only, we further analyze the different relationships between them and create higher-level semantics which require more effort for attackers to evade the detection. We represent the Android applications (apps), related APIs, and their rich relationships as a structured heterogeneous information network (HIN). Then we use a meta-path based approach to characterize the semantic relatedness of apps and APIs. We use each meta-path to formulate a similarity measure over Android apps, and aggregate different similarities using multi-kernel learning. Then each meta-path is automatically weighted by the learning algorithm to make predictions. To the best of our knowledge, this is the first work to use structured HIN for Android malware detection. Comprehensive experiments on real sample collections from Comodo Cloud Security Center are conducted to compare various malware detection approaches. Promising experimental results demonstrate that our developed system HinDroid outperforms other alternative Android malware detection techniques.

web age information management | 2016

DroidDelver: An Android Malware Detection System Using Deep Belief Network Based on API Call Blocks

Shifu Hou; Aaron Saas; Yanfang Ye; Lifei Chen

Because of the explosive growth of Android malware and due to the severity of its damages, the detection of Android malware has become an increasing important topic in cyber security. Currently, the major defense against Android malware is commercial mobile security products which mainly use signature-based method for detection. However, attackers can easily devise methods, such as obfuscation and repackaging, to evade the detection, which calls for new defensive techniques that are harder to evade. In this paper, resting on the analysis of Application Programming Interface (API) calls extracted from the smali files, we further categorize the API calls which belong to the some method in the smali code into a block. Based on the generated code blocks, we then apply a deep learning framework (i.e., Deep Belief Network) for newly unknown Android malware detection. Using a real sample collection from Comodo Cloud Security Center, a comprehensive experimental study is performed to compare various malware detection approaches. Promising experimental results demonstrate that DroidDelver which integrates our proposed method outperform other alternative Android malware detection techniques.

Knowledge and Information Systems | 2018

DeepAM: a heterogeneous deep learning framework for intelligent malware detection

Yanfang Ye; Lingwei Chen; Shifu Hou; William Hardy; Xin Li

With computers and the Internet being essential in everyday life, malware poses serious and evolving threats to their security, making the detection of malware of utmost concern. Accordingly, there have been many researches on intelligent malware detection by applying data mining and machine learning techniques. Though great results have been achieved with these methods, most of them are built on shallow learning architectures. Due to its superior ability in feature learning through multilayer deep architecture, deep learning is starting to be leveraged in industrial and academic research for different applications. In this paper, based on the Windows application programming interface calls extracted from the portable executable files, we study how a deep learning architecture can be designed for intelligent malware detection. We propose a heterogeneous deep learning framework composed of an AutoEncoder stacked up with multilayer restricted Boltzmann machines and a layer of associative memory to detect newly unknown malware. The proposed deep learning model performs as a greedy layer-wise training operation for unsupervised feature learning, followed by supervised parameter fine-tuning. Different from the existing works which only made use of the files with class labels (either malicious or benign) during the training phase, we utilize both labeled and unlabeled file samples to pre-train multiple layers in the heterogeneous deep learning framework from bottom to up for feature learning. A comprehensive experimental study on a real and large file collection from Comodo Cloud Security Center is performed to compare various malware detection approaches. Promising experimental results demonstrate that our proposed deep learning framework can further improve the overall performance in malware detection compared with traditional shallow learning methods, deep learning methods with homogeneous framework, and other existing anti-malware scanners. The proposed heterogeneous deep learning framework can also be readily applied to other malware detection tasks.

annual computer security applications conference | 2017

SecureDroid: Enhancing Security of Machine Learning-based Detection against Adversarial Android Malware Attacks

Lingwei Chen; Shifu Hou; Yanfang Ye

With smart phones being indispensable in peoples everyday life, Android malware has posed serious threats to their security, making its detection of utmost concern. To protect legitimate users from the evolving Android malware attacks, machine learning-based systems have been successfully deployed and offer unparalleled flexibility in automatic Android malware detection. In these systems, based on different feature representations, various kinds of classifiers are constructed to detect Android malware. Unfortunately, as classifiers become more widely deployed, the incentive for defeating them increases. In this paper, we explore the security of machine learning in Android malware detection on the basis of a learning-based classifier with the input of a set of features extracted from the Android applications (apps). We consider different importances of the features associated with their contributions to the classification problem as well as their manipulation costs, and present a novel feature selection method (named SecCLS) to make the classifier harder to be evaded. To improve the system security while not compromising the detection accuracy, we further propose an ensemble learning approach (named SecENS) by aggregating the individual classifiers that are constructed using our proposed feature selection method SecCLS. Accordingly, we develop a system called SecureDroid which integrates our proposed methods (i.e., SecCLS and SecENS) to enhance security of machine learning-based Android malware detection. Comprehensive experiments on the real sample collections from Comodo Cloud Security Center are conducted to validate the effectiveness of SecureDroid against adversarial Android malware attacks by comparisons with other alternative defense methods. Our proposed secure-learning paradigm can also be readily applied to other malware detection tasks.

ieee international conference semantic computing | 2015

Cluster-oriented ensemble classifiers for intelligent malware detection

Shifu Hou; Lifei Chen; Egemen Tas; Igor Demihovskiy; Yanfang Ye

With explosive growth of malware and due to its damage to computer security, malware detection is one of the cyber security topics that are of great interests. Many research efforts have been conducted on developing intelligent malware detection systems applying data mining techniques. Such techniques have successes in clustering or classifying particular sets of malware samples, but they have limitations that leave a large room for improvement. Specifically, based on the analysis of the file contents extracted from the file samples, existing researches apply only specific clustering or classification methods, but not integrate them together. Actually, the learning of class boundaries for malware detection between overlapping class patterns is a difficult problem. In this paper, resting on the analysis of Windows Application Programming Interface (API) calls extracted from the file samples, we develop the intelligent malware detection system using cluster-oriented ensemble classifiers. To the best of our knowledge, this is the first work of applying such method for malware detection. A comprehensive experimental study on a real and large data collection from Comodo Cloud Security Center is performed to compare various malware detection approaches. Promising experimental results demonstrate that the accuracy and efficiency of our proposed method outperform other alternate data mining based detection techniques.

knowledge discovery and data mining | 2018

Gotcha - Sly Malware!: Scorpion A Metagraph2vec Based Malware Detection System

Yujie Fan; Shifu Hou; Yiming Zhang; Yanfang Ye; Melih Abdulhayoglu

Due to its severe damages and threats to the security of the Internet and computing devices, malware detection has caught the attention of both anti-malware industry and researchers for decades. To combat the evolving malware attacks, in this paper, we first study how to utilize both content- and relation-based features to characterize sly malware; to model different types of entities (i.e., file, archive, machine, API, DLL ) and the rich semantic relationships among them (i.e., file-archive, file-machine, file-file, API-DLL, file-API relations), we then construct a structural heterogeneous information network (HIN) and present meta-graph based approach to depict the relatedness over files. To measure the relatedness over files on the constructed HIN, since malware detection is a cost-sensitive task, it calls for efficient methods to learn latent representations for HIN. To address this challenge, based on the built meta-graph schemes, we propose a new HIN embedding model metagraph2vec on the first attempt to learn the low-dimensional representations for the nodes in HIN, where both the HIN structures and semantics are maximally preserved for malware detection. A comprehensive experimental study on the real sample collections from Comodo Cloud Security Center is performed to compare various malware detection approaches. The promising experimental results demonstrate that our developed system Scorpion which integrate our proposed method outperforms other alternative malware detection techniques. The developed system has already been incorporated into the scanning tool of Comodo Antivirus product.

international joint conference on artificial intelligence | 2018

Make Evasion Harder: An Intelligent Android Malware Detection System

Shifu Hou; Yanfang Ye; Yangqiu Song; Melih Abdulhayoglu

To combat the evolving Android malware attacks, in this paper, instead of only using Application Programming Interface (API) calls, we further analyze the different relationships between them and create higher-level semantics which require more efforts for attackers to evade the detection. We represent the Android applications (apps), related APIs, and their rich relationships as a structured heterogeneous information network (HIN). Then we use a meta-path based approach to characterize the semantic relatedness of apps and APIs. We use each meta-path to formulate a similarity measure over Android apps, and aggregate different similarities using multi-kernel learning to make predictions. Promising experimental results based on real sample collections from Comodo Cloud Security Center demonstrate that our developed system HinDroid outperforms other alternative Android malware detection techniques.

advances in social networks analysis and mining | 2017

Deep Neural Networks for Automatic Android Malware Detection

Shifu Hou; Aaron Saas; Lingwei Chen; Yanfang Ye; Thirimachos Bourlai

Because of the explosive growth of Android malware and due to the severity of its damages, the detection of Android malware has become an increasing important topic in cybersecurity. Currently, the major defense against Android malware is commercial mobile security products which mainly use signature-based method for detection. However, attackers can easily devise methods, such as obfuscation and repackaging, to evade the detection, which calls for new defensive techniques that are harder to evade. In this paper, resting on the analysis of Application Programming Interface (API) calls extracted from the smali files, we further categorize the API calls which belong to the some method in the smali code into a block. Based on the generated API call blocks, we then explore deep neural networks (i.e., Deep Belief Network (DBN) and Stacked AutoEncoders (SAEs)) for newly unknown Android malware detection. Using a real sample collection from Comodo Cloud Security Center, a comprehensive experimental study is performed to compare various malware detection approaches. The experimental results demonstrate that (1) our proposed feature extraction method (i.e., using API call blocks) outperforms using API calls directly in Android malware detection; (2) DBN works better than SAEs in this application; and (3) the detection performance of deep neural networks is better than shallow learning architectures.

web age information management | 2017

An Adversarial Machine Learning Model Against Android Malware Evasion Attacks

Lingwei Chen; Shifu Hou; Yanfang Ye; Lifei Chen

With explosive growth of Android malware and due to its damage to smart phone users, the detection of Android malware is one of the cybersecurity topics that are of great interests. To protect legitimate users from the evolving Android malware attacks, systems using machine learning techniques have been successfully deployed and offer unparalleled flexibility in automatic Android malware detection. Unfortunately, as machine learning based classifiers become more widely deployed, the incentive for defeating them increases. In this paper, we explore the security of machine learning in Android malware detection on the basis of a learning-based classifier with the input of Application Programming Interface (API) calls extracted from the smali files. In particular, we consider different levels of the attackers’ capability and present a set of corresponding evasion attacks to thoroughly assess the security of the classifier. To effectively counter these evasion attacks, we then propose a robust secure-learning paradigm and show that it can improve system security against a wide class of evasion attacks. The proposed model can also be readily applied to other security tasks, such as anti-spam and fraud detection.

Explore More