Paolo Palumbo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Paolo Palumbo is active.

Explore More

Publication

Featured researches published by Paolo Palumbo.

trust, security and privacy in computing and communications | 2015

Efficient Detection of Zero-day Android Malware Using Normalized Bernoulli Naive Bayes

Luiza Sayfullina; Emil Eirola; Dmitry Komashinsky; Paolo Palumbo; Yoan Miche; Amaury Lendasse; Juha Karhunen

According to a recent F-Secure report, 97% of mobile malware is designed for the Android platform which has a growing number of consumers. In order to protect consumers from downloading malicious applications, there should be an effective system of malware classification that can detect previously unseen viruses. In this paper, we present a scalable and highly accurate method for malware classification based on features extracted from Android application package (APK) files. We explored several techniques for tackling independence assumptions in Naive Bayes and proposed Normalized Bernoulli Naive Bayes classifier that resulted in an improved class separation and higher accuracy. We conducted a set of experiments on an up-to-date large dataset of APKs provided by F-Secure and achieved 0.1% false positive rate with overall accuracy of 91%.

Computers & Security | 2017

A pragmatic android malware detection procedure

Paolo Palumbo; Luiza Sayfullina; Dmitriy Komashinskiy; Emil Eirola; Juha Karhunen

Abstract The academic security research community has studied the Android malware detection problem extensively. Machine learning methods proposed in previous work typically achieve high reported detection performance on fixed datasets. Some of them also report reasonably fast prediction times. However, most of them are not suitable for real-world deployment because requirements for malware detection go beyond these figures of merit. In this paper, we introduce several important requirements for deploying Android malware detection systems in the real world. One such requirement is that candidate approaches should be tested against a stream of continuously evolving data. Such streams of evolving data represent the continuous flow of unknown file objects received for categorization, and provide more reliable and realistic estimate of detection performance once deployed in a production environment. As a case study we designed and implemented an ensemble approach for automatic Android malware detection that meets the real-world requirements we identified. Atomic Naive Bayes classifiers used as inputs for the Support Vector Machine ensemble are based on different APK feature categories, providing fast speed and additional reliability against the attackers due to diversification. Our case study with several malware families showed that different families are detected by different atomic classifiers. To the best of our knowledge, our work contains the first publicly available results generated against evolving data streams of nearly 1 million samples with a model trained over a massive sample set of 120,000 samples.

international conference on machine learning and applications | 2016

Android Malware Detection: Building Useful Representations

Luiza Sayfullina; Emil Eirola; Dmitry Komashinsky; Paolo Palumbo; Juha Karhunen

The problem of proactively detecting Android Malware has proven to be a challenging one. The challenges stem from a variety of issues, but recent literature has shown that this task is hard to solve with high accuracy when only a restricted set of features, like permissions or similar fixed sets of features, are used. The opposite approach of including all available features is also problematic, as it causes the features space to grow beyond reasonable size. In this paper we focus on finding an efficient way to select a representative feature space, preserving its discriminative power on unseen data. We go beyond traditional approaches like Principal Component Analysis, which is too heavy for large-scale problems with millions of features. In particular we show that many feature groups that can be extracted from Android application packages, like features extracted from the manifest file or strings extracted from the Dalvik Executable (DEX), should be filtered and used in classification separately. Our proposed dimensionality reduction scheme is applied to each group separately and consists of raw string preprocessing, feature selection via log-odds and finally applying random projections. With the size of the feature space growing exponentially as a function of the training sets size, our approach drastically decreases the size of the feature space of several orders of magnitude, this in turn allows accurate classification to become possible in a real world scenario. After reducing the dimensionality we use the feature groups in a light-weight ensemble of logistic classifiers. We evaluated the proposed classification scheme on real malware data provided by the antivirus vendor and achieved state-of-the-art 88.24% true positive and reasonably low 0.04% false positive rates with a significantly compressed feature space on a balanced test set of 10,000 samples.

Archive | 2013