Babak Rahbarinia | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Babak Rahbarinia is active.

Explore More

Publication

Featured researches published by Babak Rahbarinia.

workshop on information security applications | 2014

PeerRush: Mining for unwanted P2P traffic

Babak Rahbarinia; Roberto Perdisci; Andrea Lanzi

Abstract In this paper we present PeerRush, a novel system for the identification of unwanted P2P traffic. Unlike most previous work, PeerRush goes beyond P2P traffic detection, and can accurately categorize the detected P2P traffic and attribute it to specific P2P applications, including malicious applications such as P2P botnets . PeerRush achieves these results without the need of deep packet inspection, and can accurately identify applications that use encrypted P2P traffic. We implemented a prototype version of PeerRush and performed an extensive evaluation of the system over a variety of P2P traffic datasets. Our results show that we can detect all the considered types of P2P traffic with up to 99.5% true positives and 0.1% false positives. Furthermore, PeerRush can attribute the P2P traffic to a specific P2P application with a misclassification rate of 0.68% or less.

dependable systems and networks | 2015

Segugio: Efficient Behavior-Based Tracking of Malware-Control Domains in Large ISP Networks

Babak Rahbarinia; Roberto Perdisci; Manos Antonakakis

In this paper, we propose Segugio, a novel defense system that allows for efficiently tracking the occurrence of new malware-control domain names in very large ISP networks. Segugio passively monitors the DNS traffic to build a machine-domain bipartite graph representing who is querying what. After labelling nodes in this query behavior graph that are known to be either benign or malware-related, we propose a novel approach to accurately detect previously unknown malware-control domains. We implemented a proof-of-concept version of Segugio and deployed it in large ISP networks that serve millions of users. Our experimental results show that Segugio can track the occurrence of new malware-control domains with up to 94% true positives (TPs) at less than 0.1% false positives (FPs). In addition, we provide the following results: (1) we show that Segugio can also detect control domains related to new, previously unseen malware families, with 85% TPs at 0.1% FPs, (2) Segugios detection models learned on traffic from a given ISP network can be deployed into a different ISP network and still achieve very high detection accuracy, (3) new malware-control domains can be detected days or even weeks before they appear in a large commercial domain name blacklist, and (4) we show that Segugio clearly outperforms Notos, a previously proposed domain name reputation system.

european symposium on research in computer security | 2013

Measuring and Detecting Malware Downloads in Live Network Traffic

Phani Vadrevu; Babak Rahbarinia; Roberto Perdisci; Manos Antonakakis

In this paper, we present AMICO, a novel system for measuring and detecting malware downloads in live web traffic. AMICO learns to distinguish between malware and benign file downloads from the download behavior of the network users themselves. Given a labeled dataset of past benign and malware file downloads, AMICO learns a provenance classifier that can accurately detect future malware downloads based on information about where the downloads originated from. The main intuition is that to avoid current countermeasures, malware campaigns need to use an “agile” distribution infrastructure, e.g., frequently changing the domains and/or IPs of the malware download servers. We engineer a number of statistical features that aim to capture these fundamental characteristics of malware distribution campaigns.

international conference on detection of intrusions and malware and vulnerability assessment | 2013

PeerRush: mining for unwanted p2p traffic

Babak Rahbarinia; Roberto Perdisci; Andrea Lanzi

In this paper we present PeerRush, a novel system for the identification of unwanted P2P traffic. Unlike most previous work, PeerRush goes beyond P2P traffic detection, and can accurately categorize the detected P2P traffic and attribute it to specific P2P applications, including malicious applications such as P2P botnets. PeerRush achieves these results without the need of deep packet inspection, and can accurately identify applications that use encrypted P2P traffic. We implemented a prototype version of PeerRush and performed an extensive evaluation of the system over a variety of P2P traffic datasets. Our results show that we can detect all the considered types of P2P traffic with up to 99.5% true positives and 0.1% false positives. Furthermore, PeerRush can attribute the P2P traffic to a specific P2P application with a misclassification rate of 0.68% or less.

ACM Transactions on Privacy and Security (TOPS) | 2016

Efficient and Accurate Behavior-Based Tracking of Malware-Control Domains in Large ISP Networks

Babak Rahbarinia; Roberto Perdisci; Manos Antonakakis

In this article, we propose Segugio, a novel defense system that allows for efficiently tracking the occurrence of new malware-control domain names in very large ISP networks. Segugio passively monitors the DNS traffic to build a machine-domain bipartite graph representing who is querying what. After labeling nodes in this query behavior graph that are known to be either benign or malware-related, we propose a novel approach to accurately detect previously unknown malware-control domains. We implemented a proof-of-concept version of Segugio and deployed it in large ISP networks that serve millions of users. Our experimental results show that Segugio can track the occurrence of new malware-control domains with up to 94% true positives (TPs) at less than 0.1% false positives (FPs). In addition, we provide the following results: (1) we show that Segugio can also detect control domains related to new, previously unseen malware families, with 85% TPs at 0.1% FPs; (2) Segugio’s detection models learned on traffic from a given ISP network can be deployed into a different ISP network and still achieve very high detection accuracy; (3) new malware-control domains can be detected days or even weeks before they appear in a large commercial domain-name blacklist; (4) Segugio can be used to detect previously unknown malware-infected machines in ISP networks; and (5) we show that Segugio clearly outperforms domain-reputation systems based on Belief Propagation.

machine learning and data mining in pattern recognition | 2018

An Efficient Two-Layer Classification Approach for Hyperspectral Images

Semih Dinc; Babak Rahbarinia; Luis Cueva-Parra

Different from regular RGB images that only store red, green, and blue band values for each pixel, hyperspectral images are rich with information from the large portion of the spectrum, storing numerous spectral band values within each pixel. An efficient, two-layer region detection framework for hyperspectral images is introduced in this paper. The proposed framework aims to automatically identify various regions within a hyperspectral image by providing a classification for each pixel of the image, associating them to distinct regions. The first layer of the system includes two new classifiers, and is responsible for generating probability scores as the “new feature set” of the original dataset. The second layer works as an ensemble classifier and combines the newly generated features to estimate the region of the sample. Experimental results show that the proposed system can produce accurate classifications with an average area under the ROC curve of 0.98 over all regions. This result indicates the higher accuracy of the proposed system compared to some other well-known classifiers.

computer and communications security | 2018

Augmenting Telephone Spam Blacklists by Mining Large CDR Datasets

Jienan Liu; Babak Rahbarinia; Roberto Perdisci; Haitao Du; Li Su

Telephone spam has become an increasingly prevalent problem in many countries all over the world. For example, the US Federal Trade Commissions (FTC) National Do Not Call Registrys number of cumulative complaints of spam/scam calls reached 30.9 million submissions in 2016. Naturally, telephone carriers can play an important role in the fight against spam. However, due to the extremely large volume of calls that transit across large carrier networks, it is challenging to mine their vast amounts of call detail records (CDRs) to accurately detect and block spam phone calls. This is because CDRs only contain high-level metadata (e.g., source and destination numbers, call start time, call duration, etc.) related to each phone calls. In addition, ground truth about both benign and spam-related phone numbers is often very scarce (only a tiny fraction of all phone numbers can be labeled). More importantly, telephone carriers are extremely sensitive to false positives, as they need to avoid blocking any non-spam calls, making the detection of spam-related numbers even more challenging. In this paper, we present a novel detection system that aims to discover telephone numbers involved in spam campaigns. Given a small seed of known spam phone numbers, our system uses a combination of unsupervised and supervised machine learning methods to mine new, previously unknown spam numbers from large datasets of call detail records (CDRs). Our objective is not to detect all possible spam phone calls crossing a carriers network, but rather to expand the list of known spam numbers while aiming for zero false positives, so that the newly discovered numbers may be added to a phone blacklist, for example. To evaluate our system, we have conducted experiments over a large dataset of real-world CDRs provided by a leading telephony provider in China, while tuning the system to produce no false positives. The experimental results show that our system is able to greatly expand on the initial seed of known spam numbers by up to about 250%.

dependable systems and networks | 2017

Exploring the Long Tail of (Malicious) Software Downloads

Babak Rahbarinia; Marco Balduzzi; Roberto Perdisci

In this paper, we present a large-scale study of global trends in software download events, with an analysis of both benign and malicious downloads, and a categorization of events for which no ground truth is currently available. Our measurement study is based on a unique, real-world dataset collected at Trend Micro containing more than 3 million in-the-wild web-based software download events involving hundreds of thousands of Internet machines, collected over a period of seven months. Somewhat surprisingly, we found that despite our best efforts and the use of multiple sources of ground truth, more than 83% of all downloaded software files remain unknown, i.e. cannot be classified as benign or malicious, even two years after they were first observed. If we consider the number of machines that have downloaded at least one unknown file, we find that more than 69% of the entire machine/user population downloaded one or more unknown software file. Because the accuracy of malware detection systems reported in the academic literature is typically assessed only over software files that can be labeled, our findings raise concerns on their actual effectiveness in large-scale real-world deployments, and on their ability to defend the majority of Internet machines from infection. To better understand what these unknown software files may be, we perform a detailed analysis of their properties. We then explore whether it is possible to extend the labeling of software downloads by building a rule-based system that automatically learns from the available ground truth and can be used to identify many more benign and malicious files with very high confidence. This allows us to greatly expand the number of software files that can be labeled with high confidence, thus providing results that can benefit the evaluation of future malware detection systems.

computer and communications security | 2016