Nir Nissim
Ben-Gurion University of the Negev
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Nir Nissim.
intelligence and security informatics | 2008
Robert Moskovitch; Dima Stopel; Clint Feher; Nir Nissim; Yuval Elovici
Todaypsilas signature-based anti-viruses are very accurate, but are limited in detecting new malicious code. Currently, dozens of new malicious codes are created every day, and this number is expected to increase in the coming years. Recently, classification algorithms were used successfully for the detection of unknown malicious code. These studies used a test collection with a limited size where the same malicious-benign-file ratio in both the training and test sets, which does not reflect real-life conditions. In this paper we present a methodology for the detection of unknown malicious code, based on text categorization concepts. We performed an extensive evaluation using a test collection that contains more than 30,000 malicious and benign files, in which we investigated the imbalance problem. In real-life scenarios, the malicious file content is expected to be low, about 10% of the total files. For practical purposes, it is unclear as to what the corresponding percentage in the training set should be. Our results indicate that greater than 95% accuracy can be achieved through the use of a training set that contains below 20% malicious file content.
Journal in Computer Virology | 2009
Robert Moskovitch; Dima Stopel; Clint Feher; Nir Nissim; Nathalie Japkowicz; Yuval Elovici
The recent growth in network usage has motivated the creation of new malicious code for various purposes. Today’s signature-based antiviruses are very accurate for known malicious code, but can not detect new malicious code. Recently, classification algorithms were used successfully for the detection of unknown malicious code. But, these studies involved a test collection with a limited size and the same malicious: benign file ratio in both the training and test sets, a situation which does not reflect real-life conditions. We present a methodology for the detection of unknown malicious code, which examines concepts from text categorization, based on n-grams extraction from the binary code and feature selection. We performed an extensive evaluation, consisting of a test collection of more than 30,000 files, in which we investigated the class imbalance problem. In real-life scenarios, the malicious file content is expected to be low, about 10% of the total files. For practical purposes, it is unclear as to what the corresponding percentage in the training set should be. Our results indicate that greater than 95% accuracy can be achieved through the use of a training set that has a malicious file content of less than 33.3%.
knowledge discovery and data mining | 2009
Robert Moskovitch; Nir Nissim; Yuval Elovici
The recent growth in network usage has motivated the creation of new malicious code for various purposes, including economic and other malicious purposes. Currently, dozens of new malicious codes are created every day and this number is expected to increase in the coming years. Todays signature-based anti-viruses and heuristic-based methods are accurate, but cannot detect new malicious code. Recently, classification algorithms were used successfully for the detection of malicious code. We present a complete methodology for the detection of unknown malicious code, inspired by text categorization concepts. However, this approach can be exploited further to achieve a more accurate and efficient acquisition method of unknown malicious files. We use an Active-Learning framework that enables the selection of the unknown files for fast acquisition. We performed an extensive evaluation of a test collection consisting of more than 30,000 files. We present a rigorous evaluation setup, consisting of real-life scenarios, in which the malicious file content is expected to be low, at about 10% of the files in the stream. We define specific evaluation measures based on the known precision and recall measures, which show the accuracy of the acquisition process and the improvement in the classifier resulting from the efficient acquisition process.
Computers & Security | 2015
Nir Nissim; Aviad Cohen; Chanan Glezer; Yuval Elovici
Initial penetration is one of the first steps of an Advanced Persistent Threat (APT) attack, and it is considered one of the most significant means of initiating cyber-attacks aimed at organizations. Such an attack usually results in the loss of sensitive and confidential information. Because email communication is an integral part of daily business operations, APT attackers frequently leverage email as an attack vector for initial penetration of the targeted organization. Emails allow the attacker to deliver malicious attachments or links to malicious websites. Attackers usually use social engineering in order to make the recipient open the malicious email, open the attachment, or press a link. Existing defensive solutions within organizations prevent executables from entering organizational networks via emails, therefore, recent APT attacks tend to attach non-executable files (PDF, MS Office etc.) which are widely used in organizations and mistakenly considered less suspicious or malicious. This article surveys existing academic methods for the detection of malicious PDF files. The article outlines an Active Learning framework and highlights the correlation between structural incompatibility of PDF files and their likelihood of maliciousness. Finally, we provide comparisons, insights and conclusions, as well as avenues for future research in order to enhance the detection of malicious PDFs.
KI '07 Proceedings of the 30th annual German conference on Advances in Artificial Intelligence | 2007
Robert Moskovitch; Nir Nissim; Dima Stopel; Clint Feher; Roman Englert; Yuval Elovici
Detecting unknown worms is a challenging task. Extant solutions, such as anti-virus tools, rely mainly on prior explicit knowledge of specific worm signatures. As a result, after the appearance of a new worm on the Web there is a significant delay until an update carrying the worms signature is distributed to anti-virus tools. We propose an innovative technique for detecting the presence of an unknown worm, based on the computer operating system measurements. We monitored 323 computer features and reduced them to 20 features through feature selection. Support vector machines were applied using 3 kernel functions. In addition we used active learning as a selective sampling method to increase the performance of the classifier, exceeding above 90% mean accuracy, and for specific unknown worms 94% accuracy.
intelligence and security informatics | 2007
Robert Moskovitch; Nir Nissim; Yuval Elovici
Detection of known malicious code is commonly performed by anti-virus tools. These tools detect the known malicious code using signature detection methods. Each time a new malicious code is found the anti-virus vendors create a new signature and update their clients. During the period between the appearance of a new unknown malicious code and the update of the signature base of the anti-virus clients, millions of computers might be infected. In order to cope with this problem, new solutions must be found for detecting unknown malicious code at the entrance of a clients computer. We presented here the use of active learning in the acquisition of unknown malicious code. Preliminary Results are encouraging. We are currently in the process of creating a wide test collection of more than 30,000 benign and malicious files to evaluate several active learning criterions.
intelligence and security informatics | 2014
Nir Nissim; Aviad Cohen; Robert Moskovitch; Assaf Shabtai; Mattan Edry; Oren BarAd; Yuval Elovici
Email communication carrying malicious attachments or links is often used as an attack vector for initial penetration of the targeted organization. Existing defense solutions prevent executables from entering organizational networks via emails, therefore recent attacks tend to use non-executable files such as PDF. Machine learning algorithms have recently been applied for detecting malicious PDF files. These techniques, however, lack an essential element - they cannot be updated daily. In this study we present ALPD, a framework that is based on active learning methods that are specially designed to efficiently assist anti-virus vendors to focus their analytical efforts. This is done by identifying and acquiring new PDF files that are most likely malicious, as well as informative benign PDF documents. These files are used for retraining and enhancing the knowledge stores. Evaluation results show that in the final day of the experiment, Combination, one of our AL methods, outperformed all the others, enriching the anti-viruss signature repository with almost seven times more new PDF malware while also improving the detection models performance on a daily basis.
artificial intelligence in medicine in europe | 2015
Nir Nissim; Mary Regina Boland; Robert Moskovitch; Nicholas P. Tatonetti; Yuval Elovici; Yuval Shahar; George Hripcsak
Understanding condition severity, as extracted from Electronic Health Records (EHRs), is important for many public health purposes. Methods requiring physicians to annotate condition severity are time-consuming and costly. Previously, a passive learning algorithm called CAESAR was developed to capture severity in EHRs. This approach required physicians to label conditions manually, an exhaustive process. We developed a framework that uses two Active Learning (AL) methods (Exploitation and Combination_XA) to decrease manual labeling efforts by selecting only the most informative conditions for training. We call our approach CAESAR-Active Learning Enhancement (CAESAR-ALE). As compared to passive methods,CAESAR-ALE’s first AL method, Exploitation, reduced labeling efforts by 64% and achieved an equivalent true positive rate, while CAESAR-ALE’s second AL method, Combination_XA, reduced labeling efforts by 48% and achieved equivalent accuracy. In addition, both these AL methods outperformed the traditional AL method (SVM-Margin). These results demonstrate the potential of AL methods for decreasing the labeling efforts of medical experts, while achieving greater accuracy and lower costs.
Knowledge and Information Systems | 2016
Nir Nissim; Robert Moskovitch; Oren BarAd; Lior Rokach; Yuval Elovici
Many new unknown malwares aimed at compromising smartphones are created constantly. These widely used smartphones are very dependent on anti-virus solutions due to their limited resources. To update the anti-virus signature repository, anti-virus vendors must deal with vast quantities of new applications daily in order to identify new unknown malwares. Machine learning algorithms have been used to address this task, yet they must also be efficiently updated on a daily basis. To improve detection and updatability, we introduce a new framework, “ALDROID” and active learning (AL) methods on which ALDROID is based. Our methods are aimed at selecting only new informative applications (benign and especially malicious), thus reducing the labeling efforts of security experts, and enable a frequent and efficient process of enhancing the framework’s detection model and Android’s anti-virus software. Results indicate that our AL methods outperformed other solutions including the existing AL method and heuristic engine. Our AL methods acquired the largest number and percentage of new malwares, while preserving the detection models’ detection capabilities (high TPR and low FPR rates). Specifically, our methods acquired more than double the amount of new malwares acquired by the heuristic engine and 6.5 times more malwares than the existing AL method.
IEEE Transactions on Information Forensics and Security | 2017
Nir Nissim; Aviad Cohen; Yuval Elovici
Attackers increasingly take advantage of innocent users who tend to casually open email messages assumed to be benign, carrying malicious documents. Recent targeted attacks aimed at organizations utilize the new Microsoft Word documents (*.docx). Anti-virus software fails to detect new unknown malicious files, including malicious docx files. In this paper, we present ALDOCX, a framework aimed at accurate detection of new unknown malicious docx files that also efficiently enhances the framework’s detection capabilities over time. Detection relies upon our new structural feature extraction methodology (SFEM), which is performed statically using meta-features extracted from docx files. Using machine-learning algorithms with SFEM, we created a detection model that successfully detects new unknown malicious docx files. In addition, because it is crucial to maintain the detection model’s updatability and incorporate new malicious files created daily, ALDOCX integrates our active-learning (AL) methods, which are designed to efficiently assist anti-virus vendors by better focusing their experts’ analytical efforts and enhance detection capability. ALDOCX identifies and acquires new docx files that are most likely malicious, as well as informative benign files. These files are used for enhancing the knowledge stores of both the detection model and the anti-virus software. The evaluation results show that by using ALDOCX and SFEM, we achieved a high detection rate of malicious docx files (94.44% TPR) compared with the anti-virus software (85.9% TPR)—with very low FPR rates (0.19%). ALDOCX’s AL methods used only 14% of the labeled docx files, which led to a reduction of 95.5% in security experts’ labeling efforts compared with the passive learning and the support vector machine (SVM)-Margin (existing active-learning method). Our AL methods also showed a significant improvement of 91% in number of unknown docx malware acquired, compared with the passive learning and the SVM-Margin, thus providing an improved updating solution for the detection model, as well as the anti-virus software widely used within organizations.