Is this you? Create Your Porfile

Alon Schclar

Ben-Gurion University of the Negev

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alon Schclar is active.

Explore More

Publication

Featured researches published by Alon Schclar.

Information Sciences | 2012

User identity verification via mouse dynamics

Clint Feher; Yuval Elovici; Robert Moskovitch; Lior Rokach; Alon Schclar

Identity theft is a crime in which hackers perpetrate fraudulent activity under stolen identities by using credentials, such as passwords and smartcards, unlawfully obtained from legitimate users or by using logged-on computers that are left unattended. User verification methods provide a security layer in addition to the username and password by continuously validating the identity of logged-on users based on their physiological and behavioral characteristics. We introduce a novel method that continuously verifies users according to characteristics of their interaction with the mouse. The contribution of this work is threefold: first, user verification is derived based on the classification results of each individual mouse action, in contrast to methods which aggregate mouse actions. Second, we propose a hierarchy of mouse actions from which the features are extracted. Third, we introduce new features to characterize the mouse activity which are used in conjunction with features proposed in previous work. The proposed algorithm outperforms current state-of-the-art methods by achieving higher verification accuracy while reducing the response time of the system.

Expert Systems With Applications | 2014

Ensemble methods for multi-label classification

Lior Rokach; Alon Schclar; Ehud Itach

Ensemble methods have been shown to be an effective tool for solving multi-label classification tasks. In the RAndom k-labELsets (RAKEL) algorithm, each member of the ensemble is associated with a small randomly-selected subset of k labels. Then, a single label classifier is trained according to each combination of elements in the subset. In this paper we adopt a similar approach, however, instead of randomly choosing subsets, we select the minimum required subsets of k labels that cover all labels and meet additional constraints such as coverage of inter-label correlations. Construction of the cover is achieved by formulating the subset selection as a minimum set covering problem (SCP) and solving it by using approximation algorithms. Every cover needs only to be prepared once by offline algorithms. Once prepared, a cover may be applied to the classification of any given multi-label dataset whose properties conform with those of the cover. The contribution of this paper is two-fold. First, we introduce SCP as a general framework for constructing label covers while allowing the user to incorporate cover construction constraints. We demonstrate the effectiveness of this framework by proposing two construction constraints whose enforcement produces covers that improve the prediction performance of random selection by achieving better coverage of labels and inter-label correlations. Second, we provide theoretical bounds that quantify the probabilities of random selection to produce covers that meet the proposed construction criteria. The experimental results indicate that the proposed methods improve multi-label classification accuracy and stability compared to the RAKEL algorithm and to other state-of-the-art algorithms.

multiple classifier systems | 2013

Improving Simple Collaborative Filtering Models Using Ensemble Methods

Ariel Bar; Lior Rokach; Guy Shani; Bracha Shapira; Alon Schclar

In this paper we examine the effect of applying ensemble learning to the performance of collaborative filtering methods. We present several systematic approaches for generating an ensemble of collaborative filtering models based on a single collaborative filtering algorithm (single-model or homogeneous ensemble). We present an adaptation of several popular ensemble techniques in machine learning for the collaborative filtering domain, including bagging, boosting, fusion and randomness injection. We evaluate the proposed approach on several types of collaborative filtering base models: k-NN, matrix factorization and a neighborhood matrix factorization model. Empirical evaluation shows a prediction improvement compared to all base CF algorithms. In particular, we show that the performance of an ensemble of simple (weak) CF models such as k-NN is competitive compared with a single strong CF model (such as matrix factorization) while requiring an order of magnitude less computational cost.

international conference on enterprise information systems | 2009

Random Projection Ensemble Classifiers

Alon Schclar; Lior Rokach

We introduce a novel ensemble model based on random projections. The contribution of using random projections is two-fold. First, the randomness provides the diversity which is required for the construction of an ensemble model. Second, random projections embed the original set into a space of lower dimension while preserving the dataset’s geometrical structure to a given distortion. This reduces the computational complexity of the model construction as well as the complexity of the classification. Furthermore, dimensionality reduction removes noisy features from the data and also represents the information which is inherent in the raw data by using a small number of features. The noise removal increases the accuracy of the classifier.

conference on recommender systems | 2009

Ensemble methods for improving the performance of neighborhood-based collaborative filtering

Alon Schclar; Alexander Tsikinovsky; Lior Rokach; Amnon Meisels; Liat Antwarg

Recommender systems provide consumers with ratings of items. These ratings are based on a set of ratings that were obtained from a wide scope of users. Predicting the ratings can be formulated as a regression problem. Ensemble regression methods are effective tools that improve the results of simple regression algorithms by iteratively applying the simple algorithm to a diverse set of inputs. The present paper describes a simple and effective ensemble regressor for the prediction of missing ratings in recommender systems. The ensemble method is an adaptation of the AdaBoost regression algorithm for recommendation tasks. In all iterations, interpolation weights for all nearest neighbors are simultaneously derived by minimizing the root mean squared error. From iteration to iteration instances that are hard to predict are reinforced by manipulating their weights in the goal function that needs to be minimized. The experimental evaluation demonstrates that the ensemble methodology significantly improves the predictive performance of single neighborhood-based collaborative filtering.

systems man and cybernetics | 2012

User Authentication Based on Representative Users

Alon Schclar; Lior Rokach; Adi Abramson; Yuval Elovici

User authentication based on username and password is the most common means to enforce access control. This form of access restriction is prone to hacking since stolen usernames and passwords can be exploited to impersonate legitimate users in order to commit malicious activity. Biometric authentication incorporates additional user characteristics such as the manner by which the keyboard is used in order to identify users. We introduce a novel approach for user authentication based on the keystroke dynamics of the password entry. A classifier is tailored to each user and the novelty lies in the manner by which the training set is constructed. Specifically, only the keystroke dynamics of a small subset of users, which we refer to as representatives, is used along with the password entry keystroke dynamics of the examined user. The contribution of this approach is twofold: it reduces the possibility of overfitting, while allowing scalability to a high volume of users. We propose two strategies to construct the subset for each user. The first selects the users whose keystroke profiles govern the profiles of all the users, while the second strategy chooses the users whose profiles are the most similar to the profile of the user for whom the classifier is constructed. Results are promising reaching in some cases 90% area under the curve. In many cases, a higher number of representatives deteriorate the accuracy which may imply overfitting. An extensive evaluation was performed using a dataset containing over 780 users.

intelligent data analysis | 2017

Ensembles of classifiers based on dimensionality reduction

Alon Schclar; Lior Rokach; Amir Amit

We present a novel approach for the construction of ensemble classifiers based on dimensionality reduction. Dimensionality reduction methods represent datasets using a small number of attributes while preserving the information conveyed by the original dataset. The ensemble members are trained based on dimension-reduced versions of the training set. These versions are obtained by applying dimensionality reduction to the original training set using different values of the input parameters. This construction meets both the diversity and accuracy criteria which are required to construct an ensemble classifier where the former criterion is obtained by the various input parameter values and the latter is achieved due to the decorrelation and noise reduction properties of dimensionality reduction. In order to classify a test sample, it is first embedded into the dimension reduced space of each individual classifier by using an out-of-sample extension algorithm. Each classifier is then applied to the embedded sample and the classification is obtained via a voting scheme. We present three variations of the proposed approach based on the Random Projections, the Diffusion Maps and the Random Subspaces dimensionality reduction algorithms. We also present a multi-strategy ensemble which combines AdaBoost and Diffusion Maps. A comparison is made with the Bagging, AdaBoost, Rotation Forest ensemble classifiers and also with the base classifier which does not incorporate dimensionality reduction. Our experiments used seventeen benchmark datasets from the UCI repository. The results obtained by the proposed algorithms were superior in many cases to other algorithms.

Information Sciences | 2016

XML-AD

Eitan Menahem; Alon Schclar; Lior Rokach; Yuval Elovici

Many information systems use XML documents to store data and to interact with other systems. Abnormal documents, which can be the result of either an on-going cyber attack or the actions of a benign user, can potentially harm the interacting systems and are therefore regarded as a threat. In this paper we address the problem of anomaly detection and localization in XML documents using machine learning techniques. We present XML-AD - a new XML anomaly detection framework. Within this framework, an automatic method for extraction of feature from XML documents as well as a practical method for transforming XML features into vectors of fixed dimensionality was developed. With these two methods in place, the XML-AD framework makes it possible to utilize general learning algorithms for anomaly detection. The core of the framework consists of a novel multi-univariate anomaly detection algorithm, ADIFA. The framework was evaluated using four XML documents datasets which were obtained from real information systems. It achieved over 89% true positive detection rate with less than 0.2% of false positives.

International Journal of Granular Computing, Rough Sets and Intelligent Systems | 2012

k-anonymised reducts

Lior Rokach; Alon Schclar

Privacy-preserving data mining aims to prevent the exposure of sensitive information as a result of mining algorithms. This is commonly achieved by data anonymisation. One way to anonymise data is by adherence to the k-anonymity concept which requires that the probability to identify an individual by linking databases does not exceed 1/k. In this paper, we propose an algorithm which utilises rough set theory to achieve k-anonymity. The basic idea is to partition the original dataset into several disjoint reducts such that each one of them adheres to k-anonymity. We show that it is easier to make each reduct comply with k-anonymity if it does not contain all quasi-identifier attributes. Moreover, our procedure ensures that even if the attacker attempts to rejoin the reducts, the k-anonymity is still preserved. Unlike other algorithms that achieve k-anonymity, the proposed method requires no prior knowledge of the domain hierarchy taxonomy.

granular computing | 2010

k-Anonymized Reducts

Lior Rokach; Alon Schclar

Privacy preserving data mining aims to prevent the violation of privacy that might result from mining of sensitive data. This is commonly achieved by data anonymization. One way to anonymize data is adherence to the k-anonymity concept which requires that the probability to identify an individual by linking databases not to exceed 1/k. In this paper we propose an algorithm which utilizes rough set theory to achieve k-anonymity. The basic idea is to partition the original dataset into several disjoint reducts such that each one of them adheres to k-anonymity. We show that it is easier to make each reduct comply with k-anonymity if it does not contain all quasi-identifier attributes. Moreover, our procedure ensures that even if the attacker attempts to rejoin the reducts, the kanonymity is still preserved.

Explore More