Riyad Alshammari
Dalhousie University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Riyad Alshammari.
computational intelligence and security | 2009
Riyad Alshammari; A. Nur Zincir-Heywood
The objective of this work is to assess the robustness of machine learning based traffic classification for classifying encrypted traffic where SSH and Skype are taken as good representatives of encrypted traffic. Here what we mean by robustness is that the classifiers are trained on data from one network but tested on data from an entirely different network. To this end, five learning algorithms — AdaBoost, Support Vector Machine, Naïe Bayesian, RIPPER and C4.5 — are evaluated using flow based features, where IP addresses, source/destination ports and payload information are not employed. Results indicate the C4.5 based approach performs much better than other algorithms on the identification of both SSH and Skype traffic on totally different networks.
Computer Networks | 2011
Riyad Alshammari; A. Nur Zincir-Heywood
Abstract Identifying encrypted application traffic represents an important issue for many network tasks including quality of service, firewall enforcement and security. Solutions should ideally be both simple – therefore efficient to deploy – and accurate. This paper presents a machine learning based approach employing simple packet header feature sets and statistical flow feature sets without using the IP addresses, source/destination ports and payload information to unveil encrypted application tunnels in network traffic. We demonstrate the effectiveness of our approach as a forensic analysis tool on two encrypted applications, Secure SHell (SSH) and Skype, using traces captured from entirely different networks. Results indicate that it is possible to identify encrypted traffic tunnels with high accuracy without inspecting payload, IP addresses and port numbers. Moreover, it is also possible to identify which services run in encrypted tunnels.
conference on communication networks and services research | 2007
Riyad Alshammari; Sumalee Sonamthiang; Mohsen Teimouri; Denis Riordan
One of the major problems of Intrusion Detection Systems (IDS) at the present is the high rate of false alerts that the systems produce. These alerts cause problems to human analysts to repeatedly and intensively analyze the false alerts to initiate appropriate actions. We demonstrate the advantages of using a hybrid neuro-fuzzy approach to reduce the number of false alarms. The neuro-fuzzy approach was experimented with different background knowledge sets in DARPA 1999 network traffic dataset. The approach was evaluated and compared with RIPPER algorithm. The results shows that the neuro- fuzzy approach significantly reduces the number of false alarms more than the RIPPER algorithm and requires less background knowledge sets.
congress on evolutionary computation | 2010
Riyad Alshammari; A. Nur Zincir-Heywood
The classification of Encrypted Traffic, namely Skype, from network traffic represents a particularly challenging problem. Solutions should ideally be both simple — therefore efficient to deploy — and accurate. Recent advances to team-based Genetic Programming provide the opportunity to decompose the original problem into a subset of classifiers with non-overlapping behaviors. Thus, in this work we have investigated the identification of Skype encrypted traffic using Symbiotic Bid-Based (SBB) paradigm of team based Genetic Programming (GP) found on flow features without using IP addresses, port numbers and payload data. Evaluation of SBB-GP against C4.5 and AdaBoost — representing current best practice — indicates that SBB-GP solutions are capable of providing simpler solutions in terms number of features used and the complexity of the solution/model without sacrificing accuracy.
conference on network and service management | 2010
Riyad Alshammari; A. Nur Zincir-Heywood
The classification of encrypted traffic on the fly from network traces represents a particularly challenging application domain. Recent advances in machine learning provide the opportunity to decompose the original problem into a subset of classifiers with non-overlapping behaviors, in effect providing further insight into the problem domain. Thus, the objective of this work is to classify VoIP encrypted traffic, where Gtalk and Skype applications are taken as good representatives. To this end, three different machine learning based approaches, namely, C4.5, AdaBoost and Genetic Programming (GP), are evaluated under data sets common and independent from the training condition. In this case, flow based features are employed without using the IP addresses, source/destination ports and payload information. Results indicate that C4.5 based machine learning approach has the best performance.
2009 IEEE Symposium on Computational Intelligence in Cyber Security | 2009
Riyad Alshammari; A. Nur Zincir-Heywood
The objective of this work is to discover generalized signatures for identifying encrypted traffic where SSH is taken as an example application. What we mean by generalized signatures is that the signatures learned by training on one network are still valid when they are applied to traffic coming from a totally different network. We identified 13 signatures and 14 flow attributes for SSH traffic classification where IP addresses, source/destination ports and payload information are not employed. The signatures are able to identify encrypted traffic with high detection rate and low false positive rate. We can achieve up to 97% DR and 0.8% FPR for identifying SSH traffic.
genetic and evolutionary computation conference | 2009
Riyad Alshammari; Peter Lichodzijewski; Malcolm I. Heywood; A. Nur Zincir-Heywood
The classification of Encrypted Traffic, namely Secure Shell (SSH), on the fly from network TCP traffic represents a particularly challenging application domain for machine learning. Solutions should ideally be both simple - therefore efficient to deploy - and accurate. Recent advances to teambased Genetic Programming provide the opportunity to decompose the original problem into a subset of classifiers with non-overlapping behaviors, in effect providing further insight into the problem domain and increasing the throughput of solutions. Thus, in this work we have investigated the identification of SSH encrypted traffic based on packet header features without using IP addresses, port numbers and payload data. Evaluation of C4.5 and AdaBoost - representing current best practice - against the Symbiotic Bid-based (SBB) paradigm of team-based Genetic Programming (GP) under data sets common and independent from the training condition indicates that SBB based GP solutions are capable of providing simpler solutions without sacrificing accuracy.
CISIS | 2009
Riyad Alshammari; A. Nur Zincir-Heywood
The objective of this work is the comparison of two types of feature sets for the classification of encrypted traffic such as SSH. To this end, two learning algorithms – RIPPER and C4.5 – are employed using packet header and flow-based features. Traffic classification is performed without using features such as IP addresses, source/destination ports and payload information. Results indicate that the feature set based on packet header information is comparable with flow based feature set in terms of a high detection rate and a low false positive rate.
security and trust management | 2009
Riyad Alshammari; A. Nur Zincir-Heywood; Abdel Aziz Farrag
The objective of this work is the classification of encrypted traffic where SSH is taken as an example application. To this end, four learning algorithms AdaBoost, RIPPER, C4.5 and Rough Set are evaluated using flow based features to extract the minimum features/rules set required to classify SSH traffic. Results indicate that C4.5 based classifier performs better than the other three. However, we have also identified 15 features that are important to classify encrypted traffic, namely SSH.
congress on evolutionary computation | 2011
Riyad Alshammari; A. Nur Zincir-Heywood
Traffic classification becomes more challenging since the traditional techniques such as port numbers or deep packet inspection are ineffective against voice over IP (VoIP) applications, which uses non-standard ports and encryption. Statistical information based on network layer with the use of machine learning (ML) can achieve high classification accuracy and produce transportable signatures. However, the ability of ML to find transportable signatures depends mainly on the training data sets. In this paper, we explore the importance of sampling training data sets for the ML algorithms, specifically Genetic Programming, C5.0, Naive Bayesian and AdaBoost, to find transportable signatures. To this end, we employed two techniques for sampling network training data sets, namely random sampling and consecutive sampling. Results show that random sampling and 90-minute consecutive sampling have the best performance in terms of accuracy using C5.0 and SBB, respectively. In terms of complexity, the size of C5.0 solutions increases as the training size increases, whereas SBB finds simpler solutions.