[PDF] A new method for flow-based network intrusion detection using the inverse Potts model

Abstract

Network Intrusion Detection Systems (NIDS) play an important role as tools for identifying potential network threats. In the context of ever-increasing traffic volume on computer networks, flow-based NIDS arise as good solutions for real-time traffic classification. In recent years, different flow-based classifiers have been proposed using Machine Learning (ML) algorithms. Nevertheless, classical ML-based classifiers have some limitations. For instance, they require large amounts of labeled data for training, which might be difficult to obtain. Additionally, most ML-based classifiers are not capable of domain adaptation, i.e. after being trained on an specific data distribution, they are not general enough to be applied to other related data distributions. And, finally, many of the models inferred by these algorithms are black boxes, which do not provide explainable results. To overcome these limitations, we propose a new algorithm, called Energy-based Flow Classifier (EFC). This anomaly-based classifier uses inverse statistics to infer a statistical model based on labeled benign examples. We show that EFC is capable of accurately performing binary flow classification and is more adaptable to different data distributions than classical ML-based classifiers. Given the positive results obtained on three different datasets (CIDDS-001, CICIDS17 and CICDDoS19), we consider EFC to be a promising algorithm to perform robust flow-based traffic classification.

Full PDF

11 A new method for ﬂow-based network intrusiondetection using the inverse Potts model

Camila Pontes, Manuela Souza, João Gondim, Matt Bishop and Marcelo Marotta

Abstract —Network Intrusion Detection Systems (NIDS) playan important role as tools for identifying potential networkthreats. In the context of ever-increasing trafﬁc volume on com-puter networks, ﬂow-based NIDS arise as good solutions for real-time trafﬁc classiﬁcation. In recent years, different ﬂow-basedclassiﬁers have been proposed using Machine Learning (ML)algorithms. Nevertheless, classical ML-based classiﬁers have somelimitations. For instance, they require large amounts of labeleddata, which might be difﬁcult to obtain. Additionally, most ML-based classiﬁers are not capable of domain adaptation, i.e., afterbeing trained on an speciﬁc data distribution, they are not generalenough to be applied to other related data distributions. And,ﬁnally, many of the models inferred by these algorithms areblack boxes, hard to understand in detail. To overcome theselimitations, we propose a new algorithm, called Energy-basedFlow Classiﬁer (EFC). This anomaly-based classiﬁer uses inversestatistics to infer a statistical model based on labeled benignexamples. We show that EFC is capable of accurately performingbinary ﬂow classiﬁcation and is more adaptable to new domainsthan classical ML-based classiﬁers. Given the positive resultsobtained on three different datasets (CIDDS-001, CICIDS17 andCICDDoS19), we consider EFC to be a promising algorithm toperform robust ﬂow-based trafﬁc classiﬁcation.

Index Terms —Flow-based Network Intrusion Detection,Anomaly-based Network Intrusion Detection, Network FlowClassiﬁcation, Network Intrusion Detection Systems, Energy-based Flow Classiﬁer, Inverse Potts Model, Domain Adaptation.

I. I

NTRODUCTION S YMANTEC’S Internet Security Threat Report [1] pointsout a 56% increase in the number of web attacks in 2019.Network scans, denial of service, and brute force attacks areamong the most common threats. Such malicious activitiesthreaten not only individuals, but also some collective or-ganizations such as public health, ﬁnancial, and governmentinstitutions. In this context, Network Intrusion Detection Sys-tems (NIDSs) play an important role as tools for identifyingpotential threats [2].There are two main approaches for NIDSs regarding thekind of data analyzed: packet-based and ﬂow-based. In theformer, deep packet inspection is performed, taking into ac-count individual packet payloads as well as header information[3]. In the latter, ﬂows, i.e., packet collections, are analyzedregarding their properties, e.g., duration, number of packets,number of bytes, and source/destination port [3]. To performclassiﬁcation in real-time, a massive volume of data must

C. Pontes, M. Souza, J. Gondim and M. A. Marotta are with the Uni-versity of Brasilia, Brazil, emails: [email protected], [email protected],[email protected];M. Bishop is with the University of California at Davis, Davis, USA, email:[email protected] be analyzed, which makes deep packet inspection too costlyto be applied regarding processing and energy consumption.Since ﬂow-based approaches can classify the whole trafﬁcinspecting an equivalent to 0.1% of the total volume, NIDSsbased on ﬂow analysis arise as good solutions for real-timetrafﬁc classiﬁcation [4].In recent years, different ﬂow-based classiﬁers have beenproposed based on both shallow and deep learning [5]. Ac-cording to the report in [5], the best ﬂow-based classiﬁersachieve around 99% accuracy. Although quite accurate, clas-sical Machine Learning (ML)-based classiﬁers require labeledmalicious trafﬁc samples to perform training. However, realtrafﬁc labeling might be difﬁcult, especially in the case ofmalicious trafﬁc. In addition to that, ML-based classiﬁers, aftertrained on speciﬁc data distribution, usually do not work wellwhen applied to other data with related distribution, i.e., theyhave low domain adaptation capability [6], [7]. Moreover, mostML algorithms are well-known to be black-box mechanisms,challenging to be understood and readjusted in detail [8]. Inthis regard, there is a clear need for a new ﬂow-based classiﬁerfor NIDSs, which generates an understandable model (whitebox), is based solely on benign examples, and is adaptable todifferent domains.In this work, we propose a novel classiﬁer called Energy-based Flow Classiﬁer (EFC), which is a network ﬂow classiﬁerbased on the inverse Potts model. EFC performs one-class,anomaly-based classiﬁcation, i.e., as long as it can learn theproperties of benign ﬂows, it will be able to discriminatebetween benign and malicious ﬂows. Moreover, it is a whitebox algorithm, producing a statistical model that can beanalyzed in detail regarding individual parameter values. Here,we compared the performance of EFC against a variety ofclassiﬁers using three different datasets, i.e.,

CIDDS-001 [9],CICIDS17 [10], and CICDDoS19 [11]. Our results show thatclassiﬁers based on classical ML are more sensitive to changesin data distribution than EFC. Our main contributions are: • The proposal and implementation of a ﬂow classiﬁerbased on the inverse Potts model to be employed inNIDSs; • A performance comparison of the proposed classiﬁerwith classical ML-based classiﬁers using three differentdatasets; • An analysis of how different classiﬁers perform whentrained within one domain and tested in another relateddomain.The rest of this paper is structured as follows. In Section II,we brieﬂy present the state-of-the-art in ﬂow-based NIDSs. a r X i v : . [ c s . N I] J un In Section III, we describe the structure of network ﬂowswith a preliminary analysis of the datasets considered here. InSection IV, we introduce the statistical model proposed and theclassiﬁer implementation. In Section V, we present the resultsobtained regarding the analysis of the statistical model andthe classiﬁcation experiments performed. Finally, in SectionVI, we present our conclusions and future work.II. R

ELATED W ORK

In this section, we brieﬂy review the state-of-the-art in ﬂow-based network intrusion detection. We show some early workin the ﬁeld, as well as recent advances. In the end, someprevious work on CIDDS-001, CICIDS17, and CICDDoS19datasets are shown.Several ML-based ﬂow classiﬁers have been explored overthe last 15 years for network intrusion detection. There arerecent comprehensive surveys, in which ML-based classiﬁersused in this context are reviewed [5], [12], [13], [14]. Withinthe algorithms evaluated in these surveys, Random Forest (RF)performs especially well, and has been applied in most of therecently proposed NIDS [15], [16], [17]. In this work, wedeploy most of the ML-classiﬁers covered in recent surveysto serve as baselines against which we compare our classiﬁer.Flow-based intrusion detection has also been explored inmodern contexts, i.e.,

Internet of Things (IoT) networks [18],[19] and cloud environments [20], [21]. The proposed solu-tions for intrusion detection in IoT and cloud environmentsachieved satisfactory classiﬁcation accuracy and feasible run-ning times. However, their domain adaptation capability is stilla matter of investigation. Most of the proposed solutions fromliterature assume that there will be available training sets tobe used in all contexts, which is not necessarily true. In thisregard, we propose a ﬂow-classiﬁer solution that is adaptableto different domains without retraining.Since malicious data is frequently changing its characteristicwhen new attack types arise, domain adaptation becomes amajor issue to intrusion detection. Bartos et al. [6] and Li et al. [7] proposed similar approaches to cope with domainchange by applying data transformation to reduce differencesin data features across domains. Here, we propose a classiﬁerwhich is intrinsically adaptable to different domains, since itsleaning phase is based solely on benign data. Hence, there isno need to transform data to adapt data features, making ourapproach simpler and more straightforward.To assess EFC’s performance, one of the datasets we useis CIDDS-001. This dataset was used by Verma and Ranga(2018) [22] to assess the performance of K-Nearest Neighbors(KNN) and k-means clustering algorithms when classifyingtrafﬁc. Both algorithms achieved over 99% accuracy. Also,Ring et al. [23] explored slow port scans detection usingCIDDS-001. The approach proposed by them is capable ofaccurately recognizing the attacks with a low false alarm rate.Finally, Abdulhammed et al. [24] also performed classiﬁcationbased on ﬂows on CIDDS-001 and proposed an approach thatis robust considering imbalanced network trafﬁc. In summary,CIDDS-001 is an updated and relevant dataset to be used fornetwork ﬂow-classiﬁcation solutions, being one of our datasetchoices for assessing the performance of EFC. Another two datasets we use in this work are CICIDS17and CICDDoS19, from the Canadian Institute for Cyber Se-curity. Recently, Yulianto, Sukarno, and Suwastika [25] usedCICIDS17 to assess the performance of an Adaboost-basedclassiﬁer. Aksu et al. [26] did the same in 2018 with differentML classiﬁers. CICIDS17 contains benign as well as the mostup-to-date common attacks, resembling true real-world data,being a relevant dataset to consider for ﬂow-based trafﬁcclassiﬁcation.CICDDoS19, in turn, is a very recent dataset with a focuson DDoS attacks. A very recent work [27] proposes a real-time entropy-based NIDS for detection of volumetric DDoSin IoT and performs tests over CICDDoS19 dataset, amongother datasets. Another recent work [28] obtained over 99%accuracy over CICDDoS19 dataset using a ConvolutionalNeural Network (CNN). And, ﬁnally, Novaes et al. [29]proposed a system for intrusion detection based on fuzzylogic, which had its performance assessed on CICDDoS19.The rising popularity of this dataset serves as proof of itsrelevance to assess the performance of different NIDS. Hence,we use CICDDoS19 and other two up-to-date datasets to testour classiﬁer and compare it to the performance of classicalML classiﬁers. III. P

RELIMINARIES

A network ﬂow is a set of packets that traverses in-termediary nodes between end-points within a given timeinterval. Under the perspective of an intermediary node, i.e., an observation point, all packets belonging to a given ﬂowhave a set of common features called ﬂow keys. It means thatﬂow keys do not change for packets belonging to the sameﬂow, while the remaining features might vary. FlowScan [30]is an example of a tool capable of collecting data from a setof packets and extracting ﬂow features to be later exported indifferent formats, such as NetFlow and IPFIX. Since NetFlowis the most commonly used format, its main features are listedbelow: • Source/Destination IP (ﬂow keys) - determine the originand destination of a given ﬂow in the network; • Source/Destination port (ﬂow keys) - characterize differ-ent kinds of network services e.g., ssh service uses port22; • Protocol (ﬂow key) - characterizes ﬂows regarding thetransport protocol used e.g.,

TCP, UDP, ICMP. • Number of packets (feature) - total number of packetscaptured in a ﬂow; • Number of bytes (feature) - total number of bytes in aﬂow; • Duration (feature) - total duration of a ﬂow in seconds; • Initial timestamp (feature) - system time when one ﬂowstarted to be captured.Other features such as TCP Flags and Type of Service mightalso be exported in some cases. The combination of differentﬂow keys and features characterize one ﬂow and determine itsparticular behavior.Flow-based approaches are seen as suitable alternatives toprecede packet inspection in real-time NIDSs. The idea is to deeply inspect only the packets belonging to ﬂows consideredto be suspicious by the ﬂow-based classiﬁer. A two-stepapproach would notably reduce the amount of data analyzedwhile maintaining a high classiﬁcation accuracy [4]. In thiswork, we are only concerned with the ﬁrst step, which isthe ﬂow classiﬁcation. We evaluate the performance of ouralgorithm, the EFC, compared to other ML algorithms usingthree different datasets. We also evaluate the performanceof the algorithms by training with data from part of thedataset and testing with other parts of it. Nonetheless, althoughboth parts of the data come from the same dataset, theirdistributions are different to characterize domain adaptation. Inthe following, we brieﬂy describe the datasets used for testingand characterize what constitutes a domain adaptation in eachof them.

A. CIDDS-001

CIDDS-001 [9] is a relatively recent dataset composed ofa set of ﬂow samples captured within a simulated OpenStackenvironment and another set of ﬂow samples obtained from areal server. The former contains only simulated trafﬁc, whilethe latter includes both real and simulated trafﬁc. Each samplecollected within these two environments has one of the labelsdescribed in Table I.

Table IL

ABELS WITHIN

CIDDS-001

DATASET

Environment Labels

OpenStack normal, DoS, portScan, pingScan, bruteForceExternal server normal, DoS, bruteForce, unknown, suspicious

Simulated benign ﬂows are labeled as normal , while simu-lated malicious ﬂows are labeled as dos , portScan , pingScan or bruteForce , depending on the type of attack simulated. Thelabels suspicious and unknown , in turn, are used for real trafﬁc.The external server is open to user access through the ports80 and 443. Hence, ﬂows directed at these ports were labeledas unknown , since they could be either benign or malicious.All ﬂows directed at other ports were labeled as suspicious .Trafﬁc was sampled in both the simulated and the externalenvironment for a period of four weeks. For the simulatedenvironment, we consider only trafﬁc captured in the secondweek to reduce the amount of data to be analyzed. Similarly,only external trafﬁc captured within the third week wasassessed. These weeks were selected because they have thefairest proportion between the different types of maliciousﬂows. Within this dataset, a change from the simulated datadistribution to the external server data distribution is a domainchange, requiring the classiﬁers to adapt.CIDDS-001 dataset ﬂow features are shown in Table II.All features were taken into account for characterization andclassiﬁcation except for Src IP , Dest IP and

Date ﬁrst seen .These exceptions are because the latter one is intrinsicallynot informative to differentiate ﬂows, and the former two aremade up in the context of the simulated network and mightbe confounding.

Table IIF

EATURES WITHIN

CIDDS-001

DATASET e.g.,

ICMP, TCP, or UDP)6 Date ﬁrst seen Start time ﬂow ﬁrst seen7 Duration Duration of the ﬂow8 Bytes Number of transmitted bytes9 Packets Number of transmitted packets10 Flags OR concatenation of all TCP Flags

B. CICIDS17

CICIDS17 [10] dataset contains benign trafﬁc and the mostup-to-date common attacks, resembling real-world data. Thisdataset was built using the abstract behavior of 25 users basedon the HTTP, HTTPS, FTP, SSH, and email protocols. Thedata was captured during one week in July 2017. The attacksimplemented include Brute Force FTP, Brute Force SSH,DoS, Heartbleed, Web Attack, Inﬁltration, Botnet, and DDoS.They were executed both morning and afternoon on Tuesday,Wednesday, Thursday, and Friday (see Table III). Within thisdataset, a change from one day’s data distribution to that ofanother day is a domain change, requiring the classiﬁers toadapt.

Table IIIA

TTACKS WITHIN

CICIDS17

DATASET

Week day Attacks

MondayTuesday FTP-Patator, SSH-PatatorWednesday DoS slowloris, DoS Slowhttptes, DoS Hulk, DoSGoldenEye, Heartbleed Port 444Thursday Brute Force, XSS, Sql Injection, Dropbox download,Cool diskFriday Botnet ARES, Port Scan, DDoS LOIT

Flow features on this dataset were extracted using CI-CFlowMeter [31]. There are in total 88 features, which arenot going to be cited here because of the limited space. Allfeatures were considered here, except for Flow ID, Source IP,Destination IP, and Timestamp. These exceptions were madebecause the features were either intrinsically not informativeor made up within a simulated environment.

C. CICDDoS19

CICDDoS19 [11] contains benign trafﬁc and the most up-to-date common DDoS attacks (volumetric and application: lowvolume, slow rate), resembling real-world data. This datasetcontains different modern reﬂective DDoS attacks such asPortMap, NetBIOS, LDAP, MSSQL, UDP, UDP-Lag, SYN,NTP, DNS, and SNMP. The trafﬁc was captured in January(ﬁrst day) and March (second day), 2019. Attacks wereexecuted during this period (see Table IV). Within this dataset, a change from one day’s data distribution to that of anotherday is a domain change, requiring the classiﬁers to adapt.

Table IVA

TTACKS WITHIN

CICDD O S19

DATASET

Day Attacks

First PortMap, NetBIOS, LDAP, MSSQL, UDP, UDP-Lag, SYNSecond NTP, DNS, LDAP, MSSQL, NetBIOS, SNMP, SSDP, UDP,UDP-Lag, WebDDos, SYN, TFTP

Flow features on this dataset were extracted using CI-CFlowMeter [31]. All features were considered here, exceptfor Flow Id, Source IP, Destination IP, and Timestamp. Theseexceptions were made because the features were either in-trinsically not informative or made up within a simulatedenvironment. IV. S

TATISTICAL MODEL

The main task of inverse statistics is to infer a statisticaldistribution based on a sample of it [32]. Methods usinginverse statistics have been successfully applied to problemsin other disciplines, e.g., the difﬁculty of predicting proteincontacts in Biophysics [32], [33]. Here, the statistical inferenceis based on the Potts model [34]. This model provides amathematical description of interacting spins on a crystallinelattice. Within the model framework, interacting spins aremapped into a graph G ( η , ε ) (see Figure 1 A)), where eachnode i ∈ η = { , ..., N } has an associated spin a i , which canassume one value from a set Ω that contains all possibleindividual quantum states. Each node i has also an associatedlocal ﬁeld h i ( a i ) that is a function of a i ’s state. Meanwhile,each edge ( i , j ) ∈ ε , i , j ∈ η , has an associated coupling value e i j ( a i , a j ) that is a function of the states of spins a i and a j associated to nodes i and j . A speciﬁc system conﬁgurationhas an associated total energy, determined by the Hamiltonianfunction H ( a ... a N ) , which depends on all spin states. ProtocolDuration SrcPtDstPt

80 800.1

TCP +1+2 -3 -2 -1-2 +1 +100A) B)a l a j a k a i e ij (a i ,a j ) h i (a i ) e jk (a j ,a k ) e kl (a k ,a l ) e jl (a j ,a l ) e il (a i ,a l ) e ik (a i ,a k ) h j (a j ) h k (a k ) h l (a l ) Figure 1. A) Interacting spins on a crystalline lattice. B) Network ﬂowmapped into a graph structure.

In this work, we adapt the Potts model to characterizenetwork ﬂows (see Figure 1 B)). An individual ﬂow k isrepresented by a speciﬁc graph conﬁguration G k ( η , ε ) . Insteadof spins, each node represents a selected feature i ∈ η = { SrcPort , ...,

Flags } . Within a given ﬂow k , each feature i assumes one value a ki from the set Ω i that contains allpossible values for this feature. As in the Potts Model, eachfeature i has an associated local ﬁeld h i ( a ki ) . Meanwhile, ε = { ( i , j ) | i , j ∈ η ; i (cid:54) = j } is the set of edges determined byall possible pairs of features. Each edge has an associatedcoupling value determined by the function e i j ( a ki , a k j ) .Since the values of local ﬁelds and couplings depend on thevalues assumed by features within a given ﬂow, each distinctﬂow will have a different combination of these quantities.As in the Potts Model, local ﬁelds and couplings determinethe total "energy" H ( a k ... a k N ) of each ﬂow. For instance,in Figure 1 B), the total "energy" of the ﬂow is obtainedby summing up all values associated with the edges andto the nodes, resulting in a total of -3. Note that what wecall energy is analogous to the notion of Hamiltonian inQuantum Mechanics. It is important to note that the modeldescribed here is discrete. Therefore continuous features mustbe discretized. The classes for continuous feature discretizationare shown on Supplementary Information. In the following, wepresent the framework applied to perform the statistical modelinference and subsequent energy-based ﬂow classiﬁcation. A. Model inference

In this section, a statistical model is going to be inferred interms of couplings and local ﬁeld values to perform energy-based ﬂow classiﬁcation. The main idea consists in extractinga statistical model from benign ﬂow samples to infer couplingand local ﬁeld values that characterize this type of trafﬁc.When calculating the energies of unlabeled ﬂows using theinferred values, it is expected that benign ﬂows will have lowerenergies than malicious ﬂows.Let ( A ... A N ) be an N-tuple of features, which can be in-stantiated for ﬂow k as ( a k ... a k N ) , with a k ∈ Ω , ..., a k N ∈ Ω N .Each feature value a ki is encoded by an integer from the set Ω = { , , ..., Q } , i.e., all feature alphabets are the same Ω i = Ω of size Q . If a given feature can only assume M values and M < Q , it is considered that values M + , ..., Q are possible,but will never be observed empirically. For instance, if theonly possible values for feature protocol are { ’TCP’ , ’UDP’ },and given Q =

4. In this case, we would have the mapping{ ’TCP’ :1, ’UDP’ :2, ’ ’ :3, ’ ’ :4 } and feature values 3 and 4would never occur.Now, let K be the set of all possible ﬂows, i.e., allpossible combinations of feature values ( K = Ω N ), and let S ⊂ K be a sample of ﬂows. We can use inverse statisticalphysics to infer a statistical model associating a probability P ( a k ... a k N ) to each ﬂow k ∈ K based on sample S . Theglobal statistical model P is inferred following the EntropyMaximization Principle [35]:max P − ∑ k ∈ K P ( a k ... a k N ) log ( P ( a k ... a k N )) (1) s . t . ∑ k ∈ K | a ki = a i P ( a k ... a k N ) = f i ( a i ) (2) ∀ i ∈ η ; ∀ a i ∈ Ω ; ∑ k ∈ K | a ki = a i , a kj = a j P ( a k ... a k N ) = f i j ( a i , a j ) (3) ∀ ( i , j ) ∈ η | i (cid:54) = j ; ∀ ( a i , a j ) ∈ Ω ; where f i ( a i ) is the empirical frequency of value a i on feature i and f i j ( a i , a j ) is the empirical joint frequency of the pair ofvalues ( a i , a j ) of features i and j . Note that constraints 2 and3 force model P to generate single as well as joint empiricalfrequency counts as marginals. This way, the model is sure tobe coherent with empirical data.Single and joint empirical frequencies f i ( a i ) and f i j ( a i , a j ) are obtained from set S by counting occurrences of a givenfeature value a i or feature value pair ( a i , a j ) , respectively, anddividing by the total number of ﬂows in S . Since the set S is ﬁnite and much smaller than K , inferences based on S aresubjected to undersampling effects. Following the theoreticalframework proposed in [33], we add pseudocounts to empiricalfrequencies to limit undersampling effects by performing thefollowing operations: f i ( a i ) ← ( − α ) f i ( a i ) + α Q (4) f i j ( a i , a j ) ← ( − α ) f i j ( a i , a j ) + α Q (5)where ( a i , a j ) ∈ Ω and 0 ≤ α ≤ S is extended with a fractionof ﬂows with uniformly sampled features.The proposed maximization can be solved using a La-grangian function such as presented in [35], yielding thefollowing Boltzmann-like distribution: P ∗ ( a k ... a k N ) = e − H ( a k ... a k N ) Z (6)where H ( a k ... a k N ) = − ∑ i , j | i < j e i j ( a ki , a k j ) − ∑ i h i ( a ki ) (7)is the Hamiltonian of ﬂow k and Z (eq. (6)) is the partitionfunction that normalizes the distribution. Since in this work weare not interested in obtaining individual ﬂow probabilities, Z is not required and, as a consequence, its calculation isomitted. Our objective is to calculate individual ﬂows energies, i.e., individual Hamiltonians as determined in eq. (7).Note that the Hamiltonian, as presented above, is fullydetermined regarding the Lagrange multipliers e i j ( · ) and h i ( · ) associated to constraints (2) and (3), respectively. Withinthe Potts Model framework, the Lagrange multipliers have aspecial meaning, with the set { e i j ( a i , a j ) | ( a i , a j ) ∈ Ω } beingthe set of all possible coupling values between features i and j and { h i ( a i ) | a i ∈ Ω } the set of possible local ﬁelds associatedto feature i .Inferring the local ﬁelds and pairwise couplings is dif-ﬁcult since the number of parameters exceeds the numberof independent constraints. Due to the physical propertiesof interacting spins, it is possible to infer pairwise couplingvalues e i j ( a i , a j ) using a Gaussian approximation. Assumingthat the same properties apply for ﬂow features, we infercoupling values as follows: e i j ( a i , a j ) = − ( C − ) i j ( a i , a j ) , (8) ∀ ( i , j ) ∈ η , ∀ ( a i , a j ) ∈ Ω , a i , a j (cid:54) = Q where C i j ( a i , a j ) = f i j ( a i , a j ) − f i ( a i ) f j ( a j ) (9)is the covariance matrix obtained from single and joint empir-ical frequencies. Taking the inverse of the covariance matrixis a well known procedure in statistics to remove the effect ofindirect correlation in data [36]. Now, it is important to clarifythat the number of independent constraints in eq. (2) and eq.(3) is actually N ( N − ) ( Q − ) + N ( Q − ) , even though themodel in eq. (6) has N ( N − ) Q + NQ parameters. So, withoutloss of generality, we set: e i , j ( a i , Q ) = e i , j ( Q , a j ) = h i ( Q ) = e i , j ( a i , a j ) in case a i or a j is equal to Q [33]. Afterwards, local ﬁelds h i ( a i ) canbe inferred using a mean-ﬁeld approximation [37]: f i ( a i ) f i ( Q ) = exp (cid:32) h i ( a i ) + ∑ j , a j e i j ( a i , a j ) f j ( a j ) (cid:33) , (11) ∀ i ∈ η , a i ∈ Ω , a i (cid:54) = Q where f i ( Q ) is the frequency of the last element a i = Q for any feature i used for normalization. It is also worthmentioning that the element Q is arbitrarily selected and couldbe replaced by any other value in {1 . . . Q} as long as theselected element is kept the same for calculations of the localﬁelds of every feature i ∈ η . Note that in eq. (11) the empiricalsingle frequencies f i ( a i ) and the coupling values e i j ( a i , a j ) areknown, yielding: h i ( a i ) = ln (cid:18) f i ( a i ) f i ( Q ) (cid:19) − ∑ j , a j e i j ( a i , a j ) f j ( a j ) (12)In the mean-ﬁeld approximation presented above, the inter-action of a feature with its neighbors is replaced by anapproximate interaction with an averaged feature, yielding anapproximated value for the local ﬁeld associated to it.For further details about these calculations, please referto [32]. Now that all model parameters are known, it ispossible to calculate a given ﬂow energy according to eq.(7). In the following, we are going to present the theoreticalframework implementation to perform a two-class, i.e., benignand malicious, ﬂow classiﬁcation. B. Energy-based ﬂow classiﬁcation

The energy of a given ﬂow can be calculated according toeq. (7) based on the values of its features and the parametersfrom the statistical model inferred in the last section. In simpleterms, a given ﬂow energy is the negative sum of couplingsand local ﬁelds associated with its features, according to agiven statistical model. It means that a ﬂow that resembles theones used to infer the model is likely to be low in energy.Since EFC is an anomaly-based classiﬁer, the statisticalmodel used for classiﬁcation is inferred based only on benignﬂow samples. We would then expect the energies of benignsamples to be lower than the energies of malicious samples.In this sense, it is possible to classify ﬂow samples asbenign or malicious based on a chosen energy threshold. The classiﬁcation is performed by stating that samples with energysmaller than the threshold are benign, and samples with energygreater than or equal to the threshold are malicious. Note thatthe threshold for classiﬁcation can be chosen in different ways,and it can be static or dynamic. In this work, we will considera static threshold.

Algorithm 1

Energy-based Flow Classiﬁer

Input: benign _ f lows ( K × N ) , Q , α , cuto f f import all model inference functions f _ i ← SiteFreq ( benign _ f lows , Q , α ) f _ i j ← PairFreq ( benign _ f lows , f _ i , Q , α ) e _ i j ← Couplings ( f _ i , f _ i j , Q ) h _ i ← LocalFields ( e _ i j , f _ i , Q ) while Scanning the Network do f low ← wait_for_incoming_ﬂow() e ← for i ← N − do a _ i ← f low [ i ] for j ← i + N do a _ j ← f low [ j ] if a _ i (cid:54) = Q and a _ j (cid:54) = Q then e ← e − e _ i j [ i , a _ i , j , a _ j ] end if end for if a _ i (cid:54) = Q then e ← e − h _ i [ i , a _ i ] end if end for if e ≥ cuto f f then stop_ﬂow() forward_to_DPI() else release_ﬂow() end if end while Algorithm 1 shows the implementation of EFC. In lines 2-5, the statistical model for the sampled ﬂows is inferred, asdescribed by eqs. (4), (5), (8) and (12). Afterward, on lines 6-27, the classiﬁer monitors the network waiting for a capturedﬂow. When a ﬂow is captured, its energy is calculated on lines9-20, according to the Hamiltonian in eq. (7). The computedﬂow energy is compared to a known threshold ( cutoff ) valueon line 21. In case the energy falls above the threshold, theﬂow is classiﬁed as malicious and should be forwarded to deeppacket inspection (line 23) for assessment. Otherwise, the ﬂowis released, and the classiﬁer waits for another ﬂow.It is essential to highlight that the time complexity of thetraining step of EFC is O (( M × Q ) + N × M × Q ) , where N in the number of samples, M is the number of features, and Q is the size of the alphabet. Meanwhile, the complexity of theclassiﬁcation step for each sample is O ( M ) . It means that, inboth steps, the complexity is more dependant on the numberof features chosen, which can be kept small by using a fea-ture selection mechanism, e.g., Principal Component Analysis(PCA). Therefore, it is possible to see that EFC has a lowcomputational cost when compared to ML-based classiﬁers, such as Artiﬁcial Neural Network (ANN), Support VectorMachine (SVM), and RF. Considering the implementationshown in this section, next, we present the results obtainedwhen EFC is used to perform ﬂow classiﬁcation.V. R

ESULTS

In this section, we present the results obtained for EFC andclassical ML-based classiﬁers in different binary classiﬁcationexperiments considering three different datasets, i.e.,

CIDDS-001, CICIDS17, and CICDDoS19. First, we show that EFCcan separate benign from malicious ﬂows based on theirenergies, a result that is consistent for all the considereddatasets. Then, we present EFC’s classiﬁcation performanceand compare it to the classiﬁcation performance of classicalML-based classiﬁers in different test scenarios within eachdataset.It is important to highlight that the classiﬁcation experi-ments we perform in this work were designed not only toassess the performance of different classiﬁers but also toinvestigate their capability of adaptation to different domains, i.e., data distributions. Hence, we considered each day/contextwithin a given dataset to be a different domain and performedtwo kinds of experiments: training/testing in the same do-main, and training/testing in different domains. In the case oftraining/testing in the same domain, 10-fold cross-validation(CV) was performed. Afterward, the models inferred in eachof the ten steps of the CV were used to classify samplescoming from another domain. This classiﬁcation was done toinvestigate each classiﬁer’s capability for domain adaptation.EFC’s cutoff was deﬁned to be at the 95th percentile of theenergy distribution obtained in the training phase.

A. EFC characterization

To investigate if EFC was able to classify benign andmalicious trafﬁc ﬂow samples correctly, we inferred a modelbased on benign samples from simulated trafﬁc within theCIDDS-001 dataset. This model was then used to calculatethe energy of benign and malicious ﬂow samples coming fromsimulated trafﬁc, and also from the external server trafﬁc.Figure 2 ﬁrst plot shows energy values of 5,000 randomlysampled ﬂows labeled as normal and 5,000 randomly sampledﬂows labeled as malicious from the simulated trafﬁc containedin CIDDS-001 dataset. The statistical model used to calculatethe energies was inferred based on 4,500 ﬂows randomlysampled from the simulated trafﬁc. The separation between thetwo ﬂow classes is clear; i.e., normal ﬂows energy distributionis clearly shifted to the left in relation to malicious ﬂowsenergy distribution.Energy values of 5,000 randomly sampled ﬂows labeledas unknown and 5,000 randomly sampled ﬂows labeled as suspicious from the external sever trafﬁc in CIDDS-001 areshown in Figure 2 second plot. Trafﬁc labeled as unknown is trafﬁc coming from external users with destination port 80or 443, i.e., expected trafﬁc. In this sense, here we considerthis trafﬁc to be analogous to benign trafﬁc. Trafﬁc labeled as suspicious , on the other hand, is trafﬁc coming from externalusers aimed at ports other than 80 and 443, i.e., unexpected

CIDDS-001 Train/Test simulatedCIDDS-001 Train simulated, test external

Figure 2. Energy histogram of ﬂow samples from simulated trafﬁc (above)and from the external server trafﬁc (below) within CIDDS-001 dataset. Theenergy classiﬁcation threshold, deﬁned as the 95th percentile of the trainingdistribution, is shown in red. trafﬁc. Hence, this trafﬁc is considered analogous to malicioustrafﬁc. Note that the separation between these two classes, i.e., unknown and suspicious, is also evident. In Figure 2,we can see that a portion of unknown trafﬁc is mixed upwith suspicious trafﬁc, which might be an indication that thistrafﬁc, even with expected destination ports, is malicious. It isimportant to note that it is possible to apply the same energythreshold (around 140, i.e., i.e., benign and malicious, and the resultsare consistent for all datasets considered. In addition to that,

CICIDS17 Friday working hours CICDDoS19 DrDoS NTP

Train/Test same day Train/Test same dayTrain/Test di ﬀ erent days Train/Test di ﬀ erent days Figure 3. Energy histograms for classiﬁcation tests performed on samplescoming from the same day as training (ﬁrst row) and samples coming from adifferent day (second row) for both CICIDS17 (ﬁrst column) and CICDDoS19(second column) datasets. The energy classiﬁcation threshold, deﬁned as the95th percentile of the training distribution, is shown in red. we observe that the classiﬁcation threshold was deﬁned basedon a speciﬁc training distribution and can be applied to adifferent data distribution or domain. Such an observationcontributes to the claim that EFC is adaptable to differentdomains and does not overﬁt data. In the following subsection,classiﬁcation results are shown for different classiﬁers andcompared with the results obtained for EFC. In the nextsubsection, comparative results are presented.

B. Comparative analysis of EFC’s performance

We compared EFC to seven different ML classiﬁers foundin [12] that are available online at GitHub . The classiﬁersconsidered here are: K-Nearest Neighbors (KNN) [38], Deci-sion Tree (DT) [39], [40], Adaboost [41], Random Forest (RF)[42], ANN [43], Naive Bayes (NB) [44], and SVM [45], alldeployed with their default scikit-learn conﬁgurations. Flowfeatures were only discretized for EFC (see SupplementaryInformation: Table XI) since discretization would probablyimpair the performance of most ML algorithms. The metricsused to compare the results were the F1 score and the areaunder the ROC curve (AUC). The ﬁrst metric, F1 score, is theharmonic mean of the Precision and the Recall, i.e.,F = Precision − + Recall − = · Precision · RecallPrecision + Recall (13)where

Precision = T P / ( T P + FP ) , Recall = T P / ( T P + FN ) ,TP are the true positives, i.e., malicious trafﬁc classiﬁed as ma-licious, FP are the false positives, i.e., benign trafﬁc classiﬁedas malicious, and FN are the false negatives, i.e., benign trafﬁcclassiﬁed as malicious. The second metric, the area under theROC curve (AUC), is one of the most widespread evaluationmetrics for binary classiﬁers [46], [47]. The ROC curve isconstructed by plotting the true positive rate (TPR) against the https://github.com/vinayakumarr/Network-Intrusion-Detection false positive rate (FPR) at different classiﬁcation thresholds.It means that the AUC is the probability that a randomlychosen positive example will receive a higher score than arandomly chosen negative one. One of the main advantages ofthe AUC is that it is invariant to changes in class distribution, i.e., the ROC curve will not change if the distribution changesin a test set, but the underlying conditional distributions fromwhich the data are drawn stay the same [48], [47]. Since weare interested in evaluating domain adaptation, this metric isparticularly interesting to be adopted in this work.

1) CIDDS-001:

To evaluate the performance of EFC com-pared to different ML algorithms, we constructed two testsets using a subset of the CIDDS-001 dataset. Test set Iis composed solely by simulated trafﬁc ﬂow samples (withno common samples between them), while external trafﬁcﬂow samples from test set II. Dataset undersampling wasperformed to obtain a more homogeneous distribution of thedifferent malicious trafﬁc subclasses. Details about how thisundersampling was performed are described in SupplementaryInformation: Appendix A.

Table VC

LASSIFICATION RESULTS : P

ERFORMANCE OF DIFFERENT CLASSIFIERSTRAINED WITH

CIDDS-001 - S

IMULATED TRAFFIC

Train/Test simulated Train simulated, test externalClassiﬁer F1 score AUC F1 score AUCNB 0.799 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± EFC ± ± ± ± It is clear from Table V that most classiﬁers achieve highvalues of both F1 score and AUC over the test set containingsimulated trafﬁc. For instance, DT, Adaboost, and RF achievedan F1 score and AUC above 99%. EFC also achieved goodresults, with an F1 score of 0.957 ± ±

2) CICIDS17:

Two different experiments were performedto evaluate the performance of the classiﬁers over CICIDS17.In each experiment, two test sets were constructed: one com-prising trafﬁc of one speciﬁc day, and the other comprisingtrafﬁc of all other days, except the chosen one (see Supple-mentary Information: Appendix B). Days in which there werenot enough samples to compose a test set were left out.

Table VIC

LASSIFICATION RESULTS : P

ERFORMANCE OF DIFFERENT CLASSIFIERSTRAINED WITH

CICIDS17 - F

RIDAY WORKING HOURS

Train/Test same day Train/Test different daysClassiﬁer F1 score AUC F1 score AUCNB 0.773 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± EFC ± ± ± ± Table VI shows the results obtained when the training wasperformed on Friday. When tested on data from the same day,DT, Adaboost and RF had the results with both F1 scoreand AUC above 99%. EFC also performed well, achievingan F1 score of 0.952 ± ± ± ± Table VIIC

LASSIFICATION RESULTS : P

ERFORMANCE OF DIFFERENT CLASSIFIERSTRAINED WITH

CICIDS17 - W

EDNESDAY WORKING HOURS

Train/Test same day Train/Test different daysClassiﬁer F1 score AUC F1 score AUCNB 0.930 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± RF 0.998 ± ± ± ± ± ± ± ± ± ± ± ± EFC ± ± ± ± Despite the favorable results obtained in the ﬁrst experiment,it is possible to see in Table VII that Adaboost and RFoutperformed EFC when trained on Wednesday. Adaboost andRF achieved better results both when training and testing onthe same day, and when training and testing on different days.Such results might be due to the greater diversity of attacktypes present on Wednesday compared to Friday, thus givingmore information for the ML algorithms. However, even ifit was not the best, EFC had a good performance in bothtests, achieving over 99% AUC in the ﬁrst test and over 95%AUC in the second. It is also worth mentioning that EFCdoes not require malicious samples to achieve the same resultsdifferently from the other MLs. Therefore, these results arealigned to the other results presented to this point, pointingout that EFC is easily adaptable to different domains withlesser training information, i.e., without malicious samples.

3) CICDDoS19:

Three separate experiments were per-formed to evaluate the performance of different classiﬁersover the CICDDoS19 dataset. In each experiment, two testsets were constructed: one comprising trafﬁc of one speciﬁcday, and the other comprising trafﬁc of all other days, except the chosen one (see Supplementary Information: Appendix C).Days in which there were not enough samples to compose atest set were left out.

Table VIIIC

LASSIFICATION RESULTS : P

ERFORMANCE OF DIFFERENT CLASSIFIERSTRAINED WITH

CICDD O S19 - DD O S NTP

Train/Test same day Train/Test different daysClassiﬁer F1 score AUC F1 score AUCNB 0.735 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± EFC ± ± ± ± Table VIII shows the classiﬁcation results obtained whentraining with only attacks of the type DDoS NTP. With trainingand testing in the same context, once more DT, Adaboost andRF outperform the other classiﬁers, achieving over 99% bothon the F1 score and AUC. EFC achieved an F1 score of 0.968 ± ± ± ± Table IXC

LASSIFICATION RESULTS : P

ERFORMANCE OF DIFFERENT CLASSIFIERSTRAINED WITH

CICDD O S19 - S YN Train/Test same day Train/Test different daysClassiﬁer F1 score AUC F1 score AUCNB 0.684 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± SVM 0.918 ± ± ± ± ± ± ± ± EFC ± ± ± ± Differently, when trained with only Syn (Table IX) or TFTP(Table X) attack types, EFC does not outperform the otherclassiﬁers. It is something expected to happen in this dataset,since different types of volumetric DoS attacks have similarcharacteristics between them, making domain adaptation easierfor classical ML algorithms. In the experiments presented,Adaboost and RF were the classiﬁers that obtained the bestresults both when training and testing in the same contextand when training in one context and testing in another. It isworth mentioning that, despite not being the best algorithm,EFC achieved outstanding results in all tests performed, usingonly half of the information in the training phase.

4) Average results:

Finally, Table XI shows the averageperformance of each classiﬁer, considering all tests performedwith all datasets. It is possible to observe that RF outperformsother classiﬁers when trained and tested in the same domain.However, for the case where the classiﬁers are trained in one

Table XC

LASSIFICATION RESULTS : P

ERFORMANCE OF DIFFERENT CLASSIFIERSTRAINED WITH

CICDD O S19 - TFTP

Train/Test same day Train/Test different daysClassiﬁer F1 score AUC F1 score AUCNB 0.720 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± RF 0.996 ± ± ± ± ± ± ± ± ± ± ± ± EFC ± ± ± ± domain and tested in another, EFC outperforms classical ML-based classiﬁers. In this case, EFC achieved an F1 score of0 . ± .

032 and AUC of 0 . ± . . ± . Table XIC

LASSIFICATION RESULTS : A

VERAGE PERFORMANCE OF DIFFERENTCLASSIFIERS

Train/Test same domain Train/Test different domainsClassiﬁer F1 score AUC F1 score AUCNB 0.774 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± EFC ± ± ± ± Taken as a whole, the results presented in this subsectionshow that EFC is better at adapting to other domains thanclassical ML-based classiﬁers on average (see Table XI). Inaddition to that, it is possible to see that EFC achieves AUCvalues similar to the best ML algorithms, showing that it iscapable of performing well even if trained with only half ofthe information that other classiﬁers use. Not using malicioussamples in the training phase is likely to be the reason whyEFC is so good at adapting to other domains. On the otherhand, this feature might also contribute to a possible lowperformance in speciﬁc scenarios, e.g., when malicious trafﬁcshares many characteristics with benign trafﬁc. Moreover,EFC’s increased capability for domain adaptation when thereis a signiﬁcant in data distribution is a highly desirable traitin network ﬂow-based classiﬁers, since changes in trafﬁccomposition are expected to be very frequent, and new kindsof attacks are generated continuously. In the next section, wepresent our conclusions and future work directions.VI. C

ONCLUSION

In this work, we present a new ﬂow-based classiﬁer fornetwork intrusion detection called Energy-based Flow Clas-siﬁer, EFC. In EFC’s training phase, a statistical model is inferred based solely on benign trafﬁc samples. Afterward,this statistical model is used to classify network ﬂows inbenign or malicious based on "energy" values. Our resultsshow that EFC is capable of correctly performing networkﬂow binary classiﬁcation considering three different datasets.F1 score ( 96% at best) and AUC ( 99% at best) valuesobtained using EFC are comparable to the values obtainedwith other classical ML-based classiﬁers, such as RandomForest, K-Nearest Neighbors and Artiﬁcial Neural Networks,even though EFC uses only half of the information in thetraining phase compared to the other algorithms.In addition to that, we analyzed different classiﬁer capabil-ities for domain adaptation and observed that EFC is moresuitable to that than classical ML-based algorithms. In threeout of the six experiments performed to evaluate that overdifferent datasets, EFC outperformed the other classiﬁers. Inthe cases in which EFC was outperformed, the adaptationwas not difﬁcult for most of the algorithms, meaning that thetwo data distributions were not that different across domains.We understand that EFC’s capability for domain adaptation isprobably linked to the fact that the model inference performedin the training phase is based only on benign samples, whichprevents overﬁtting.Considering the advantages presented, we believe EFC tobe a promising algorithm to perform ﬂow-based trafﬁc classi-ﬁcation. Nevertheless, despite the promising results achieved,there is still room for further testing and improvement. Forinstance, to obtain a more homogeneous distribution of dif-ferent attack types, we performed a dataset undersampling,which might have had some effect on the results. Hence, infuture work, we aim at performing a more comprehensiveinvestigation of EFC applicability to real-world data anddifferent contexts, such as fraud analysis in bank data.A CKNOWLEDGMENT

The authors would like to thank Luís Paulo Faina Garcia forhelping with dataset analysis. Matt Bishop was supported bythe National Science Foundation under Grant Number OAC-1739025. Any opinions, ﬁndings, and conclusions or recom-mendations expressed in this material are those of the author(s)and do not necessarily reﬂect the views of the National ScienceFoundation. João Gondim gratefully acknowledges the supportfrom Project "EAGER: USBRCCR: Collaborative: SecuringNetworks in the Programmable Data Plane Era" funded byNSF (National Science Foundation) and RNP (Brazilian Na-tional Research Network).R

Securing the Internet of Things: Concepts,Methodologies, Tools, and Applications . IGI Global, 2020, pp. 481–497.[3] M. Ring, S. Wunderlich, D. Scheuring, D. Landes, and A. Hotho, “Asurvey of network-based intrusion detection data sets,”

Computers &Security , 2019. [4] A. Sperotto, G. Schaffrath, R. Sadre, C. Morariu, A. Pras, andB. Stiller, “An overview of IP ﬂow-based intrusion detection,”

IEEECommunications Surveys and Tutorials , vol. 12, no. 3, pp. 343–356,2010. [Online]. Available: http://ieeexplore.ieee.org/document/5455789/[5] M. F. Umer, M. Sher, and Y. Bi, “Flow-based intrusion detection:Techniques and challenges,”

Computers and Security , vol. 70, pp.238–254, sep 2017. [Online]. Available: https://linkinghub.elsevier.com/retrieve/pii/S0167404817301165[6] K. Bartos, M. Sofka, and V. Franc, “Optimized invariant representationof network trafﬁc for detecting unseen malware variants,” in { USENIX } Security Symposium ( { USENIX } Security 16) , 2016, pp.807–822.[7] H. Li, Z. Chen, R. Spolaor, Q. Yan, C. Zhao, and B. Yang, “Dart:Detecting unseen malware variants using adaptation regularization trans-fer learning,” in

ICC 2019-2019 IEEE International Conference onCommunications (ICC) . IEEE, 2019, pp. 1–6.[8] C. Rudin, “Stop explaining black box machine learning models for highstakes decisions and use interpretable models instead,”

Nature MachineIntelligence , vol. 1, no. 5, pp. 206–215, 2019.[9] M. Ring, S. Wunderlich, D. Grüdl, D. Landes, and A. Hotho, “Flow-based benchmark data sets for intrusion detection,” in

Proceedings ofthe 16th European Conference on Cyber Warfare and Security. ACPI ,2017, pp. 361–369.[10] I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, “Toward generatinga new intrusion detection dataset and intrusion trafﬁc characterization.”in

ICISSP , 2018, pp. 108–116.[11] I. Sharafaldin, A. H. Lashkari, S. Hakak, and A. A. Ghorbani, “Devel-oping realistic distributed denial of service (ddos) attack dataset andtaxonomy,” in . IEEE, 2019, pp. 1–8.[12] R. Vinayakumar, K. Soman, and P. Poornachandran, “Evaluating effec-tiveness of shallow and deep networks to intrusion detection system,”in . IEEE, 2017, pp. 1282–1289.[13] S. Khan, E. Sivaraman, and P. B. Honnavalli, “Performance evaluation ofadvanced machine learning algorithms for network intrusion detectionsystem,” in

Proceedings of International Conference on IoT InclusiveLife (ICIIL 2019), NITTTR Chandigarh, India . Springer, 2020, pp.51–59.[14] A. M. Mahfouz, D. Venugopal, and S. G. Shiva, “Comparative analysisof ml classiﬁers for network intrusion detection,” in

Fourth InternationalCongress on Information and Communication Technology . Springer,2020, pp. 193–207.[15] X. Tan, S. Su, Z. Huang, X. Guo, Z. Zuo, X. Sun, and L. Li, “Wirelesssensor networks intrusion detection based on smote and the randomforest algorithm,”

Sensors , vol. 19, no. 1, p. 203, 2019.[16] J. Kazemitabar, R. TAHERI, and G. KHERADMANDIAN, “A noveltechnique for improvement of intrusion detection via combining randomforrest and genetic algorithm,” 2019.[17] T. T. Bhavani, M. K. Rao, and A. M. Reddy, “Network intrusiondetection system using random forest and decision tree machine learningtechniques,” in

First International Conference on Sustainable Technolo-gies for Computational Intelligence . Springer, 2020, pp. 637–643.[18] N. Moustafa, B. Turnbull, and K.-K. R. Choo, “An ensemble intrusiondetection technique based on proposed statistical ﬂow features forprotecting network trafﬁc of internet of things,”

IEEE Internet of ThingsJournal , 2018.[19] B. A. Tama and K.-H. Rhee, “Attack classiﬁcation analysis of iotnetwork via deep learning approach,”

Research Briefs on Information& Communication Technology Evolution (ReBICTE) , vol. 3, pp. 1–9,2017.[20] M. Idhammad, K. Afdel, and M. Belouch, “Distributed intrusion detec-tion system for cloud environments based on data mining techniques,”

Procedia Computer Science , vol. 127, pp. 35–41, 2018.[21] ——, “Detection system of http ddos attacks in a cloud environmentbased on information theoretic entropy and random forest,”

Security andCommunication Networks , vol. 2018, 2018.[22] A. Verma and V. Ranga, “Statistical analysis of cidds-001 datasetfor network intrusion detection systems using distance-based machinelearning,”

Procedia Computer Science , vol. 125, pp. 709–716, 2018.[23] M. Ring, D. Landes, and A. Hotho, “Detection of slow port scans inﬂow-based network trafﬁc,”

PloS one , vol. 13, no. 9, p. e0204507, 2018.[24] R. Abdulhammed, M. Faezipour, A. Abuzneid, and A. AbuMallouh,“Deep and Machine Learning Approaches for Anomaly-BasedIntrusion Detection of Imbalanced Network Trafﬁc,”

IEEE SensorsLetters , vol. 3, no. 1, pp. 1–4, jan 2019. [Online]. Available:https://ieeexplore.ieee.org/document/8526292/ [25] A. Yulianto, P. Sukarno, and N. A. Suwastika, “Improving adaboost-based intrusion detection system (ids) performance on cic ids 2017dataset,” in Journal of Physics: Conference Series , vol. 1192, no. 1.IOP Publishing, 2019, p. 012018.[26] D. Aksu, S. Üstebay, M. A. Aydin, and T. Atmaca, “Intrusion detec-tion with comparative analysis of supervised learning techniques andﬁsher score feature selection algorithm,” in

International Symposium onComputer and Information Sciences . Springer, 2018, pp. 141–149.[27] J. Li, M. Liu, Z. Xue, X. Fan, and X. He, “Rtvd: A real-time volumetricdetection scheme for ddos in the internet of things,”

IEEE Access , vol. 8,pp. 36 191–36 201, 2020.[28] Y. Jia, F. Zhong, A. Alrawais, B. Gong, and X. Cheng, “Flowguard:An intelligent edge defense mechanism against iot ddos attacks,”

IEEEInternet of Things Journal , 2020.[29] M. P. Novaes, L. F. Carvalho, J. Lloret, and M. L. Proença, “Long short-term memory and fuzzy logic for anomaly detection and mitigationin software-deﬁned network environment,”

IEEE Access , vol. 8, pp.83 765–83 781, 2020.[30] D. Plonka, “Flowscan: A network trafﬁc ﬂow reporting and visualizationtool.” in

LISA , 2000, pp. 305–317.[31] A. H. Lashkari, G. Draper-Gil, M. S. I. Mamun, and A. A. Ghorbani,“Characterization of tor trafﬁc using time based features.” in

ICISSP ,2017, pp. 253–262.[32] S. Cocco, C. Feinauer, M. Figliuzzi, R. Monasson, and M. Weigt,“Inverse statistical physics of protein sequences: A key issues review,”

Reports on Progress in Physics , vol. 81, no. 3, p. 032601, mar 2018.[Online]. Available: http://stacks.iop.org/0034-4885/81/i=3/a=032601?key=crossref.353cf55f4345afafde1886d057be92bd[33] F. Morcos, A. Pagnani, B. Lunt, A. Bertolino, D. S. Marks,C. Sander, R. Zecchina, J. N. Onuchic, T. Hwa, and M. Weigt,“Direct-coupling analysis of residue coevolution captures nativecontacts across many protein families,”

Proceedings of the NationalAcademy of Sciences of the United States of America

Reviews of Modern Physics ,vol. 54, no. 1, pp. 235–268, jan 1982. [Online]. Available:https://link.aps.org/doi/10.1103/RevModPhys.54.235[35] E. T. Jaynes, “Information theory and statistical mechanics. II,”

Physical Review , vol. 108, no. 2, pp. 171–190, may 1957. [Online].Available: https://link.aps.org/doi/10.1103/PhysRev.106.620[36] B. Giraud, J. M. Heumann, and A. S. Lapedes, “Superadditive correla-tion,”

Physical Review E , vol. 59, no. 5, p. 4983, 1999.[37] A. Georges and J. S. Yedidia, “How to expand around mean-ﬁeld theoryusing high-temperature expansions,”

Journal of Physics A: Mathematicaland General , vol. 24, no. 9, p. 2173, 1991.[38] J. A. Hartigan and M. A. Wong, “Algorithm as 136: A k-meansclustering algorithm,”

Journal of the Royal Statistical Society. SeriesC (Applied Statistics) , vol. 28, no. 1, pp. 100–108, 1979.[39] J. R. Quinlan, “Simplifying decision trees,”

International journal ofman-machine studies , vol. 27, no. 3, pp. 221–234, 1987.[40] P. H. Swain and H. Hauska, “The decision tree classiﬁer: Design andpotential,”

IEEE Transactions on Geoscience Electronics , vol. 15, no. 3,pp. 142–147, 1977.[41] Y. Freund, R. E. Schapire et al. , “Experiments with a new boostingalgorithm,” in icml , vol. 96. Citeseer, 1996, pp. 148–156.[42] T. K. Ho, “Random decision forests,” in

Proceedings of 3rd internationalconference on document analysis and recognition , vol. 1. IEEE, 1995,pp. 278–282.[43] W. S. McCulloch and W. Pitts, “A logical calculus of the ideas immanentin nervous activity,”

The bulletin of mathematical biophysics , vol. 5,no. 4, pp. 115–133, 1943.[44] D. D. Lewis, “Naive (bayes) at forty: The independence assumptionin information retrieval,” in

European conference on machine learning .Springer, 1998, pp. 4–15.[45] C. Cortes and V. Vapnik, “Support-vector networks,”

Machine learning ,vol. 20, no. 3, pp. 273–297, 1995.[46] N. Japkowicz and M. Shah,

Evaluating learning algorithms: a classiﬁ-cation perspective . Cambridge University Press, 2011.[47] D. Brzezinski and J. Stefanowski, “Prequential auc: properties of the areaunder the roc curve for data streams with concept drift,”

Knowledge andInformation Systems , vol. 52, no. 2, pp. 531–562, 2017.[48] S. Wu, P. Flach, and C. Ferri, “An improved model selection heuristicfor auc,” in

European Conference on Machine Learning . Springer,2007, pp. 478–489.

Camila F. T. Pontes is a student at the Universityof Brasilia (UnB), Brasilia, DF, Brazil. She receivedher M.Sc. degree in Molecular Biology in 2016 fromUnB and is currently an undergrad student at theDepartment of Computer Science (CIC/UnB). Herresearch interests are Computational and TheoreticalBiology and Network Security.

Manuela M. C. de Souza is an undergrad ComputerScience student at University of Brasilia (UnB),Brasilia, DF, Brazil. Her research interest is NetworkSecurity.

João J. C. Gondim was awarded an M.Sc. inComputing Science at Imperial College, Universityof London, in 1987 and a Ph.D. in Electrical En-gineering at UnB (University of Brasilia, 2017). Heis an adjunct professor at Department of ComputingScience (CIC) at UnB where he is a tenured mem-ber of faculty. His research interests are network,information and cyber security.

Matt Bishop received his Ph.D. in computer sciencefrom Purdue University, where he specialized incomputer security, in 1984. His main research area isthe analysis of vulnerabilities in computer systems.The second edition of his textbook, Computer Se-curity: Art and Science, was published in 2002 byAddison-Wesley Professional. He is currently a co-director of the Computer Security Laboratory at theUniversity of California Davis.