[PDF] Traversing Virtual Network Functions from the Edge to the Core: An End-to-End Performance Analysis

Abstract

Future mobile networks supporting Internet of Things are expected to provide both high throughput and low latency to user-specific services. One way to overcome this challenge is to adopt network function virtualization and Multi-access edge computing (MEC). In this paper, we analyze an end-to-end communication system that consists of both MEC servers and a server at the core network hosting different types of virtual network functions. We develop a queueing model for the performance analysis of the system consisting of both processing and transmission flows. The system is decomposed into subsystems which are independently analyzed in order to approximate the behaviour of the original system. We provide closed-form expressions of the performance metrics such as system drop rate and average number of tasks in the system. Simulation results show that our approximation performs quite well. By evaluating the system under different scenarios, we provide insights for the decision making on traffic flow control and its impact on critical performance metrics.

Full PDF

TTraversing Virtual Network Functions from the Edge to the Core:An End-to-End Performance Analysis

Emmanouil Fountoulakis ∗ , Qi Liao † , Manuel Stein † , Nikolaos Pappas ∗∗ Department of Science and Technology, Link¨oping University, Sweden † Nokia Bell Labs, Stuttgart, GermanyE-mails: { emmanouil.fountoulakis, nikolaos.pappas } @liu.se, { qi.liao, manuel.stein } @nokia-bell-labs.com Abstract —Future mobile networks supporting Internet ofThings are expected to provide both high throughput and lowlatency to user-speciﬁc services. One way to overcome thischallenge is to adopt network function virtualization and Multi-access Edge Computing (MEC). In this paper, we analyze anend-to-end communications system that consists of both MECservers and a server at the core network hosting different typesof virtual network functions. We develop a queueing model for theperformance analysis of the system consisting of both processingand transmission ﬂows. We provide analytical approximationsof the performance metrics such as system drop rate andaverage number of tasks in the system. Simulation results showthat our approximations perform quite well. By evaluating thesystem under different scenarios, we provide insights for thedecision making on trafﬁc ﬂow control and its impact on criticalperformance metrics.

I. I

NTRODUCTION

In future communications systems, mission-critical mo-bile applications, e.g., augmented reality, connected vehicles,eHealth, will provide services that require ultra-low latency[1], [2]. To satisfy the low latency requirements, Multi-access Edge Computing (MEC) has been proposed as a keysolution [1]. The idea of MEC is to locate more computationalresources closer to the users, e.g., at the base stations. Besideslatency constraints, these services may have strict functionchaining requirements. In other words, each service has tobe processed by a set of network functions (e.g., ﬁrewalls,transcoders, load balancers, etc.) in a speciﬁc order [3]. Fur-thermore, the requirements of 5G networks for ﬂexibility andelasticity of the network inspire the idea of Network FunctionVirtualization (NFV) [3], [4]. The idea of NFV is to decouplethe network functions from dedicated hardware equipment.Instead of dedicated hardware equipment, general purposeservers can host one or more types of network functions.However, the computational capabilities and the availableresources of MEC servers are still limited compared to thehigh-end servers in the cloud. Therefore, it is interesting tofurther investigate the cooperation between the edge and thecore, and the cooperation among MEC servers.Recently, Virtual Network Function (VNF) placement andresource allocation problem has attracted a lot of attention,e.g., [5], [6]. In these works, the authors formulate the VNFplacement problem as mixed integer linear problem under the

This work has been supported by the European Unions Horizon 2020research and innovation programme under the Marie Skłodowska-Curie grantagreement No. 643002. assumption of known trafﬁc demand. In a dynamic environ-ment with unknown trafﬁc, the authors in [7] develop dynamicalgorithms in order to control the ﬂow by applying Lyapunovoptimization theory. There are few works on analyzing net-works and deriving key performance metrics such as delay.Authors in [8] analyze the end-to-end delay for embeddedVNF chains. They consider two types of services that traversedifferent VNF chains and provide the delay analysis for eachchain. However, this work considers a speciﬁc system modelwhere multiple VNF chains embedded on a common deter-mined network path, while routing and ﬂow control are notconsidered. Furthermore, the authors in [9] and [10] estimatethe end-to-end delay in Software Deﬁned Network (SDN)environment by using local node measurements for single ﬂowand multi-ﬂow cases, respectively.In this paper, we model and analyze an end-to-end com-munications system consisting of two MEC servers at theedge network and one at the core network hosting differenttypes of VNFs by applying tools from queueing theory. Inorder to simplify the analysis, we introduce the approachof decomposing the system into subsystems, which can befurther applied in analyzing scale-up system with arbitrarynumber of servers. We provide analytical expressions for thekey performance metrics such as average number of tasksin the system and system drop rate for each subsystem.Simulation results validate our analysis and show that ouranalytical model is accurate. Furthermore, by evaluating thesystem under different scenarios we provide insights of howthe routing decision affects the key performance metrics ofour interest. II. S

YSTEM M ODEL

We consider an end-to-end communications system consist-ing of a mobile device, two MEC servers, and one serverlocated in the core network as depicted in Fig. 1. A tasktraverses a service chain of two consecutive VNFs: VNF and VNF . In this system, an MEC server, called Server , isco-located with the base station and hosts one copy of VNF as the primary MEC server. A secondary MEC server, calledServer , is located nearby and also hosts a copy of VNF .In addition, Server in the core network hosts VNF and hasmore advanced computational capabilities than Servers and . We assume a slotted time system. At each time slot, thedevice transmits a task in form of a packet to a base stationover a wireless channel. Because of the presence of fading in a r X i v : . [ c s . N I] J u l µ µ µ Q Q Q Q Q µ ↵ ↵ p Q µ Flow

Controller

VNF 1MEC Server 1MEC Server 2VNF 1 VNF 2

CPU Tx CPU CPU

Tx Tx

WirelessWired

Fig. 1: The system model. The blue dashed lines group thequeues located in the same server.the wireless channel, transmissions may face errors. A task issuccessfully transmitted to the base station with a probability p that captures fading, attenuation, noise, etc. The deviceattempts for a new task transmission only if the previous taskis successfully received at the base station. The received tasksneed to be distributed between the queue for local processingand the queue for transmission to the secondary MEC server.Thus, there are two possible routes to pass through the servicechain. A ﬂow controller at the base station decides randomlythe routing for each task . With probability α the task isprocessed by Server ﬁrst, and then forwarded to Server .With probability − α the task is forwarded to Server , tobe processed by VNF , and then forwarded to Server forbeing processed by VNF .Each task that arrives at a server ﬁrst waits in a queue forbeing processed by a VNF. Then, after the processing, it isstored in the transmission queue, waiting to be forwarded andprocessed by the next VNF. Let Q i denote the i -th queue,where i ∈ K , and K is the set of the queues in the system. Notethat the queues follow an early departure-late arrival model: atthe beginning of the slot the departure takes place and a newarrival can enter the queue at the end of the slot. The queuesfor task transmission are Q , Q , and Q , and the queues fortask processing are Q , Q , Q . The arrival rates for queues Q and Q are pα and p (1 − α ) , respectively. We denote by µ i , i ∈ K , the service rates of the queues. We assume thatthe service times are geometrically distributed. Furthermore,given that Q , Q , and Q are non empty, the arrival rates of Q , Q , and Q are equivalent to the service rates of Q , Q ,and Q (i.e., µ , µ , and µ ) respectively.Furthermore, the queues at Servers and are assumedto have ﬁnite buffer. Let M i denote the buffer size of eachqueue i ∈ K\ { } . If a queue is full and no task departsat the same time that a new one arrives, the new task isdropped and removed from the system. However, the queueof Server (where Q is located) is assumed to have inﬁnitelength of buffer. In practice, the buffer in the core network haslimited size, which is usually quite large. Our analysis based Probabilistic routing is a common strategy widely used in the literature,see for example [11]. on the inﬁnite buffer size assumption can capture this scenariowith minor modiﬁcations. However, we can extract insights onﬁnding the appropriate queue size by performing the analysisassuming inﬁnite queue size.III. P

ERFORMANCE A NALYSIS

In this section, we perform the modeling and the perfor-mance analysis that allow us to derive the critical perfor-mance metrics. We model the considered queueing systemutilizing Discrete Time Markov Chain (DTMC). Modelingthe whole system as one Markov chain can drive in a quitecomplicated system difﬁcult to be analyzed in terms of closed-form expressions. Thus, in order to simplify the analysis, wedecompose the system into different subsystems. We considerthe following four subsystems: 1) Q and Q , 2) Q and Q Q , and 4) Q . The performance metrics for the whole systemare approximated with the analytical expressions derived fromthe subsystems. A. Subsystems 1 and 2: Two queues in tandem

The two queues in tandem Q and Q are consid-ered a subsystem. The arrival rate for Q is: λ = pa .The Markov chain { ( X n , Y n ) } is described by P i,j : u,k =Pr { X n +1 = i , Y n +1 = j | X n = u, Y n = k } , where X n and Y n denote the states (in terms of queue length) of Q and Q at the n -th time slot, respectively, and i and j are referredto as the level i and phase j , respectively. The Markov chainis a Quasi-Birth-and-Death (QBD) DTMC [12]. Note that theQBD only goes a maximum level up or down, the transitionmatrix has a block partitioned form: P =  B CE A A A A A . . . . . . A A + A  . For the sake of simplicity, given a probability of an event,denoted by p , we denote the probability of its complementaryevent by ¯ p (cid:44) − p . To derive the block matrices B , C , E and A i , for i = 0 , , , we ﬁrst deﬁne the following matrices:First we deﬁne the following matrices P (1)1 =  µ ¯ µ ... ... µ ¯ µ  , P (2)1 =  µ ¯ µ µ ¯ µ ... ...  . Then, the block matrices of the transition matrix are calcu-lated as B = ¯ λ P (1)1 , C = λ P (1)1 , E = ¯ λ µ P (2)1 , A = λ ¯ µ P (1)1 , A = ¯ λ ¯ µ P (1)1 + λ µ P (2)1 , A = ¯ λ µ P (2)1 .Following the steps described above, utilizing the propertiesof a QBD DTMC, we can construct the transition matrix ofSubsystem for arbitrary ﬁnite buffer sizes. p p p p p p − p p p − p p − Fig. 2: Markov chain for Q .Our goal is to derive the steady state distribution of theMarkov chain deﬁned above. We can apply direct methods inorder to ﬁnd the steady state distribution [12, Chapter 4]. Notethat there are several efﬁcient algorithms that can be used forthis purpose, e.g., logarithmic reduction method.We denote the steady state distribution of Subsystem by arow vector π (1) = (cid:104) π (1)0 , , π (1)0 , , . . . , π (1)0 ,M , π (1)1 , , . . . , π (1) M ,M (cid:105) .We ﬁnd π (1) by solving the following linear system ofequations π (1) P = π (1) , π (1) = 1 , where denotes thecolumn vector of ones. Hereafter we use π ( n ) to denotethe steady state distribution vector of the n -th subsystem for n = 1 , , , .Furthermore, the arrival rate of Q depends on the servicerate of Q . However, the arrival rate of Q is equal to µ ifand only if Q is non-empty. Therefore, the arrival rate of Q is λ = Pr { Q > } µ = (cid:16)(cid:80) M j =0 (cid:80) M i =1 π (1) i,j (cid:17) µ . Similarly,we can construct the transition matrix P and the steady statedistribution π (2) for the second subsystem consisting of Q and Q . The arrival rates of Q and Q are λ = p (1 − α ) and λ = Pr { Q > } µ = (cid:32) M (cid:80) j =0 M (cid:80) i =1 π (2) i,j (cid:33) µ , respectively. B. Subsystem 3: Q with ﬁnite buffer We consider Q as an independent subsystem. M is thebuffer size of the queue. We ﬁrst deﬁne the arrival rate of Q : λ = Pr { Q > } µ = (cid:32) M (cid:80) i =0 M (cid:80) j =1 π (2) i,j (cid:33) µ .We model the subsystem as one Markov chain whosetransition matrix is shown below P =  ¯ λ λ ¯ λ µ λ µ + ¯ λ ¯ µ λ ¯ µ ¯ λ µ λ µ + ¯ λ ¯ µ λ ¯ µ . . . . . . . . . ¯ λ µ ¯ λ ¯ µ + λ  .We denote the steady state distribution of Subsystem 3by π (3) = (cid:104) π (3)0 , π (3)1 , . . . , π (3) M (cid:105) . To derive π (3) , wesolve the following linear system of equations: π (3) P = π (3) , π (3) = 1 . Using balance equations, we obtain π (3) i = λ i ¯ µ ( i − ¯ λ i µ i π (3)0 , for ≤ i ≤ M , and π (3)0 = (cid:20) M (cid:80) i =1 λ i ¯ µ i − ¯ λ i µ i (cid:21) − . C. Subsystem : Q with inﬁnite buffer size The arrival rate for Q depends on the service rate of Q and Q , and the probability that the queues are non-empty.Note that the departures from Q and Q can be considered independent stochastic processes. The arrival rates that occurdue to Q and Q are λ , = Pr { Q > } µ where λ , = Pr { Q > } µ , respectively. The arrival rate of Q is: λ = λ , + λ , . We model the system as a Markov chain as shownin Fig. 2, where p = ¯ λ , ¯ λ , , p = λ , ¯ λ , + λ , ¯ λ , , p = λ , λ , , p = ¯ λ , ¯ λ , ¯ µ + ¯ λ , λ , µ + λ , ¯ λ , µ , p = λ , λ , ¯ µ p = λ , ¯ λ , ¯ µ + ¯ λ , λ , ¯ µ + λ , λ , µ , p = λ , λ , ¯ µ , p − = ¯ λ , ¯ λ , µ .The transition matrix that describes the Markov chain aboveis shown below P =  a b · · · a b b · · · a b b b · · · b b b · · · ... b b · · ·  ,where a = p , a = p , a = p , b = p − , b = p , b = p , b = p . The transition matrix is a lowerHessenberg matrix. We denote the steady state distributionof Subsystem by π (4) = (cid:104) π (4)0 , π (4)1 , . . . (cid:105) . The generalexpression for the equilibrium equations of states is givenby the i -th term in the following equation: π (4) i = a i π (4)0 + i +1 (cid:80) j =1 b i − j π (4) j . For the DTMC with inﬁnite state space, weapply z -transform approach to solve the state equations. The z -transforms for the state transition probabilities a i and b i are A ( z ) = (cid:80) i =0 a i z − i and B ( z ) = (cid:80) i =0 b i z − i , respectively. The z -transform for the steady state distribution vector π (4) is Π( z ) = ∞ (cid:80) i =0 π (4) i z − i = π (4)0 z − A ( z ) − B ( z ) z − − B ( z ) . The solution for π (4) i is given by π (4)0 = 1 + B (cid:48) (1)1 + B (cid:48) (1) − A (cid:48) (1) , π (4) i = c i + m (cid:88) j =1 r j ( p j ) ( i − , i > , where r , p , and c are the residues, poles, and direct terms,respectively. Since Q has inﬁnite buffer size, we need tocharacterize the conditions under which the queue is stable.The Loynes’ theorem states: if the arrival and service processesof a queue are strictly jointly stationary and the average arrivalrate is less than the average service rate, then the queue isstable. Therefore, Q is stable if and only if the followinginequality holds: λ < µ . D. Discussion on the analysis of scaled-up systems

In this work, we analyze a simple end-to-end system thatconsists of three connected servers. We can analyze systemswith arbitrary number of servers by decomposing the systeminto subsystems and analyze each subsystem individually.Then, we use the results of each subsystem in order to derivethe analytical expressions for the whole system. A full versionof this work can be found in [13]. (a) Optimal values of α . (b) System drop rate. Fig. 3:

Objective: To minimize the system drop rate. µ = µ = µ = 0 . , µ = 0 . , p = 0 . . M i = 10 , for ≤ i ≤ . IV. K EY P ERFORMANCE M ETRICS

In this section, we provide analytical expressions of theperformance metrics of our interests, i.e., system drop rate andaverage number of tasks of the system by utilizing the resultsof the previous section. The probabilities to have a droppedtask at each time slot for Q − Q are shown respectively inbelow P D = λ ¯ µ M (cid:88) j =0 π (1) M ,j , P D = λ ¯ µ M (cid:88) i =1 π (1) i,M , P D = λ ¯ µ M (cid:88) j =0 π (2) M ,j , P D = λ ¯ µ M (cid:88) i =1 π (2) i,M , P D = λ ¯ µ π (3) M , where P D i is the probability to have a dropped task of queue i . The average length of each queue is given by ¯ Q = M (cid:88) i =0 M (cid:88) j =0 π (1) i,j i , ¯ Q = M (cid:88) j =0 M (cid:88) i =0 π (1) i,j j , ¯ Q = M (cid:88) i =0 M (cid:88) j =0 π (2) i,j i , ¯ Q = M (cid:88) j =0 M (cid:88) i =0 π (2) i,j j , ¯ Q = M (cid:88) i =0 π (3) i i , ¯ Q = ∞ (cid:88) i =0 π (4) i i .Therefore, the system drop rate and the average number oftask in the system can be described as P D = (cid:88) i ∈K\{ } P D i and ¯ Q = (cid:88) i ∈K ¯ Q i , respectively.V. N UMERICAL R ESULTS

In this section, we evaluate the accuracy of our derivedmathematical model in terms of key performance metrics bycomparing the analytical with simulation results. Furthermore,we provide results regarding the system performance underdifferent setups. We developed a MATLAB-based behaviouralsimulator and each case run for timeslots. A. Effect of µ and µ on the drop rate in systems with smallbuffers In this subsection, we observe the performance of the systemin terms of the drop rate when the size of the buffers is small.In Fig. 3a, we provide the optimal values of α (probabilisticrouting decision) for different values of µ and µ . Note that (a) Optimal values of α .

101 0.8 0.6 0.50.450 0.2 00100 (b) Average number of tasks.

Fig. 4:

Objective: To minimize the average number of tasks. µ = µ = µ = 0 . , µ = 0 . , p = 0 . . M i = 50 , for ≤ i ≤ . we obtain the optimal α for each value of µ and µ by ap-plying brute force. We observe that for small values of µ and µ , the value of α is small (around . ). Therefore, the routingselects to route the trafﬁc ﬂow to the secondary MEC server(Server ). Furthermore, it is shown that the value of optimal α is affected by the smaller value between the transmission rateand processing rate. Therefore, the buffer with the smallesttransmission or computation capacity becomes the bottleneckfor the subsystem. This could be the case, for example, whenthe connection between the MEC server and the server inthe core network is weak. Fig. 3b depicts the system drop ratefor the corresponding optimal α ’s. B. Effect of µ and µ on the number of tasks in systems withlarge buffers In this subsection, we provide results for the performance ofthe system in terms of average number of tasks. Our objectiveis to minimize the average number of tasks in the system whenthe buffer size is large. In Fig. 4a, the optimal α ’s for differentvalues of µ and µ are shown. We observe that for smallvalues of µ , the optimal value of α is equal to . The ﬂowcontroller decides to route the whole trafﬁc to the ﬁrst server.This decision is optimal in terms of minimizing the averagenumber of tasks in the system, but it increases signiﬁcantlythe system drop rate. The reason is that a large percentage ofthe tasks are dropped and but not served. We also observe thatthe smallest value between µ and µ operates as bottleneckin the subsystem and subsequently in the whole system. C. Trade-off between system drop rate and average queuelength - simulation vs analytical results

In this subsection, we provide results that show the trade-off between the system drop rate and average queue lengthfor different routing decisions. In addition, we compare theanalytical with simulation results and evaluate the accuracy ofour model. In this paper, we show only one scenario due tothe space limitation. We observe that α with values around . provide the best trade-off. In addition, an interesting result isshown for the case that α takes extreme values, i.e., . and . . Although the system drop rate is almost the same forthese two cases, the average queue length is quite different.The reason is that when the ﬁrst path is selected with higherprobability, the trafﬁc ﬂow traverses less number of queues .1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.900.050.10.150.20.250.30.35 2.62.72.82.933.13.23.33.43.5 Fig. 5: System drop rate and average queue length trade-off.Analytical vs simulation results. µ i = 0 . for ≤ i ≤ , µ = 0 . . M = M = 42 µ = 0 . µ = 0 . µ = 0 . µ = 0 . µ = . } µ . Fig. 6: Performance region. µ i = µ for ≤ i ≤ , M i = M for ≤ i ≤ . p = 0 . , µ = 0 . .comparing to the second path. In this case, three of the queues,i.e., Q , Q , Q , are lightly loaded. On the other hand, whenthe probability the second path to be selected is high, i.e.,smaller value of α , the trafﬁc traverses larger number ofqueue. In this case, more queues are heavily loaded and theaverage queue length increases. D. Effect of different buffer capacities and service rates onthroughput and delay

In order to further investigate the performance of the system,we provide simulation results that show how different setupsof the system affect the system throughput and delay. Notethat the analytical expressions for the throughput and delayare calculated by using the results for the system drop rateand average queue length, respectively. We omit the analysisdue to space limitation and we provide the complete analysisin an extended version of this work. However, it is interestingto show some preliminary results. In Fig. 6, we provide resultsfor the throughput, delay, and corresponding system drop rate for different values of µ and M . We obtain the throughputand delay as following: We ﬁx the service rate µ , and changethe capacity of the buffers M . Thus, we create each blackhorizontal line with the stars. The colormap represents thevalues of system drop rate.We observe that system performace is signiﬁcantly affectedwhen we increase the service rates of the buffers. On the otherhand, when the service rates are low but the capacities of thebuffers are large, the system performance is not improved.From these preliminary results, we observe that it is moreimportant to increase the service rate than the capacity of thebuffers. VI. C ONCLUSIONS & F

UTURE D IRECTIONS

In this work, we consider a network topology with twoMEC servers, a high-end server at core network, and VNFchains embedded in the servers. We model the network andprovide an analytical study on the system performance interms of system drop rate and average number of the tasksin the system. It is shown, through simulations results, thatthe approximate model performs well. Numerical results alsoshow useful insights on the design of such systems or resourceallocation at each server. Furthermore, we investigate numer-ically the routing policy that optimizes different objectivefunctions. R

EFERENCES[1] Y. C. Hu, M. Patel, D. Sabella, N. Sprecher, and V. Young, “Mobile edgecomputing–A key technology towards 5G,”

ETSI white paper , vol. 11,no. 11, pp. 1–16, 2015.[2] T. Taleb, K. Samdanis, B. Mada, H. Flinck, S. Dutta, and D. Sabella,“On multi-access edge computing: A survey of the emerging 5G net-work edge cloud architecture and orchestration,”

IEEE CommunicationsSurveys & Tutorials , vol. 19, no. 3, pp. 1657–1681, 2017.[3] R. Mijumbi, J. Serrat, J.-L. Gorricho, N. Bouten, F. De Turck, andR. Boutaba, “Network function virtualization: State-of-the-art and re-search challenges,”

IEEE Communications Surveys & Tutorials , vol. 18,no. 1, pp. 236–262, 2016.[4] M. S. Bonﬁm, K. L. Dias, and S. F. L. Fernandes, “Integrated NFV/SDNarchitectures: A systematic literature review,”

ACM Comput. Surv. ,vol. 51, no. 6, pp. 114:1–114:39, 2019.[5] R. Cohen, L. Lewin-Eytan, J. S. Naor, and D. Raz, “Near optimalplacement of virtual network functions,” in proc. IEEE INFOCOM , pp.1346–1354, 2015.[6] L. Wang, Z. Lu, X. Wen, R. Knopp, and R. Gupta, “Joint optimizationof service function chaining and resource allocation in network functionvirtualization,”

IEEE Access , vol. 4, pp. 8084–8094, 2016.[7] H. Feng, J. Llorca, A. M. Tulino, and A. F. Molisch, “Optimal dynamiccloud network control,”

IEEE/ACM Transactions on Networking , no. 99,pp. 1–14, 2018.[8] Q. Ye, W. Zhuang, X. Li, and J. Rao, “End-to-end delay modeling forembedded VNF chains in 5G core networks,”

IEEE Internet of ThingsJournal , pp. 1–1, 2018.[9] H.-N. Nguyen, T. Begin, A. Busson, and I. G. Lassous, “Approximatingthe end-to-end delay using local measurements: A preliminary studybased on conditional expectation,” in Proc. IEEE ISNCC , pp. 1–6, 2016.[10] ——, “Evaluation of an end-to-end delay estimation in the case ofmultiple ﬂows in SDN networks,” in

IEEE CNSM , 2016, pp. 336–341.[11] M. Ploumidis, N. Pappas, and A. Traganitis, “Flow allocation formaximum throughput and bounded delay on multiple disjoint pathsfor random access wireless multihop networks,”

IEEE Transactions onVehicular Technology , vol. 66, no. 1, pp. 720–733, 2016.[12] A. S. Alfa,