Traversing Virtual Network Functions from the Edge to the Core: An End-to-End Performance Analysis
Emmanouil Fountoulakis, Qi Liao, Manuel Stein, Nikolaos Pappas
TTraversing Virtual Network Functions from the Edge to the Core:An End-to-End Performance Analysis
Emmanouil Fountoulakis ∗ , Qi Liao † , Manuel Stein † , Nikolaos Pappas ∗∗ Department of Science and Technology, Link¨oping University, Sweden † Nokia Bell Labs, Stuttgart, GermanyE-mails: { emmanouil.fountoulakis, nikolaos.pappas } @liu.se, { qi.liao, manuel.stein } @nokia-bell-labs.com Abstract —Future mobile networks supporting Internet ofThings are expected to provide both high throughput and lowlatency to user-specific services. One way to overcome thischallenge is to adopt network function virtualization and Multi-access Edge Computing (MEC). In this paper, we analyze anend-to-end communications system that consists of both MECservers and a server at the core network hosting different typesof virtual network functions. We develop a queueing model for theperformance analysis of the system consisting of both processingand transmission flows. We provide analytical approximationsof the performance metrics such as system drop rate andaverage number of tasks in the system. Simulation results showthat our approximations perform quite well. By evaluating thesystem under different scenarios, we provide insights for thedecision making on traffic flow control and its impact on criticalperformance metrics.
I. I
NTRODUCTION
In future communications systems, mission-critical mo-bile applications, e.g., augmented reality, connected vehicles,eHealth, will provide services that require ultra-low latency[1], [2]. To satisfy the low latency requirements, Multi-access Edge Computing (MEC) has been proposed as a keysolution [1]. The idea of MEC is to locate more computationalresources closer to the users, e.g., at the base stations. Besideslatency constraints, these services may have strict functionchaining requirements. In other words, each service has tobe processed by a set of network functions (e.g., firewalls,transcoders, load balancers, etc.) in a specific order [3]. Fur-thermore, the requirements of 5G networks for flexibility andelasticity of the network inspire the idea of Network FunctionVirtualization (NFV) [3], [4]. The idea of NFV is to decouplethe network functions from dedicated hardware equipment.Instead of dedicated hardware equipment, general purposeservers can host one or more types of network functions.However, the computational capabilities and the availableresources of MEC servers are still limited compared to thehigh-end servers in the cloud. Therefore, it is interesting tofurther investigate the cooperation between the edge and thecore, and the cooperation among MEC servers.Recently, Virtual Network Function (VNF) placement andresource allocation problem has attracted a lot of attention,e.g., [5], [6]. In these works, the authors formulate the VNFplacement problem as mixed integer linear problem under the
This work has been supported by the European Unions Horizon 2020research and innovation programme under the Marie Skłodowska-Curie grantagreement No. 643002. assumption of known traffic demand. In a dynamic environ-ment with unknown traffic, the authors in [7] develop dynamicalgorithms in order to control the flow by applying Lyapunovoptimization theory. There are few works on analyzing net-works and deriving key performance metrics such as delay.Authors in [8] analyze the end-to-end delay for embeddedVNF chains. They consider two types of services that traversedifferent VNF chains and provide the delay analysis for eachchain. However, this work considers a specific system modelwhere multiple VNF chains embedded on a common deter-mined network path, while routing and flow control are notconsidered. Furthermore, the authors in [9] and [10] estimatethe end-to-end delay in Software Defined Network (SDN)environment by using local node measurements for single flowand multi-flow cases, respectively.In this paper, we model and analyze an end-to-end com-munications system consisting of two MEC servers at theedge network and one at the core network hosting differenttypes of VNFs by applying tools from queueing theory. Inorder to simplify the analysis, we introduce the approachof decomposing the system into subsystems, which can befurther applied in analyzing scale-up system with arbitrarynumber of servers. We provide analytical expressions for thekey performance metrics such as average number of tasksin the system and system drop rate for each subsystem.Simulation results validate our analysis and show that ouranalytical model is accurate. Furthermore, by evaluating thesystem under different scenarios we provide insights of howthe routing decision affects the key performance metrics ofour interest. II. S
YSTEM M ODEL
We consider an end-to-end communications system consist-ing of a mobile device, two MEC servers, and one serverlocated in the core network as depicted in Fig. 1. A tasktraverses a service chain of two consecutive VNFs: VNF and VNF . In this system, an MEC server, called Server , isco-located with the base station and hosts one copy of VNF as the primary MEC server. A secondary MEC server, calledServer , is located nearby and also hosts a copy of VNF .In addition, Server in the core network hosts VNF and hasmore advanced computational capabilities than Servers and . We assume a slotted time system. At each time slot, thedevice transmits a task in form of a packet to a base stationover a wireless channel. Because of the presence of fading in a r X i v : . [ c s . N I] J u l µ µ µ Q Q Q Q Q µ ↵ ↵ p Q µ Flow
Controller
VNF 1MEC Server 1MEC Server 2VNF 1 VNF 2
CPU Tx CPU CPU
Tx Tx
WirelessWired
Fig. 1: The system model. The blue dashed lines group thequeues located in the same server.the wireless channel, transmissions may face errors. A task issuccessfully transmitted to the base station with a probability p that captures fading, attenuation, noise, etc. The deviceattempts for a new task transmission only if the previous taskis successfully received at the base station. The received tasksneed to be distributed between the queue for local processingand the queue for transmission to the secondary MEC server.Thus, there are two possible routes to pass through the servicechain. A flow controller at the base station decides randomlythe routing for each task . With probability α the task isprocessed by Server first, and then forwarded to Server .With probability − α the task is forwarded to Server , tobe processed by VNF , and then forwarded to Server forbeing processed by VNF .Each task that arrives at a server first waits in a queue forbeing processed by a VNF. Then, after the processing, it isstored in the transmission queue, waiting to be forwarded andprocessed by the next VNF. Let Q i denote the i -th queue,where i ∈ K , and K is the set of the queues in the system. Notethat the queues follow an early departure-late arrival model: atthe beginning of the slot the departure takes place and a newarrival can enter the queue at the end of the slot. The queuesfor task transmission are Q , Q , and Q , and the queues fortask processing are Q , Q , Q . The arrival rates for queues Q and Q are pα and p (1 − α ) , respectively. We denote by µ i , i ∈ K , the service rates of the queues. We assume thatthe service times are geometrically distributed. Furthermore,given that Q , Q , and Q are non empty, the arrival rates of Q , Q , and Q are equivalent to the service rates of Q , Q ,and Q (i.e., µ , µ , and µ ) respectively.Furthermore, the queues at Servers and are assumedto have finite buffer. Let M i denote the buffer size of eachqueue i ∈ K\ { } . If a queue is full and no task departsat the same time that a new one arrives, the new task isdropped and removed from the system. However, the queueof Server (where Q is located) is assumed to have infinitelength of buffer. In practice, the buffer in the core network haslimited size, which is usually quite large. Our analysis based Probabilistic routing is a common strategy widely used in the literature,see for example [11]. on the infinite buffer size assumption can capture this scenariowith minor modifications. However, we can extract insights onfinding the appropriate queue size by performing the analysisassuming infinite queue size.III. P
ERFORMANCE A NALYSIS
In this section, we perform the modeling and the perfor-mance analysis that allow us to derive the critical perfor-mance metrics. We model the considered queueing systemutilizing Discrete Time Markov Chain (DTMC). Modelingthe whole system as one Markov chain can drive in a quitecomplicated system difficult to be analyzed in terms of closed-form expressions. Thus, in order to simplify the analysis, wedecompose the system into different subsystems. We considerthe following four subsystems: 1) Q and Q , 2) Q and Q Q , and 4) Q . The performance metrics for the whole systemare approximated with the analytical expressions derived fromthe subsystems. A. Subsystems 1 and 2: Two queues in tandem
The two queues in tandem Q and Q are consid-ered a subsystem. The arrival rate for Q is: λ = pa .The Markov chain { ( X n , Y n ) } is described by P i,j : u,k =Pr { X n +1 = i , Y n +1 = j | X n = u, Y n = k } , where X n and Y n denote the states (in terms of queue length) of Q and Q at the n -th time slot, respectively, and i and j are referredto as the level i and phase j , respectively. The Markov chainis a Quasi-Birth-and-Death (QBD) DTMC [12]. Note that theQBD only goes a maximum level up or down, the transitionmatrix has a block partitioned form: P = B CE A A A A A . . . . . . A A + A . For the sake of simplicity, given a probability of an event,denoted by p , we denote the probability of its complementaryevent by ¯ p (cid:44) − p . To derive the block matrices B , C , E and A i , for i = 0 , , , we first define the following matrices:First we define the following matrices P (1)1 = µ ¯ µ ... ... µ ¯ µ , P (2)1 = µ ¯ µ µ ¯ µ ... ... . Then, the block matrices of the transition matrix are calcu-lated as B = ¯ λ P (1)1 , C = λ P (1)1 , E = ¯ λ µ P (2)1 , A = λ ¯ µ P (1)1 , A = ¯ λ ¯ µ P (1)1 + λ µ P (2)1 , A = ¯ λ µ P (2)1 .Following the steps described above, utilizing the propertiesof a QBD DTMC, we can construct the transition matrix ofSubsystem for arbitrary finite buffer sizes. p p p p p p − p p p − p p − Fig. 2: Markov chain for Q .Our goal is to derive the steady state distribution of theMarkov chain defined above. We can apply direct methods inorder to find the steady state distribution [12, Chapter 4]. Notethat there are several efficient algorithms that can be used forthis purpose, e.g., logarithmic reduction method.We denote the steady state distribution of Subsystem by arow vector π (1) = (cid:104) π (1)0 , , π (1)0 , , . . . , π (1)0 ,M , π (1)1 , , . . . , π (1) M ,M (cid:105) .We find π (1) by solving the following linear system ofequations π (1) P = π (1) , π (1) = 1 , where denotes thecolumn vector of ones. Hereafter we use π ( n ) to denotethe steady state distribution vector of the n -th subsystem for n = 1 , , , .Furthermore, the arrival rate of Q depends on the servicerate of Q . However, the arrival rate of Q is equal to µ ifand only if Q is non-empty. Therefore, the arrival rate of Q is λ = Pr { Q > } µ = (cid:16)(cid:80) M j =0 (cid:80) M i =1 π (1) i,j (cid:17) µ . Similarly,we can construct the transition matrix P and the steady statedistribution π (2) for the second subsystem consisting of Q and Q . The arrival rates of Q and Q are λ = p (1 − α ) and λ = Pr { Q > } µ = (cid:32) M (cid:80) j =0 M (cid:80) i =1 π (2) i,j (cid:33) µ , respectively. B. Subsystem 3: Q with finite buffer We consider Q as an independent subsystem. M is thebuffer size of the queue. We first define the arrival rate of Q : λ = Pr { Q > } µ = (cid:32) M (cid:80) i =0 M (cid:80) j =1 π (2) i,j (cid:33) µ .We model the subsystem as one Markov chain whosetransition matrix is shown below P = ¯ λ λ ¯ λ µ λ µ + ¯ λ ¯ µ λ ¯ µ ¯ λ µ λ µ + ¯ λ ¯ µ λ ¯ µ . . . . . . . . . ¯ λ µ ¯ λ ¯ µ + λ .We denote the steady state distribution of Subsystem 3by π (3) = (cid:104) π (3)0 , π (3)1 , . . . , π (3) M (cid:105) . To derive π (3) , wesolve the following linear system of equations: π (3) P = π (3) , π (3) = 1 . Using balance equations, we obtain π (3) i = λ i ¯ µ ( i − ¯ λ i µ i π (3)0 , for ≤ i ≤ M , and π (3)0 = (cid:20) M (cid:80) i =1 λ i ¯ µ i − ¯ λ i µ i (cid:21) − . C. Subsystem : Q with infinite buffer size The arrival rate for Q depends on the service rate of Q and Q , and the probability that the queues are non-empty.Note that the departures from Q and Q can be considered independent stochastic processes. The arrival rates that occurdue to Q and Q are λ , = Pr { Q > } µ where λ , = Pr { Q > } µ , respectively. The arrival rate of Q is: λ = λ , + λ , . We model the system as a Markov chain as shownin Fig. 2, where p = ¯ λ , ¯ λ , , p = λ , ¯ λ , + λ , ¯ λ , , p = λ , λ , , p = ¯ λ , ¯ λ , ¯ µ + ¯ λ , λ , µ + λ , ¯ λ , µ , p = λ , λ , ¯ µ p = λ , ¯ λ , ¯ µ + ¯ λ , λ , ¯ µ + λ , λ , µ , p = λ , λ , ¯ µ , p − = ¯ λ , ¯ λ , µ .The transition matrix that describes the Markov chain aboveis shown below P = a b · · · a b b · · · a b b b · · · b b b · · · ... b b · · · ,where a = p , a = p , a = p , b = p − , b = p , b = p , b = p . The transition matrix is a lowerHessenberg matrix. We denote the steady state distributionof Subsystem by π (4) = (cid:104) π (4)0 , π (4)1 , . . . (cid:105) . The generalexpression for the equilibrium equations of states is givenby the i -th term in the following equation: π (4) i = a i π (4)0 + i +1 (cid:80) j =1 b i − j π (4) j . For the DTMC with infinite state space, weapply z -transform approach to solve the state equations. The z -transforms for the state transition probabilities a i and b i are A ( z ) = (cid:80) i =0 a i z − i and B ( z ) = (cid:80) i =0 b i z − i , respectively. The z -transform for the steady state distribution vector π (4) is Π( z ) = ∞ (cid:80) i =0 π (4) i z − i = π (4)0 z − A ( z ) − B ( z ) z − − B ( z ) . The solution for π (4) i is given by π (4)0 = 1 + B (cid:48) (1)1 + B (cid:48) (1) − A (cid:48) (1) , π (4) i = c i + m (cid:88) j =1 r j ( p j ) ( i − , i > , where r , p , and c are the residues, poles, and direct terms,respectively. Since Q has infinite buffer size, we need tocharacterize the conditions under which the queue is stable.The Loynes’ theorem states: if the arrival and service processesof a queue are strictly jointly stationary and the average arrivalrate is less than the average service rate, then the queue isstable. Therefore, Q is stable if and only if the followinginequality holds: λ < µ . D. Discussion on the analysis of scaled-up systems
In this work, we analyze a simple end-to-end system thatconsists of three connected servers. We can analyze systemswith arbitrary number of servers by decomposing the systeminto subsystems and analyze each subsystem individually.Then, we use the results of each subsystem in order to derivethe analytical expressions for the whole system. A full versionof this work can be found in [13]. (a) Optimal values of α . (b) System drop rate. Fig. 3:
Objective: To minimize the system drop rate. µ = µ = µ = 0 . , µ = 0 . , p = 0 . . M i = 10 , for ≤ i ≤ . IV. K EY P ERFORMANCE M ETRICS
In this section, we provide analytical expressions of theperformance metrics of our interests, i.e., system drop rate andaverage number of tasks of the system by utilizing the resultsof the previous section. The probabilities to have a droppedtask at each time slot for Q − Q are shown respectively inbelow P D = λ ¯ µ M (cid:88) j =0 π (1) M ,j , P D = λ ¯ µ M (cid:88) i =1 π (1) i,M , P D = λ ¯ µ M (cid:88) j =0 π (2) M ,j , P D = λ ¯ µ M (cid:88) i =1 π (2) i,M , P D = λ ¯ µ π (3) M , where P D i is the probability to have a dropped task of queue i . The average length of each queue is given by ¯ Q = M (cid:88) i =0 M (cid:88) j =0 π (1) i,j i , ¯ Q = M (cid:88) j =0 M (cid:88) i =0 π (1) i,j j , ¯ Q = M (cid:88) i =0 M (cid:88) j =0 π (2) i,j i , ¯ Q = M (cid:88) j =0 M (cid:88) i =0 π (2) i,j j , ¯ Q = M (cid:88) i =0 π (3) i i , ¯ Q = ∞ (cid:88) i =0 π (4) i i .Therefore, the system drop rate and the average number oftask in the system can be described as P D = (cid:88) i ∈K\{ } P D i and ¯ Q = (cid:88) i ∈K ¯ Q i , respectively.V. N UMERICAL R ESULTS
In this section, we evaluate the accuracy of our derivedmathematical model in terms of key performance metrics bycomparing the analytical with simulation results. Furthermore,we provide results regarding the system performance underdifferent setups. We developed a MATLAB-based behaviouralsimulator and each case run for timeslots. A. Effect of µ and µ on the drop rate in systems with smallbuffers In this subsection, we observe the performance of the systemin terms of the drop rate when the size of the buffers is small.In Fig. 3a, we provide the optimal values of α (probabilisticrouting decision) for different values of µ and µ . Note that (a) Optimal values of α .
101 0.8 0.6 0.50.450 0.2 00100 (b) Average number of tasks.
Fig. 4:
Objective: To minimize the average number of tasks. µ = µ = µ = 0 . , µ = 0 . , p = 0 . . M i = 50 , for ≤ i ≤ . we obtain the optimal α for each value of µ and µ by ap-plying brute force. We observe that for small values of µ and µ , the value of α is small (around . ). Therefore, the routingselects to route the traffic flow to the secondary MEC server(Server ). Furthermore, it is shown that the value of optimal α is affected by the smaller value between the transmission rateand processing rate. Therefore, the buffer with the smallesttransmission or computation capacity becomes the bottleneckfor the subsystem. This could be the case, for example, whenthe connection between the MEC server and the server inthe core network is weak. Fig. 3b depicts the system drop ratefor the corresponding optimal α ’s. B. Effect of µ and µ on the number of tasks in systems withlarge buffers In this subsection, we provide results for the performance ofthe system in terms of average number of tasks. Our objectiveis to minimize the average number of tasks in the system whenthe buffer size is large. In Fig. 4a, the optimal α ’s for differentvalues of µ and µ are shown. We observe that for smallvalues of µ , the optimal value of α is equal to . The flowcontroller decides to route the whole traffic to the first server.This decision is optimal in terms of minimizing the averagenumber of tasks in the system, but it increases significantlythe system drop rate. The reason is that a large percentage ofthe tasks are dropped and but not served. We also observe thatthe smallest value between µ and µ operates as bottleneckin the subsystem and subsequently in the whole system. C. Trade-off between system drop rate and average queuelength - simulation vs analytical results
In this subsection, we provide results that show the trade-off between the system drop rate and average queue lengthfor different routing decisions. In addition, we compare theanalytical with simulation results and evaluate the accuracy ofour model. In this paper, we show only one scenario due tothe space limitation. We observe that α with values around . provide the best trade-off. In addition, an interesting result isshown for the case that α takes extreme values, i.e., . and . . Although the system drop rate is almost the same forthese two cases, the average queue length is quite different.The reason is that when the first path is selected with higherprobability, the traffic flow traverses less number of queues .1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.900.050.10.150.20.250.30.35 2.62.72.82.933.13.23.33.43.5 Fig. 5: System drop rate and average queue length trade-off.Analytical vs simulation results. µ i = 0 . for ≤ i ≤ , µ = 0 . . M = M = 42 µ = 0 . µ = 0 . µ = 0 . µ = 0 . µ = . } µ . Fig. 6: Performance region. µ i = µ for ≤ i ≤ , M i = M for ≤ i ≤ . p = 0 . , µ = 0 . .comparing to the second path. In this case, three of the queues,i.e., Q , Q , Q , are lightly loaded. On the other hand, whenthe probability the second path to be selected is high, i.e.,smaller value of α , the traffic traverses larger number ofqueue. In this case, more queues are heavily loaded and theaverage queue length increases. D. Effect of different buffer capacities and service rates onthroughput and delay
In order to further investigate the performance of the system,we provide simulation results that show how different setupsof the system affect the system throughput and delay. Notethat the analytical expressions for the throughput and delayare calculated by using the results for the system drop rateand average queue length, respectively. We omit the analysisdue to space limitation and we provide the complete analysisin an extended version of this work. However, it is interestingto show some preliminary results. In Fig. 6, we provide resultsfor the throughput, delay, and corresponding system drop rate for different values of µ and M . We obtain the throughputand delay as following: We fix the service rate µ , and changethe capacity of the buffers M . Thus, we create each blackhorizontal line with the stars. The colormap represents thevalues of system drop rate.We observe that system performace is significantly affectedwhen we increase the service rates of the buffers. On the otherhand, when the service rates are low but the capacities of thebuffers are large, the system performance is not improved.From these preliminary results, we observe that it is moreimportant to increase the service rate than the capacity of thebuffers. VI. C ONCLUSIONS & F
UTURE D IRECTIONS
In this work, we consider a network topology with twoMEC servers, a high-end server at core network, and VNFchains embedded in the servers. We model the network andprovide an analytical study on the system performance interms of system drop rate and average number of the tasksin the system. It is shown, through simulations results, thatthe approximate model performs well. Numerical results alsoshow useful insights on the design of such systems or resourceallocation at each server. Furthermore, we investigate numer-ically the routing policy that optimizes different objectivefunctions. R
EFERENCES[1] Y. C. Hu, M. Patel, D. Sabella, N. Sprecher, and V. Young, “Mobile edgecomputing–A key technology towards 5G,”
ETSI white paper , vol. 11,no. 11, pp. 1–16, 2015.[2] T. Taleb, K. Samdanis, B. Mada, H. Flinck, S. Dutta, and D. Sabella,“On multi-access edge computing: A survey of the emerging 5G net-work edge cloud architecture and orchestration,”
IEEE CommunicationsSurveys & Tutorials , vol. 19, no. 3, pp. 1657–1681, 2017.[3] R. Mijumbi, J. Serrat, J.-L. Gorricho, N. Bouten, F. De Turck, andR. Boutaba, “Network function virtualization: State-of-the-art and re-search challenges,”
IEEE Communications Surveys & Tutorials , vol. 18,no. 1, pp. 236–262, 2016.[4] M. S. Bonfim, K. L. Dias, and S. F. L. Fernandes, “Integrated NFV/SDNarchitectures: A systematic literature review,”
ACM Comput. Surv. ,vol. 51, no. 6, pp. 114:1–114:39, 2019.[5] R. Cohen, L. Lewin-Eytan, J. S. Naor, and D. Raz, “Near optimalplacement of virtual network functions,” in proc. IEEE INFOCOM , pp.1346–1354, 2015.[6] L. Wang, Z. Lu, X. Wen, R. Knopp, and R. Gupta, “Joint optimizationof service function chaining and resource allocation in network functionvirtualization,”
IEEE Access , vol. 4, pp. 8084–8094, 2016.[7] H. Feng, J. Llorca, A. M. Tulino, and A. F. Molisch, “Optimal dynamiccloud network control,”
IEEE/ACM Transactions on Networking , no. 99,pp. 1–14, 2018.[8] Q. Ye, W. Zhuang, X. Li, and J. Rao, “End-to-end delay modeling forembedded VNF chains in 5G core networks,”
IEEE Internet of ThingsJournal , pp. 1–1, 2018.[9] H.-N. Nguyen, T. Begin, A. Busson, and I. G. Lassous, “Approximatingthe end-to-end delay using local measurements: A preliminary studybased on conditional expectation,” in Proc. IEEE ISNCC , pp. 1–6, 2016.[10] ——, “Evaluation of an end-to-end delay estimation in the case ofmultiple flows in SDN networks,” in
IEEE CNSM , 2016, pp. 336–341.[11] M. Ploumidis, N. Pappas, and A. Traganitis, “Flow allocation formaximum throughput and bounded delay on multiple disjoint pathsfor random access wireless multihop networks,”
IEEE Transactions onVehicular Technology , vol. 66, no. 1, pp. 720–733, 2016.[12] A. S. Alfa,