[PDF] Dynamic VNF Placement, Resource Allocation and Traffic Routing in 5G

Abstract

5G networks are going to support a variety of vertical services, with a diverse set of key performance indicators (KPIs), by using enabling technologies such as software-defined networking and network function virtualization. It is the responsibility of the network operator to efficiently allocate the available resources to the service requests in such a way to honor KPI requirements, while accounting for the limited quantity of available resources and their cost. A critical challenge is that requests may be highly varying over time, requiring a solution that accounts for their dynamic generation and termination. With this motivation, we seek to make joint decisions for request admission, resource activation, VNF placement, resource allocation, and traffic routing. We do so by considering real-world aspects such as the setup times of virtual machines, with the goal of maximizing the mobile network operator profit. To this end, first, we formulate a one-shot optimization problem which can attain the optimum solution for small size problems given the complete knowledge of arrival and departure times of requests over the entire system lifespan. We then propose an efficient and practical heuristic solution that only requires this knowledge for the next time period and works for realistically-sized scenarios. Finally, we evaluate the performance of these solutions using real-world services and large-scale network topologies. {Results demonstrate that our heuristic solution performs better than a state-of-the-art online approach and close to the optimum.

Full PDF

11 Dynamic VNF Placement, Resource Allocation andTrafﬁc Routing in 5G

Morteza Golkarifard, Carla Fabiana Chiasserini, Francesco Malandrino, and Ali Movaghar

Abstract —5G networks are going to support a variety ofvertical services, with a diverse set of key performance indi-cators (KPIs), by using enabling technologies such as software-deﬁned networking and network function virtualization. It is theresponsibility of the network operator to efﬁciently allocate theavailable resources to the service requests in such a way to honorKPI requirements, while accounting for the limited quantity ofavailable resources and their cost. A critical challenge is thatrequests may be highly varying over time, requiring a solutionthat accounts for their dynamic generation and termination.With this motivation, we seek to make joint decisions forrequest admission, resource activation, VNF placement, resourceallocation, and trafﬁc routing. We do so by considering real-worldaspects such as the setup times of virtual machines, with the goalof maximizing the mobile network operator proﬁt. To this end,ﬁrst, we formulate a one-shot optimization problem which canattain the optimum solution for small size problems given thecomplete knowledge of arrival and departure times of requestsover the entire system lifespan. We then propose an efﬁcient andpractical heuristic solution that only requires this knowledge forthe next time period and works for realistically-sized scenarios.Finally, we evaluate the performance of these solutions usingreal-world services and large-scale network topologies. Resultsdemonstrate that our heuristic solution performs better thanstate-of-the-art online algorithms and close to the optimum.

I. I

NTRODUCTION

5G networks are envisioned to support a variety of servicesbelonging to vertical industries ( e . g ., autonomous driving,media, and entertainment) with a diverse set of requirements.Services are deﬁned as a directed graph of virtual networkfunctions (VNFs) with speciﬁc and varying key performanceindicators (KPIs), e . g ., throughput, and delay. Requests forthese services arrive over time and mobile network operators(MNOs) are responsible for efﬁciently satisfy such a demand,by fulﬁlling their associated KPI while minimizing the costfor themselves.As a result of the softwarization of 5G-and-beyond net-works, enabled by software-deﬁned networking (SDN) andnetwork function virtualization (NFV), it is now feasibleto use general-purpose resources ( e . g ., virtual machines) toimplement the VNFs required by the different service. Thedecision on which resources to associate with which VNF andservice is made by a network component called orchestrator ,as standardized by ETSI [1]. Without loss of generalityi, wefocus only on computational and communication resources( e . g ., virtual machines and the links connecting them); notice, M. Golkarifard and A. Movaghar are with Sharif University of Technology,Iran. F. Malandrino and C. F. Chiasserini are with CNR-IEIIT and CNIT, Italy.C. F. Chiasserini is with Politecnico di Torino, Italy. however, that our proposed framework is applicable to otherresource types ( e . g ., storage).The network orchestrator makes the following decisions [1]: • admission of requests; • activation/deactivation of VMs; • placement of VNF instances therein; • assignment of CPU to VMs for running the hosted VNFinstances; • routing of trafﬁc through physical links.These decisions are clearly mutually dependent, and thereforeshould be made jointly , in order to account for the – oftennontrivial – ways in which they inﬂuence one another. Thefocus of this paper is thus to consider the joint requestsadmission, VM activation/deactivation, VNF placement, CPUassignment, and trafﬁc routing problem in order to maximizethe MNO proﬁt, while considering: • the properties of each VNF, • the KPI requirements of each service, • the capabilities of VMs and PoPs (points of presence, e . g ., datacenters) and their latency, • the capacity and latency of physical links, • the VMs setup times, • the arrival and departure times of service requests.As better discussed in Sec. II, some of these factors aresimpliﬁed, or even neglected, in existing works on 5G orches-tration. Notably, we account for the VM setup time, whichbecomes a signiﬁcant factor in (for example) IoT applications,when requests are often short-lived. Ignoring setup (and tear-down) times can reduce the optimality of existing solutions.Furthermore, we account for the fact that different VNFsmay have different levels of complexity, therefore, differentquantities of computational resources may be needed to attainthe same KPI target. Inspired by several works in the litera-ture [2], we model individual VNFs as queues and servicesas queuing networks. Critically, unlike traditional queuingnetworks, the quantity of trafﬁc ( i . e ., the number of clientsin queues) can change across queues, as VNFs can drop somepackets ( e . g ., ﬁrewalls) or change the quantity thereof ( e . g .,video transcoders). Our model accounts for this importantaspect by replacing traditional ﬂow conservation constraintswith a generalized ﬂow conservation law, allowing us todescribe arbitrary services with arbitrary VNF graphs.Given this model, we formulate a one-shot optimizationproblem which, assuming perfect knowledge of future re-quests, allows us to maximize the MNO proﬁt. Given theNP-hardness of such a problem and the fact that knowledge offuture requests is usually not available, we propose MaxSR, an a r X i v : . [ c s . N I] F e b efﬁcient heuristic algorithm which will be invoked periodicallybased on the knowledge of requests within each time period.The proposed method can achieve a near-optimal solution forlarge-scale network scenarios. We evaluate MaxSR comparedto the optimum and other benchmarks using real-world ser-vices and different network scenarios.In summary, the main contributions of this paper are asfollows: • we propose a complete model for the main components of5G, both in terms of vertical services (dynamic requests,VNFs, and services KPIs) and in terms of resources ( e . g .VMs and links); • our model accounts for the time variations of servicerequests, and dynamically allocates the computational andnetwork resources while considering VMs setup times.It can also accommodate a diverse set of VNFs interms of computational complexity and KPI requirements,multiple VNF instances, and arbitrary VNF graphs withseveral ingress and egress VNFs, rather than a simplechain or directed acyclic graph (DAG); • we formulate a one-shot optimization problem as aMixed-Integer Programming (MIP) to make a joint deci-sion on VM state, VNF placement, CPU assignment, andtrafﬁc routing based on the complete requests statisticsover the entire system lifespan; • we propose MaxSR, an efﬁcient near-optimal heuristicalgorithm to solve the aforementioned problem based onthe knowledge of the near future for large scale networkscenarios; • ﬁnally, we compare MaxSR with optimum and the onlineapproach Best-Fit, through extensive experiments usingsynthetic services and requests, and different networkscenarios.The rest of the paper is organized as follows. Sec. IIreviews related works. Sec. III describes the system modeland problem formulation, while Sec. IV clariﬁes our solutionstrategy. Finally, Sec. V presents our numerical evaluationunder different network scenarios, and Sec. VI concludes thepaper. II. R ELATED W ORK

Several works have addressed VNF placement and trafﬁcrouting, as exempliﬁed by the survey paper [3]. In most ofthese works, the problem is formulated as a Mixed IntegerLinear Program (MILP) with a different set of objectivesand constraints. Such an approach can yield exact solutions,but merely works for small instances; therefore, heuristicalgorithms that offer a near-optimal solution have also beenpresented.In particular, a ﬁrst body of works provides a one-timeVNFs placement, given the incoming service requests. Sincethis method leaves already placed VNFs intact, it can lead to asub-optimal solution when the trafﬁc varies over time. Exam-ples of such an approach can be found in [4], [5], [6], [7], [8],[9], which aim at minimizing a cost function, e . g ., operationalcost, QoS degradation cost, server utilization, or a combinationof them, and assume that there are always enough resources to serve the incoming requests. Among them, Cohen et al . [4]propose an approximation algorithm to place sets of VNFs inan optimal manner, while approximating to the constraints by aconstant factor. Pham et al . [7] introduce a distributed solutionbased on a Markov approximation technique to place chainsof VNFs where the cost enfolds the delay cost, in addition tothe cost of trafﬁc and server. [8], instead, addresses the sameproblem but aims at minimizing the energy consumption, givenconstraints on end-to-end latency for each ﬂow and serverutilization. Pei et al . [9] propose an online heuristic for thisproblem, by which VNF instances are deployed and connectedusing the shortest path algorithm, in order to minimize thenumber VNF instances and satisfy their end-to-end delayconstraint.Another thread of works focuses on an efﬁcient admissionpolicy that maximizes the throughput or revenue of admittedrequests [10], [11], [12], [13]. In particular, Sallam et al . [10]formulate joint VNF placement and resource allocation prob-lem to maximize the number of fully served ﬂows consideringthe budget and capacity constraints. They leverage the sub-modularity property for a relaxed version of the problemand propose two heuristics with a constant approximationratio. [11] studies the joint VNF placement and service chainembedding problem, so as to maximize the revenue from theadmitted requests. A similar problem is tackled in [13] and[12] but for an online setting where the requests should beadmitted and served upon their arrival. Zhou et al . [12], onthe other hand, ﬁrst formulate a one-shot optimization problemover the entire system lifespan and then leverage the primal-dual method to design an online solution with a theoreticallyproved upper bound on the competitive ratio.A different approach is adopted in [14], [15], [16], [17],[18], [19], [20] where VNF placement can be readjustedthrough VNF sharing and migration, to optimally ﬁt time-varying service demands. [14] and [15] propose algorithmsthat properly scale over-utilized or under-utilized VNF in-stances based on the estimation of future service demands.Jia et al . [16] propose an online algorithm with a boundedcompetitive ratio that dynamically deploys delay constrainedservice function chains across geo-distributed datacenters min-imizing operational costs.Request admission control has instead been considered in[17], [18], [19], [20]. More in detail, Li et al . [17] propose aproactive algorithm that dynamically provisions resources toadmit as many requests as possible with a timing guarantee.Similarly, [18] admits requests and places their VNFs in thepeak interval, but minimizes the energy cost of VNF instancesby migration and turning off empty ones in the off-peak inter-val. Liu et al . [19] envision an algorithm that maximizes theservice provider’s proﬁt by periodically admitting new requestsand rearranging the current-served ones, while accountingfor the operational overhead of migration. Finally, leveragingVNF migration and sharing, [20] proposes an online algorithmto maximize throughput while minimizing service cost andmeeting latency constraints.Relevant to our work are also studies that target speciﬁcally5G systems, although they merely consider the link delay andneglect processing delays in the servers. An example can be found in [2], which models VMs as M/M/1 PS queues, andproposes a MILP and a heuristic solution to minimize theaverage service delay, while meeting the constraints on thelinks and host capacities. The works in [21] and [22] aiminstead to minimize, respectively, the operational cost and theenergy consumption of VMs and links while ensuring end-to-end delay KPI. [22] also allows for VNF sharing and studiesthe impact of applying priorities to different services withina shared VNF. Zhang et al . [23] tackle the request admissionproblem to maximize the total throughput, neglecting insteadqueuing delay at VMs.We remark that most of the above works present proactiveapproaches, and only deal with either cost minimization orrequest admission. On the contrary, we focus on dynamicresource activation, VNF placement, and CPU assignment tomaximize the revenue from admitted requests over the entiresystem lifespan, while minimizing the deployment costs andaccounting for some practical issues. Our proactive MILPformulation of the problem extends existing models by ac-counting for the maximum end-to-end delay as the main KPI,while our heuristic is a practical and scalable solution, whichperiodically admits new requests and readjusts the existingVNF deployment. To the best of our knowledge, this is the ﬁrstdynamic solution for service orchestration in 5G networks.III. S YSTEM M ODEL AND P ROBLEM F ORMULATION

In this section, ﬁrst we describe our system model supportedby a simple example. Later, we formulate the joint requestsadmission, VM activation, VNF placement, CPU assignment,and trafﬁc routing problem; a discussion of the problem timecomplexity follows. The frequently used notation is summa-rized in Table I.

A. System Model

Physical infrastructure.

Let G = (M , E) be a directedgraph representing the physical infrastructure network, whereeach node 𝑚 ∈ M is either a VM or a network node ( i . e .,a router or a switch). A VM 𝑚 has maximum computationalcapacity 𝐶 vm ( 𝑚 ) . Set E denotes the physical links connectingthe network nodes. We deﬁne 𝐵 ( 𝑒 ) and 𝐷 phy ( 𝑒 ) as, respec-tively, the bandwidth and delay of physical link 𝑒 ∈ E . Time isdiscretized into steps , T = { , , . . . , 𝑇 } , and we assume thatat every time step a VM may be in one of the following states: terminated , turning-on , or active . Speciﬁcally, VMs can onlybe used when they are active, and they need to be turned-onone time step before being active. Based on the measurementsreported in [15], we also consider the trafﬁc ﬂow migrationtime to be negligible with respect to the VM setup time.Each VM can host one VNF and belongs to a datacenter 𝑑 ∈D ; we denote the available amount of computational resourcesin datacenter 𝑑 by 𝐶 dc ( 𝑑 ) and the set of VMs within 𝑑 with M 𝑑 . In the physical graph 𝐺 , physical links within datacentersare assumed to be ideal, i . e ., they have no capacity limit andzero delay. Let logical link 𝑙 ∈ L be a sequence of physicallinks connecting two VMs, src ( 𝑙 ) and destination dst ( 𝑙 ) , thenwe deﬁne end-to-end path 𝑝 ∈ P as a sequence of logicallinks. Services.

We represent each service 𝑠 ∈ S with a VNFForwarding Graph (VNFFG) , where the nodes are VNFs 𝑞 ∈ Q , and the directed edges show how trafﬁc traverses theVNFs. VNFFG can be any general graph with possibly severalingress and egress VNFs. We denote the total new trafﬁc,entering the ingress VNFs of service 𝑠 , by 𝜆 new ( 𝑠 ) . A trafﬁcpacket of service 𝑠 , processed in VNF 𝑞 , is forwarded to VNF 𝑞 with probability of P ( 𝑠, 𝑞 , 𝑞 ) . Similarly, P ( 𝑠, ◦ , 𝑞 ) is theprobability that a new trafﬁc packet of service 𝑠 starts gettingservice in ingress VNF 𝑞 , and P ( 𝑠, 𝑞, ◦) is the probabilitythat a trafﬁc packet of service 𝑠 , already served at egressVNF 𝑞 , departs service 𝑠 . For each service 𝑠 , we considerits target delay, 𝐷 QoS ( 𝑠 ) , as the most critical KPI, specifyingthe maximum tolerable end-to-end delay for the trafﬁc packetsof 𝑠 .VNFs can have different processing requirements dependingon their computational complexity. We denote by 𝜔 ( 𝑞 ) thecomputational capability that VNF 𝑞 needs to process oneunit of trafﬁc. Some VNFs may not ﬁnd sufﬁcient resourceson a single VM to completely serve the trafﬁc while satisfyingthe target delay. Thus, multiple instances can be created, with 𝑁 ( 𝑠, 𝑞 ) being the maximum number of instances of VNF 𝑞 ateach point in time. Instances of the same VNF can be deployedeither within the same datacenter or at different datacenters; inthe latter case, the trafﬁc between each pair of VNFs must besplitted through different logical links that connect the VMsrunning the corresponding VNF instances.Different requests for the same services may arrive overtime; we denote with 𝐾 𝑠 the set of all service requests forservice 𝑠 , and characterize the generic service request 𝑘 ∈ K with its arrival time 𝑡 arv ( 𝑘 ) and departure time 𝑡 dpr ( 𝑘 ) . Dueto slice isolation requirements [24], we assume that the VNFinstances of different service requests are not shared with otherservice requests. Example.

Fig. 1 represents a possible deployment of twosample services, vehicle collision detection (VCD) and videoon-demand (VoD), on the physical graph (Fig. 1c) in a singletime step. VCD is a low-latency service with a very low targetdelay 𝐷 QoS , and VoD is a trafﬁc intensive service with a high 𝜆 new . Fig. 1a and Fig. 1b depict the VNFFGs of the VCD andVoD services, respectively, where the numbers on the edgesrepresent the transition probability of trafﬁc packets betweencorresponding VNFs. The physical graph contains a set ofdatacenters D = { 𝑑 , 𝑑 , 𝑑 } with computational capability 𝐶 dc . Datacenters are connected to each other using a switchand physical links with bandwidth 𝐵 and a latency 𝐷 phy . VMswithin each datacenter are denoted by sets M 𝑑 = { 𝑚 , 𝑚 } , M 𝑑 = { 𝑚 , 𝑚 } , and M 𝑑 = { 𝑚 , 𝑚 , 𝑚 } , each withcomputational capability 𝐶 vm . As depicted in Fig. 1c, serviceVCD is deployed within datacenter 𝑑 to avoid inter-datacenternetwork latency. Service VoD is deployed across datacenter 𝑑 and third-party datacenter 𝑑 . VNF transcoder , havinghigh computational complexity 𝜔 , requires two instances indatacenters 𝑑 to fully serve the trafﬁc. B. Problem Formulation

In this section, we ﬁrst describe the decisions that have to bemade to map the service requests onto network resources. Then

TABLE I: Notation (sets, variables, and parameters)

Symbol Description D Set of datacenters E Set of physical links K Set of service requests L Set of logical links M Set of VMs P Set of end-to-end paths Q Set of VNFs S Set of services T Set of time steps W 𝑠 Set of paths from ingress VNFs to egress VNFs in VNFgraph of service 𝑠𝐴 ( 𝑘, 𝑚, 𝑞, 𝑡 ) Whether to deploy VNF 𝑞 of service request 𝑘 at VM 𝑚 at time 𝑡𝐷 ( 𝑘, 𝑚, 𝑞, 𝑡 ) Trafﬁc departing VM 𝑚 for VNF 𝑞 of service request 𝑘 at time 𝑡𝐹 ( 𝑘, 𝑙, 𝑞 , 𝑞 , 𝑡 ) Equal to when 𝜌 ( 𝑘, 𝑙, 𝑞 , 𝑞 , 𝑡 ) > 𝐼 ( 𝑘, 𝑚, 𝑞, 𝑡 ) Trafﬁc entering VM 𝑚 for VNF 𝑞 of service request 𝑘 at time 𝑡𝐿 ( 𝑒, 𝑡 ) Trafﬁc on physical link 𝑒 at time 𝑡𝑂 ( 𝑚, 𝑡 ) Whether VM 𝑚 is active at time 𝑡𝑅 ( 𝑚, 𝑡 ) Average time for a request to be processed at VM 𝑚 attime 𝑡𝑈 ( 𝑚, 𝑡 ) Whether VM 𝑚 is turning-on at time 𝑡𝑉 ( 𝑘, 𝑡 ) Whether service request 𝑘 is active at time 𝑡𝜇 ( 𝑘, 𝑚, 𝑞, 𝑡 ) Service rate to assign to VM 𝑚 for VNF 𝑞 of servicerequest 𝑘 at time 𝑡𝜌 ( 𝑘, 𝑙, 𝑞 , 𝑞 , 𝑡 ) Fraction of trafﬁc from VNF 𝑞 to 𝑞 of service request 𝑘 , through logical link 𝑙 at time 𝑡 Symbol Description 𝐵 ( 𝑒 ) Bandwidth of physical link 𝑒𝐶 dc ( 𝑑 ) Computational capacity of datacenter 𝑑𝐶 vm ( 𝑚 ) Computational capacity of VM 𝑚𝐷 QoS ( 𝑠 ) Target delay for service 𝑠𝐷 log ( 𝑙 ) Delay of logical link 𝑙𝐷 phy ( 𝑒 ) Delay of physical link 𝑒𝑁 ( 𝑠, 𝑞 ) Maximum number of instances for VNF 𝑞 of service 𝑠𝑋 cpu ( 𝑚 ) Cost for VM 𝑚 to process one unit of computation in onetime step 𝑋 idle ( 𝑚 ) Fixed cost incurred when VM 𝑚 is turning-on or active inone time step 𝑋 link ( 𝑒 ) Cost of data transmission through physical link 𝑒 in onetime step 𝑋 rev ( 𝑠 ) Revenue from serving one trafﬁc unit of service 𝑠 Λ ( 𝑠, 𝑞 , 𝑞 ) Trafﬁc from VNF 𝑞 to 𝑞 for service 𝑠 P ( 𝑠, 𝑞 , 𝑞 ) Probability that trafﬁc processed at VNF 𝑞 is forwarded toVNF 𝑞 of service 𝑠𝛼 ( 𝑠, 𝑞 ) Ratio of outgoing trafﬁc to incoming trafﬁc for VNF 𝑞 ofservice 𝑠𝜆 new ( 𝑠 ) New trafﬁc for service 𝑠𝜔 ( 𝑞 ) Computation capability required for one trafﬁc unit at VNF 𝑞𝑡 arv ( 𝑘 ) Arrival time of service request 𝑘𝑡 dpr ( 𝑘 ) Departure time of service request 𝑘 firewall collision detector

11 1 (a) load balancer cachedownload transcoder (b) datacenterVMVNF switch downloadtranscodertranscoderfirewallcoll. detectionload balancercache (c)

Fig. 1: VNFFG of (a) vehicle collision detection (VCD)service and (b) video on-demand (VoD) service. The numberon edges represents transition probability of trafﬁc packets.(c) Physical graph including three datacenters connectedusing a switch.we formalize the system constraints and the objective usingthe model presented in Sec. III-A, along with the decisionvariables we deﬁne. In general, given the knowledge of thefuture arrival and departure times of service requests, we should make the following decisions: • service request activation, i . e ., when service requests getserved; • VM activation/deactivation, i . e ., when VMs are set up orterminated; • VNF instance placement, i . e ., which VMs have to runVNF instances; • CPU assignment, i . e ., how much computational capabilityshall be assigned to a VM to run the deployed VNF; • trafﬁc routing, i . e ., how trafﬁc between VNFs is routedthrough physical links. Service request activation.

Let binary variable 𝑉 ( 𝑘, 𝑡 ) ∈{ , } denote whether service request 𝑘 is being served attime 𝑡 . Once admitted, a service request has to be providedfor all its lifetime duration. Given service request arrival time 𝑡 arv ( 𝑘 ) and departure time 𝑡 dpr ( 𝑘 ) , this translates into: 𝑉 ( 𝑘, 𝑡 ) = , ∀ 𝑘 ∈ K , 𝑡 ∈ T : 𝑡 < 𝑡 arv ( 𝑘 ) ∨ 𝑡 ≥ 𝑡 dpr ( 𝑘 ) . (1) VNF instances.

The following constraint limits the numberof deployed instances of VNF 𝑞 of any service request 𝑘 ∈ K 𝑠 to be less than 𝑁 ( 𝑠, 𝑞 ) at any point in time: ∑︁ 𝑚 ∈M 𝐴 ( 𝑘, 𝑚, 𝑞, 𝑡 ) ≤ 𝑁 ( 𝑠, 𝑞 ) , ∀ 𝑡 ∈ T , 𝑠 ∈ S , 𝑘 ∈ K 𝑠 , 𝑞 ∈ Q , (2)where binary variable 𝐴 ( 𝑘, 𝑚, 𝑞, 𝑡 ) represents whether VNF 𝑞 of service request 𝑘 is placed on VM 𝑚 at time 𝑡 . The networkslice isolation property of 5G networks prevents VNF sharing among requests for different services. In addition, at most oneVNF instance can be deployed on any VM, i . e ., ∑︁ 𝑘 ∈K ∑︁ 𝑞 ∈Q 𝐴 ( 𝑘, 𝑚, 𝑞, 𝑡 ) ≤ , ∀ 𝑚 ∈ M , 𝑡 ∈ T . (3) VM states.

We deﬁne two binary variables 𝑈 ( 𝑚, 𝑡 ) and 𝑂 ( 𝑚, 𝑡 ) to represent whether VM 𝑚 is turning-on or active at time 𝑡 , respectively. We formulate a simple constraint toprevent VMs from being concurrently turning-on and activeat any time, i . e ., 𝑂 ( 𝑚, 𝑡 ) + 𝑈 ( 𝑚, 𝑡 ) ≤ , ∀ 𝑚 ∈ M , 𝑡 ∈ T . (4)The following constraint enforces that VM 𝑚 can be active attime 𝑡 only if it has been turning-on or active in the previoustime step: 𝑂 ( 𝑚, 𝑡 ) ≤ 𝑂 ( 𝑚, 𝑡 − ) + 𝑈 ( 𝑚, 𝑡 − ) , ∀ 𝑚 ∈ M , 𝑡 ∈ T . (5)VMs are able to run VNFs only when they are active, i . e ., ∑︁ 𝑘 ∈K ∑︁ 𝑞 ∈Q 𝐴 ( 𝑘, 𝑚, 𝑞, 𝑡 ) ≤ 𝑂 ( 𝑚, 𝑡 ) , ∀ 𝑚 ∈ M , 𝑡 ∈ T . (6) Computational capacity.

Let real variable 𝜇 ( 𝑘, 𝑚, 𝑞, 𝑡 ) represent the service rate assigned to VM 𝑚 to run VNF 𝑞 of service request 𝑘 at time 𝑡 . Multiplying it by 𝜔 ( 𝑞 ) , wehave the amount of computation capability assigned to VM 𝑚 to run VNF 𝑞 at time 𝑡 . The limited computational capabilityof datacenters and VMs denoted, respectively, by 𝐶 dc ( 𝑑 ) and 𝐶 vm ( 𝑚 ) , should not be exceeded at any point in time. Wedescribe such a limitation by imposing: ∑︁ 𝑚 ∈M 𝑑 ∑︁ 𝑘 ∈K ∑︁ 𝑞 ∈Q 𝜇 ( 𝑘, 𝑚, 𝑞, 𝑡 ) · 𝜔 ( 𝑞 ) ≤ 𝐶 dc ( 𝑑 ) , ∀ 𝑡 ∈ T , 𝑑 ∈ D , (7)where the sum on the left-hand side of the inequality is overall VMs within datacenter 𝑑 . Similarly, for the VMs we have 𝜇 ( 𝑘, 𝑚, 𝑞, 𝑡 ) · 𝜔 ( 𝑞 ) ≤ 𝐴 ( 𝑘, 𝑚, 𝑞, 𝑡 ) · 𝐶 vm ( 𝑚 ) , ∀ 𝑡 ∈ T , 𝑘 ∈ K , 𝑞 ∈ Q , 𝑚 ∈ M , (8)where 𝐴 ( 𝑘, 𝑚, 𝑞, 𝑡 ) on the right-hand side of the inequalityenforces zero service rate for VM 𝑚 when no VNF is placedtherein. KPI target fulﬁllment.

Whenever a service request is beingserved, i . e ., 𝑉 ( 𝑘, 𝑡 ) = , all the trafﬁc in the correspondingVNFFG should be carried by the underlying physical links.The following constraint ensures this condition for the trafﬁcbetween each pair of VNFs at any point in time: ∑︁ 𝑙 ∈L 𝜌 ( 𝑘, 𝑙, 𝑞 , 𝑞 , 𝑡 ) ≥ 𝑉 ( 𝑘, 𝑡 ) , ∀ 𝑡 ∈ T , 𝑠 ∈ S , 𝑘 ∈ K 𝑠 , 𝑞 , 𝑞 ∈ Q : P ( 𝑠, 𝑞 , 𝑞 ) > . (9)Real variable 𝜌 ( 𝑘, 𝑙, 𝑞 , 𝑞 , 𝑡 ) shows the fraction of trafﬁc fromVNF 𝑞 to 𝑞 of service request 𝑘 that is routed through logicallink 𝑙 at time 𝑡 . As mentioned, the trafﬁc ﬂow from VNF 𝑞 toVNF 𝑞 may be splitted into several logical links (see Eq. (2)).Moreover, since we consider multi-path routing, there may be multiple logical links between each pair of VNF instances.Therefore, constraint (9) implies that for any service request 𝑘 requesting trafﬁc from VNF 𝑞 to 𝑞 ( i . e ., P ( 𝑠, 𝑞 , 𝑞 ) > ),the sum of all fractional trafﬁc going though any logical link,should be equal to at any time when the service request isbeing served.The above constraint does not include ingress and egresstrafﬁc. To account for such contributions, we need to introduce dummy nodes in the VNFFG and the physical graph. Weadd an end-point dummy VNF, ◦ in every VNFFG, whichis directly connected to all ingress and egress VNFs and a dummy VM in the physical graph which is directly connectedto all VMs. We deﬁne L ◦ as the set of dummy logical linkswhich start from or end at the dummy VM. We assume that dummy logical links are ideal, i . e ., they have no capacity limitand zero delay and cost. We can now formulate the associatedtrafﬁc constraints as: ∑︁ 𝑙 ∈L ◦ 𝜌 ( 𝑘, 𝑙, ◦ , 𝑞, 𝑡 ) ≥ 𝑉 ( 𝑘, 𝑡 ) , ∀ 𝑡 ∈ T , 𝑠 ∈ S , 𝑘 ∈ K 𝑠 , 𝑞 ∈ Q : P ( 𝑠, ◦ , 𝑞 ) > , (10) ∑︁ 𝑙 ∈L ◦ 𝜌 ( 𝑘, 𝑙, 𝑞, ◦ , 𝑡 ) ≥ 𝑉 ( 𝑘, 𝑡 ) , ∀ 𝑡 ∈ T , 𝑠 ∈ S , 𝑘 ∈ K 𝑠 , 𝑞 ∈ Q : P ( 𝑠, 𝑞, ◦) > , (11)where 𝜌 ( 𝑘, 𝑙, ◦ , 𝑞, 𝑡 ) and 𝜌 ( 𝑘, 𝑙, 𝑞, ◦ , 𝑡 ) are the fraction of newtrafﬁc entering ingress VNF 𝑞 and the fraction of trafﬁcdeparting from egress VNF 𝑞 , respectively, going throughlogical link 𝑙 at time 𝑡 . Placement.

We can now correlate the routing decisions 𝜌 and the placement decisions 𝐴 as 𝜌 ( 𝑘, 𝑙, 𝑞 , 𝑞 , 𝑡 ) ≤ 𝐴 ( 𝑘, 𝑚, 𝑞 , 𝑡 ) , ∀ 𝑡 ∈ T , 𝑘 ∈ K , 𝑞 ∈ Q ∪ {◦} , 𝑞 ∈ Q ,𝑚 ∈ M , 𝑙 ∈ L ∪ L ◦ : dst ( 𝑙 ) = 𝑚. (12)The above constraint implies that whenever there is an incom-ing trafﬁc to VNF 𝑞 through logical link 𝑙 whose destinationis VM 𝑚 , i . e ., dst ( 𝑙 ) = 𝑚 , VNF 𝑞 is deployed at VM 𝑚 .Similarly, whenever there is an outgoing trafﬁc from VNF 𝑞 through logical link 𝑙 whose source is VM 𝑚 , i . e ., src ( 𝑙 ) = 𝑚 ,VNF 𝑞 is deployed at VM 𝑚 : 𝜌 ( 𝑘, 𝑙, 𝑞 , 𝑞 , 𝑡 ) ≤ 𝐴 ( 𝑘, 𝑚, 𝑞 , 𝑡 ) , ∀ 𝑡 ∈ T , 𝑘 ∈ K , 𝑞 ∈ Q , 𝑞 ∈ Q ∪ {◦} 𝑚 ∈ M , 𝑙 ∈ L ∪ L ◦ : src ( 𝑙 ) = 𝑚. (13) System stability.

Let 𝜆 ( 𝑠, 𝑞 ) denote the total incomingtrafﬁc of VNF 𝑞 of service 𝑠 . 𝜆 ( 𝑠, 𝑞 ) equals the sum of ingresstrafﬁc and the trafﬁc coming from other VNFs to VNF 𝑞 ofservice 𝑠 : 𝜆 ( 𝑠, 𝑞 ) = 𝜆 new ( 𝑠 ) · P ( 𝑠, ◦ , 𝑞 )++ ∑︁ 𝑞 ∈Q\{ 𝑞 } 𝜆 ( 𝑠, 𝑞 (cid:48) ) · P ( 𝑠, 𝑞 (cid:48) , 𝑞 ) . (14) Using 𝜆 ( 𝑠, 𝑞 ) , the amount of trafﬁc from VNF 𝑞 to VNF 𝑞 of service 𝑠 can be represented as: Λ ( 𝑠, 𝑞 , 𝑞 ) = 𝜆 ( 𝑠, 𝑞 ) · P ( 𝑠, 𝑞 , 𝑞 ) . (15)We can now deﬁne an auxiliary variable to represent theincoming trafﬁc of VNF 𝑞 of service request 𝑘 , which entersVM 𝑚 at time 𝑡 : 𝐼 ( 𝑘, 𝑚, 𝑞, 𝑡 ) = ∑︁ 𝑞 (cid:48) ∈Q∪{◦} ∑︁ 𝑙 ∈L∪L ◦ : dst ( 𝑙 ) = 𝑚 𝜌 ( 𝑘, 𝑙, 𝑞 (cid:48) , 𝑞, 𝑡 ) · Λ ( 𝑠, 𝑞 (cid:48) , 𝑞 ) ,𝑡 ∈ T , 𝑠 ∈ S , 𝑘 ∈ K 𝑠 , 𝑞 ∈ Q , 𝑚 ∈ M , (16)where the summation is over all logical links ending at VM 𝑚 . Finally, we describe the system stability requirement, whichimposes the incoming trafﬁc not to exceed the assigned servicerate for each VNF 𝑞 of service request 𝑘 on VM 𝑚 , at anypoint in time: 𝐼 ( 𝑘, 𝑚, 𝑞, 𝑡 ) ≤ 𝜇 ( 𝑘, 𝑚, 𝑞, 𝑡 ) , ∀ 𝑡 ∈ T , 𝑘 ∈ K , 𝑞 ∈ Q , 𝑚 ∈ M . (17) Generalized ﬂow conservation.

Our model captures thepossibility of having VNFs for which, due to processing, theamount of incoming and that of outgoing trafﬁc are different.We deﬁne the scaling factor 𝛼 ( 𝑠, 𝑞 ) as the ratio of outgoingtrafﬁc to incoming trafﬁc for VNF 𝑞 of service 𝑠 : 𝛼 ( 𝑠, 𝑞 ) = (cid:205) 𝑞 (cid:48) ∈Q∪{◦} Λ ( 𝑠, 𝑞, 𝑞 (cid:48) ) (cid:205) 𝑞 (cid:48) ∈Q∪{◦} Λ ( 𝑠, 𝑞 (cid:48) , 𝑞 ) , 𝑠 ∈ S , 𝑞 ∈ Q . (18)We also deﬁne auxiliary variable 𝐷 ( 𝑘, 𝑚, 𝑞, 𝑡 ) to represent theoutgoing trafﬁc of VNF 𝑞 of service request 𝑘 departing VM 𝑚 at time 𝑡 : 𝐷 ( 𝑘, 𝑚, 𝑞, 𝑡 ) = ∑︁ 𝑞 (cid:48) ∈Q∪{◦} ∑︁ 𝑙 ∈L∪L ◦ : src ( 𝑙 ) = 𝑚 𝜌 ( 𝑘, 𝑙, 𝑞, 𝑞 (cid:48) , 𝑡 ) · Λ ( 𝑠, 𝑞, 𝑞 (cid:48) ) ,𝑡 ∈ T , 𝑠 ∈ S , 𝑘 ∈ K 𝑠 , 𝑞 ∈ Q , 𝑚 ∈ M , (19)where the right-hand side enfolds all trafﬁc ﬂowing throughlogical links starting from VM 𝑚 . We can then formulate the generalized ﬂow conservation law for each VNF 𝑞 of servicerequest 𝑘 on VM 𝑚 at time 𝑡 : 𝐷 ( 𝑘, 𝑚, 𝑞, 𝑡 ) = 𝛼 ( 𝑠, 𝑞 ) · 𝐼 ( 𝑘, 𝑚, 𝑞, 𝑡 ) , ∀ 𝑡 ∈ T , 𝑠 ∈ S , 𝑘 ∈ K 𝑠 , 𝑞 ∈ Q , 𝑚 ∈ M , (20)which implies that for each VNF 𝑞 of service request 𝑘 on VM 𝑚 , at any time, the outgoing trafﬁc is equal to the incomingtrafﬁc multiplied by the scaling factor 𝛼 ( 𝑠, 𝑞 ) . Latency.

End-to-end network latency for a trafﬁc packet of aservice request is the time it takes to the packet to be served byall VNFs along the path from the ingress to the egress VNFs.Such a latency includes two contributions, namely, the networkdelay between pairs of VMs on which subsequent VNFs aredeployed and the processing time at the VNFs themselves. Theformer can be deﬁned based on the delay of the logical links 𝑙 , denoted by 𝐷 log ( 𝑙 ) . Such a delay is the sum of the delay ofthe underlying physical links: 𝐷 log ( 𝑙 ) = ∑︁ 𝑒 ∈ 𝑙 𝐷 phy ( 𝑒 ) . (21) We also introduce binary variable 𝐹 ( 𝑘, 𝑙, 𝑞 , 𝑞 , 𝑡 ) to representwhether logical link 𝑙 is used for routing the trafﬁc from VNF 𝑞 to 𝑞 of service request 𝑘 at time 𝑡 . 𝐹 can be described as 𝜌 ( 𝑘, 𝑙, 𝑞 , 𝑞 , 𝑡 ) ≤ 𝐹 ( 𝑘, 𝑙, 𝑞 , 𝑞 , 𝑡 ) , ∀ 𝑡 ∈ T , 𝑘 ∈ K , 𝑞 , 𝑞 ∈ Q , 𝑙 ∈ L . (22)The trafﬁc packets in the VNFFG follow a path 𝑝 of logicallinks in the underlying physical graph, which connect all VNFsin the VNFFG. Let 𝑤 ∈ W 𝑠 be the sequence of VNFs, froman ingress VNF to an egress VNF in the VNFFG of service 𝑠 . The network delay of trafﬁc packets of service request 𝑘 ,which traverse the VNFs as speciﬁed by 𝑤 and go through thelinks belonging to 𝑝 , is given by: ∑︁ ( 𝑞 ,𝑞 ) ∈ 𝑤 ∑︁ 𝑙 ∈ 𝑝 𝐹 ( 𝑘, 𝑙, 𝑞 , 𝑞 , 𝑡 ) · 𝐷 log ( 𝑙 ) . (23)The processing time of VM 𝑚 , denoted by 𝑅 ( 𝑚, 𝑡 ) , is thetime it takes for a trafﬁc packet to be completely processedin the VM. Modeling each VM as a queue with discipline PS(or, equivalently, FIFO), the processing time of VM 𝑚 at time 𝑡 is [2]: 𝑅 ( 𝑚, 𝑡 ) = (cid:205) 𝑘 ∈K (cid:205) 𝑞 ∈Q ( 𝜇 ( 𝑘, 𝑚, 𝑞, 𝑡 ) − 𝐼 ( 𝑘, 𝑚, 𝑞, 𝑡 )) ,𝑚 ∈ M , 𝑡 ∈ T . (24)Then, the processing time incurred by the trafﬁc packetsfollowing the VNF sequence 𝑤 , is given by: ∑︁ 𝑞 ∈ 𝑤 ∑︁ 𝑚 ∈ 𝑝 𝐴 ( 𝑘, 𝑚, 𝑞, 𝑡 ) · 𝑅 ( 𝑚, 𝑡 ) . (25)Finally, the experience delay must be less than the target delay, i . e ., ∑︁ ( 𝑞 ,𝑞 ) ∈ 𝑤 ∑︁ 𝑙 ∈ 𝑝 𝐹 ( 𝑘, 𝑙, 𝑞 , 𝑞 , 𝑡 ) · 𝐷 log ( 𝑙 )++ ∑︁ 𝑞 ∈ 𝑤 ∑︁ 𝑚 ∈ 𝑝 𝐴 ( 𝑘, 𝑚, 𝑞, 𝑡 ) · 𝑅 ( 𝑚, 𝑡 ) ≤ 𝐷 QoS ( 𝑠 ) , ∀ 𝑡 ∈ T , 𝑠 ∈ S , 𝑘 ∈ K 𝑠 , 𝑤 ∈ W 𝑠 , 𝑝 ∈ P . (26) Link capacity.

The trafﬁc on any physical link should notexceed the maximum link capacity, 𝐵 ( 𝑒 ) . To formalize thisconstraint, we deﬁne the auxiliary variable 𝐿 ( 𝑒, 𝑡 ) to representthe trafﬁc on physical link 𝑒 at time 𝑡 . This variable is equal tothe total trafﬁc between each pair of VNFs which goes throughthe logical link 𝑙 containing the physical link 𝑒 : 𝐿 ( 𝑒, 𝑡 ) = ∑︁ 𝑠 ∈S ∑︁ 𝑘 ∈K 𝑠 ∑︁ 𝑞 ,𝑞 ∈Q ∑︁ 𝑒 ∈ 𝑙 Λ ( 𝑠, 𝑞 , 𝑞 ) · 𝜌 ( 𝑘, 𝑙, 𝑞 , 𝑞 , 𝑡 ) . (27)The link capacity constraint is expressed as 𝐿 ( 𝑒, 𝑡 ) ≤ 𝐵 ( 𝑒 ) , ∀ 𝑒 ∈ E , 𝑡 ∈ T . (28) Objective.

The goal of the optimization problem is tomaximize the service revenue while minimizing the total cost.The revenue obtained by serving one unit of trafﬁc of service 𝑠 is indicated as 𝑋 rev ( 𝑠 ) ; we assume such a quantity to beinversely proportional to the target delay of service 𝑠 , i . e ., / 𝐷 QoS ( 𝑠 ) . This implies that serving services with lower targetdelay yields higher revenue for the MNO. The total revenueis expressed as 𝑅 = ∑︁ 𝑡 ∈T ∑︁ 𝑠 ∈S ∑︁ 𝑘 ∈K 𝑠 𝑋 rev ( 𝑠 ) · 𝑉 ( 𝑘, 𝑡 ) · 𝜆 new ( 𝑠 ) . (29)The total cost is the sum of the transmission cost in phys-ical links, computational and idle costs in VMs, which aredescribed, respectively, as: 𝐶 link = ∑︁ 𝑡 ∈T ∑︁ 𝑒 ∈E 𝑋 link ( 𝑒 ) · 𝐿 ( 𝑒, 𝑡 ) , (30) 𝐶 cpu = ∑︁ 𝑡 ∈T ∑︁ 𝑚 ∈M ∑︁ 𝑘 ∈K ∑︁ 𝑞 ∈Q 𝑋 cpu ( 𝑚 ) · 𝜇 ( 𝑘, 𝑚, 𝑞, 𝑡 ) · 𝜔 ( 𝑞 ) , (31) 𝐶 idle = ∑︁ 𝑡 ∈T ∑︁ 𝑚 ∈M 𝑋 idle ( 𝑚 ) · ( 𝑈 ( 𝑚, 𝑡 ) + 𝑂 ( 𝑚, 𝑡 )) . (32)The above costs are expressed per unit of time and depend,respectively, on a proportional cost 𝑋 link ( 𝑒 ) paid for eachphysical link 𝑒 per unit of trafﬁc, a proportional cost 𝑋 cpu ( 𝑚 ) for each VM 𝑚 paid per unit of computation, and a ﬁxed cost 𝑋 idle ( 𝑚 ) for each VM 𝑚 paid if VM 𝑚 is turning-on or active .Finally, we write our objective as: max (cid:2) 𝑅 − ( 𝐶 link + 𝐶 cpu + 𝐶 idle ) (cid:3) . (33) C. Problem Complexity

The problem of jointly making decisions about VM acti-vation, VNF placement, CPU assignment, and trafﬁc routingformulated above contains both integer and real decision vari-ables, hence it is non convex. In the following, we prove thatthe problem is NP hard, through a reduction from the weightconstrained shortest path problem (WCSPP) to a simplerversion of our own.

Theorem 1:

The problem mentioned in Sec. III-A is NP-hardwhen the objective value is greater than zero.

Proof:

We reduce an NP-hard problem, called weight con-strained shortest path problem (WCSPP) [25], to our problem.Given a graph 𝐺 ( 𝑉, 𝐸 ) , and the cost and weight associatedwith the edges, the WCSPP asks to ﬁnd the minimum costroute between two speciﬁed nodes while ensuring that the totalweight is less than a given value. We consider a special caseof our problem where only one service request with a chainof two VNFs arrives at 𝑡 = and departs in the next time step.We set the maximum number of instances for both VNFs toone. There are only two VMs in the physical infrastructure,with 𝐶 vm ( 𝑚 ) = ∞ and 𝑋 cpu ( 𝑚 ) = 𝑋 idle ( 𝑚 ) = ; the remainingare network nodes. We set 𝐶 dc ( 𝑑 ) = ∞ , ∀ 𝑑 ∈ D . Then, it iseasy to see that WCSPP is equivalent to the special case ofour problem when the objective value is greater than zero.Beside complexity, solving the problem formulated inSec. III-B assumes that the entire knowledge of arrival anddeparture times of all service requests is available, which isnot realistic in many scenarios. As detailed below, to cope withthis issue, our strategy is to periodically solve our problem,with each problem instance leveraging only the informationabout the past and the current service requests. IV. T HE M AX SR S

OLUTION

In light of the problem complexity discussed above, we pro-pose a heuristic solution called MaxSR, which makes decisions(i) only concerning a subsequent time interval encompassingthe present and the near future, which can be predicted withhigh accuracy [26], (ii) based on the knowledge of the servicerequests occurring within such time interval. More precisely,starting from time step 𝑡 , MaxSR makes decisions concerningthe current service requests and accounting for a time horizon 𝐻 , i . e ., extending till 𝑡 + 𝐻 . After 𝜏 time steps, where 𝜏 ≤ 𝐻 ,MaxSR is executed again accounting for the next time interval, i . e ., [ 𝑡 + 𝜏, 𝑡 + 𝜏 + 𝐻 ) . Note that, although decisions are madeaccounting for a time horizon equal to 𝐻 , they will be enactedjust until the next execution of MaxSR, i . e ., they hold, inpractice, only for 𝜏 . Even with such a limited time horizon,directly solving the problem deﬁned in Sec. III-B is stillNP hard. To walk around this limitation, at every execution,MaxSR processes the service requests received in the last 𝜏 time steps sequentially, i . e ., one request at a time. In thefollowing, we provide an overview of MaxSR in Sec. IV-A,and we detail the algorithms composing our heuristic inSec. IV-B. A. Overview

At every execution, MaxSR ﬁrst considers service requestsin decreasing order based on the corresponding service rev-enue. It then activates the necessary VMs for serving the ﬁrstservice request, trying to map the VNF sequence 𝑤 onto apath 𝑝 connecting the VMs deemed to host the required VNFs.While doing this, more than one instance can be created fora VNF if necessary to meet the service target delay. To thisend, we associate with each VNF a delay budget , which isproportional to the VNF computational complexity 𝜔 ( 𝑞 ) . Suchbudget, however, is ﬂexible , since the delay contribution of aVNF exceeding its delay budget may be compensated for bya subsequent VNF on 𝑤 , which is deployed in a VM able toprocess trafﬁc faster than what indicated by the VNF budget.Additionally, MaxSR exploits a backtrack approach: in caseof lack of sufﬁcient resources at a certain point of currentpath 𝑝 , the algorithm can go back to the last successfullydeployed VNF and looks for an alternative deployment (hencepath), leaving more spare budget for subsequent VNFs. Nonethe less, it may prove impossible to ﬁnd enough resources toaccommodate the trafﬁc and delay constraint of a given VNFinstance; in this case, the service request is rejected.The decisions that MaxSR makes are summarized below. Placement.

MaxSR aims to minimize the placement cost.This implies that the number of deployed VNF instancesshould be low, and the selected VMs should have a low cost.The algorithm thus starts from one instance and chooses thelowest-cost VM among the available ones. If this placementis not feasible, it tries the highest capacity VM to avoid theuse of an extra instance. If the latter strategy is also infeasible,it increases the number of instances and repeats the processuntil a successful deployment is possible, or the limit on themaximum number of instances is reached (Alg. 2 and Alg. 3).

Routing.

Recall that each VNF may have several instancesand that such instances may be deployed on VMs connectedthrough multiple logical links. MaxSR adopts a water-ﬁllingapproach to route the trafﬁc between each pair of VNFsthrough different logical links between a pair of VMs. To limitthe processing time at each VM, the trafﬁc entering each VMis properly set based on the VM available capacity (Alg. 3).

CPU assignment.

MaxSR aims to keep the service rate ofthe used VMs as low as possible, in order to reduce theconsumption of computing resources, hence the cost. Thismeans setting the lowest service rate compatible with theper-VNF delay budget, except when we have to compensatefor a VNF exceeding its delay budget; in the latter case,the algorithm opts for the maximum service rate on the VM(Alg. 4).

Algorithm 1:

Main body of MaxSR algorithm

Input: 𝑡, 𝐻 , K 𝑡,𝐻 ← { 𝑘 ∈ K : [ 𝑡, 𝑡 + 𝐻 ) ∩ (cid:2) 𝑡 arv ( 𝑘 ) , 𝑡 dpr ( 𝑘 ) (cid:1) ≠ ∅} Output: result sets R 𝑝 : = { 𝜇 ( 𝑘, 𝑚, 𝑞 )} , R 𝑟 : = { 𝑟 ( 𝑘, 𝑙, 𝑞 , 𝑞 )} , VM states R 𝑝 ← ∅ , R 𝑟 ← ∅ 𝑅 ( 𝑘 ) ← 𝑋 rev ( 𝑠 ) · ( min { 𝑡 + 𝐻, 𝑡 dpr ( 𝑘 )} − max { 𝑡, 𝑡 arv ( 𝑘 )}) · 𝜆 new ( 𝑠 ) , ∀ 𝑠 ∈ S , 𝑘 ∈ K 𝑠 ∩ K 𝑡,𝐻 sort 𝑘 ∈ K 𝑡,𝐻 by 𝑅 ( 𝑘 ) in desc. order forall 𝑘 ∈ K 𝑡,𝐻 do call BSRD ( 𝑘 ) and update R 𝑝 and R 𝑟 VM-Activation (R 𝑝 ) B. Algorithms

Alg. 1.

It is the main body of the MaxSR heuristic, takingas input time horizon 𝐻 , the current time step 𝑡 , and the set K 𝑡,𝐻 of service requests which should be served in the timehorizon [ 𝑡, 𝑡 + 𝐻 ) . Line 2 calculates service revenue 𝑅 ( 𝑘 ) foreach request 𝑘 , based on the expected trafﬁc to be servedin the time horizon and the expected revenue, i . e ., 𝑋 rev ( 𝑠 ) for service 𝑠 . The algorithm sorts the service requests inLine 3 in descending order, according to 𝑅 ( 𝑘 ) . It then calls BSRD for each request, in order to determine whether andhow to serve it within the time horizon. If the request canbe served, the resulting VNF placement/CPU assignment androuting decisions are stored in R 𝑝 and in R 𝑟 , respectively.For each served request, R 𝑝 will then contain a tuple pereach VNF instance that speciﬁes the allocated VM and itsassigned service rate, while R 𝑟 will contain a tuple for eachpair of VNF instances, determining the amount of trafﬁc ontheir connecting logical link(s). Finally, the VMs required forrunning the service request are activated if not already active;we recall that it takes one time step to activate them (turning-on state), and they will remain up till the service departuretime. Alg. 2.

Given service request 𝑘 for service 𝑠 as an input,the goal of Alg. 2 is to check whether all VNFs of 𝑠 canbe deployed with the available resources. If it is possible, therequest is served and the result sets R 𝑝 and R 𝑟 are returned. Algorithm 2:

Backtracking-based service request de-ployment (

BSRD ) Input: service request 𝑘 of service 𝑠 Output: R 𝑝 , R 𝑟 𝑖 ← ; status ← normal ; can-backtrack ← false ; C ← ∅ , R 𝑝 ← ∅ ; R 𝑟 ← ∅ Δ ( 𝑠, 𝑞 ) ← 𝜔 ( 𝑞 )/ (cid:205) |Q 𝑠 | 𝑗 = 𝜔 ( 𝑄 𝑠 ( 𝑗 )) , ∀ 𝑞 ∈ VNF chain of 𝑠 while 𝑖 ≤ number of VNFs do if status is normal then for 𝑛 ← to 𝑁 ( 𝑠, 𝑄 𝑠 ( 𝑖 )) do for strategy ∈ { cheapest , largest } do call VPTR ( 𝑘, 𝑖, 𝑛, strategy ) and CA ( 𝑘, 𝑖 ) if deployment is successful then break else if status is critical then can-backtrack ← false call VPTR ( 𝑘, 𝑖, 𝑁 ( 𝑠, 𝑄 𝑠 ( 𝑖 )) , largest ) and CA ( 𝑘, 𝑖 ) if 𝑖 -th VNF is successfully deployed then if status is normal then can-backtrack ← true Update R 𝑝 , R 𝑟 , status ← normal , 𝑖 ← 𝑖 + else ⊲ 𝑖 -th VNF is not deployed status ← critical if can-backtrack then Discard R 𝑝 , R 𝑟 for ( 𝑖 − ) -th VNF, 𝑖 ← 𝑖 − else if fail is due to delay budget then Update R 𝑝 , R 𝑟 , 𝑖 ← 𝑖 + else ⊲ fail is due to trafﬁc terminate and discard R 𝑝 and R 𝑟 if result sets are not feasible then terminate and discard R 𝑝 and R 𝑟 The global boolean variables status and can-backtrack repre-sent the deployment status and the possibility of backtracking,respectively. status is critical if the last VNF deploymenthas failed, and normal otherwise. The global cache C isa set of results that facilitates the backtracking operation(see Alg. 3). The algorithm starts in normal mode; clearly,backtracking is not allowed for the ﬁrst VNF in the VNFFGand cache C is empty (Line 1). The algorithm starts byassigning a delay budget to each VNF of the service, which isproportional to the VNF computational complexity (Line 2),where 𝑄 𝑠 ( 𝑗 ) denotes the 𝑗 -th VNF in the VNFFG. Then, itgoes across the sequence of VNFs starting from the ingressVNF and deploys them one by one.For each VNF, Lines 4-11 decide on the number of requiredinstances and the VM selection strategy , based on the deploy-ment status . The strategy can be cheapest or largest : the algorithm selects VMs with the lowest cost when the strategyis cheapest , and with the highest capacity when the strategyis largest . The ﬁrst part (Lines 5-8) deploys the VNF in the normal mode. Since the algorithm aims to keep the numberof required VNF instances as low as possible, it starts withone instance and the cheapest strategy and calls VPTR todetermine placement and routing, and CA to determine theCPU assignment. The deployment is successful if neither ofthese algorithms fails. If the cheapest strategy does notyield a successful deployment for the VNF, the algorithmkeeps the number of instances ﬁxed and tries the largest strategy. If both strategies fail, the number of instances isincreased by one and the process is repeated. The algorithmends whenever a successful deployment is found (Line 8), orthe maximum number of instances is reached.Lines 12-22 decide how to proceed in the VNF sequence ac-cording to the result of deployment, status and can-bakctrack .If the deployment is successful (Line 12), the algorithmupdates the result set, sets status to normal and proceeds tothe next VNF in the VNFFG (Line 14). can-backtrack is alsoupdated in Line 13, which means that backtracking is allowedfor the next VNFs only when we have a successful deploymentin the normal mode for the current VNF: this prevents thealgorithm to backtrack again to a VNF, which has already beendeployed in critical mode. Otherwise (Line 15), status isset to critical and the algorithm proceeds as follows. Asthe ﬁrst attempt, it tries to reﬁne the placement in the previousstep. Thus, if backtracking is allowed, it reverts the result setsrelated to the previous VNF in the VNFFG and goes backto deploy it again (Line 18). When the deployment fails butbacktracking is not possible, due to a violation of the delaybudget, the algorithm preserves the current deployment in theresult set and proceeds to the next VNF, hoping to compensatefor the exceeded delay budget (Line 20). If neither option isviable, the algorithm decides not to serve the current servicerequest and reverts all result sets related to its deployment(Line 22).Lines 10-11 deploy the VNF when status is critical , i . e .,when the previous VNF deployment has failed. This VNF iseither the next VNF in the VNFFG when the algorithm is in thebacktracking phase, or the previous VNF when the algorithmis going to compensate for the exceeded delay budget by thecurrent deployment. In either case, the algorithm chooses thefastest option to deploy the VNF, regardless of the cost, usingthe maximum number of instances and largest strategy.Finally, the algorithm checks the feasibility of the decisionsmade with regard to the datacenter capacity and service targetdelay after each VNF deployment in Line 23. For the former,it is enough to check that the total computational capabilityassigned to VMs within each datacenter does not exceed itsmaximum capacity, i . e ., for each datacenter 𝑑 , ∑︁ 𝜇 ( 𝑘,𝑚,𝑞 ) ∈R 𝑝 : 𝑚 ∈M 𝑑 𝜇 ( 𝑘, 𝑚, 𝑞 ) · 𝜔 ( 𝑞 ) ≤ 𝐶 dc ( 𝑑 ) . (34)Trafﬁc packets belonging to a service may go through dif-ferent end-to-end paths in the physical network and experi-ence different end-to-end delays. We deﬁne ¯ 𝛿 ( 𝑘, 𝑚, 𝑞 ) as themaximum end-to-end delay that trafﬁc packets belonging to service request 𝑘 experience from the ingress VNF until theydepart VM 𝑚 which hosts an instance of VNF 𝑞 . Thus, afterdeploying VNF 𝑞 of service request 𝑘 ∈ K 𝑠 , it is enough tocheck that this delay for any VM 𝑚 , hosting an instance of 𝑞 ,does not exceed the service target delay: ¯ 𝛿 ( 𝑘, 𝑚, 𝑞 ) ≤ 𝐷 QoS ( 𝑠 ) . (35) Algorithm 3:

VNF placement and trafﬁc routing(

VPTR ) Input: 𝑘 ∈ K 𝑠 , 𝑖, 𝑛, strategy ( 𝑞 , 𝑞 ) ← ( 𝑄 𝑠 ( 𝑖 − ) , 𝑄 𝑠 ( 𝑖 )) ; R 𝑟 ← ∅ ; Λ (cid:48) ← Λ ( 𝑠, 𝑞 , 𝑞 ) ; 𝐵 (cid:48) log ( 𝑙 ) ← remaining capacity of 𝑙, ∀ 𝑙 ∈ L L (cid:48) ← { 𝑙 ∈ L ∪ L ◦ : 𝑞 is on src(l) ∧ dst(l) is free ∧ 𝐵 (cid:48) log ( 𝑙 ) > } if C ≠ ∅ then ⊲ cache is not empty Fill 𝑙 ∈ L (cid:48) : dst ( 𝑙 ) = 𝑚 , consiedring limit 𝐷 ( 𝑘,𝑚,𝑞 ) 𝛼 ( 𝑠,𝑞 ) , ∀ 𝐷 ( 𝑘, 𝑚, 𝑞 ) ∈ C : 𝑞 = 𝑞 Update R 𝑟 , Λ (cid:48) , 𝑛, L (cid:48) ; C ← ∅ if strategy is cheapest then sort 𝑙 ∈ L (cid:48) by 𝜔 ( 𝑞 ) · 𝑋 cpu ( dst(l) ) + (cid:205) 𝑒 ∈ 𝑙 𝑋 link ( 𝑒 ) in asc. order else if strategy is largest then sort 𝑙 ∈ L (cid:48) by min { 𝐵 (cid:48) log ( 𝑙 ) , 𝐶 vm ( (dst(l)) 𝜔 ( 𝑞 ) } in desc.order L top ← Pick top 𝑙 ∈ L (cid:48) as much as possible such that |{ dst(l) : 𝑙 ∈ L top }| = 𝑛 M top ← { dst(l) : 𝑙 ∈ L top } ˆ 𝐼 ( 𝑘, 𝑚, 𝑞 ) ← 𝐶 vm ( 𝑚 ) (cid:205) 𝑚 (cid:48)∈M top 𝐶 vm ( 𝑚 (cid:48) ) · Λ (cid:48) , ∀ 𝑚 ∈ M top 𝐶 (cid:48) vm ( 𝑚 ) ← 𝐶 vm ( 𝑚 ) , ∀ 𝑚 ∈ M top forall 𝑙 ∈ L top do 𝑐 ( 𝑙 ) ← min { 𝐵 (cid:48) log ( 𝑙 ) , 𝐶 (cid:48) vm ( dst(l) ) 𝜔 ( 𝑞 ) } 𝑟 ( 𝑘, 𝑙, 𝑞 , 𝑞 ) ← Fill 𝑙 by remaining outgoingtrafﬁc of 𝑞 on src(l) considering 𝑐 ( 𝑙 ) and limit ˆ 𝐼 ( 𝑘, dst(l) , 𝑞 ) Update Λ (cid:48) , 𝐵 (cid:48) log ( 𝑙 ) , 𝐶 (cid:48) vm ( dst(l) ) ; R 𝑟 ← R 𝑟 ∪ { 𝑟 ( 𝑘, 𝑙, 𝑞 , 𝑞 )} if Λ (cid:48) > then Preserve 𝐷 ( 𝑘, 𝑚, 𝑞 ) in cache C such that 𝑞 is on 𝑚 return fail , ∅ return success , R 𝑟 Alg. 3.

It determines the placement and trafﬁc routing forthe 𝑖 -th VNF of request 𝑘 of service 𝑠 , using 𝑛 instances andthe given strategy . Line 1 initializes ( 𝑞 , 𝑞 ) to the 𝑖 -th VNFspair in the VNFFG of service 𝑠 , the routing result set R 𝑟 to ∅ , and the remaining unserved trafﬁc between 𝑞 and 𝑞 , i . e ., Λ (cid:48) , to Λ ( 𝑘, 𝑞 , 𝑞 ) . The ﬁrst pair of VNFs is (◦ , 𝑞 ) with theassumption that the dummy VNF ◦ is placed on the dummy VM. In Lines 2-3, ﬁrst the remaining capacity of each logicallink 𝑙 is calculated and stored in 𝐵 (cid:48) log ( 𝑙 ) and then the onesthat have a remaining capacity greater than zero, host VNF 𝑞 on their source VM, and host no VNF on their destination, are picked and stored in the set L top . The links in L top andtheir destination VMs are the only potential candidates for thisalgorithm to place instances of the 𝑖 -th VNF and accommodateits incoming trafﬁc Λ (cid:48) . In other words, in the rest of thealgorithm, we consider the joint logical link and its destinationVM as one entity and pick the best ones according to the strategy and 𝑛 . If the selected entities cannot ﬁt the incomingtrafﬁc, the placement fails; none the less, we still preservethe amount of satisﬁed trafﬁc in the cache and exploit thisinformation in the backtracking phase.The implementation speed of the backtrack operation isgreatly improved by caching . Speciﬁcally, when Alg. 3 iscalled in the backtracking phase to reﬁne the placement of 𝑖 -thVNF, the cache contains results which determine the routingof a portion of the outgoing trafﬁc of the ( 𝑖 + ) -th VNFto the ( 𝑖 + ) -th VNF, which was satisﬁed by the previousdeployment of the ( 𝑖 + ) -th VNF in the VNFFG. Lines 4-6 exploit the cached results and accommodate the unservedportion of incoming trafﬁc by using different instances, whichhelps the next deployment of the ( 𝑖 + ) -th VNF to fully serveits trafﬁc. For instance, assuming 𝛼 ( 𝑠, Q 𝑠 ( 𝑖 + )) = and thatthe placement of the ( 𝑖 + ) -th VNF has failed by Λ (cid:48) unservedtrafﬁc, the backtracking step will have to accommodate only Λ (cid:48) trafﬁc on extra VMs, i . e ., the routing and placement resultsfor the served trafﬁc portion, 𝐷 ( 𝑘, 𝑚, 𝑞 ) ∈ C , will not change.The pairs of logical links and the connected VMs will beselected for placement and routing based on the given strategy .If the strategy is cheapest , they will be sorted according tothe cost of the logical link plus the VM CPU cost in ascendingorder (Line 8). If the strategy is largest , we sort them indescending order by the minimum of the remaining capacityof the logical link and the VM (Line 10). Line 11 picks thebiggest set of top logical links such that the number of uniquedestination VMs is equal to the number of instances, i . e ., 𝑛 ,and stores them in L top . Note that there may be multiplelogical links with the same destination VM in this set, andtherefore we should pick the largest set to increase the chanceof ﬁtting the trafﬁc. If the number of unique destination VMsis less than 𝑛 , L top will be empty and the placement fails.Otherwise, we store destination VMs corresponding to logicallinks 𝑙 ∈ L top in set M top (Line 12).To avoid an exceedingly high processing time, Line 13introduces a limit for the amount of trafﬁc entering a given VM 𝑚 ∈ M top , proportional to the VM maximum computationalcapacity. Notice that all logical links ending at the same desti-nation VM have the same limit. The remaining computationalcapacity of each selected VM, 𝐶 (cid:48) 𝑣𝑚 ( 𝑚 ) , is initialized to itsmaximum 𝐶 𝑣𝑚 ( 𝑚 ) (Line 14). The algorithm adopts a water-ﬁlling approach to ﬁll the logical links in Lines 15-18. First, foreach logical link 𝑙 and its connected VM 𝑑𝑠𝑡 ( 𝑙 ) , the remainingcapacity, i . e ., the minimum of the remaining capacities of 𝑙 and dst(l) , is stored in 𝑐 ( 𝑙 ) (Line 16). Then, logical link 𝑙 is ﬁlledby the remaining unserved outgoing trafﬁc of VNF 𝑞 on VMsrc ( 𝑙 ) , so that neither 𝑐 ( 𝑙 ) limit on the capacity of logicallink 𝑙 nor the ˆ 𝐼 ( 𝑘, 𝑚, 𝑞 ) limit on the incoming trafﬁc of VMdst ( 𝑙 ) are violated. Line 18 updates the remaining unservedtrafﬁc from 𝑞 to 𝑞 ( Λ (cid:48) ), the remaining capacity of logicallink 𝑙 ( 𝐵 (cid:48) log ( 𝑙 ) ), the remaining capacity of destination VM ( 𝐶 (cid:48) vm ( dst ( 𝑙 )) ), and routing result set ( R 𝑟 ). Finally, if there isstill some unserved trafﬁc from VNF 𝑞 to 𝑞 ( i . e ., not all thetrafﬁc can be served), the algorithm returns fail (Lines 19-21). Line 20 preserves the satisﬁed outgoing trafﬁc of VM 𝑚 hosting an instance of VNF 𝑞 , i . e ., 𝐷 ( 𝑘, 𝑚, 𝑞 ) , in the cache,so as to use it later on in case of backtracking. Otherwise, thealgorithm returns success with the placement result set R 𝑝 . Algorithm 4:

CPU assignment ( CA ) Input: 𝑘 ∈ K 𝑠 , 𝑖, R 𝑟 ( 𝑞 , 𝑞 ) ← ( 𝑄 𝑠 ( 𝑖 − ) , 𝑄 𝑠 ( 𝑖 )) ; R 𝑝 ← ∅ ; L dep ← { 𝑙 ∈ L : ∃ 𝑟 ( 𝑘 (cid:48) , 𝑞 (cid:48) , 𝑞 (cid:48) , 𝑙 ) ∈ R 𝑟 : 𝑘 (cid:48) = 𝑘 ∧ 𝑞 (cid:48) = 𝑞 ∧ 𝑞 (cid:48) = 𝑞 ∧ 𝑟 ( 𝑘 (cid:48) , 𝑞 (cid:48) , 𝑞 (cid:48) , 𝑙 ) > } M dep ← { 𝑚 ∈ M : ∃ 𝑙 ∈ L dep : dst ( 𝑙 ) = 𝑚 } for 𝑚 ∈ M dep do 𝐼 ( 𝑘, 𝑚, 𝑞 ) ← (cid:205) 𝑟 ( 𝑘,𝑙,𝑞 ,𝑞 ) ∈R 𝑟 : dst ( 𝑙 ) = 𝑚 𝑟 ( 𝑘, 𝑙, 𝑞 , 𝑞 ) ˇ 𝛿 ( 𝑘, 𝑚, 𝑞 ) ← max 𝑙 ∈L dep : dst ( 𝑙 ) = 𝑚 (cid:0) ¯ 𝛿 ( 𝑘, src(l) , 𝑞 ) + 𝐷 log ( 𝑙 ) (cid:1) if status is critical then 𝜇 ( 𝑘, 𝑚, 𝑞 ) ← 𝐶 vm ( 𝑚 ) 𝜔 ( 𝑞 ) else ⊲ status is normal 𝜇 ( 𝑘, 𝑚, 𝑞 ) ← 𝐼 ( 𝑘, 𝑚, 𝑞 ) + (cid:205) 𝑖𝑗 = Δ ( 𝑠,𝑄 𝑠 ( 𝑗 ))− ˇ 𝛿 ( 𝑘,𝑚,𝑞 ) if 𝜇 ( 𝑘, 𝑚, 𝑞 ) ∉ ( 𝐼 ( 𝑘, 𝑚, 𝑞 ) , 𝐶 vm ( 𝑚 ) 𝜔 ( 𝑞 ) ] then 𝜇 ( 𝑘, 𝑚, 𝑞 ) ← 𝐶 vm ( 𝑚 ) 𝜔 ( 𝑞 ) R 𝑝 ← R 𝑝 ∪ { 𝜇 ( 𝑘, 𝑚, 𝑞 )} ¯ 𝛿 ( 𝑘, 𝑚, 𝑞 ) ← ˇ 𝛿 ( 𝑘, 𝑚, 𝑞 ) + 𝜇 ( 𝑘,𝑚,𝑞 )− 𝐼 ( 𝑘,𝑚,𝑞 ) if max 𝑚 ∈M dep ¯ 𝛿 ( 𝑘, 𝑚, 𝑞 ) > (cid:205) 𝑖𝑗 = Δ ( 𝑠, 𝑄 𝑠 ( 𝑗 )) then return fail , R 𝑝 return success , R 𝑝 Alg. 4 . It is called in Line 7 and Line 11 of Alg. 2 whenthe deployment of VNF 𝑞 in Alg. 3 is successful. Given theresult set R 𝑟 , this algorithm is responsible for assigning theservice rates to VMs for running the deployed instances ofVNF 𝑞 . After initialization, in Line 2, L dep deﬁnes the set ofthe logical links used for routing a part of trafﬁc from anyinstance of VNF 𝑞 to any instance of VNF 𝑞 . We store theVMs on which VNF 𝑞 is already deployed in the set M dep (Line 3). Then, for each 𝑚 ∈ M dep , we calculate the incomingtrafﬁc through the sum of trafﬁc from all logical links endingin VM 𝑚 , and store it in 𝐼 ( 𝑘, 𝑚, 𝑞 ) in Line 5. ˇ 𝛿 ( 𝑘, 𝑚, 𝑞 ) represents the maximum end-to-end delay thattrafﬁc packets experience from the ingress VM to VM 𝑚 ,which hosts an instance of VNF 𝑞 , but before being processedby 𝑚 . For each logical link 𝑙 ∈ L dep where dst(l) = 𝑚 , thisdelay is equal to the sum of the maximum end-to-end delayof trafﬁc packets after being processed by VNF 𝑞 on VM src ( 𝑙 ) , i . e ., ¯ 𝛿 ( 𝑞 , src(l) ) , and the delay of logical link 𝑙 , i . e ., 𝐷 log ( 𝑙 ) . Taking the maximum over all such logical links, wehave ˇ 𝛿 ( 𝑘, 𝑚, 𝑞 ) in Line 6.Similar to the VNF deployment in Alg. 3, the algorithmassigns service rates to VMs based on the deployment status .In the critical mode, the algorithm aims to reduce the delay contribution, which depends on logical links delay andprocessing time on VMs. The logical links are already selectedby the VPTR algorithm, thus here we assign the maximumpossible service rate for the VM to reduce the processingtime (Line 8). Instead, when the algorithm is in normal mode, it chooses the minimum possible service rates for VM 𝑚 (Line 10), such that the VNFs delay budget do not violate, i . e . 𝑖 ∑︁ 𝑗 = Δ ( 𝑠, 𝑄 𝑠 ( 𝑗 )) − ˇ 𝛿 ( 𝑘, 𝑚, 𝑞 ) = 𝜇 ( 𝑘, 𝑚, 𝑞 ) − 𝐼 ( 𝑘, 𝑚, 𝑞 ) . (36)In the above equation, the right- and left-hand sides representthe processing time of VM 𝑚 and the remaining delay budgetof VNFs, respectively. To compute the latter, ﬁrst it is calcu-lated the total delay budget of the VNFs up to the 𝑖 -th one( i . e ., the current one). Then, it is subtracted by the maximumend-to-end delay of trafﬁc packets, before being processed byVNF 𝑞 on VM 𝑚 , i . e ., ˇ 𝛿 ( 𝑘, 𝑚, 𝑞 ) .The computed service rate for VM 𝑚 may be invalidbecause (i) no delay budget is left to process the currentVNF on VM 𝑚 , i . e ., the left-hand side of equality in (36)becomes non-positive, or (ii) the assigned service rate exceedsthe maximum capability of the VM. In both cases, the CA algorithm fails, however the VM is assigned to its maximumcomputational capability to process the VNF (Line 12). Recallthat, although the CPU assignment failed for the current VNF,the algorithm keeps the results to be used in Alg. 2 (Line 19)when backtracking is not allowed. In this case, the algorithmcontinues with the next VNF and tries to compensate for theexceeded delay budget. Line 13 stores the results, and Line 14updates ¯ 𝛿 ( 𝑘, 𝑚, 𝑞 ) for this VM that shows the maximum end-to-end delay after the packets are processed by VM 𝑚 . Finally,when all service rates have been assigned, the algorithmreturns fail if the remaining delay budget is violated forat least one VM (Line 15), and success otherwise. C. Computational Complexity

The MaxSR heuristic takes the set of physical links E ,service requests K , and their VNFFG Q 𝑠 , VMs M , andlogical links L as inputs. Note that L is considered as aninput since it is computed once for all executions of MaxSRalgorithm. Below, we prove that this algorithm has a worst-case polynomial complexity in terms of input parameters. Theorem 2:

The MaxSR algorithm has a worst-case poly-nomial computation complexity.

Proof:

First, we determine the complexity of the

VPTR and CA algorithms. VPTR constructs and sorts the set L (cid:48) in 𝑂 (|L| log |L|) and adopts water ﬁlling to ﬁll the logicallinks in 𝑂 (L) , thus the total time complexity of this algorithmis 𝑂 (|L| log |L|) . CA also has 𝑂 (L) complexity, hence thetotal computational complexity of VPTR and CA remainsequal to that of VPTR . Alg. 1 sorts the service requestsin 𝑂 (|K | log |K |) and calls BSRD for each service request.In the worst-case,

BSRD tries every possible number ofinstances and strategies for all VNFs in the VNFFG of thegiven service request. Let 𝑁 and 𝑄 be upper bounds on themaximum number of instances, i . e . 𝑁 ( 𝑠, 𝑞 ) , ∀ 𝑠 ∈ S , 𝑞 ∈ Q , and the number of VNFs in a VNFFG, i . e . |Q 𝑠 | , ∀ 𝑠 ∈ S ,respectively. Thus, the total time complexity of BSRD is 𝑂 ( 𝑁𝑄 |L| log |L|) and total time complexity of Alg. 1 is 𝑂 (cid:0) |K | 𝑁𝑄 |L| log |L| + |K | log |K | (cid:1) . Therefore, the worst-casetotal time complexity is polynomial in terms of input param-eters. In other words, the complexity of the heuristic dependsprimarily on the number of service requests, the number ofVNFs in each VNFFG, the number of deployment attemptsfor each VNF, and the number of logical links.V. N UMERICAL R ESULTS

We now present the results of the numerical experiments weconducted, and show that our proposed scheme consistentlyperforms better than state-of-the-art approaches and close tothe optimum. We compare our heuristic algorithm against thefollowing benchmarks: • Global optimum . The solution of the optimization prob-lem deﬁned in Sec. III-B obtained by brute-force search,assuming exact knowledge of arrival and departure timesof all service requests. • Best-ﬁt . It is an online algorithm which decides abouteach service request upon its arrival, without any infor-mation about the future service requests. Best-Fit deploysVNFs of a service request one by one, using a singleinstance of each VNF and the cheapest strategy. Ifthe request can be served, the selected resources will bededicated to the service request until its departure.In our performance evaluation, we use the following perfor-mance metrics: • Service revenue , deﬁned as the sum of revenues achievedby serving service requests. For a single request ofservice 𝑠 , this metric equals the amount of served trafﬁcmultiplied by 𝑋 rev ( 𝑠 ) . • Cost/trafﬁc , which reﬂects the average cost incurred toserve a unit of trafﬁc.In the following, we ﬁrst consider a small-scale networkscenario, for which the optimum solution can be obtained in areasonable time. This scenario will give interesting and easy-to-interpret insights regarding how each service type impactsthe revenue and cost/trafﬁc ratio. Then, we run MaxSR andBest-Fit in a large-scale real network scenario, where achiev-ing the optimum solution is impractical. Table II summarizesthe services we consider for our performance evaluation,inspired to real-world 5G applications. The revenue gainedfrom serving one unit of trafﬁc of service 𝑠 , i . e ., 𝑋 rev ( 𝑠 ) ,is set inversely proportional to the service target delays. Weassume that the service requests arrive according to a Poissonprocess, and the duration of requests follows an exponentialdistribution.In both scenarios, we study the impact of trafﬁc and delayon the performance metrics by multiplying trafﬁc arrival rates 𝜆 new and physical link delays 𝐷 phy ( 𝑒 ) by different factors. Werun each experiment times and report the average value foreach point in the ﬁgures. In general, MaxSR, taking advantageof backtracking, achieves close to the optimum service revenuebetter than Best-Fit. However, the value of cost/trafﬁc ratiodepends on how tight the target delay is. When the target TABLE II: List of services

Service 𝐷 QoS 𝜆 new 𝑋 rev Application(ms) (Mb/s) ( e /Gb) 𝑠

10 3 100 safety apps. ( e . g ., vehicle collisiondetection) 𝑠

45 10 22 . real-time apps ( e . g ., gaming) 𝑠

80 15 12 . soft real-time apps 𝑠 . delay-tolerant apps ( e . g ., videostreaming) TABLE III: Different VM types in datacenters

VM type 𝐶 𝑣𝑚 𝑋 𝑐𝑝𝑢 𝑋 𝑖𝑑𝑙𝑒 (MIPS) ( e /MIPS/hour) ( e /hour)Small

600 2 × − . Medium × − . Large × − . delay is small, the chance of backtracking increases; therefore,MaxSR incurs more cost to serve the requests. A. Small-scale Scenario

We consider two pairs of VMs of different types, i . e ., small and medium as described in Table III. Pairs of VMs inside areconnected using a physical link: physical links between small and medium types VMs have cost of . e /Gb and . e /Gbper hour, respectively, while their latency varies from ms to ms with the default value set to ms, and we disregard thelink capacity. The time needed to setup a VM is one minute.We consider two simple services 𝑠 and 𝑠 , each having achain of two VNFs with target delays ms and ms, andwith input trafﬁc rates Mb/s and Mb/s, respectively (assummarized in Table II). In this scenario, we set 𝑁 ( 𝑠, 𝑞 ) = for all VNFs, an average duration of minutes for eachservice, and we assign them randomly to the arrival pointsof a Poisson process with an average rate of . requests perminute, while the total system lifespan is set to minutes. Impact of Physical Link latency and Arrival Trafﬁc.

Fig. 2 shows the impact of the trafﬁc arrival intensity onthe service revenue and cost/trafﬁc ratio. MaxSR matchesthe optimum, and Best-Fit performs close to the optimumin both service revenue and cost/trafﬁc ratio. As it has nobacktracking mechanism, Best-Fit does not serve a requestwhenever any of its VNFs cannot be served within its delaybudget, i . e ., it has no budget ﬂexibility; therefore, it achieveslower service revenue than the optimum. While the cost ofphysical links increases proportionally to the trafﬁc, the costsof VMs in turning-on mode remains constant, and their costin active mode increases less than proportionally with thetrafﬁc; the resulting effect is that cost/trafﬁc ratio decreaseswith the trafﬁc – which conforms to the intuitive notion thatserving larger amounts of trafﬁc is more cost-efﬁcient. Best-Fitincurs more cost compared to MaxSR and optimum because itdoes not support VNF migration, causing a VNF to continuerunning on a high-cost VM even if a low-cost VM becomesavailable. The excess VMs CPU cost and transmission costscale with the trafﬁc, whereas the excess VMs idle cost S e r v i ce r e v e u e ( € ) Arrival traffic multiplier

OptimumMaxSRBest-Fit (a) C o s t/t r a ff i c ( € / G b ) Arrival traffic multiplier

OptimumMaxSRBest-Fit (b)

Fig. 2: Small-scale scenario. Impact of service requestsarrival trafﬁc on absolute value of service revenue andcost/trafﬁc ratio. Physical link delay = ms. S e r v i ce r e v e u e ( € ) Physical link delay (ms)

OptimumMaxSRBest-Fit (a) C o s t/t r a ff i c ( € / G b ) Physical link delay (ms)

OptimumMaxSRBest-Fit (b)

Fig. 3: Small-scale scenario. Impact of physical link latecyon absolute value of service revenue and cost/trafﬁc ratio.Arrival trafﬁc multiplier = .remains constant; therefore, the difference between Best-Fitand optimum becomes smaller as trafﬁc increases.Fig. 3 shows the impact of physical link latency on theservice revenue and cost/trafﬁc ratio. For all latency values,MaxSR is still able to achieve optimum service revenue. Asshown in Fig. 4a, no strategy (not even optimum) can serve allrequests when the physical link delay is ms especially, forservice 𝑠 . The reason is that when the number of concurrentrequests becomes more than two, both optimum and MaxSRgive the priority to the high-revenue service 𝑠 and requestsfor 𝑠 will only be processed if resources are available. Whenthe physical link delay increases, service requests need morecomputational capacity on VMs to meet their target delay,in order to offset longer network delays. Speciﬁcally, whenthe physical link delay is ms, requests of type 𝑠 can onlybe served on high-capacity VMs, and therefore, concurrentrequests of this type can not be served. This is conﬁrmed bythe degradation of the optimum in Fig. 3a, and in Fig. 4b,where the fraction of served requests of type 𝑠 becomes lessthan when the physical link delay is ms.Best-Fit gains substantially lower service revenue comparedto others, especially for higher values of physical link delay.As shown in Fig. 4b, this is due to the fact that Best-Fit cannotdeploy requests of type 𝑠 in those cases. This, in turn, is dueto the fact that it does not support backtracking: when the delaybudget for the second VNF in the chain of 𝑠 is violated, nocorrective action is taken and the whole request fails.Fig. 3b shows MaxSR has a higher cost/trafﬁc ratio whenthe physical link delay is over ms. The reason is that theneed for backtracking increases with the physical link delay,and VMs become more likely to be scaled to their maximum s s s s s F r ac . o f d e p . s e r v i ce r e q s . Low cost resourcesHigh cost resources

OptimumMaxSRBest-Fit (a) Physical link delay = ms. s s s s s F r ac . o f d e p . s e r v i ce r e q s . Low cost resourcesHigh cost resources

OptimumMaxSRBest-Fit (b) Phsysical link delay = ms. Fig. 4: Fraction of deployed service requests for each serviceand algorithm. Arrival trafﬁc multiplier = 1.Fig. 5: Cogent Network Topology.capacity, which results in a higher cost. As one might expect,the cost/trafﬁc ratio for Best-Fit decreases when physical linkdelay ≥ ms because it does not serve requests of highercost service 𝑠 . Recall that the cost of a service depends onthe amount of required CPU on VMs, and therefore serviceswith lower target delays incur more costs to serve one unit oftrafﬁc. Large-scale scenario.

We consider the real-world inter-datacenter network

Cogent , a tier Internet service provider(Fig. 5). This network topology contains access nodeswith physical links and datacenters. We set the costof links connecting the datacenters to . e /GB. The delayof logical links connecting the datacenters is set to be pro-portional to their geographical lengths, while the links insideeach datacenter are assumed to be ideal having no capacitylimit, latency, and cost. We assume each datacenter hosts VMs, each of which is connected to some edge switches. Wecategorize VMs within each datacenter in small , medium , and large types according to their capacity and cost, as describedin Table III. We assume VMs need one minute to setup beforebeing active .We consider the four different services described in Table II,each of which is a representative of a category of real 5Gapplications. In this scenario, we assume that the VNFFG ofeach service is a chain of ﬁve VNFs. We further assume thatthe computational complexity, i . e ., 𝜔 and maximum numberof instances, i . e ., 𝑁 ( 𝑠, 𝑞 ) for two VNFs of service 𝑠 are three,while other VNFs have 𝜔 = and 𝑁 ( 𝑠, 𝑞 ) = . Similar to theprevious scenario, we consider an equal number of requestsfor each service where requests arrive across time steps withthe average inter-arrival time of three minutes and end after anaverage duration of two hours, and the total system lifespanis assumed to be one day. In this experiment, we set 𝐻 to minutes and 𝜏 to minutes. S e r v i ce r e v e u e ( K € ) Arrival traffic multiplier

MaxSRBest-Fit (a) C o s t/t r a ff i c ( € / G b ) Arrival traffic multiplier

MaxSRBest-Fit (b)

Fig. 6: Large-scale scenario. Impact of service requestsarrival trafﬁc on absolute value of service revenue andcost/trafﬁc ratio. Physical link delay multiplier = . s s s s s s s F r ac . o f d e p . s e r v i ce r e q s . Low cost resourcesMedium cost resourcesHigh cost resources

MaxSRBest-Fit (a) Trafﬁc multiplier = . s s s s s s s F r ac . o f d e p . s e r v i ce r e q s . Low cost resourcesMedium cost resourcesHigh cost resources

MaxSRBest-Fit (b) Trafﬁc multiplier = . Fig. 7: Fraction of deployed service requests for each serviceand algorithm. Physical link delay multiplier = . Impact of Physical Link latency and Arrival Trafﬁc.

Asexplained above, the optimum values cannot be obtained forthis scenario in a reasonable time and therefore we rely onresults for MaxSR and Best-Fit. Fig. 6a shows the effect ofarrival trafﬁc on the service revenue, while Fig. 7 shows thefraction of requests of each service that can be successfullydeployed. We observe that service revenue for MaxSR changesalmost proportionally with the trafﬁc because increasing thetrafﬁc almost does not impact the fraction of served requestsby this algorithm. Best-Fit serves a lower fraction of servicerequests, and therefore achieves lower revenue. Besides, Best-Fit shows a drop-off in service revenue when the arrival trafﬁcmultiplier is . : as conﬁrmed by Fig. 7b, this is because Best-Fit does not serve requests of high trafﬁc service 𝑠 when thetrafﬁc multiplier is over . , due to its lack of support formultiple VNF instances.Fig. 6b shows the impact of arrival trafﬁc on the cost/trafﬁcratio. Best-Fit has lower cost/trafﬁc ratio when arrival trafﬁcmultiplier is less than . , since MaxSR must use resourceswith higher cost and higher computational capabilities to servemore service requests; in other words, Best-Fit serves lesstrafﬁc but that trafﬁc is served cheaply. Similar to servicerevenue the values of cost/trafﬁc ratio for Best-Fit have asigniﬁcant rise when the arrival trafﬁc multiplier is . , asconﬁrmed by Fig. 8, when the trafﬁc multiplier increases from . to . , Best-Fit is no longer able to serve a signiﬁcantfraction of the total trafﬁc. As shown by Fig. 7, the trafﬁcBest-Fit is unable to serve mainly belongs to the low costservice 𝑠 , which results in a higher cost for served trafﬁc.Fig. 9a shows the impact of physical link latency on theservice revenue. Similar to the small-scale scenario, MaxSRoutperforms Best-Fit especially for higher values of physical l o w m e dh i ghno t d e p . l o w m e dh i ghno t d e p . F r ac . o f o ff e r e d t r a ff i c MaxSRBest-Fit (a) Trafﬁc multiplier = . l o w m e dh i ghno t d e p . l o w m e dh i ghno t d e p . F r ac . o f o ff e r e d t r a ff i c MaxSRBest-Fit (b) Trafﬁc multiplier = . Fig. 8: Fraction of offered trafﬁc deployed on each resourcetype for each algorithm. low , med , high and not dep. meanthe fraction of offered trafﬁc served on low cost resources,medium cost resources, high cost resources, and which is notserved, respectively. Physical link delay multiplier = . S e r v i ce r e v e u e ( K € ) Physical link delay multiplier

MaxSRBest-Fit (a) C o s t/t r a ff i c ( € / G b ) Physical link delay multiplier

MaxSRBest-Fit (b)

Fig. 9: Large-scale scenario. Impact of physical link delayon absolute value of service revenue and cost/trafﬁc ratio.Arrival trafﬁc multiplier = 1.link delays, because the latter cannot serve the requests forservices with low target delay (hence, higher revenue). Asthese service types have higher cost, they will also cause thecost/trafﬁc ratio for Best-Fit to be lower than MaxSR as shownin Fig. 9b.

Running Time.

We run our experiments using a server with -core Intel Xeon E5-2690 v2 3.00GHz CPU and GB ofmemory. To compare the running time of different algorithms,we consider the case where the arrival trafﬁc and physical linkdelay multipliers are equal to one. For each scenario, we runthe algorithm times and report the average running timein Table IV. MaxSR and Best-Fit are substantially faster thanbrute-force in the small-scale scenario. The prohibitively longrunning time for brute-force highlights its poor scalability, andmakes it inapplicable for the large-scale scenario in practice.The results for the large-scale scenario show that althoughMaxSR has higher running time compared to Best-Fit due tobacktracking, both of them are scalable and adequately fastfor large-scale networks.VI. C ONCLUSION

We proposed a dynamic service deployment strategy in5G networks, accounting for real-world aspects such as VMsetup times, and jointly making all the required decisions.We ﬁrst formulated the problem of joint requests admission,VM activation, VNF placement, resource allocation, and trafﬁcrouting as a MILP based on the complete knowledge ofrequests arrival and departure times. We took the MNO proﬁt s s s s s s s F r ac . o f d e p . s e r v i ce r e q s . Low cost resourcesMedium cost resourcesHigh cost resources

MaxSRBest-Fit (a) Physical link delay multiplier = . s s s s s s s F r ac . o f d e p . s e r v i ce r e q s . Low cost resourcesMedium cost resourcesHigh cost resources

MaxSRBest-Fit (b) Physical link delay multiplier = . Fig. 10: Fraction of deployed service requests for eachservice and algorithm. Arrival trafﬁc multiplier = .TABLE IV: Running time (in seconds) Scenario Brute-force MaxSR Best-FitSmall-scale

399 0 . . Large-scale -

21 2 as the main objective to be optimized over the entire systemlifespan, leveraging a queueing model to ensure all requestsadhere to their latency targets. Our model also accounted forthe key features of 5G services such as complex VNF graphsand arbitrary input trafﬁc.Due to the problem complexity, we further proposed aheuristic, MaxSR, which has polynomial complexity and at-tains near-optimal solutions, while only needing the knowl-edge/prediction of the upcoming service requests in a shorttime horizon. The algorithm works in a sliding-horizon fash-ion, rearranging the current-served requests across existingVMs to reduce the deployment costs, and admitting the newones as they arrive at the system. Furthermore, the parametersof MaxSR allow for different tradeoffs between solution op-timality and running time. We demonstrated the effectivenessand efﬁciency of our approach through a numerical evaluationincluding different network scenarios.A

CKNOWLEDGMENTS

This work was supported by the EU 5GROWTH project(Grant No. 856709). R

EFERENCES[1] ETSI. (2017) Network Functions Virtualisation (NFV); Management andOrchestration.[2] S. Agarwal, F. Malandrino, C. F. Chiasserini, and S. De, “Vnf placementand resource allocation for the support of vertical services in 5gnetworks,”

IEEE/ACM Transactions on Networking , vol. 27, no. 1, pp.433–446, Feb. 2019.[3] B. Yi, X. Wang, K. Li, M. Huang et al. , “A comprehensive surveyof network function virtualization,”

Computer Networks , vol. 133, pp.212–262, 2018.[4] R. Cohen, L. Lewin-Eytan, J. S. Naor, and D. Raz, “Near optimal place-ment of virtual network functions,” in

IEEE Conference on ComputerCommunications (INFOCOM) , Apr. 2015, pp. 1346–1354.[5] Lin Gu, Sheng Tao, Deze Zeng, and Hai Jin, “Communication cost efﬁ-cient virtualized network function placement for big data processing,” in

IEEE Conference on Computer Communications Workshops (INFOCOMWKSHPS) , Apr. 2016, pp. 604–609.[6] M. Mechtri, C. Ghribi, and D. Zeghlache, “A scalable algorithm for theplacement of service function chains,”

IEEE Transactions on Networkand Service Management , vol. 13, no. 3, pp. 533–546, Sep. 2016. [7] C. Pham, N. H. Tran, S. Ren, W. Saad, and C. S. Hong, “Trafﬁc-awareand energy-efﬁcient vnf placement for service chaining: Joint samplingand matching approach,” IEEE Transactions on Services Computing , pp.1–1, 2017.[8] M. M. Tajiki, S. Salsano, L. Chiaraviglio, M. Shojafar, and B. Akbari,“Joint energy efﬁcient and qos-aware path allocation and vnf placementfor service function chaining,”

IEEE Transactions on Network andService Management , vol. 16, no. 1, pp. 374–388, Mar. 2019.[9] J. Pei, P. Hong, K. Xue, and D. Li, “Efﬁciently embedding servicefunction chains with dynamic virtual network function placement ingeo-distributed cloud system,”

IEEE Transactions on Parallel and Dis-tributed Systems , vol. 30, no. 10, pp. 2179–2192, Oct. 2019.[10] G. Sallam and B. Ji, “Joint placement and allocation of virtual networkfunctions with budget and capacity constraints,” in

IEEE Conference onComputer Communications (INFOCOM) , Apr. 2019, pp. 523–531.[11] M. A. Tahmasbi Nejad, S. Parsaeefard, M. A. Maddah-Ali, T. Mah-moodi, and B. H. Khalaj, “vspace: Vnf simultaneous placement, ad-mission control and embedding,”

IEEE Journal on Selected Areas inCommunications , vol. 36, no. 3, pp. 542–557, Mar. 2018.[12] R. Zhou, Z. Li, and C. Wu, “An efﬁcient online placement schemefor cloud container clusters,”

IEEE Journal on Selected Areas inCommunications , vol. 37, no. 5, pp. 1046–1058, May 2019.[13] T. Kuo, B. Liou, K. C. Lin, and M. Tsai, “Deploying chains of virtualnetwork functions: On the relation between link and server usage,”

IEEE/ACM Transactions on Networking , vol. 26, no. 4, pp. 1562–1576,Aug. 2018.[14] X. Fei, F. Liu, H. Xu, and H. Jin, “Adaptive vnf scaling and ﬂow routingwith proactive demand prediction,” in

IEEE Conference on ComputerCommunications (INFOCOM) , Apr. 2018, pp. 486–494.[15] H. Tang, D. Zhou, and D. Chen, “Dynamic network function instancescaling based on trafﬁc forecasting and vnf placement in operatordata centers,”

IEEE Transactions on Parallel and Distributed Systems ,vol. 30, no. 3, pp. 530–543, Mar. 2019.[16] Y. Jia, C. Wu, Z. Li, F. Le, and A. Liu, “Online scaling of nfv servicechains across geo-distributed datacenters,”

IEEE/ACM Transactions onNetworking , vol. 26, no. 2, pp. 699–710, Apr. 2018.[17] Y. Li, L. T. X. Phan, and B. T. Loo, “Network functions virtualizationwith soft real-time guarantees,” in

IEEE Conference on ComputerCommunications (INFOCOM) . IEEE, 2016, pp. 1–9.[18] V. Eramo, E. Miucci, M. Ammar, and F. G. Lavacca, “An approachfor service function chain routing and virtual function network instancemigration in network function virtualization architectures,”

IEEE/ACMTransactions on Networking , vol. 25, no. 4, pp. 2008–2025, 2017.[19] J. Liu, W. Lu, F. Zhou, P. Lu, and Z. Zhu, “On dynamic service functionchain deployment and readjustment,”

IEEE Transactions on Network andService Management , vol. 14, no. 3, pp. 543–553, Sep. 2017.[20] M. Huang, W. Liang, Y. Ma, and S. Guo, “Maximizing throughput ofdelay-sensitive nfv-enabled request admissions via virtualized networkfunction placement,”

IEEE Transactions on Cloud Computing , pp. 1–1,2019.[21] F. Malandrino, C. F. Chiasserini, C. Casetti, G. Landi, and M. Capitani,“An optimization-enhanced mano for energy-efﬁcient 5g networks,”

IEEE/ACM Transactions on Networking , vol. 27, no. 4, pp. 1756–1769,Aug. 2019.[22] F. Malandrino, C.-F. Chiasserini, G. Einziger, and G. scalosub, “Reduc-ing service deployment cost through vnf sharing,”

IEEE/ACM Transac-tions on Networking , vol. PP, 10 2019.[23] Q. Zhang, F. Liu, and C. Zeng, “Adaptive interference-aware vnf place-ment for service-customized 5g network slices,” in

IEEE Conference onComputer Communications (INFOCOM) , Apr. 2019, pp. 2449–2457.[24] N. Alliance, “Description of network slicing concept,”

NGMN 5G P ,vol. 1, p. 1, 2016.[25] I. Dumitrescu and N. Boland, “Algorithms for the weight constrainedshortest path problem,”

International Transactions in Operational Re-search , vol. 8, no. 1, pp. 15–29, 2001.[26] D. Bega, M. Gramaglia, M. Fiore, A. Banchs, and X. Costa-P´erez,“Deepcog: Optimizing resource provisioning in network slicing withai-based capacity forecasting,”