Joint Planning of Network Slicing and Mobile Edge Computing in 5G Networks
Bin Xiang, Jocelyne Elias, Fabio Martignon, Elisabetta Di Nitto
11 Joint Planning of Network Slicing and MobileEdge Computing in 5G Networks
Bin Xiang, Jocelyne Elias, Fabio Martignon, and Elisabetta Di Nitto
Abstract —Multi-access Edge Computing (MEC) facilitates the deployment of critical applications with stringent QoS requirements,latency in particular. Our paper considers the problem of jointly planning the availability of computational resources at the edge, theslicing of mobile network and edge computation resources, and the routing of heterogeneous traffic types to the various slices. Theseaspects are intertwined and must be addressed together to provide the desired QoS to all mobile users and traffic types still keepingcosts under control. We formulate our problem as a mixed-integer nonlinear program (MINLP) and we define a heuristic, namedNeighbor Exploration and Sequential Fixing (NESF), to facilitate the solution of the problem. The approach allows network operators tofine tune the network operation cost and the total latency experienced by users. We evaluate the performance of the proposed modeland heuristic against two natural greedy approaches. We show the impact of the variation of all the considered parameters (viz.,different types of traffic, tolerable latency, network topology and bandwidth, computation and link capacity) on the defined model.Numerical results demonstrate that NESF is very effective, achieving near-optimal planning and resource allocation solutions in a veryshort computing time even for large-scale network scenarios.
Index Terms —Edge computing, network planning, node placement, network slicing, joint allocation. (cid:70)
NTRODUCTION T HE fifth-generation (5G) networks aim to meet differentusers’ Quality of Service (QoS) requirements in severaldemanding application scenarios and use cases. Among theothers, controlling latency is certainly one of the key QoSrequirements that mobile operators have to deal with. Infact, the classification devised by the International Telecom-munications Union-Radio communication Sector (ITU-R),shows that mission-critical services depend on strong la-tency constraints. For example, in some use cases (e.g.,autonomous driving), the tolerable latency is expected toreach less than 1 ms [1].To address such constraints various ingredients areemerging. First of all, through Network Slicing , the physicalnetwork infrastructure can be split into several isolatedlogical networks, each dedicated to applications with spe-cific latency requirements, thus enabling an efficient anddynamic use of network resources [2].Second,
Multi-access Edge Computing (MEC) provides anIT service environment and cloud-computing capabilities atthe edge of the mobile network, within the Radio AccessNetwork and in close proximity to mobile subscribers [3].Through this approach the latency experienced by mobileusers can be consistently reduced. However, the computa-tion power that can be offered by an edge cloud is quitelimited in comparison with a remote cloud. Consideringthat 5G networks will be likely built in an ultra-dense • B. Xiang and E. Di Nitto are with the Dipartimento di Elettronica,Informazione e Bioingegneria, Politecnico di Milano, Milan, Italy, 20133.E-mail: { bin.xiang, elisabetta.dinitto } @polimi.it. • J. Elias is with the Department of Computer Science and Engineering(DISI), University of Bologna, Bologna, Italy, 40126.E-mail: [email protected]. • F. Martignon is with the Department of Management, Information andProduction Engineering, University of Bergamo, Bergamo, Italy, 24044.E-mail: [email protected]. manner, the edge clouds attached to 5G base stations willalso be massively deployed and connected to each other ina specific topology. In this way, cooperation among multipleedge clouds provides a solution for the problem of limitedcomputation resources on a single MEC unit.In this line, we study the case of a complex networkorganized in multiple edge clouds , each of which may beconnected to the Radio Access Network of a certain location.All such edge clouds are connected through an arbitrarytopology. This way, each edge cloud can serve end usertraffic by relying not only on its own resources, but alsooffloading some traffic to its neighbors when needed. Wespecifically consider multiple classes of traffic and corre-sponding requirements, including voice, video, web, amongothers. For every class of traffic incoming from the corre-sponding Radio Access Network, the edge cloud decideswhether to serve it or offload it to some other edge cloud.This decision depends on the QoS requirements associatedto the specific class of traffic and on the current status of theedge cloud.Our main objective is to ensure that the infrastructureis able to serve all possible types of traffic within theboundaries of their QoS requirements and of the availableresources.In this work we therefore propose a complete approach,named
Joint Planning and Slicing of mobile Network and edgeComputation resources (JPSNC), which solves the problem of operating cost-efficient edge networks. The approach jointlytakes into account the overall budget that the operator usesin order to allocate and operate computing capabilities in itsedge network, and allocates resources , aiming at minimizingthe network operation cost and the total traffic latency oftransmitting, outsourcing and processing user traffic, underconstraints of user tolerable latency for each class of traffic.This turns out to be a mixed-integer nonlinear program- a r X i v : . [ c s . N I] M a y ming (MINLP) optimization problem, which is an N P -hard problem [4]. To tackle this challenge, we transform itinto an equivalent mixed-integer quadratically constrainedprogramming (MIQCP) problem, which can be solved moreefficiently through the Branch and Bound method. Based onthis reformulation, we further propose an effective heuristic,named
Neighbor Exploration and Sequential Fixing (NESF) ,that permits obtaining near-optimal solutions in a very shortcomputing time, even for the large-scale scenarios we con-sidered in our numerical analysis. Furthermore, we proposetwo simple heuristics, based on a greedy approach, thatprovide benchmarks for our algorithms and obtain (slightly)sub-optimal solutions with respect to NESF, and are stillvery fast. Finally, we systematically analyze and discusswith a thorough numerical evaluation the impact of all con-sidered parameters (viz. the overall planning budget of theoperator, different types of traffic, tolerable latency, networktopology and bandwidth, computation and link capacity) onthe optimal and approximate solutions obtained from ourproposed model and heuristics. Numerical results demon-strate that our proposed model and heuristics can providevery efficient resource allocation and network planning so-lution for multiple edge networks. This work takes the rootfrom a previous paper [5] where we focused exclusively onminimizing the latency of traffic in a hierarchical network,keeping the network and computation capacity fixed. In thispaper, we have completely revised our optimization modelto cope with a joint network planning, slicing and edgecomputing problem, aimed at minimizing both the totallatency and operation cost for arbitrary network topologies.The remainder of this paper is organized as follows.Section 2 introduces the network system architecture weconsider. Section 3 provides an intuitive overview of theproposed approach by using a simple example. Section 4illustrates the proposed mathematical model and Section 5the heuristics. Section 6 discusses numerical results in aset of typical network topologies and scenarios. Section 7discusses related work. Finally, Section 8 concludes thepaper.
YSTEM A RCHITECTURE
Figure 1 illustrates our reference network architecture. Weconsider an edge network composed of
Edge Nodes . Each ofsuch nodes can be equipped with any of the following threecapabilities: • the ability of acquiring traffic from mobile devicesthrough the Remote Radio Head (RRH), such nodes arethose we call Ingress Nodes ; • the ability of executing network or application levelservices requiring computational power, this is donethanks to the availability of an Edge Cloud on the node; • the ability to route traffic to other nodes.Not all nodes must have all the three capabilities, so, in thisrespect, the edge network can be constituted of heteroge-neous nodes.Each link ( i, j ) between any two edge nodes, i and j , hasa fixed bandwidth, denoted by B ij . Each Ingress Node k hasa specific ingress network capacity C k , which is a measureof its ability to accept traffic incoming from mobile devices. Nodes able to perform some computation have a computa-tion capacity S i . One of the objectives of the planning modelpresented in this paper is to determine the optimal value ofthe computation capacity that must be made available ateach node.We assume that users’ incoming data in each IngressNode is aggregated according to the corresponding traffictype n ∈ N . Examples of traffic types can be video, game,data from sensors, and the like. In Figure 1 traffic of dif-ferent types is shown as arrows of different colors. Fromeach Ingress Node, traffic can be split and processed onall edge clouds in the network; the dashed arrows shownin the figure represent possible outsourcing paths of thetraffic pieces from different Ingress Nodes. Different slicesof the ingress network capacity C k and the edge cloudcomputation capacity S i can be allocated to serve the dif-ferent types of traffic based on the corresponding ServiceLevel Agreements (SLAs), which, in this paper are focusedon keeping latency under control. Thus, another objectiveof our model is to find the allocation of traffic to edgeclouds that allows us to minimize the total latency, whichis expressed in terms of the latency at the ingress node,due to the limitations of the wireless network, plus thelatency due to the traffic processing computation, plus thelatency occurring in the communication links internal to thenetwork system architecture. O u t s o u r c i ng Ingress IngressIngressIngress
Edge CloudIncoming tra ffi cRRH Forwarding Edge Node
Fig. 1: Network system architecture.We assume that the edge network is controlled by a management component which is in charge of achieving theoptimal utilization of its resources, in terms of network andcomputation, still guaranteeing the SLA associated to eachtraffic type accepted by the network. This component mon-itors the network by periodically computing the networkcapacity of each ingress node (through broadcast messagesexchanged in the network) and the bandwidth of each linkin the network topology. Moreover, it knows the maximumavailable computation capacity of all computation nodes.With these pieces of information as input, and knowingthe SLA associated to each traffic type, the managementcomponent periodically solves an optimization problem thatprovides as output the identification of a proper networkconfiguration and traffic allocation. In particular, it willidentify: i) the amount of computational capacity to beassigned to each node so that, with the foreseen traffic, the
48 102 96 57 13 (a) Minimizing both latencyand computation costs
48 102 96 57 13 (b) With the same settings,but a change λ n ,t = 40 Gb/s
Fig. 2: Toy example for a network with 10 nodes and 20edges (average degree: 4.0).node usage remains below a certain level of its capacity;ii) which node is taking care of which traffic type; and iii)the nodes through which each traffic type must be routedtoward its destination.For simplicity, the optimization problem is based on theassumption that the system is time-slotted, where time isdivided into equal-length short slots (short periods wherenetwork parameters can be considered as fixed and trafficshows only small variations). We observe that our proposedheuristic (NESF) exhibits a short computing time so that itis feasible to run the problem periodically and to adjust theconfiguration of the system network based on the actualevolution of the traffic.In the next section, we give an intuition of the solu-tion applied by the management component in the caseof a simple network, while in Section 4 we formalize theoptimization problem and in Section 5 we present someheuristics that make the problem tractable in realistic cases.
VERVIEW OF P LANNING AND A LLOCATION
In this section we refer to a simple but still meaningful edgenetwork and we show how the management componentbehaves in the presence of two types of traffic. The examplewe consider is shown in Figure 2 and consists of 10 nodesconnected together with an average degree of 4, two ofwhich are ingress nodes (labeled as n and n in the figureand colored in orange). For simplicity, we assume that thebandwidth of all links is B l = 100 Gb/s , and the wirelessnetwork capacity of the two ingress nodes is, respectively, C n = 50 Gb/s and C n = 60 Gb/s . Every node in thenetwork has a computation capacity that can take one ofthe following values: Gb/s (i.e., no computation capacityis made available at the current time), D = 30 Gb/s , D = 40 Gb/s , and D = 50 Gb/s . Given the above edgenetwork, let us assume the management component esti-mates that node n will receive traffic of type t at rate λ n ,t = 25 Gb/s and type t at rate λ n ,t = 20 Gb/s ,while node n will receive the two types of traffic withrates λ n ,t = 15 Gb/s and λ n ,t = 35 Gb/s , respectively.Finally, let us assume that the network operator has set anupper bound on the power budget to be used (i.e., the totalamount of computational power) P = 300 Gb/s and has Note that computation capacity is often expressed in cycles/s. Asdiscussed in Section 6, for homogeneity with the other values, we havetransformed it into
Gb/s . defined in its SLA a tolerable latency for the two types oftraffic, respectively, to the following values: τ t = 1 ms and τ t = 2 ms .In this case, the computed optimal configuration isshown in Figure 2(a). The management component willassign at ingress node n a wireless network capacity slice of Gb/s to t and of Gb/s to t , while at ingress node n itwill assign Gb/s to t and Gb/s to t . Moreover, it willassign a computation capacity D to nodes n and n and D to n , while it will switch off the computation capacity ofthe other nodes. This leads to a total computation capacityof Gb/s , which is well below the available computationcapacity budget P . Given that t is the traffic type withthe most demanding constraint in terms of latency, themanagement component decides to use the full D capacityof n to process traffic t from n . Applying the samestrategy within node n would result in a waste of resourcesbecause the t traffic of n will take only Gb/s of theavailable computation capacity, and the remaining one willnot be sufficient to handle the expected total amount of t traffic. Since moving the t traffic of one hop would stillallow the system to fulfill the SLA, the decision is then toconfigure the network to route such traffic to n . The reasonfor choosing n is mainly because it is one of the nearestneighbors of both n and n (with 2 hops to n and 1 hopto n ) and, with its D capacity, can handle both t trafficfrom n and t traffic from n . Specifically, the percentageof computation capacity allocated for n , t and n , t is and , respectively. t traffic from n is, instead,processed locally at n itself.Let us now assume that the management componentobserves a change in the λ n ,t traffic rate, which increasesto λ n ,t = 40 Gb/s . Based on this, the management compo-nent runs again the optimization algorithm that will outputthe configuration illustrated in Figure 2(b). The slicing ofthe wireless network capacity for ingress node n doesnot vary, while for ingress node n , a slice of Gb/s isassigned to t and, as a consequence, a slice of Gb/s ,smaller than before, to t . Moreover, a computation capacity D is allocated to n , which processes t locally, and D isallocated to the neighbor node n , which handles the t traffic from n . A capacity D is allocated to n to process t locally and, finally, D is allocated to n to process t incoming from n . Both ingress nodes select the nearest 1-hop neighbor to offload the traffic and the total computationcapacity is equal to Gb/s . Notice that, by manuallyanalyzing the initial configuration of Figure 2(a), we maythink that a better solution would be to simply increase thecomputation capacity of n to D as in this way the networkremains almost the same as before and the total computationcapacity is Gb/s , smaller than the one of Figure 2(b).However, a more in-depth analysis shows that, even if thissolution is certainly feasible, it is less optimal than the oneof Figure 2(b) in terms of total latency. The main reasonis that traffic t from node n suffers for a larger latencyin the wireless ingress network due to a smaller allocatedslice, and, in the scenario where both n and n rely on thesame node n for offloading some traffic, it is also sufferingfor a relatively high latency due to the traffic computationon n . This second component of the latency is reducedin the case of Figure 2(b) where traffic t from node n has the computation capacity of n entirely dedicated toit. Thus, the total latency for t is . ms in the case ofFigure 2(b) and . ms in the other case. In Section 4 weshow how such values are computed and, in general, theoptimization model that computes the optimal allocation ofcomputational and network resources as well as the optimalrouting paths. ROBLEM F ORMULATION
In this section we provide the mathematical formulationof our
Joint Planning and Slicing of mobile Network and edgeComputation resources (JPSNC) model. Table 1 summarizesthe notation used throughout this section. For brevity, wesimplify expression ∀ n ∈ N as ∀ n , and apply the same ruleto other set symbols like E , K , L , etc. throughout the rest ofthis paper unless otherwise specified.TABLE 1: Summary of used notations. Parameters Definition N Set of traffic types E Set of edge nodes in the edge networks K Set of ingress nodes, where
K ⊆ EL
Set of directed links in the networks B ij Bandwidth of the link from node i to j , where ( i, j ) ∈ L C k Network capacity of ingress edge node k ∈ K D a Levels of computation capacities ( a ∈ A = { , , . . . } ) P Planning budget of computation capacity λ kn User traffic rate of type n in ingress node kτ n Tolerable delay for serving the total traffic of type nκ i Cost of using one unit of computation capacity on node iw Weight to balance among total latency and operation costVariables Definition c kn Slice of the network capacity for traffic knb kni
Whether traffic kn is processed on node i or not α kni Percentage of traffic kn processed on node iβ kni Percentage of i ’s computation capacity sliced to traffic knδ ai Decision for planning computation capacity on node i R kni Set of links for routing the traffic piece α kni from k to i The goal of our formulation is to minimize a weightedsum of the total latency and network operation cost forserving several types of user traffic under the constraintsof users’ maximum tolerable latency and network planningbudget. This allows the network operator to fine tune itsneeds in terms of quality of service provided to its users andcost of the planned network. Different types of traffic, withheterogeneous requirements, need to be accommodated,and may enter the network from different ingress nodes.In the following, we first focus on the network planningissue and its related cost, as well as on the traffic routingissue, and then detail all components that contribute to theoverall latency experienced by users, which we capture inour model.
Network Planning : We assume that, in each edge node,some processing capacity can be made available, thus en-abling MEC capabilities. This action will result in an oper-ation cost that will increase at the increase of the amountof processing capacity. To model more closely real networkscenarios, we assume that only a discrete set of capacity values can be chosen by the network operator and madeavailable. Therefore, we adopt a piecewise-constant function S i for the processing capacity of an edge node, in linewith [6]. This is defined as: S i = (cid:88) a ∈A δ ai D a , ∀ i, (1)where D a is a capacity level ( a ∈ A ) and δ ai ∈ { , } isa binary decision variable for capacity planning, satisfyingthe following constraint (only one level of capacity can bemade available on a node, including zero, i.e., no processingcapability): (cid:88) a ∈A δ ai = 1 − δ i , ∀ i, (2)where δ i is a binary variable that indicates whether node i has currently available some computation power or not.This constraint implies that S i can be set as either 0 (nocomputation power) or exactly one capacity level, D a .To save on operation costs, in the case an edge nodeis not supposed to be exploited to process some traffic,then no processing capacity is made available on it. Weintroduce binary variable b kni to indicate whether traffic kn is processed on node i (we will use the expression “traffic kn ” in the following, for brevity, to indicate the user traffic oftype n from ingress point k ). Then the following constraintshould be satisfied: b kni (cid:54) − δ i (cid:54) (cid:88) k (cid:48) ∈K (cid:88) n (cid:48) ∈N b k (cid:48) n (cid:48) i , ∀ k, ∀ n, ∀ i, (3)We also consider a total planning budget, P , for theavailable computation capacity, introducing the followingconstraint: (cid:88) i ∈E S i (cid:54) P. (4)Then, the total operation cost can be expressed as: J = (cid:88) i ∈E κ i S i , (5)where κ i is the cost of using one unit of computationcapacity (in the example of Section 3 this will be Gb/s )on node i . Network Routing : We assume that each type of trafficcan be split into multiple pieces only at its ingress node.Each piece can then be offloaded to another edge computingnode independently of the other pieces, but it cannot befurther split (we say that each piece is unsplittable ). Eachlink l ∈ L may carry different traffic pieces, α kni (we denoteby α kni the percentage of traffic kn processed at node i , andwith β kni the percentage of computation capacity S i slicedfor traffic kn ). Then, the traffic flow kn on l , f knl , can beexpressed as the sum of all pieces of traffic that pass throughsuch link: f knl = (cid:88) i ∈E : l ∈R kni α kni , ∀ k, ∀ n, ∀ l, (6)where R kni ⊂ L denotes a routing path (set of traversedlinks) for the traffic piece α kni λ kn from ingress k to node i .The following constraint ensures that the total traffic on eachlink does not exceed its capacity: B ij > (cid:88) k ∈K (cid:88) n ∈N f knij λ kn , ∀ ( i, j ) ∈ L . (7) The traffic flow conservation constraint is enforced bythe following constraint: (cid:88) j ∈I i f knji − (cid:88) j ∈O i f knij = (cid:26) α kni − , if i = k,α kni , otherwise , ∀ k, ∀ n, ∀ i, (8)where I i = { j ∈ E | ( j, i ) ∈ L} and O i = { j ∈ E | ( i, j ) ∈ L} are the set of nodes connected by the incoming and outgoinglinks of node i , respectively. The fulfillment of this constraintguarantees continuity of the routing path. Moreover, therouting path R kni should be acyclic . The latency in each ingress edge node is modeled as thesum of the wireless network latency and the outsourcing latency which, in turn, is composed of the processing latency in someedge cloud and then link latency between edge clouds.
Wireless Network Latency : We model the transmissionof traffic in each user ingress point as an M | M | processingqueue. The wireless network latency for transmitting the usertraffic of type n from ingress point k , denoted by t knW , cantherefore be expressed as: t knW = 1 c kn − λ kn , ∀ k, ∀ n, (9)where c kn is the capacity of the network slice allocated fortraffic kn in the ingress edge network (a decision variablein our model) and λ kn is the traffic rate. The followingconstraints ensure that the capacity of all slices does notexceed the total capacity C k of each ingress edge node, and c kn is higher than the corresponding λ kn value: (cid:88) n ∈N c kn (cid:54) C k , ∀ k, (10) λ kn < c kn , ∀ k, ∀ n. (11) Processing Latency : We assume that each type of trafficcan be segmented and processed on different edge clouds,and each edge cloud can slice its computation capacity toserve different types of traffic from different ingress nodes.As introduced before, we indicate with α kni the percentageof traffic kn processed at node i , and with β kni the per-centage of computation capacity S i sliced for traffic kn . Theprocessing of user traffic is described by an M | M | model.Let t kn,iP denote the processing latency of edge cloud i fortraffic kn . Then, based on the computational capacity β kni S i sliced for traffic kn , with an amount α kni λ kn to be served, ∀ k, ∀ n, ∀ i, t kn,iP is expressed as: t kn,iP = (cid:40) β kni S i − α kni λ kn , if α kni > , , otherwise . (12)In the above equation, when traffic kn is not processed onedge cloud i , the corresponding value is ; at the same time,no computation resource of i should be sliced to traffic kn (i.e., β kni = 0 ). The corresponding constraint is written as: (cid:26) α kni λ kn < β kni S i , if α kni > ,α kni = β kni = 0 , otherwise . (13) α kni and β kni also have to fulfill the following consistencyconstraints: (cid:88) i ∈E α kni = 1 , ∀ k, ∀ n, (14) (cid:88) k ∈K (cid:88) n ∈N β kni (cid:54) , ∀ i. (15) Link Latency : Let t kn,iL denote the link latency for routingtraffic kn to node i . In each ingress node, the incoming trafficis routed in a multi-path way, i.e., different types or pieces ofthe traffic may be dispatched to different nodes via differentpaths. ∀ k, ∀ n, ∀ i , t kn,iL is defined as: t kn,iL = (cid:80) l ∈R kni B l − (cid:80) k (cid:48)∈K (cid:80) n (cid:48)∈N f k (cid:48) n (cid:48) l λ k (cid:48) n (cid:48) , if α kni > i (cid:54) = k, , otherwise . (16)Recall that R kni is a routing path for the traffic piece α kni λ kn from ingress k to node i . The link latency is accounted foronly if a certain traffic piece is processed on node i (i.e. α kni > ) and i (cid:54) = k . Total Latency : Now we can define the outsourcing latency for traffic kn , which depends on the longest serving timeamong edge clouds: t knP L = max i ∈E { t kn,iP + t kn,iL } , ∀ k, ∀ n. (17)The latency experienced by each type of traffic coming fromthe ingress nodes, can therefore be defined as t knW + t knP L , andalso should respect the tolerable latency requirement: t knW + t knP L (cid:54) τ n , ∀ k, ∀ n. (18)For each traffic type n , we consider the maximum valueamong different ingress nodes with respect to the wirelessnetwork latency and outsourcing latency, i.e., max k ∈K { t knW + t knP L } . Then, we define the total latency as follows: T = (cid:88) n ∈N max k ∈K { t knW + t knP L } . (19) Our goal in the
Joint Planning and Slicing of mobile Networkand edge Computation resources (JPSNC) problem is to min-imize the total latency and the operation cost, under theconstraints of maximum tolerable delay for each traffic typecoming from ingress nodes and the total planning budgetfor making available processing-capable nodes: P c kn ,b kni ,α kni ,β kni ,δ ai , R kni T + wJ, s.t. (1) − (19) , where w ≥ is a weight that permits to set the desired bal-ance between the total latency and operation cost. Problem P contains both nonlinear and indicator constraints, there-fore, it is a mixed-integer nonlinear programming (MINLP)problem, which is hard to be solved directly [4], as discussedin Section 4.4. Problem P formulated in Section 4 cannot be solved di-rectly and efficiently due to the following reasons: • We aim at identifying the optimal routing (the routingpath R kni is a variable in our model, since many pathsmay exist from each ingress node k to a generic node i in the network); furthermore, we must ensure that suchrouting is acyclic and ensures continuity and unsplitta-bility of traffic pieces. • Variables R kni and α kni are reciprocally dependent: tofind the optimal routing, the percentage of traffic pro-cessed at each node i should be known, and at the sametime, to solve the optimal traffic allocation, the routingpath should be known. • The processing latency, defined in the previous sections,depends on three decision variables in our model andthe corresponding formula (12) is (highly) nonlinear. • P contains indicator functions and constraints, e.g.(12), (13), (16), which cannot be directly and easilyprocessed by most solvers.To deal with the above issues, we propose an equivalentreformulation of P (called Problem P ), which can besolved very efficiently with the Branch and Bound method.Moreover, the reformulated problem can be further relaxedand, based on that, we propose in the next section anheuristic algorithm which can get near-optimal solutionsin a shorter computing time. More specifically, in P , wefirst reformulate the processing latency and link latencyconstraints (viz., constraints (12) and (16)), and we deal,at the same time, with the computation planning problem.Then, we handle the difficulties related to variables R kni and the corresponding routing constraints. Appendix Acontains all details about the problem reformulation. Sincesome constraints are quadratic while the others are linear, P is a mixed-integer quadratically constrained program-ming (MIQCP) problem, for which commercial and freelyavailable solvers can be used, as we will illustrate in thenumerical evaluation section. EURISTICS
Hereafter, we illustrate our proposed heuristic, named
Neighbor Exploration and Sequential Fixing - NESF, whichproceeds by exploring and utilizing the neighbors of eachingress node for hosting (a part of) the traffic along an objective descent direction , that is, by trying to minimize theobjective function (which, we recall, is a weighted sum ofthe total latency and operation cost). During each step wherewe explore potential candidates for computation offloading,we partially fix the main binary decision variables in thereformulated problem P and then solve the so-reducedproblem by using the Branch and Bound method. Ourexploring strategy provides excellent results, in practice,achieving near-optimal solution in many network scenarios,as we will illustrate in the Numerical Results Section.The detailed exploring strategy is illustrated in Figure 3,which shows three typical variation paths of the objectivefunction value versus the number of computing nodesmade available in the network (note that these 3 trendsare independent from each other, in the sense that eitherof them, or a combination of them, can be experienced in | Nodes | Objective function
I II III y A x A Ay B x B By C x C Cy D x D Dy E x E E Fig. 3: Three typical variations of the objective functionvalue versus the number of computing nodes made avail-able.a given network instance). Point A represents the stagewhere a minimum required number of computing nodes( x A ) is opened to ensure the feasibility of the problem.For instance, if the ingress nodes can host all the trafficunder all the constraints, x A = |K| . Point E indicates themaximum number of computing nodes that can be madeavailable in the network; any point above x E will violatethe computation budget or tolerable latency constraints.During the search phase of our heuristic, which is ex-ecuted in Algorithms 1 and 3, detailed hereafter, we firsttry to obtain (or get as closer as possible to) point A andthe corresponding objective value y A . If A can not be foundwithin the computation budget, the problem is infeasible.Otherwise, we continue to explore computation candidatesfrom the h -hop neighbors of each ingress node, and allocatethem to serve different types of traffic. The objective value isobtained by solving P with new configurations of the deci-sion variables. The change of the objective value may henceexhibit one of the three patterns (I, II and III) illustrated inFigure 3.The objective value increases monotonically in path I. Inpath II, it first decreases to point C then increases to point E ;finally, path III shows a more complex pattern which hasone local maximum point B and one minimum point D .In case I, the network system has just enough computationpower to serve the traffic. Hence, adding more computationcapacity to the system does not guarantee to decrease delay,while it will increase on the other hand computation costs.In case II, few ingress nodes in the system may support arelatively high traffic load. Equipping some of their neigh-bors with more computation capabilities (with total amountless than x C ) can still decrease the total system costs. Afterpoint C , the objective value shows a similar trend to case I.In case III, several ingress nodes may serve high traffic load.At the beginning, adding some computing nodes (with totalamount less than x B ) may be not enough to decrease thedelay costs to a certain degree, and this will also increasethe total installation costs. After point B , the objective valuevaries like in case II and has a minimum at point D .To summarize, our heuristic aims at reaching the min-imum points A (I), C (II) and D (III) in Figure 3, andits flowchart is shown in Figure 4. The main idea behindAlgorithm 1 is to check whether the ingress nodes canhost all the traffic without activating additional MEC units, Input network parameters, topologyTry to host traffic by ingress nodes only (
Algorithm P , update solution. Is the best one achieved so far? ( Algorithm
Output recorded best solutionYes NoNo
Algorithm Fig. 4: Flowchart of our NESF heuristic.thus saving some computation cost. Algorithm 2 aims atsearching the h -hop neighbors of each ingress node formaking them process part of the traffic (the outsourcedtraffic), while Algorithm 3 aims at setting up the allocationplan for outsourced traffic and try to solve P to obtain thebest solution. The three proposed algorithms are describedinto detail in the following subsections. The definition of thenew notation introduced in these algorithms is summarizedfor clarity in Table 2. In Algorithm 1, the main idea is to check whether ingressnodes can host all the traffic, without using other MECunits in order to save both computation cost and latency.To this end, we first individuate the subset of ingress nodes(denoted as K u ) that cannot host all the traffic that enters thenetwork through them. This is done by checking if S ek (= D m − (cid:80) n ∈N λ kn ) (cid:54) (lines 1-2), that is, if some computingcapacity is still available or not at ingress nodes (recall that D m is the maximum computation capacity that can be madeavailable). Then, if K u (cid:54) = ∅ , ∀ k ∈ K u , we try to find the setof its neighbor ingress nodes k (cid:48) ∈ [( K − K u ) ∩ ( (cid:83) Hh G hk )] that can cover S ek (i.e., S ek (cid:48) + S ek > ), where G hk ⊂ E is theset of node k ’s h -hop neighbor nodes ( h = 1 , . . . , H ). Iffound, they are stored as candidates in a list, Q k , orderedwith increasing distance (hop count) from k (lines 3-7). If K u = ∅ or sufficient nodes in Q k have been found toprocess the extra traffic from K u (line 9), then ∀ k ∈ K u ,the corresponding traffic is allocated to nodes in Q k startingTABLE 2: Notations used in the algorithms. Notation Definition S ek Estimated available computation of ingress node k ∈ KK u Ingress nodes that cannot host all traffic ( S ek (cid:54) ) H Maximum searching depth of our heuristic G hk h -hop neighbors ( h (cid:54) H ) of ingress node k ∈ K Q k Candidates for computing traffic from ingress node k ∈ K S ok Overall computation of ingress node k ∈ K S li Maximum left computation of node i ∈ EK bi Ingress nodes who booked computation from node i ∈ E d ik Count of hops from node i to ingress node k ∈ K O P Objective function value of problem P from the top (choosing the closest ones) and repeatedly(covering all the traffic types), beginning with less latencyto more latency-tolerant traffic.This is implemented by setting the corresponding vari-ables b kni , δ ai and γ kn,il in P to save the costs and alsoaccelerate the algorithm. Finally, P with the fixed variablesis solved by using Branch and Bound method to obtain thesolution (lines 10-11). If P is feasible with these settings,the objective value O P is stored to be used in the nextsearching and resource allocation phases of Algorithm 3. Algorithm 1
Attempt of serving traffic with ingress nodes only S ek = D m − (cid:80) n ∈N λ kn , ∀ k ∈ K ; K u = { k ∈ K | S ek (cid:54) } ; Compute k ’s h -hop neighbors G hk , h (cid:54) H, ∀ k ∈ K ; Q k = { k } , ∀ k ∈ K , O t = − ; for k ∈ K u do X = { k (cid:48) ∈ [( K−K u ) ∩ ( (cid:83) Hh G hk )] | S ek (cid:48) + S ek > } ; Q k = Q k ∪ X , rank Q k by increasing hop count to k ; Rank N as N k by descending ( λ kn , τ n ) , ∀ k ∈ K ; if K u = ∅ or (cid:86) k ∈K u ( | Q k | > then Allocate Q k to N k in order and repeatedly, ∀ k ∈ K ; Solve P by B&B to obtain obj. fct. value O P ; if O P > then O t = O P ; This section describes Algorithm 2, upon which Algorithm 3is based to provide the final solution. Algorithm 2 proceedsas follows. We first assign a rank (or a priority value) to eachingress node taking into account the amount of incomingtraffic and the computation capacity. Then, we handle theoutsourced traffic offloading task (i.e., choose the best subsetof computational nodes) starting from the ingress node withthe highest priority.In more detail, set K s is set K sorted by the ascendingvalue of the tuple ( S ek , − λ kn ) , i.e., the ingress node withthe lowest estimated available (left) computation S ek and thehigher amount of traffic of type n has the highest rank/pri-ority in our Algorithm 2, where n represents the traffictype having the maximum tolerable latency (lines 1-2). Theprocess of determining the best subset of computation nodesfor processing the outsourced traffic of each ingress node isexecuted hop-by-hop , starting with ingress node ˆ k = K s (0) ,until any one of the following three conditions is satisfied:(1) the number of computation nodes opened for processingtraffic exceeds the maximum budget (cid:98) Pmin ( D a ) (cid:99) , or(2) all ingress nodes are completely scanned (line 3), or(3) the algorithm could not improve further the solution(Algorithm 3, lines 8, 10).In the searching phase, we first try to identify the setof temporary candidate computation nodes B for ingress ˆ k ( B ⊆ ( G h ˆ k ˆ k − [ K ∪ Q ˆ k ]) ) , by checking if the maximumavailable computation capacity of i ∈ B , S li could help ˆ k to cover S e ˆ k (lines 4-7). S li is computed as the differencebetween i ’s maximum installable computation capacity D m and the total computation booked from i by ingress nodesin K bi ⊆ K , i.e., (cid:80) k ∈K bi S ek , where K bi is the set of ingress nodes that booked computation from node i . If B = ∅ , weincrease the number of hops h ˆ k for ingress ˆ k . If not (we aredone with ˆ k ), we move to the next ingress node in the set K s (lines 8-9).At this point we rank B by descending values of tuple ( S li , − d ik : k ∈ K s ) , where d ik is the count of hops fromnode i to ingress node k ∈ K s . The first computation node ˆ ı is selected as the one to compute the traffic of ˆ k , and ˆ k is added into the corresponding set K b ˆ ı . To make full useof computation node ˆ ı , we further spread it to help otheringress nodes K s \{ ˆ k } , if ˆ ı is their neighbor within H hopsand has sufficient computation budget (lines 10-13). Then,given such computation node ˆ ı and for each ingress node k ,we update the value of the overall computation, S ok , dueto the full use of computation nodes ˆ ı (line 14). Hence,ingress k with the minimum support S ok will be chosenas the next searching target and Algorithm 2 continues asfollows.The next searching target ˆ k is set to k ∈ K s with theminimum S ok value (lines 15-16). If S o ˆ k (cid:54) , this means thatthe current computation configuration could not host all thetraffic; hence, the algorithm will go back to the while loopand continue to the next searching. Otherwise, we set a flag skip := ( S o ˆ k (cid:54) rD m ) where r is set to a small value (i.e., . ). If skip is true , it indicates that ˆ k has a high traffic load,and this may cause the processing latency to increase. Thisflag is used in Algorithm 3. In fact, this step implements thestrategy of skipping point B to avoid the local minimum(point A ) in path III shown in Figure 3. Finally, based on Q k , we run Algorithm 3 to obtain the objective value O t and the corresponding solution. Algorithm 2
Priority searching of computation candidates Rank ingress nodes as K s by ascending ( S ek , − λ kn ) ; ˆ k = K s (0) , h k = 1 , S ok = S ek ( ∀ k ∈ K ) , K bi = ∅ ( ∀ i ∈ E ) ; while | (cid:83) k ∈K Q k | < (cid:98) Pmin ( D a ) (cid:99) and K s (cid:54) = ∅ do B = ∅ ; for i ∈ ( G h ˆ k ˆ k − [ K ∪ Q ˆ k ]) do S li = D m + (cid:80) k ∈K bi S ek if S li + S e ˆ k > then B = B ∪ { i } ; if B = ∅ then h ˆ k ++ , update K s , ˆ k when h ˆ k > H and continue ; Rank B by descending ( S li , − d ik : k ∈ K s ) , ˆ ı = B (0) ; Q ˆ k = Q ˆ k ∪ { ˆ ı } , K b ˆ ı = K b ˆ ı ∪ { ˆ k } , S b = D m ; for k ∈ K s \{ ˆ k } , if (ˆ ı ∈ (cid:83) Hh G hk ) & ( S b > λ k ) do Q k = Q k ∪ { ˆ ı } , K b ˆ ı = K b ˆ ı ∪ { k } , S b = S b − λ k ; S ok = S ok + ( D m + (cid:80) k (cid:48) ∈K b ˆ ı ∩K u −{ k } S ek (cid:48) ) , ∀ k ∈ K b ˆ ı ; ˆ k = argmin k ∈K s S ok ; if S o ˆ k (cid:54) then continue ; else skip := ( S o ˆ k (cid:54) rD m ) ; Run (
Algorithm
3) to obtain O t ; Return O t ; In Algorithm 3, we first relax problem P to ˜ P , replacingbinary variables b kni , δ ai and γ kn,il with continuous ones.Given the set Q k (by Algorithm 2) of candidate computation nodes for processing the outsourced traffic of ingress node k ,the goal is to allocate node k ’s different traffic types tothe computation nodes in Q k starting with the traffic withthe most stringent constraint in terms of latency. Unusedcomputation nodes are turned off. These two steps (lines1-2) provide a partial guiding information and also anacceleration for solving the relaxed problem, thus obtainingquite fast the relaxed optimal values of ˜ b kni .If ˜ P is infeasible ( O ˜ P < ), we check whether both theprevious best solution exists ( O t > ) and the algorithmdoes not skip . If yes, the searching process breaks andreturns O t (line 10). Otherwise, the algorithm will continuesearching to avoid getting stuck in a local optimum point inpath III (see Figure 3), according to the following.Hence, if ˜ P is feasible (line 3), the obtained ˜ b kni valuecan be regarded as the probability of processing traffic kn atnode i . Based on this, for each ingress k , we rank the can-didates in descending order of the probabilities (cid:80) n ∈N ˜ b kni .Then we revert to the original problem P , set the upperbound for P if possible, allocate the candidates to host alltypes of traffic in order and repeatedly for each ingress node,and also turn off the unused nodes (lines 5-7). By solving P ,we obtain the current solution and compare it with theprevious best one ( O t ). If the solution gets worse, the wholesearching process breaks out and returns the recorded bestresult (line 8). Otherwise (if the solution is improving), thecurrent solution is updated as the best one and the searchingprocess continues. Algorithm 3
Allocating resources and obtaining the solution Relax b kni , δ ai , γ kn,il to continuous ones ( P → ˜ P ); Allocate Q k to N k partially and solve ˜ P to obtain ˜ b kni ; if O ˜ P > then Rank candidates as Q sk by descending (cid:80) n ∈N ˜ b kni ; Revert to the original problem P ; if O t > then set O t as P ’s upper bound ; Allocate Q sk to N k and solve P ; if < O t &( O t < O P || O P < skip then break ; if < O P &( O P < O t || O t < then O t = O P ; else if O t > skip then break ; Essentially, the proposed heuristic described in the abovesubsections exploits the P formulation limiting the searchspace only to the nodes that are within a limited numberof hops h < H from the ingress nodes. We expect this isa realistic assumption based on the consideration that themain purpose of edge networks is to keep the traffic asclose as possible to the ingress nodes and, therefore, to theusers. Thanks to this approach, we are able to make the P problem more tractable and solvable in a short time even inthe case of complex edge networks (see Section 6).We can further improve the solution time by eliminatingfrom the problem formulation all unneeded variables. Inparticular, we modify P by adding a scope k (where k is the ingress node) to E and L . E k ⊆ E represents the setof h -hop neighbor nodes ( h (cid:54) H ) of k and L k ⊆ L the setof links inside this neighborhood. This way, the solver will be able to skip all variables outside the considered k scope,thus reducing the time needed to load, store, analyze andprune the problem. Such modification does not change theresult produced by the heuristic but it results in a consistentimprovement (up to 1 order of magnitude) in the computingtime needed to obtain the solution in our numerical analysis. UMERICAL R ESULTS
The goal of this evaluation is to show that: i) our P model offers an appropriate solution to the edge networkoptimization problem we have discussed in this paper, ii)our NESF heuristic computes a solution which is alignedwith the optimal one, and iii) when compared with twobenchmark heuristics,
Greedy and
Greedy-Fair , NESF offersbetter results within similar ranges of computing time.Consistently, the rest of this section is organized as fol-lows: Section 6.1 describes the heuristics we have comparedwith; Section 6.2 describes the setup for our experiments;Section 6.3 discusses about optimal solution and the resultsobtained by the heuristics in the small network scenario pre-sented in Section 3; Section 6.4 analyzes the results achievedby the heuristics when the network parameters vary; Finally,Section 6.5 discusses about the computing time needed tofind a solution.
We propose two benchmark heuristics, based on a greedyapproach, which can be naturally devised in our context:
Greedy : With this approach, each ingress node uses itsneighbor nodes computation facilities to guarantee a lowoverall latency for its incoming traffic. Hence, each ingressnode first tries to locally process all incoming traffic. Ifits computation capacity is sufficient, a feasible solution isobtained; otherwise, the extra traffic is split and outsourcedto its 1-hop neighbors, and so on, until it is completelyprocessed (if possible).
Greedy-Fair : It is a variant of
Greedy which performsa sort of “fair” traffic offloading on neighbor nodes. Morespecifically, it proceeds as follows: 1) compute the maximumnumber of available computing nodes, based on the powerbudget and the average computation capacity of a node;2) divide such maximum number (budget) into |K| partsaccording to the ratio of the total traffic rate among ingressnodes, and choose for each ingress node the correspondingnumber of computing nodes from its nearest h -hop neigh-bors. Each ingress node spreads its load on its neighborsproportionally to the corresponding distance ( hop +1 ), forexample, if the load is outsourced to two neighbors,the ratio is (1 : : ) = (0 . .
25 : 0 . . We implement our model and heuristics using SCIP (SolvingConstraint Integer Programs) , an open-source frameworkthat solves constraint integer programming problems. Allnumerical results presented in this section have been ob-tained on a server equipped with an Intel(R) Xeon(R) E5-2640 v4 CPU @ 2.40GHz and 126 Gbytes of RAM. The http://scip.zib.de parameters of SCIP in our experiments are set to theirdefault values.The illustrated results are obtained by averaging over 50instances with random traffic rates λ kn following a Gaussiandistribution N ( µ, σ ) , where µ is the value of λ kn shownin Table 3 and σ = 0 . (we recall that the optimizationproblem is solved under the assumption that the trafficshows only little random variations during the time slotunder observation. For this reason, the choice of a Gaus-sian distribution is appropriate). We computed 95% narrowconfidence intervals, as shown in the following figures.The network topologies used in the following ex-periments are generated based on Erd ¨os-R´enyi randomgraph [7] by specifying the numbers of nodes and edges.Note that the original Erd ¨os-R´enyi algorithm may pro-duce disconnected random graphs with isolated nodes andcomponents. To generate a connected network graph, wepatched it with a simple strategy that connects isolatednodes to randomly sampled nodes (up to 10 nodes) inthe graph. We generated several kinds of topologies withdifferent numbers of nodes and edges, shown in Figure 5,that span from a quasi-tree shape topology (Figure 5(c)) to amore general, highly connected one with 100 nodes and 150edges (Figure 5(f)). These topologies can be considered rep-resentative of various edge network configurations wheremultiple edge nodes are distributed in various ways overthe territory. Due to space constraints, in the following wepresent and discuss the results obtained for a representativetopology, i.e., the one in Figure 5(e), as well as those for thesmall topology of Figure 2, used to compare our proposedheuristics to the optimal solution. The full set of results isavailable online .TABLE 3: Parameters setting - Initial (reference) values (forthe case of high incoming traffic load with low tolerablelatency) Parameter Initial valueNo. of nodes |E| ∼ No. of ingress nodes |K| (Topologies in Fig. 5)No. of traffic types |N | Link bandwidth B l (Gb/s) ( l ∈ L )Network capacity C k (Gb/s) , , ( k ∈ K )Computation level D a (Gb/s) , , ( a ∈ A )Computation budget P (Gb/s) Traffic rate λ kn (Gb/s) ( K × N )Tolerable latency τ n (ms) , . , , , . ( n ∈ N )Weights κ i , w . , . ( i ∈ E ) In Table 3 we provide a summary of the reference valueswe define for each parameter. Such values are representativeof a scenario with a high traffic load and low tolerablelatency relative to the limited communication and com-putation resources. Referring to the computation capacitylevels and budget in Table 3, it is worth noticing thatunit “cycles/s” is often used for these metrics; for sim-plicity we transform it into “Gb/s” by using the factor“8bit/1900cycles”, which assumes that processing 1 byteof data needs 1900 CPU cycles in a BBU pool [8]. The http://xiang.faculty.polimi.it/files/supplementary-results.pdf (a) 20 nodes 30 edges (avg. degree: 3.0)
12 514233 417 2431 32 121615 27 35386 297 18 825 349 10 1122 2113 4033 3036 19 20 393726 28 (b) 40 nodes 60 edges (avg. degree: 3.0) (c) 50 nodes 50 edges ( avg. degree: 2.0 ) (d) 60 nodes 90 edges (avg. degree: 3.0) (e) 80 nodes 120 edges (avg. degree: 3.0) (f) 100 nodes 150 edges (avg. degree: 3.0) Fig. 5: Network topologies. Ingress nodes for each graph are colored in red.number of traffic types is set to five. Each traffic typecan be dedicated to a specific application case (e.g., videotransmission for entertainment, real-time signaling, virtualreality games, audio). Our traffic rates result from the ag-gregation of traffic generated by multiple users connectedat a certain ingress nodes. We select rate values that canbe typical in a 5G usage scenario and that almost saturatethe wireless network capacity at the ingress nodes that weassume to vary from 40 to 60 Gb/s. The tolerable latencyfor each traffic type aims at challenging the approach withquite demanding requirements ranging from 1 to 3.5 ms.More specifically, the values of traffic rate λ kn and tolerablelatency τ n are designed to cover several different scenarios,i.e., mice , normal and elephant traffic load under strict , nor-mal and loose latency requirements. For simplicity, in thispaper we fix the number of ingress nodes to three. Anin-depth analysis of the impact of the number of ingressnodes on the performance of the optimization algorithmis the subject of our future research. To make the problemsolution manageable, we assume to adopt links of the samebandwidth (100 Gb/s) that are representative of currentfiber connections. As in the example of Section 3, we assumethree possible levels for the computation capacity (30, 40 and50 Gb/s), under the assumption that, as it happens in typicalcloud IaaS, users see a predefined computation service offer.The maximum computation budget is set to 300 Gb/s,which is a relatively low value considering the traffic rateswe use in the experiments and the number of availablenodes in the considered topologies. Finally, by assigning thesame values to weights κ i , w , we make sure that the twocomponents of the optimization problem, the total latencyand the operation cost, have the same importance in theidentification of the solution. O b j e c t i v e f un c t i o n Greedy-FairGreedyNESFOptimal (a) Network capacity ( C k ) O b j e c t i v e f un c t i o n Greedy-FairGreedyNESFOptimal (b) Trade-off weight ( w ) Fig. 6: Comparison with the optimum varying two selectedparameters ( C k and w ) in the example network scenario10N20E of Figure 2. We first compare the results obtained by our proposedheuristic,
NESF , against the optimum obtained solvingmodel P in the simple topology illustrated in Figure 2,Section 3. Note that the original model could be solvedonly in such a small network scenarios due to a very highcomputing time. In Figure 6 we show the variation of theobjective function (the sum of total latency and operationcost) with respect to two parameters, the network capac-ity C k and the weight w in the objective function. In thesecases, it can be observed that NESF obtains near-optimalsolutions, practically overlapping with the optimum curve,for the whole range of the parameters, while both
Greedy and
Greedy Fair perform worse. The results achieved whenthe other parameters vary show the same trend. For the sakeof space, we do not show them, but they are reported in thesupplementary results available here .Figure 7 shows the configuration of nodes and routing α , :1 α , :1 α , :1 α , :1 β , :1 D c , :27 , c , :23 β , :1 D c , :22 , c , :38 D β , :0 . β , :0 . (a) Optimal β , :1 D c , :27 , c , :23 β , :1 D c , :22 , c , :38 D β , :1 D β , :1 (b) Greedy β , :1 D c , :27 , c , :23 β , :1 D c , :22 , c , :38 D β , :0 . β , :0 . (c) NESF . .
67 0 . . . , . . , . . . β , :0 . β , :0 . D c , :27 , c , :23 β , :0 . β , :0 . D c , :20 , c , :40 D β , :0 . β , :0 . β , :0 . β , :0 . D β , :0 . β , :0 . D (d) Greedy-Fair
Fig. 7: Comparison of the solutions achieved by the heuristics and the optimum for the 10N20E topology.paths for the network (10N20E) with the parameter valuesdefined in Section 3. Each sub-figure refers to one of the fourconsidered solutions. Here we highlight the ingress nodes(i.e., and ) and the other nodes which offer computationcapacity or support traffic routing. The remaining nodes arenot shown for the sake of clarity. The black arrows representthe enabled routing paths. The traffic flow allocation ofeach solution is marked in red for traffic type 1 and bluefor type 2, respectively. The values of all relevant decisionvariables (see Section 4) are shown as well.Comparing Figures 7(a) and 7(c), we notice that both Optimal and
NESF enable the computation capacity on theingress nodes and an intermediate node, with one type oftraffic kept in the ingress nodes and the other offloaded tothe intermediate. The obvious differences between
Optimal and
NESF include: i) planning of the computation capacityon ingress node (i.e., D by Optimal while D by NESF ),and ii) the intermediate node selected and the consequentrouting paths. However, the obtained objective functionvalues (trade-off between the total latency and operationcost) by
Optimal and
NESF are respectively . and . ,and very close to each other. To further check the reasonsbehind, we found that the latencies for the traffic of type 1and 2 are, respectively, . ms and . ms for Optimal ,while . ms and . ms for NESF . Since in this case
NESF can acquire less total latency at the expense of a littlebit higher computation cost, compared with
Optimal , theircorresponding objective function values are close. Note thatthe computing time needed to obtain the optimal solutionis around 10 hours ( seconds) while
NESF is able tocompute the approximate solution in only about second.The Greedy and
Greedy-Fair approaches tend to enablecomputation capacity on more nodes.
Greedy-Fair also splitseach type of traffic following multiple paths. Both aspectsresult in a higher objective function value.When increasing the network capacity C k by the scalefactor . , the resulting solutions remain almost the same,except for the allocation of the wireless network capacityand computation capacity. We investigate the effect of several parameters on the ob-jective function value, with respect to link bandwidth B l , network capacity C k , computation capacity D a and corre-sponding total budget P , traffic rate λ kn , tolerable latency τ n and trade-off weight w . We conduct our simulations byscaling one parameter value at a time, starting from theinitial values in Table 3. Since the goal is to minimize theweighted sum of total latency and operation cost, lowervalues for the objective function are preferable.In Figure 8 we report all results referring to the topol-ogy with 80 Nodes and 120 links (Figure 5(e)). All resultsobtained considering the other topologies in Figure 5 areavailable here and show similar trends. B l Figure 8(a) illustrates the variation of the objective functionvalue (costs w.r.t. latency and computation) versus the linkbandwidth B l , ∀ l ∈ L , the values of which are scaled withrespect to its initial ones in Table 3 from to . with a stepof . . In all cases, the problem instance is unfeasible belowa certain threshold bandwidth value. As B l increases abovethe threshold, the cost value achieved by each approachdecreases and converges to a smaller value, i.e., around . for NESF (achieved at . ), . for Greedy at . and . for Greedy-Fair at . . In all cases, NESF performs the bestamong all the approaches, with the following gains: around to Greedy and to Greedy-Fair . Greedy and
Greedy-Fair show little flexibility to the variation of link bandwidth. C k Figures 8(b) demonstrates the variation of the objectivefunction value with respect to the wireless network capac-ity C k , ∀ k ∈ K , scaled with respect to the initial valuesreported in Table 3 from . to . , which corresponds tothe case in which the wireless network shows a capacitycomparable to the one of the internal network links. When C k increases, the objective function value obtained by eachapproach decreases quite fast (more than times) and con-verges to a specific value. For NESF , the cost decreases from . and converges to . ; Greedy and
Greedy-Fair exhibitclose performance, i.e.,
Greedy from . to . , Greedy-Fair from . to . . NESF still has the best performanceamong all the approaches, with consistent gaps: around to Greedy and up to for
Greedy-Fair . This trendreflects the strong effect of the wireless network capacity O b j e c t i v e f un c t i o n Greedy-FairGreedyNESF (a) Bandwidth ( B l ) O b j e c t i v e f un c t i o n Greedy-FairGreedyNESF (b) Network capacity ( C k ) O b j e c t i v e f un c t i o n Greedy-FairGreedyNESF (c) Computation capacity budget ( P ) O b j e c t i v e f un c t i o n Greedy-Fair Greedy NESF (d) Computation capacity ( D ) O b j e c t i v e f un c t i o n Greedy-Fair Greedy NESF (e) Computation capacity ( D ) O b j e c t i v e f un c t i o n Greedy-FairGreedyNESF (f) Computation capacity ( D ) O b j e c t i v e f un c t i o n Greedy-FairGreedyNESF (g) Traffic rate ( λ kn ) O b j e c t i v e f un c t i o n Greedy-FairGreedyNESF (h) Tolerable latency ( τ n ) O b j e c t i v e f un c t i o n Greedy-FairGreedyNESF (i) Trade-off weight ( w ) Fig. 8: Numerical results for the large-scale network topology 5(e), 80N120E (averaged over 50 instances).increase on the minimization of the overall system cost andperformance. P Figures 8(c) shows the trend of the objective function valueat the variation of the computation capacity budget P ,whose value is scaled with respect to the initial one inTable 3 from . to . with a step of . . Clearly, a lowpower budget challenges the optimization approach thatmust ensure the available computation capacity is alwayswithin this budget. The figure shows that each heuristichas a limit budget value below which it is unable to finda feasible solution ( . for Greedy-Fair , . for Greedy and . for NESF ). Thus,
NESF is the most resilient in thiscase. As P increases, the cost values obtained by NESF and
Greedy monotonically decrease like staircases, and finallyfast converge to specific points, i.e., . for NESF and . for Greedy . The staircase pattern is due to the factthat the optimal solution remains constant when P variesin a small range, and the decreasing trend is also consistentwith the real world case. However, the cost value for Greedy-Fair exhibits an opposite trend. This is due to its strategy that tries to use the maximum number of nodes that thebudget P can cover, and distribute the traffic load on all ofthem. This scheme, thus, results in a waste of computationcapacity and cost increase in some situations. Finally, NESF still achieves the best performance, with the following gaps:around to Greedy and to Greedy-Fair . D a Figures 8(d), 8(e), and 8(f) illustrate the variations of theobjective function value with respect to the three levelsof computation capacity D a , which are scaled from . to . w.r.t. the initial values in Table 3 with a step of . ,still keeping the relation D < D < D . In Figures8(d) and 8(e), the objective function values obtained bythe three approaches show very small variation when thecomputation capacity is scaled. In Figure 8(f), there is a cleardecreasing trend for the objective function values achievedby both Greedy and
Greedy-Fair . The reason is that manyedge nodes are enabled with the D computation level, andthe increased D capacity reduces much of the total latencywhile not adding much operation cost. The objective func-tion value achieved by NESF , on the other hand, almost does S o l v i n g t i m e ( s ) Greedy-FairGreedyNESFOptimal
Fig. 9: Computing time.not change. To summarize,
NESF could provide better andmore stable solutions, compared with the other approaches. λ kn Figure 8(g) shows the objective function value variationversus the traffic rate. Values λ kn , kn ∈ K × N are scaledfrom . to . with respect to the initial value in Table 3,with a step of . . As traffic λ kn increases, the objectivefunction values for all the approaches increase. We observethat N ESF is characterized by a smooth curve, whichindicates stability in the solving processing, while both
Greedy and
Greedy-Fair exhibit larger fluctuations. When thescale is (cid:54) . , i.e., the traffic rate is relatively low, the costvalues for all the approaches are the same since the bestconfiguration, i.e., locally computing of the traffic, is easilyidentified by all of them. After that point, NESF exhibitsa better performance with a clear gap (around ) withrespect to the other approaches. τ n Figure 8(h) illustrates the objective function value withrespect to the tolerable latency τ n , n ∈ N scaled from . to . on the initial value in Table 3. When τ n increases,the objective function values obtained by all the approachesdecrease and converge to specific points, i.e., around . for NESF , . for Greedy , and finally . for Greedy-Fair .Parameter τ n serves in our model as an upper bound (seeconstraint (18)), and limits the solution space. In fact, witha low τ n value, the feasible solution set is smaller and thetotal cost increases, and vice versa. Finally, NESF performsthe best, with the following gaps: around with respectto Greedy , and to Greedy-Fair . w This parameter permits to express, in the objective functioncomputation, the relevance of the overall operation costwith respect to the total latency experienced by users. Lowervalues of w correspond to a lower relevance of the operationcost w.r.t. latency. In Figure 8(i) w is scaled from to . with respect to the initial value in Table 3, with a step of . . When the scaling factor is , the optimization focusesalmost exclusively on the total latency. As w increases, theobjective function values increase almost linearly for allthe approaches. The NESF algorithm still achieves the bestperformance, with gaps around with respect to Greedy and w.r.t.
Greedy-Fair . Figure 9 compares the average computing time of the pro-posed approaches under all considered network topologies.The computing time for P is shown only for the smallesttopology and it is already significantly larger than the oth-ers. For the tree-shaped network topology (Figure 5(c)), allapproaches are able to obtain the solution very fast, in lessthan s . This is due to the fact that routing optimizationis indeed trivial in such topology. The computing time isordered as: Greedy < NESF < Greedy-Fair . When consideringstandard deviation, the order is:
NESF < Greedy < Greedy-Fair ,and this shows the stability of our proposed approach inthe solving process. As for the network topology with nodes and edges (a general large scale network),
NESF is able to obtain a good solution in around s , and remainsbelow this value in the other considered cases. This givesus an indication that the network management componentcan periodically run NESF as a response to changes in thenetwork or in the incoming traffic, and optimize nodes com-putation capacities and routing paths accordingly. This is akey feature for providing the necessary QoS levels in next-generation mobile network architectures and for updating itdynamically.
ELATED W ORK
Several works have been recently published on the resourcemanagement problem in a MEC environment; most of themconsider a single mobile edge cloud at the ingress nodeand do not account for its connection to a larger edgecloud network [9–11]. The following of this section providesa short overview on the various areas that are relevantto the problem we consider. As discussed in Section 7.6,ours is the first approach that considers at the same timemultiple aspects related to the configuration of an edgecloud network.
The network planning problem in a MEC/Fog/Cloud con-text tackles the problems concerning nodes placement, traf-fic routing and computation capacity configuration. Theauthors in [12] propose a mixed integer linear programming(MILP) model to study cloudlet placement, assignment ofaccess points (APs) to cloudlets and traffic routing problems,by minimizing installation costs of network facilities. Thework in [6] proposes a MILP model for the problem offog nodes placement under capacity and latency constraints.[13] presents a model to configure the computation capacityof edge hosts and adjust the cloud tenancy strategy fordynamic requests in cloud-assisted MEC to minimize theoverall system cost.
The service and content placement problems are consideredin several contexts including, among others, micro-clouds,multi-cell MEC etc. The work in [14] studies the dynamicservice placement problem in mobile micro-clouds to min-imize the average cost over time. The authors first pro-pose an offline algorithm to place services using predictedcosts within a specific look-ahead time-window, and then improve it to an online approximation one with polyno-mial time-complexity. An integer linear programming (ILP)model is formulated in [15] for serving the maximum num-ber of user requests in edge clouds by jointly consideringservice placement and request scheduling. The edge cloudsare considered as a pool of servers without any topology,which have shareable (storage) and non-shareable (commu-nications, computation) resources. Each user is also limitedto use one edge server. In [16], the authors extend the workin [15] by separating the time scales of the two decisions:service placement (per frame) and request scheduling (perslot) to reduce the operation cost and system instability.In [17], the authors study the joint service placement andrequest routing problem in multi-cell MEC networks tominimize the load of the centralized cloud. No topology isconsidered for the MEC networks. A randomized rounding(RR) based approach is proposed to solve the problemwith a provable approximation guarantee for the solution,i.e., the solution returned by RR is at most a factor (morethan 3) times worse than the optimum with high proba-bility. However, although it offers an important theoreticalresult, the guarantee provided by the RR approach is onlyspecific to the formulated optimization problem. [18] studiesthe problem of service entities placement for social virtualreality (VR) applications in the edge computing environ-ment. [19] analyzes the mixed-cast packet processing androuting policies for service chains in distributed computingnetworks to maximize network throughput.The work in [20] studies the edge caching problem ina Cloud RAN (C-RAN) scenario, by jointly considering theresource allocation, content placement and request routingproblems, aiming at minimizing the system costs over time.[21] formulates a joint caching, computing and bandwidthresources allocation model to minimize the energy con-sumption and network usage cost. The authors considerthree different network topologies (ring, grid and a hypo-thetical US backbone network, US64), and abstract the fixedrouting paths from them using the OSPF routing algorithm. The cloud activation and selection problems are studied asa way to handle the configuration of computation capacityin a MEC environment. The authors in [22] design an onlineoptimization model for task offloading with a sleep controlscheme to minimize the long term energy consumption ofmobile edge networks. The authors use a Lyapunov-basedapproach to convert the long term optimization problemto a per-slot one. No topology is considered for the MECnetworks. [23] proposes a model to dynamically switchon/off edge servers and cooperatively cache services andassociate users in mobile edge networks to minimize energyconsumption. [24] jointly optimizes the active base stationset, uplink and downlink beamforming vector selection,and computation capacity allocation to minimize powerconsumption in mobile edge networks. [25] proposes amodel to minimize a weighted sum of energy consump-tion and average response time in MEC networks, whichjointly considers the cloud selection and routing problems.A population game-based approach is designed to solve theoptimization problem.
The authors in [26] study the resource allocation problem innetwork slicing where multiple resources have to be sharedand allocated to verticals (5G end-to-end services). [27]formulates a resource allocation problem for network slicingin a cloud-native network architecture, which is based on autility function under the constraints of network bandwidthand cloud power capacities. For the slice model, the authorsconsider a simplified scenario where each slice serves net-work traffic from a single source to a single destination. Forthe network topology, they consider a 6x6 square grid and a39-nodes fat-tree.
Inter-connected datacenters also share some common re-search problems with the multi-MEC system. The workin [28] studies the joint resource provisioning for Internetdatacenters to minimize the total cost, which includes serverprovisioning, load dispatching for delay sensitive jobs, loadshifting for delay-tolerant jobs, and capacity allocation. [29]presents a bandwidth allocation model for inter-datacentertraffic to enforce bandwidth guarantees, minimize the net-work cost, and avoid potential traffic overload on low costlinks.The work in [30] studies the problem of task offloadingfrom a single device to multiple edge servers to minimizethe total execution latency and energy consumption byjointly optimizing task allocation and computational fre-quency scaling. In [31], the authors study task offloadingand wireless resource allocation in an environment withmultiple MEC servers. [32] formulates an optimizationmodel to maximize the profit of a mobile service providerby jointly scheduling network resources in C-RAN andcomputation resources in MEC.
To the best of our knowledge, our paper is the first topropose a complete approach that encompasses both theproblem of planning cost-efficient edge networks and allocat-ing resources , performing optimal routing and minimizingthe total traffic latency of transmitting, outsourcing andprocessing user traffic, under a constraint of user tolerablelatency for each class of traffic. We model accurately bothlink and processing latency, using non-linear functions, andpropose both exact models and heuristics that are able toobtain near-optimal solutions also in large-scale networkscenarios, that include hundreds of nodes and edges, as wellas several traffic flows and classes.
ONCLUSION
In this paper, we studied the problem of jointly planningand optimizing the resource management of a mobile edgenetwork infrastructure. We formulated an exact optimiza-tion model, which takes into accurate account all the el-ements that contribute to the overall latency experiencedby users, a key performance indicator for these networks,and further provided an effective heuristics that computesnear-optimal solutions in a short computing time, as we
EFERENCES 15 demonstrated in the detailed numerical evaluation we con-ducted in a set of representative, large-scale topologies, thatinclude both mesh and tree-like networks, spanning wideand meaningful variations of the parameters’ set.We measured and quantified how each parameter hasa distinct impact on the network performance (which weexpress as a weighted sum of the experienced latency andthe total network cost) both in terms of strength and form.Traffic rate and network capacity have the stronger effects,and this is consistent with real network cases. Tolerablelatency shows an interesting effect: the lower requirementson latency (or equivalently: the higher value of tolerablelatency) the system sets, the lower latency and costs thesystem will have. This information can be useful for net-work operators to design the network indicators of services.The computation capacity has relatively smaller effect on thenetwork performance, compared with the other parameters.Another key observation that we draw from our numericalanalysis is that as the system capacities (including link band-width, network capacity and computation capacity budget)increase, the system performance converges to a plateau,which means that increasing the system capacity over acertain level (which we quantify for each network scenario)will have small effectiveness, and on the contrary, it willincrease the total system cost. A CKNOWLEDGMENT
This research was supported by the H2020-MSCA-ITN-2016SPOTLIGHT under grant agreement number 722788. R EFERENCES [1] W. Xiang, K. Zheng, and X. S. Shen,
5G mobile communica-tions . Springer, 2017.[2] H. Zhang, N. Liu, X. Chu, K. Long, A.-H. Aghvami,and V. C. Leung, “Network slicing based 5G and futuremobile networks: mobility, resource management, andchallenges,”
IEEE Communications Magazine , vol. 55, no.8, pp. 138–145, 2017.[3] Y. C. Hu, M. Patel, D. Sabella, N. Sprecher, and V. Young,“Mobile edge computing–A key technology towards 5G,”
ETSI white paper , vol. 11, no. 11, pp. 1–16, 2015.[4] R. Kannan and C. L. Monma, “On the computational com-plexity of integer programming problems,” in
Optimizationand Operations Research , Springer, 1978, pp. 161–172.[5] B. Xiang, J. Elias, F. Martignon, and E. Di Nitto, “JointNetwork Slicing and Mobile Edge Computing in 5G Net-works,” in
IEEE International Conference on Communications(ICC) , IEEE, 2019, pp. 1–7.[6] A. Santoyo-Gonz´alez and C. Cervell´o-Pastor, “Latency-aware cost optimization of the service infrastructure place-ment in 5g networks,”
Journal of Network and ComputerApplications , vol. 114, pp. 29–37, 2018.[7] P. Erd˝os and A. R´enyi, “On Random Graphs I,”
Publica-tiones Mathematicae Debrecen , vol. 6, pp. 290–297, 1959.[8] J. Tang, W. P. Tay, T. Q. Quek, and B. Liang, “Systemcost minimization in cloud RAN with limited fronthaulcapacity,”
IEEE Transactions on Wireless Communications ,vol. 16, no. 5, pp. 3371–3384, 2017.[9] C. Wang, C. Liang, F. R. Yu, Q. Chen, and L. Tang, “Com-putation offloading and resource allocation in wirelesscellular networks with mobile edge computing,”
IEEETransactions on Wireless Communications , vol. 16, no. 8,pp. 4924–4938, 2017. [10] Y. Mao, J. Zhang, S. Song, and K. B. Letaief, “Stochasticjoint radio and computational resource management formulti-user mobile-edge computing systems,”
IEEE Trans-actions on Wireless Communications , vol. 16, no. 9, pp. 5994–6009, 2017.[11] X. Ma, S. Zhang, W. Li, P. Zhang, C. Lin, and X. Shen,“Cost-efficient workload scheduling in cloud assistedmobile edge computing,” in
Quality of Service (IWQoS),IEEE/ACM 25th International Symposium on , IEEE, 2017,pp. 1–10.[12] A. Ceselli, M. Premoli, and S. Secci, “Mobile edge cloudnetwork design optimization,”
IEEE/ACM Transactions onNetworking (TON) , vol. 25, no. 3, pp. 1818–1831, 2017.[13] X. Ma, S. Wang, S. Zhang, P. Yang, C. Lin, and X. S.Shen, “Cost-efficient resource provisioning for dynamicrequests in cloud assisted mobile edge computing,”
IEEETransactions on Cloud Computing , 2019.[14] S. Wang, R. Urgaonkar, T. He, K. Chan, M. Zafer, and K. K.Leung, “Dynamic service placement for mobile micro-clouds with predicted future costs,”
IEEE Transactions onParallel and Distributed Systems , vol. 28, no. 4, pp. 1002–1016, 2016.[15] T. He, H. Khamfroush, S. Wang, T. La Porta, and S.Stein, “It’s hard to share: Joint service placement andrequest scheduling in edge clouds with sharable and non-sharable resources,” in , IEEE, 2018,pp. 365–375.[16] V. Farhadi, F. Mehmeti, T. He, T. La Porta, H. Kham-froush, S. Wang, and K. S. Chan, “Service placementand request scheduling for data-intensive applications inedge clouds,” in
IEEE INFOCOM 2019-IEEE Conference onComputer Communications , IEEE, 2019, pp. 1279–1287.[17] K. Poularakis, J. Llorca, A. M. Tulino, I. Taylor, and L.Tassiulas, “Joint service placement and request routingin multi-cell mobile edge computing networks,” in
IEEEINFOCOM 2019-IEEE Conference on Computer Communica-tions , IEEE, 2019, pp. 10–18.[18] L. Wang, L. Jiao, T. He, J. Li, and M. M ¨uhlh¨auser, “Serviceentity placement for social virtual reality applications inedge computing,” in
IEEE INFOCOM 2018-IEEE Confer-ence on Computer Communications , IEEE, 2018, pp. 468–476.[19] J. Zhang, A. Sinha, J. Llorca, A. Tulino, and E. Modi-ano, “Optimal control of distributed computing networkswith mixed-cast traffic flows,” in
IEEE INFOCOM 2018-IEEE Conference on Computer Communications , IEEE, 2018,pp. 1880–1888.[20] L. Pu, L. Jiao, X. Chen, L. Wang, Q. Xie, and J. Xu, “Onlineresource allocation, content placement and request routingfor cost-efficient edge caching in cloud radio access net-works,”
IEEE Journal on Selected Areas in Communications ,vol. 36, no. 8, pp. 1751–1767, 2018.[21] Q. Chen, F. R. Yu, T. Huang, R. Xie, J. Liu, and Y. Liu,“Joint resource allocation for software-defined network-ing, caching, and computing,”
IEEE/ACM Transactions onNetworking , vol. 26, no. 1, pp. 274–287, 2018.[22] S. Wang, X. Zhang, Z. Yan, and W. Wang, “Cooperativeedge computing with sleep control under non-uniformtraffic in mobile edge networks,”
IEEE Internet of ThingsJournal , 2018.[23] Q. Wang, Q. Xie, N. Yu, H. Huang, and X. Jia, “DynamicServer Switching for Energy Efficient Mobile Edge Net-works,” in
IEEE International Conference on Communications(ICC) , IEEE, 2019, pp. 1–6.[24] J. Opadere, Q. Liu, N. Zhang, and T. Han, “Joint Computa-tion and Communication Resource Allocation for Energy-Efficient Mobile Edge Networks,” in
IEEE InternationalConference on Communications (ICC) , IEEE, 2019, pp. 1–6.[25] B. Wu, J. Zeng, L. Ge, Y. Tang, and X. Su, “A game-theoretical approach for energy-efficient resource alloca- tion in MEC network,” in IEEE International Conference onCommunications (ICC) , IEEE, 2019, pp. 1–6.[26] F. Fossati, S. Moretti, P. Perny, and S. Secci, “Multi-resource allocation for network slicing,” 2019.[27] M. Leconte, G. S. Paschos, P. Mertikopoulos, and U. C.Kozat, “A resource allocation framework for network slic-ing,” in
IEEE INFOCOM 2018-IEEE Conference on ComputerCommunications , IEEE, 2018, pp. 2177–2185.[28] D. Xu, X. Liu, and Z. Niu, “Joint resource provisioningfor internet datacenters with diverse and dynamic traffic,”
IEEE Transactions on Cloud Computing , vol. 5, no. 1, pp. 71–84, 2017.[29] W. Li, K. Li, D. Guo, G. Min, H. Qi, and J. Zhang, “Cost-minimizing bandwidth guarantee for inter-datacenter traf-fic,”
IEEE Transactions on Cloud Computing , 2016.[30] T. Q. Dinh, J. Tang, Q. D. La, and T. Q. Quek, “Of-floading in mobile edge computing: Task allocation andcomputational frequency scaling,”
IEEE Transactions onCommunications , vol. 65, no. 8, pp. 3571–3584, 2017.[31] K. Cheng, Y. Teng, W. Sun, A. Liu, and X. Wang, “Energy-efficient joint offloading and wireless resource allocationstrategy in multi-mec server systems,” in
IEEE Inter-national Conference on Communications (ICC) , IEEE, 2018,pp. 1–6.[32] X. Wang, K. Wang, S. Wu, S. Di, H. Jin, K. Yang, and S. Ou,“Dynamic resource scheduling in mobile edge cloud withcloud radio access network,”
IEEE Transactions on Paralleland Distributed Systems , vol. 29, no. 11, pp. 2429–2445, 2018. A PPENDIX AP ROBLEM R EFORMULATION
Problem P formulated in Section 4 cannot be solved di-rectly and efficiently due to the reasons detailed in Sec-tion 4.4.To deal with these problems, we propose in this Ap-pendix an equivalent reformulation of P , which can besolved very efficiently with the Branch and Bound method.Moreover, the reformulated problem can be further relaxedand, based on that, we propose an heuristic algorithm whichcan get near-optimal solutions in a short computing time.To this aim, we first reformulate the processing latencyand link latency constraints (viz., constraints (12) and (16)),and we deal at the same time with the computation plan-ning problem. Then, we handle the difficulties related tovariables R kni and the corresponding routing constraints. A.1 Processing Latency
In equation (12), the variable β kni and the function S i connect the computation capacity allocation and planningproblem together, and the processing latency t kn,iP has there-fore a highly nonlinear expression. To handle this problem,we first introduce an auxiliary variable p kn,ai = β kni δ ai .Then, β kni S i is replaced by a linearized form β kni S i = (cid:80) a ∈A p kn,ai D a . Furthermore, we linearize p kn,ai = β kni δ ai ,which is the product of binary and continuous variables, asfollows: (cid:40) (cid:54) p kn,ai (cid:54) δ ai , (cid:54) β kni − p kn,ai (cid:54) − δ ai , ∀ k, ∀ n, ∀ a, ∀ i. (20)According to the definitions of α kni and b kni , we have thefollowing constraint: α kni (cid:54) b kni (cid:54) M α kni , ∀ k, ∀ n, ∀ i, (21) where M > is a big value; such constraint implies thatif α kni = 0 , the traffic kn is not processed on node i , i.e. b kni = 0 .Based on the above, we can rewrite constraint (13) as: (cid:26) α kni λ kn − (1 − b kni ) < (cid:80) a ∈A p kn,ai D a ,β kni (cid:54) b kni , ∀ k, ∀ n, ∀ i. (22)Note that the term (1 − b kni ) permits to implementcondition α kni > in Eq. (13).In equation (12), we observe that if b kni = 1 , we have: β kni S i − α kni λ kn > S i (cid:62) j ∈E S j , otherwise β kni S i − α kni λ kn = 0 resulting in t kn,iP → ∞ .To handle this case, we first define a new variable t kn,iP (cid:48) asfollows: t kn,iP (cid:48) = 1 (cid:80) a ∈A p kn,ai D a − α kni λ kn + (1 − b kni ) D m , (23)where D m is the maximum computation capacity that canbe installed on a node ( D m = max a ∈A D a ).From this equation, we have b kni = 1 ⇒ t kn,iP (cid:48) = t kn,iP > D m and b kni = 0 ⇒ t kn,iP (cid:48) = D m , t kn,iP = 0 . Hereafter,we prove that this reformulation has no influence on thesolution of our optimization problem.The outsourcing latency is defined as the maximum ofthe processing latency t kn,iP and link latency t kn,iL among allnodes. Equation (17) can be transformed as t knP L (cid:62) t kn,iP + t kn,iL , ∀ k, ∀ n, ∀ i . When b kni = 0 , t kn,iP = t kn,iL = 0 . Thus,based on above, the inequality is equivalent to t knP L (cid:62) t kn,iP (cid:48) + t kn,iL , ∀ k, ∀ n, ∀ i. A.2 Link Latency
As we stated before, to compute the link latency, we needto determine the routing path R kni , and this problem willbe specifically handled in the next subsection. Assuming R kni has been determined, we first introduce a binary vari-able γ kn,il defined as follows: γ kn,il = (cid:26) , if l ∈ R kni , , otherwise , ∀ k, ∀ n, ∀ i, ∀ l. which indicates whether l is used in the routing path R kni or not. Note that only if traffic kn is processed on node i (i.e., b kni = 1 ) and i (cid:54) = k , the corresponding routing path isdefined. Then we have: (cid:40) γ kn,kl = 0 , ∀ k, ∀ n, ∀ l,γ kn,il (cid:54) b kni , ∀ k, ∀ n, ∀ i, ∀ l. (24)We now introduce variable v l , defined as follows: v l = 1 B l − (cid:80) k (cid:48) ∈K (cid:80) n (cid:48) ∈N f k (cid:48) n (cid:48) l λ k (cid:48) n (cid:48) , ∀ l. (25)This permits to transform equation (16) as t kn,iL = (cid:80) l ∈L γ kn,il v l . We then need to linearize the product of thebinary variable γ kn,il and the continuous variable v l , and tothis aim we introduce an auxiliary variable g kn,il = γ kn,il v l , thus also eliminating t kn,iL . Specifically, we first compute thevalue range of v l as follows: B − l (cid:54) v l (cid:54) V l = 1max { B l − (cid:80) k ∈K (cid:80) n ∈N λ kn , (cid:15) } , where (cid:15) > is a small value. Based on the above, thelinearization is performed by the following constraints. (cid:40) γ kn,il B − l (cid:54) g kn,il (cid:54) γ kn,il V l , (1 − γ kn,il ) B − l (cid:54) v l − g kn,il (cid:54) (1 − γ kn,il ) V l . (26)At the same time, the link latency is rewritten as (cid:80) l ∈L g kn,il . A.3 Routing Path
Based on the definitions introduced in the previous subsec-tion, the traffic flow f knl can be transformed as: f knl = (cid:88) i ∈E γ kn,il α kni . (27)Due to the product of binary and continuous variables, h kn,il = γ kn,il α kni is introduced for linearization, as follows: (cid:40) (cid:54) h kn,il (cid:54) γ kn,il , (cid:54) α kni − h kn,il (cid:54) − γ kn,il . (28)Now we need to simplify the traffic flow conserva-tion constraint (see Eq. (8)). To this aim, and to simplifynotation, we first introduce in the network topology a“dummy” entry node which connects to all ingress nodes k ∈ K . All traffic is coming through this dummy nodeand going to each ingress node with volume λ kn , i.e. f knl = 1 , ∀ k, ∀ n, ∀ l ∈ F , where F is the dummy link setdefined as F = { (0 , k ) | k ∈ K} . Then, we extend thedefinition of I i to I i = { j ∈ E | ( j, i ) ∈ L ∪ F} . Equation (8)is hence transformed as: (cid:88) j ∈I i f knji − (cid:88) j ∈O i f knij = α kni , ∀ k, ∀ n, ∀ i. (29)Correspondingly, we add the following constraints to theset F of dummy links: (cid:40) γ kn,i k = b kni , ∀ k, ∀ n, ∀ i,γ kn,i k (cid:48) = 0 , ∀ k, ∀ n, ∀ i, ∀ k (cid:48) (cid:54) = k. (30)The final stage of our procedure is the definition of theconstraints that guarantee all desirable properties that arouting path must respect: the fact that a single path (trafficis unsplittable) is used, the flow conservation constraintsthat provide continuity to the chosen path, and finally theabsence of cycles in the routing path R kni . We would liketo highlight that the traffic kn can be only split at ingressnode k , and each proportion of such traffic is destined toan edge node i ; this is why we have multiple routing paths R kni , i ∈ { , , · · · } .To this aim, we introduce the following conditions,and prove that satisfying them along with the constraintsillustrated before can guarantee that such properties arerespected: • For an arbitrary node i , the number of ingress linksused by a path R kni (cid:48) is one, and thus variables γ kn,i (cid:48) ji should satisfy the following condition: (cid:88) j ∈I i γ kn,i (cid:48) ji (cid:54) , ∀ k, ∀ n, ∀ i, i (cid:48) . (31) • The flow conservation constraint (see Eq. (29)) imple-ments the continuity of a traffic flow. • Every routing path should have an end or a destinationto avoid loops. This can be ensured by the followingequation: γ kn,iij = 0 , ∀ k, ∀ n, ∀ ( i, j ) ∈ L . (32)The proof is as follows:a) Substitute Eq. (27) into (29) and make the transforma-tion: (cid:88) j ∈I i (cid:88) i (cid:48) ∈E γ kn,i (cid:48) ji α kni (cid:48) − (cid:88) j ∈O i (cid:88) i (cid:48) ∈E γ kn,i (cid:48) ij α kni (cid:48) = (cid:88) i (cid:48) ∈E α kni (cid:48) (cid:88) j ∈I i γ kn,i (cid:48) ji − (cid:88) i (cid:48) ∈E α kni (cid:48) (cid:88) j ∈O i γ kn,i (cid:48) ij = (cid:88) i (cid:48) ∈E α kni (cid:48) ( (cid:88) j ∈I i γ kn,i (cid:48) ji − (cid:88) j ∈O i γ kn,i (cid:48) ij ) = α kni b) Based on constraints (24) and (30), we have:if α kni (cid:48) = 0 , then (cid:88) j ∈I i γ kn,i (cid:48) ji − (cid:88) j ∈O i γ kn,i (cid:48) ij = 0 . c) From a) and b), we have: (cid:88) j ∈I i γ kn,iji − (cid:88) j ∈O i γ kn,iij = 1 , ∀ k, ∀ n, ∀ i | α kni > , (cid:88) j ∈I i γ kn,i (cid:48) ji − (cid:88) j ∈O i γ kn,i (cid:48) ij = 0 , ∀ k, ∀ n, ∀ i, ∀ i (cid:48) (cid:54) = i. d) Based on c), constraint (30), conditions (31) and (32)can be written as: (cid:88) j ∈I k γ kn,ijk = 1 , ∀ k, ∀ n, ∀ i | α kni > , (cid:88) j ∈I i γ kn,iji = 1 , ∀ k, ∀ n, ∀ i | α kni > , (cid:88) j ∈I i γ kn,i (cid:48) ji = (cid:88) j ∈O i γ kn,i (cid:48) ij (cid:54) , ∀ k, ∀ n, ∀ i, ∀ i (cid:48) (cid:54) = i. (33)(34)(35)Their practical meaning is explained as follows: • (33) ensures (0 , k ) to be the first link in any routing path R kni if α kni > , • (34) ensures i to be the end node of the last link in anyrouting path R kni if α kni > , • (35) ensures that if i ∈ E\{ i (cid:48) } is an intermediate node ina routing path R kni (cid:48) , i should have only one input linkand one output link. It also indicates the continuity ofa traffic flow.e) Given a non-empty routing path R kni (cid:48) ( α kni (cid:48) > ), checkthe validity by using the following conditions: • Let i = k in (35), then based on (33), (cid:80) j ∈O k γ kn,i (cid:48) kj = 1 ; • Assume ( k, j (cid:48) ) is a link of R kni (cid:48) , then γ kn,i (cid:48) kj (cid:48) = 1 . • If j (cid:48) = i (cid:48) , then the path is found, otherwise, continuewith the following steps: • Let i = j (cid:48) in (35), due to γ kn,i (cid:48) kj (cid:48) = 1 , (cid:80) j ∈O j (cid:48) γ kn,i (cid:48) j (cid:48) j = 1 ; • Assume ( j (cid:48) , j (cid:48)(cid:48) ) is a link of R kni (cid:48) , then γ kn,i (cid:48) j (cid:48) j (cid:48)(cid:48) = 1 . • Check j (cid:48)(cid:48) = i (cid:48) in the same way as the above steps, thewhole path k → i (cid:48) must be found.Thus, if all the conditions are satisfied, R kni (cid:48) must be avalid routing path having the three properties (unsplittabil-ity, traffic continuity, absence of cycles). A.4 Final Reformulated Problem
Based on the reformulation of routing and the demon-strations in the above subsections, the flow conservationconstraints can be further improved and the flow variable f knij can be eliminated as follows: (cid:88) j ∈I i γ kn,iji = b kni , ∀ k, ∀ n, ∀ i, (cid:88) j ∈I i γ kn,i (cid:48) ji = (cid:88) j ∈O i γ kn,i (cid:48) ij , ∀ k, ∀ n, ∀ i, ∀ i (cid:48) (cid:54) = i. (36)(37)Equation (19) contains a maximization form, to get ridof which we use a standard technique by introducingvariable T n = max k ∈K { t knW + t knP L } and linearize it as T n (cid:62) t knW + t knP L , ∀ k, ∀ n (in Section A.1, a similar transfor-mation has been performed on t knP L (see Eq. (17))). Since thearguments of the two maximizations are independent, basedon the reformulation of processing latency, equation (18) canbe transformed as: t knW + t kn,iP (cid:48) + (cid:88) l ∈L g kn,il (cid:54) T n (cid:54) τ n , ∀ k, ∀ n, ∀ i. (38)Finally, the equivalent reformulation of P can be writ-ten as: P c kn ,b kni ,α kni ,β kni ,δ ai ,γ kn,il (cid:88) n ∈N T n + w (cid:88) i ∈E κ i S i , s.t. (1) , (2) , (3) , (4) , (9) , (10) , (11) , (20) , (21) , (22) , (23) , (24) , (25) , (26) , (28) , (30) , (31) , (32) , (36) , (37) , (38) . In problem P , c kn , b kni , α kni , β kni , δ ai and γ kn,il are themain decision variables, while other auxiliary variables like T n , S i , h kn,il , v l , etc. are not shown here for simplicity. All thevariables are bounded. Since constraints (9), (23) and (25) arequadratic while the others are linear, P1