[PDF] Joint Planning of Network Slicing and Mobile Edge Computing in 5G Networks

Abstract

Multi-access Edge Computing (MEC) facilitates the deployment of critical applications with stringent QoS requirements, latency in particular. Our paper considers the problem of jointly planning the availability of computational resources at the edge, the slicing of mobile network and edge computation resources, and the routing of heterogeneous traffic types to the various slices. These aspects are intertwined and must be addressed together to provide the desired QoS to all mobile users and traffic types still keeping costs under control. We formulate our problem as a mixed-integer nonlinear program (MINLP) and we define a heuristic, named Neighbor Exploration and Sequential Fixing (NESF), to facilitate the solution of the problem. The approach allows network operators to fine tune the network operation cost and the total latency experienced by users. We evaluate the performance of the proposed model and heuristic against two natural greedy approaches. We show the impact of the variation of all the considered parameters (viz., different types of traffic, tolerable latency, network topology and bandwidth, computation and link capacity) on the defined model. Numerical results demonstrate that NESF is very effective, achieving near-optimal planning and resource allocation solutions in a very short computing time even for large-scale network scenarios.

Full PDF

11 Joint Planning of Network Slicing and MobileEdge Computing in 5G Networks

Bin Xiang, Jocelyne Elias, Fabio Martignon, and Elisabetta Di Nitto

Abstract —Multi-access Edge Computing (MEC) facilitates the deployment of critical applications with stringent QoS requirements,latency in particular. Our paper considers the problem of jointly planning the availability of computational resources at the edge, theslicing of mobile network and edge computation resources, and the routing of heterogeneous trafﬁc types to the various slices. Theseaspects are intertwined and must be addressed together to provide the desired QoS to all mobile users and trafﬁc types still keepingcosts under control. We formulate our problem as a mixed-integer nonlinear program (MINLP) and we deﬁne a heuristic, namedNeighbor Exploration and Sequential Fixing (NESF), to facilitate the solution of the problem. The approach allows network operators toﬁne tune the network operation cost and the total latency experienced by users. We evaluate the performance of the proposed modeland heuristic against two natural greedy approaches. We show the impact of the variation of all the considered parameters (viz.,different types of trafﬁc, tolerable latency, network topology and bandwidth, computation and link capacity) on the deﬁned model.Numerical results demonstrate that NESF is very effective, achieving near-optimal planning and resource allocation solutions in a veryshort computing time even for large-scale network scenarios.

Index Terms —Edge computing, network planning, node placement, network slicing, joint allocation. (cid:70)

NTRODUCTION T HE ﬁfth-generation (5G) networks aim to meet differentusers’ Quality of Service (QoS) requirements in severaldemanding application scenarios and use cases. Among theothers, controlling latency is certainly one of the key QoSrequirements that mobile operators have to deal with. Infact, the classiﬁcation devised by the International Telecom-munications Union-Radio communication Sector (ITU-R),shows that mission-critical services depend on strong la-tency constraints. For example, in some use cases (e.g.,autonomous driving), the tolerable latency is expected toreach less than 1 ms [1].To address such constraints various ingredients areemerging. First of all, through Network Slicing , the physicalnetwork infrastructure can be split into several isolatedlogical networks, each dedicated to applications with spe-ciﬁc latency requirements, thus enabling an efﬁcient anddynamic use of network resources [2].Second,

Multi-access Edge Computing (MEC) provides anIT service environment and cloud-computing capabilities atthe edge of the mobile network, within the Radio AccessNetwork and in close proximity to mobile subscribers [3].Through this approach the latency experienced by mobileusers can be consistently reduced. However, the computa-tion power that can be offered by an edge cloud is quitelimited in comparison with a remote cloud. Consideringthat 5G networks will be likely built in an ultra-dense • B. Xiang and E. Di Nitto are with the Dipartimento di Elettronica,Informazione e Bioingegneria, Politecnico di Milano, Milan, Italy, 20133.E-mail: { bin.xiang, elisabetta.dinitto } @polimi.it. • J. Elias is with the Department of Computer Science and Engineering(DISI), University of Bologna, Bologna, Italy, 40126.E-mail: [email protected]. • F. Martignon is with the Department of Management, Information andProduction Engineering, University of Bergamo, Bergamo, Italy, 24044.E-mail: [email protected]. manner, the edge clouds attached to 5G base stations willalso be massively deployed and connected to each other ina speciﬁc topology. In this way, cooperation among multipleedge clouds provides a solution for the problem of limitedcomputation resources on a single MEC unit.In this line, we study the case of a complex networkorganized in multiple edge clouds , each of which may beconnected to the Radio Access Network of a certain location.All such edge clouds are connected through an arbitrarytopology. This way, each edge cloud can serve end usertrafﬁc by relying not only on its own resources, but alsoofﬂoading some trafﬁc to its neighbors when needed. Wespeciﬁcally consider multiple classes of trafﬁc and corre-sponding requirements, including voice, video, web, amongothers. For every class of trafﬁc incoming from the corre-sponding Radio Access Network, the edge cloud decideswhether to serve it or ofﬂoad it to some other edge cloud.This decision depends on the QoS requirements associatedto the speciﬁc class of trafﬁc and on the current status of theedge cloud.Our main objective is to ensure that the infrastructureis able to serve all possible types of trafﬁc within theboundaries of their QoS requirements and of the availableresources.In this work we therefore propose a complete approach,named

Joint Planning and Slicing of mobile Network and edgeComputation resources (JPSNC), which solves the problem of operating cost-efﬁcient edge networks. The approach jointlytakes into account the overall budget that the operator usesin order to allocate and operate computing capabilities in itsedge network, and allocates resources , aiming at minimizingthe network operation cost and the total trafﬁc latency oftransmitting, outsourcing and processing user trafﬁc, underconstraints of user tolerable latency for each class of trafﬁc.This turns out to be a mixed-integer nonlinear program- a r X i v : . [ c s . N I] M a y ming (MINLP) optimization problem, which is an N P -hard problem [4]. To tackle this challenge, we transform itinto an equivalent mixed-integer quadratically constrainedprogramming (MIQCP) problem, which can be solved moreefﬁciently through the Branch and Bound method. Based onthis reformulation, we further propose an effective heuristic,named

Neighbor Exploration and Sequential Fixing (NESF) ,that permits obtaining near-optimal solutions in a very shortcomputing time, even for the large-scale scenarios we con-sidered in our numerical analysis. Furthermore, we proposetwo simple heuristics, based on a greedy approach, thatprovide benchmarks for our algorithms and obtain (slightly)sub-optimal solutions with respect to NESF, and are stillvery fast. Finally, we systematically analyze and discusswith a thorough numerical evaluation the impact of all con-sidered parameters (viz. the overall planning budget of theoperator, different types of trafﬁc, tolerable latency, networktopology and bandwidth, computation and link capacity) onthe optimal and approximate solutions obtained from ourproposed model and heuristics. Numerical results demon-strate that our proposed model and heuristics can providevery efﬁcient resource allocation and network planning so-lution for multiple edge networks. This work takes the rootfrom a previous paper [5] where we focused exclusively onminimizing the latency of trafﬁc in a hierarchical network,keeping the network and computation capacity ﬁxed. In thispaper, we have completely revised our optimization modelto cope with a joint network planning, slicing and edgecomputing problem, aimed at minimizing both the totallatency and operation cost for arbitrary network topologies.The remainder of this paper is organized as follows.Section 2 introduces the network system architecture weconsider. Section 3 provides an intuitive overview of theproposed approach by using a simple example. Section 4illustrates the proposed mathematical model and Section 5the heuristics. Section 6 discusses numerical results in aset of typical network topologies and scenarios. Section 7discusses related work. Finally, Section 8 concludes thepaper.

YSTEM A RCHITECTURE

Figure 1 illustrates our reference network architecture. Weconsider an edge network composed of

Edge Nodes . Each ofsuch nodes can be equipped with any of the following threecapabilities: • the ability of acquiring trafﬁc from mobile devicesthrough the Remote Radio Head (RRH), such nodes arethose we call Ingress Nodes ; • the ability of executing network or application levelservices requiring computational power, this is donethanks to the availability of an Edge Cloud on the node; • the ability to route trafﬁc to other nodes.Not all nodes must have all the three capabilities, so, in thisrespect, the edge network can be constituted of heteroge-neous nodes.Each link ( i, j ) between any two edge nodes, i and j , hasa ﬁxed bandwidth, denoted by B ij . Each Ingress Node k hasa speciﬁc ingress network capacity C k , which is a measureof its ability to accept trafﬁc incoming from mobile devices. Nodes able to perform some computation have a computa-tion capacity S i . One of the objectives of the planning modelpresented in this paper is to determine the optimal value ofthe computation capacity that must be made available ateach node.We assume that users’ incoming data in each IngressNode is aggregated according to the corresponding trafﬁctype n ∈ N . Examples of trafﬁc types can be video, game,data from sensors, and the like. In Figure 1 trafﬁc of dif-ferent types is shown as arrows of different colors. Fromeach Ingress Node, trafﬁc can be split and processed onall edge clouds in the network; the dashed arrows shownin the ﬁgure represent possible outsourcing paths of thetrafﬁc pieces from different Ingress Nodes. Different slicesof the ingress network capacity C k and the edge cloudcomputation capacity S i can be allocated to serve the dif-ferent types of trafﬁc based on the corresponding ServiceLevel Agreements (SLAs), which, in this paper are focusedon keeping latency under control. Thus, another objectiveof our model is to ﬁnd the allocation of trafﬁc to edgeclouds that allows us to minimize the total latency, whichis expressed in terms of the latency at the ingress node,due to the limitations of the wireless network, plus thelatency due to the trafﬁc processing computation, plus thelatency occurring in the communication links internal to thenetwork system architecture. O u t s o u r c i ng Ingress IngressIngressIngress

Edge CloudIncoming tra ﬃ cRRH Forwarding Edge Node

Fig. 1: Network system architecture.We assume that the edge network is controlled by a management component which is in charge of achieving theoptimal utilization of its resources, in terms of network andcomputation, still guaranteeing the SLA associated to eachtrafﬁc type accepted by the network. This component mon-itors the network by periodically computing the networkcapacity of each ingress node (through broadcast messagesexchanged in the network) and the bandwidth of each linkin the network topology. Moreover, it knows the maximumavailable computation capacity of all computation nodes.With these pieces of information as input, and knowingthe SLA associated to each trafﬁc type, the managementcomponent periodically solves an optimization problem thatprovides as output the identiﬁcation of a proper networkconﬁguration and trafﬁc allocation. In particular, it willidentify: i) the amount of computational capacity to beassigned to each node so that, with the foreseen trafﬁc, the

48 102 96 57 13 (a) Minimizing both latencyand computation costs

48 102 96 57 13 (b) With the same settings,but a change λ n ,t = 40 Gb/s

Fig. 2: Toy example for a network with 10 nodes and 20edges (average degree: 4.0).node usage remains below a certain level of its capacity;ii) which node is taking care of which trafﬁc type; and iii)the nodes through which each trafﬁc type must be routedtoward its destination.For simplicity, the optimization problem is based on theassumption that the system is time-slotted, where time isdivided into equal-length short slots (short periods wherenetwork parameters can be considered as ﬁxed and trafﬁcshows only small variations). We observe that our proposedheuristic (NESF) exhibits a short computing time so that itis feasible to run the problem periodically and to adjust theconﬁguration of the system network based on the actualevolution of the trafﬁc.In the next section, we give an intuition of the solu-tion applied by the management component in the caseof a simple network, while in Section 4 we formalize theoptimization problem and in Section 5 we present someheuristics that make the problem tractable in realistic cases.

VERVIEW OF P LANNING AND A LLOCATION

In this section we refer to a simple but still meaningful edgenetwork and we show how the management componentbehaves in the presence of two types of trafﬁc. The examplewe consider is shown in Figure 2 and consists of 10 nodesconnected together with an average degree of 4, two ofwhich are ingress nodes (labeled as n and n in the ﬁgureand colored in orange). For simplicity, we assume that thebandwidth of all links is B l = 100 Gb/s , and the wirelessnetwork capacity of the two ingress nodes is, respectively, C n = 50 Gb/s and C n = 60 Gb/s . Every node in thenetwork has a computation capacity that can take one ofthe following values: Gb/s (i.e., no computation capacityis made available at the current time), D = 30 Gb/s , D = 40 Gb/s , and D = 50 Gb/s . Given the above edgenetwork, let us assume the management component esti-mates that node n will receive trafﬁc of type t at rate λ n ,t = 25 Gb/s and type t at rate λ n ,t = 20 Gb/s ,while node n will receive the two types of trafﬁc withrates λ n ,t = 15 Gb/s and λ n ,t = 35 Gb/s , respectively.Finally, let us assume that the network operator has set anupper bound on the power budget to be used (i.e., the totalamount of computational power) P = 300 Gb/s and has Note that computation capacity is often expressed in cycles/s. Asdiscussed in Section 6, for homogeneity with the other values, we havetransformed it into

Gb/s . deﬁned in its SLA a tolerable latency for the two types oftrafﬁc, respectively, to the following values: τ t = 1 ms and τ t = 2 ms .In this case, the computed optimal conﬁguration isshown in Figure 2(a). The management component willassign at ingress node n a wireless network capacity slice of Gb/s to t and of Gb/s to t , while at ingress node n itwill assign Gb/s to t and Gb/s to t . Moreover, it willassign a computation capacity D to nodes n and n and D to n , while it will switch off the computation capacity ofthe other nodes. This leads to a total computation capacityof Gb/s , which is well below the available computationcapacity budget P . Given that t is the trafﬁc type withthe most demanding constraint in terms of latency, themanagement component decides to use the full D capacityof n to process trafﬁc t from n . Applying the samestrategy within node n would result in a waste of resourcesbecause the t trafﬁc of n will take only Gb/s of theavailable computation capacity, and the remaining one willnot be sufﬁcient to handle the expected total amount of t trafﬁc. Since moving the t trafﬁc of one hop would stillallow the system to fulﬁll the SLA, the decision is then toconﬁgure the network to route such trafﬁc to n . The reasonfor choosing n is mainly because it is one of the nearestneighbors of both n and n (with 2 hops to n and 1 hopto n ) and, with its D capacity, can handle both t trafﬁcfrom n and t trafﬁc from n . Speciﬁcally, the percentageof computation capacity allocated for n , t and n , t is and , respectively. t trafﬁc from n is, instead,processed locally at n itself.Let us now assume that the management componentobserves a change in the λ n ,t trafﬁc rate, which increasesto λ n ,t = 40 Gb/s . Based on this, the management compo-nent runs again the optimization algorithm that will outputthe conﬁguration illustrated in Figure 2(b). The slicing ofthe wireless network capacity for ingress node n doesnot vary, while for ingress node n , a slice of Gb/s isassigned to t and, as a consequence, a slice of Gb/s ,smaller than before, to t . Moreover, a computation capacity D is allocated to n , which processes t locally, and D isallocated to the neighbor node n , which handles the t trafﬁc from n . A capacity D is allocated to n to process t locally and, ﬁnally, D is allocated to n to process t incoming from n . Both ingress nodes select the nearest 1-hop neighbor to ofﬂoad the trafﬁc and the total computationcapacity is equal to Gb/s . Notice that, by manuallyanalyzing the initial conﬁguration of Figure 2(a), we maythink that a better solution would be to simply increase thecomputation capacity of n to D as in this way the networkremains almost the same as before and the total computationcapacity is Gb/s , smaller than the one of Figure 2(b).However, a more in-depth analysis shows that, even if thissolution is certainly feasible, it is less optimal than the oneof Figure 2(b) in terms of total latency. The main reasonis that trafﬁc t from node n suffers for a larger latencyin the wireless ingress network due to a smaller allocatedslice, and, in the scenario where both n and n rely on thesame node n for ofﬂoading some trafﬁc, it is also sufferingfor a relatively high latency due to the trafﬁc computationon n . This second component of the latency is reducedin the case of Figure 2(b) where trafﬁc t from node n has the computation capacity of n entirely dedicated toit. Thus, the total latency for t is . ms in the case ofFigure 2(b) and . ms in the other case. In Section 4 weshow how such values are computed and, in general, theoptimization model that computes the optimal allocation ofcomputational and network resources as well as the optimalrouting paths. ROBLEM F ORMULATION

In this section we provide the mathematical formulationof our

Joint Planning and Slicing of mobile Network and edgeComputation resources (JPSNC) model. Table 1 summarizesthe notation used throughout this section. For brevity, wesimplify expression ∀ n ∈ N as ∀ n , and apply the same ruleto other set symbols like E , K , L , etc. throughout the rest ofthis paper unless otherwise speciﬁed.TABLE 1: Summary of used notations. Parameters Deﬁnition N Set of trafﬁc types E Set of edge nodes in the edge networks K Set of ingress nodes, where

K ⊆ EL

Set of directed links in the networks B ij Bandwidth of the link from node i to j , where ( i, j ) ∈ L C k Network capacity of ingress edge node k ∈ K D a Levels of computation capacities ( a ∈ A = { , , . . . } ) P Planning budget of computation capacity λ kn User trafﬁc rate of type n in ingress node kτ n Tolerable delay for serving the total trafﬁc of type nκ i Cost of using one unit of computation capacity on node iw Weight to balance among total latency and operation costVariables Deﬁnition c kn Slice of the network capacity for trafﬁc knb kni

Whether trafﬁc kn is processed on node i or not α kni Percentage of trafﬁc kn processed on node iβ kni Percentage of i ’s computation capacity sliced to trafﬁc knδ ai Decision for planning computation capacity on node i R kni Set of links for routing the trafﬁc piece α kni from k to i The goal of our formulation is to minimize a weightedsum of the total latency and network operation cost forserving several types of user trafﬁc under the constraintsof users’ maximum tolerable latency and network planningbudget. This allows the network operator to ﬁne tune itsneeds in terms of quality of service provided to its users andcost of the planned network. Different types of trafﬁc, withheterogeneous requirements, need to be accommodated,and may enter the network from different ingress nodes.In the following, we ﬁrst focus on the network planningissue and its related cost, as well as on the trafﬁc routingissue, and then detail all components that contribute to theoverall latency experienced by users, which we capture inour model.

Network Planning : We assume that, in each edge node,some processing capacity can be made available, thus en-abling MEC capabilities. This action will result in an oper-ation cost that will increase at the increase of the amountof processing capacity. To model more closely real networkscenarios, we assume that only a discrete set of capacity values can be chosen by the network operator and madeavailable. Therefore, we adopt a piecewise-constant function S i for the processing capacity of an edge node, in linewith [6]. This is deﬁned as: S i = (cid:88) a ∈A δ ai D a , ∀ i, (1)where D a is a capacity level ( a ∈ A ) and δ ai ∈ { , } isa binary decision variable for capacity planning, satisfyingthe following constraint (only one level of capacity can bemade available on a node, including zero, i.e., no processingcapability): (cid:88) a ∈A δ ai = 1 − δ i , ∀ i, (2)where δ i is a binary variable that indicates whether node i has currently available some computation power or not.This constraint implies that S i can be set as either 0 (nocomputation power) or exactly one capacity level, D a .To save on operation costs, in the case an edge nodeis not supposed to be exploited to process some trafﬁc,then no processing capacity is made available on it. Weintroduce binary variable b kni to indicate whether trafﬁc kn is processed on node i (we will use the expression “trafﬁc kn ” in the following, for brevity, to indicate the user trafﬁc oftype n from ingress point k ). Then the following constraintshould be satisﬁed: b kni (cid:54) − δ i (cid:54) (cid:88) k (cid:48) ∈K (cid:88) n (cid:48) ∈N b k (cid:48) n (cid:48) i , ∀ k, ∀ n, ∀ i, (3)We also consider a total planning budget, P , for theavailable computation capacity, introducing the followingconstraint: (cid:88) i ∈E S i (cid:54) P. (4)Then, the total operation cost can be expressed as: J = (cid:88) i ∈E κ i S i , (5)where κ i is the cost of using one unit of computationcapacity (in the example of Section 3 this will be Gb/s )on node i . Network Routing : We assume that each type of trafﬁccan be split into multiple pieces only at its ingress node.Each piece can then be ofﬂoaded to another edge computingnode independently of the other pieces, but it cannot befurther split (we say that each piece is unsplittable ). Eachlink l ∈ L may carry different trafﬁc pieces, α kni (we denoteby α kni the percentage of trafﬁc kn processed at node i , andwith β kni the percentage of computation capacity S i slicedfor trafﬁc kn ). Then, the trafﬁc ﬂow kn on l , f knl , can beexpressed as the sum of all pieces of trafﬁc that pass throughsuch link: f knl = (cid:88) i ∈E : l ∈R kni α kni , ∀ k, ∀ n, ∀ l, (6)where R kni ⊂ L denotes a routing path (set of traversedlinks) for the trafﬁc piece α kni λ kn from ingress k to node i .The following constraint ensures that the total trafﬁc on eachlink does not exceed its capacity: B ij > (cid:88) k ∈K (cid:88) n ∈N f knij λ kn , ∀ ( i, j ) ∈ L . (7) The trafﬁc ﬂow conservation constraint is enforced bythe following constraint: (cid:88) j ∈I i f knji − (cid:88) j ∈O i f knij = (cid:26) α kni − , if i = k,α kni , otherwise , ∀ k, ∀ n, ∀ i, (8)where I i = { j ∈ E | ( j, i ) ∈ L} and O i = { j ∈ E | ( i, j ) ∈ L} are the set of nodes connected by the incoming and outgoinglinks of node i , respectively. The fulﬁllment of this constraintguarantees continuity of the routing path. Moreover, therouting path R kni should be acyclic . The latency in each ingress edge node is modeled as thesum of the wireless network latency and the outsourcing latency which, in turn, is composed of the processing latency in someedge cloud and then link latency between edge clouds.

Wireless Network Latency : We model the transmissionof trafﬁc in each user ingress point as an M | M | processingqueue. The wireless network latency for transmitting the usertrafﬁc of type n from ingress point k , denoted by t knW , cantherefore be expressed as: t knW = 1 c kn − λ kn , ∀ k, ∀ n, (9)where c kn is the capacity of the network slice allocated fortrafﬁc kn in the ingress edge network (a decision variablein our model) and λ kn is the trafﬁc rate. The followingconstraints ensure that the capacity of all slices does notexceed the total capacity C k of each ingress edge node, and c kn is higher than the corresponding λ kn value: (cid:88) n ∈N c kn (cid:54) C k , ∀ k, (10) λ kn < c kn , ∀ k, ∀ n. (11) Processing Latency : We assume that each type of trafﬁccan be segmented and processed on different edge clouds,and each edge cloud can slice its computation capacity toserve different types of trafﬁc from different ingress nodes.As introduced before, we indicate with α kni the percentageof trafﬁc kn processed at node i , and with β kni the per-centage of computation capacity S i sliced for trafﬁc kn . Theprocessing of user trafﬁc is described by an M | M | model.Let t kn,iP denote the processing latency of edge cloud i fortrafﬁc kn . Then, based on the computational capacity β kni S i sliced for trafﬁc kn , with an amount α kni λ kn to be served, ∀ k, ∀ n, ∀ i, t kn,iP is expressed as: t kn,iP = (cid:40) β kni S i − α kni λ kn , if α kni > , , otherwise . (12)In the above equation, when trafﬁc kn is not processed onedge cloud i , the corresponding value is ; at the same time,no computation resource of i should be sliced to trafﬁc kn (i.e., β kni = 0 ). The corresponding constraint is written as: (cid:26) α kni λ kn < β kni S i , if α kni > ,α kni = β kni = 0 , otherwise . (13) α kni and β kni also have to fulﬁll the following consistencyconstraints: (cid:88) i ∈E α kni = 1 , ∀ k, ∀ n, (14) (cid:88) k ∈K (cid:88) n ∈N β kni (cid:54) , ∀ i. (15) Link Latency : Let t kn,iL denote the link latency for routingtrafﬁc kn to node i . In each ingress node, the incoming trafﬁcis routed in a multi-path way, i.e., different types or pieces ofthe trafﬁc may be dispatched to different nodes via differentpaths. ∀ k, ∀ n, ∀ i , t kn,iL is deﬁned as: t kn,iL =  (cid:80) l ∈R kni B l − (cid:80) k (cid:48)∈K (cid:80) n (cid:48)∈N f k (cid:48) n (cid:48) l λ k (cid:48) n (cid:48) , if α kni > i (cid:54) = k, , otherwise . (16)Recall that R kni is a routing path for the trafﬁc piece α kni λ kn from ingress k to node i . The link latency is accounted foronly if a certain trafﬁc piece is processed on node i (i.e. α kni > ) and i (cid:54) = k . Total Latency : Now we can deﬁne the outsourcing latency for trafﬁc kn , which depends on the longest serving timeamong edge clouds: t knP L = max i ∈E { t kn,iP + t kn,iL } , ∀ k, ∀ n. (17)The latency experienced by each type of trafﬁc coming fromthe ingress nodes, can therefore be deﬁned as t knW + t knP L , andalso should respect the tolerable latency requirement: t knW + t knP L (cid:54) τ n , ∀ k, ∀ n. (18)For each trafﬁc type n , we consider the maximum valueamong different ingress nodes with respect to the wirelessnetwork latency and outsourcing latency, i.e., max k ∈K { t knW + t knP L } . Then, we deﬁne the total latency as follows: T = (cid:88) n ∈N max k ∈K { t knW + t knP L } . (19) Our goal in the

Joint Planning and Slicing of mobile Networkand edge Computation resources (JPSNC) problem is to min-imize the total latency and the operation cost, under theconstraints of maximum tolerable delay for each trafﬁc typecoming from ingress nodes and the total planning budgetfor making available processing-capable nodes: P c kn ,b kni ,α kni ,β kni ,δ ai , R kni T + wJ, s.t. (1) − (19) , where w ≥ is a weight that permits to set the desired bal-ance between the total latency and operation cost. Problem P contains both nonlinear and indicator constraints, there-fore, it is a mixed-integer nonlinear programming (MINLP)problem, which is hard to be solved directly [4], as discussedin Section 4.4. Problem P formulated in Section 4 cannot be solved di-rectly and efﬁciently due to the following reasons: • We aim at identifying the optimal routing (the routingpath R kni is a variable in our model, since many pathsmay exist from each ingress node k to a generic node i in the network); furthermore, we must ensure that suchrouting is acyclic and ensures continuity and unsplitta-bility of trafﬁc pieces. • Variables R kni and α kni are reciprocally dependent: toﬁnd the optimal routing, the percentage of trafﬁc pro-cessed at each node i should be known, and at the sametime, to solve the optimal trafﬁc allocation, the routingpath should be known. • The processing latency, deﬁned in the previous sections,depends on three decision variables in our model andthe corresponding formula (12) is (highly) nonlinear. • P contains indicator functions and constraints, e.g.(12), (13), (16), which cannot be directly and easilyprocessed by most solvers.To deal with the above issues, we propose an equivalentreformulation of P (called Problem P ), which can besolved very efﬁciently with the Branch and Bound method.Moreover, the reformulated problem can be further relaxedand, based on that, we propose in the next section anheuristic algorithm which can get near-optimal solutionsin a shorter computing time. More speciﬁcally, in P , weﬁrst reformulate the processing latency and link latencyconstraints (viz., constraints (12) and (16)), and we deal,at the same time, with the computation planning problem.Then, we handle the difﬁculties related to variables R kni and the corresponding routing constraints. Appendix Acontains all details about the problem reformulation. Sincesome constraints are quadratic while the others are linear, P is a mixed-integer quadratically constrained program-ming (MIQCP) problem, for which commercial and freelyavailable solvers can be used, as we will illustrate in thenumerical evaluation section. EURISTICS

Hereafter, we illustrate our proposed heuristic, named

Neighbor Exploration and Sequential Fixing - NESF, whichproceeds by exploring and utilizing the neighbors of eachingress node for hosting (a part of) the trafﬁc along an objective descent direction , that is, by trying to minimize theobjective function (which, we recall, is a weighted sum ofthe total latency and operation cost). During each step wherewe explore potential candidates for computation ofﬂoading,we partially ﬁx the main binary decision variables in thereformulated problem P and then solve the so-reducedproblem by using the Branch and Bound method. Ourexploring strategy provides excellent results, in practice,achieving near-optimal solution in many network scenarios,as we will illustrate in the Numerical Results Section.The detailed exploring strategy is illustrated in Figure 3,which shows three typical variation paths of the objectivefunction value versus the number of computing nodesmade available in the network (note that these 3 trendsare independent from each other, in the sense that eitherof them, or a combination of them, can be experienced in | Nodes | Objective function

I II III y A x A Ay B x B By C x C Cy D x D Dy E x E E Fig. 3: Three typical variations of the objective functionvalue versus the number of computing nodes made avail-able.a given network instance). Point A represents the stagewhere a minimum required number of computing nodes( x A ) is opened to ensure the feasibility of the problem.For instance, if the ingress nodes can host all the trafﬁcunder all the constraints, x A = |K| . Point E indicates themaximum number of computing nodes that can be madeavailable in the network; any point above x E will violatethe computation budget or tolerable latency constraints.During the search phase of our heuristic, which is ex-ecuted in Algorithms 1 and 3, detailed hereafter, we ﬁrsttry to obtain (or get as closer as possible to) point A andthe corresponding objective value y A . If A can not be foundwithin the computation budget, the problem is infeasible.Otherwise, we continue to explore computation candidatesfrom the h -hop neighbors of each ingress node, and allocatethem to serve different types of trafﬁc. The objective value isobtained by solving P with new conﬁgurations of the deci-sion variables. The change of the objective value may henceexhibit one of the three patterns (I, II and III) illustrated inFigure 3.The objective value increases monotonically in path I. Inpath II, it ﬁrst decreases to point C then increases to point E ;ﬁnally, path III shows a more complex pattern which hasone local maximum point B and one minimum point D .In case I, the network system has just enough computationpower to serve the trafﬁc. Hence, adding more computationcapacity to the system does not guarantee to decrease delay,while it will increase on the other hand computation costs.In case II, few ingress nodes in the system may support arelatively high trafﬁc load. Equipping some of their neigh-bors with more computation capabilities (with total amountless than x C ) can still decrease the total system costs. Afterpoint C , the objective value shows a similar trend to case I.In case III, several ingress nodes may serve high trafﬁc load.At the beginning, adding some computing nodes (with totalamount less than x B ) may be not enough to decrease thedelay costs to a certain degree, and this will also increasethe total installation costs. After point B , the objective valuevaries like in case II and has a minimum at point D .To summarize, our heuristic aims at reaching the min-imum points A (I), C (II) and D (III) in Figure 3, andits ﬂowchart is shown in Figure 4. The main idea behindAlgorithm 1 is to check whether the ingress nodes canhost all the trafﬁc without activating additional MEC units, Input network parameters, topologyTry to host trafﬁc by ingress nodes only (

Algorithm P , update solution. Is the best one achieved so far? ( Algorithm

Output recorded best solutionYes NoNo

Algorithm Fig. 4: Flowchart of our NESF heuristic.thus saving some computation cost. Algorithm 2 aims atsearching the h -hop neighbors of each ingress node formaking them process part of the trafﬁc (the outsourcedtrafﬁc), while Algorithm 3 aims at setting up the allocationplan for outsourced trafﬁc and try to solve P to obtain thebest solution. The three proposed algorithms are describedinto detail in the following subsections. The deﬁnition of thenew notation introduced in these algorithms is summarizedfor clarity in Table 2. In Algorithm 1, the main idea is to check whether ingressnodes can host all the trafﬁc, without using other MECunits in order to save both computation cost and latency.To this end, we ﬁrst individuate the subset of ingress nodes(denoted as K u ) that cannot host all the trafﬁc that enters thenetwork through them. This is done by checking if S ek (= D m − (cid:80) n ∈N λ kn ) (cid:54) (lines 1-2), that is, if some computingcapacity is still available or not at ingress nodes (recall that D m is the maximum computation capacity that can be madeavailable). Then, if K u (cid:54) = ∅ , ∀ k ∈ K u , we try to ﬁnd the setof its neighbor ingress nodes k (cid:48) ∈ [( K − K u ) ∩ ( (cid:83) Hh G hk )] that can cover S ek (i.e., S ek (cid:48) + S ek > ), where G hk ⊂ E is theset of node k ’s h -hop neighbor nodes ( h = 1 , . . . , H ). Iffound, they are stored as candidates in a list, Q k , orderedwith increasing distance (hop count) from k (lines 3-7). If K u = ∅ or sufﬁcient nodes in Q k have been found toprocess the extra trafﬁc from K u (line 9), then ∀ k ∈ K u ,the corresponding trafﬁc is allocated to nodes in Q k startingTABLE 2: Notations used in the algorithms. Notation Deﬁnition S ek Estimated available computation of ingress node k ∈ KK u Ingress nodes that cannot host all trafﬁc ( S ek (cid:54) ) H Maximum searching depth of our heuristic G hk h -hop neighbors ( h (cid:54) H ) of ingress node k ∈ K Q k Candidates for computing trafﬁc from ingress node k ∈ K S ok Overall computation of ingress node k ∈ K S li Maximum left computation of node i ∈ EK bi Ingress nodes who booked computation from node i ∈ E d ik Count of hops from node i to ingress node k ∈ K O P Objective function value of problem P from the top (choosing the closest ones) and repeatedly(covering all the trafﬁc types), beginning with less latencyto more latency-tolerant trafﬁc.This is implemented by setting the corresponding vari-ables b kni , δ ai and γ kn,il in P to save the costs and alsoaccelerate the algorithm. Finally, P with the ﬁxed variablesis solved by using Branch and Bound method to obtain thesolution (lines 10-11). If P is feasible with these settings,the objective value O P is stored to be used in the nextsearching and resource allocation phases of Algorithm 3. Algorithm 1

Attempt of serving trafﬁc with ingress nodes only S ek = D m − (cid:80) n ∈N λ kn , ∀ k ∈ K ; K u = { k ∈ K | S ek (cid:54) } ; Compute k ’s h -hop neighbors G hk , h (cid:54) H, ∀ k ∈ K ; Q k = { k } , ∀ k ∈ K , O t = − ; for k ∈ K u do X = { k (cid:48) ∈ [( K−K u ) ∩ ( (cid:83) Hh G hk )] | S ek (cid:48) + S ek > } ; Q k = Q k ∪ X , rank Q k by increasing hop count to k ; Rank N as N k by descending ( λ kn , τ n ) , ∀ k ∈ K ; if K u = ∅ or (cid:86) k ∈K u ( | Q k | > then Allocate Q k to N k in order and repeatedly, ∀ k ∈ K ; Solve P by B&B to obtain obj. fct. value O P ; if O P > then O t = O P ; This section describes Algorithm 2, upon which Algorithm 3is based to provide the ﬁnal solution. Algorithm 2 proceedsas follows. We ﬁrst assign a rank (or a priority value) to eachingress node taking into account the amount of incomingtrafﬁc and the computation capacity. Then, we handle theoutsourced trafﬁc ofﬂoading task (i.e., choose the best subsetof computational nodes) starting from the ingress node withthe highest priority.In more detail, set K s is set K sorted by the ascendingvalue of the tuple ( S ek , − λ kn ) , i.e., the ingress node withthe lowest estimated available (left) computation S ek and thehigher amount of trafﬁc of type n has the highest rank/pri-ority in our Algorithm 2, where n represents the trafﬁctype having the maximum tolerable latency (lines 1-2). Theprocess of determining the best subset of computation nodesfor processing the outsourced trafﬁc of each ingress node isexecuted hop-by-hop , starting with ingress node ˆ k = K s (0) ,until any one of the following three conditions is satisﬁed:(1) the number of computation nodes opened for processingtrafﬁc exceeds the maximum budget (cid:98) Pmin ( D a ) (cid:99) , or(2) all ingress nodes are completely scanned (line 3), or(3) the algorithm could not improve further the solution(Algorithm 3, lines 8, 10).In the searching phase, we ﬁrst try to identify the setof temporary candidate computation nodes B for ingress ˆ k ( B ⊆ ( G h ˆ k ˆ k − [ K ∪ Q ˆ k ]) ) , by checking if the maximumavailable computation capacity of i ∈ B , S li could help ˆ k to cover S e ˆ k (lines 4-7). S li is computed as the differencebetween i ’s maximum installable computation capacity D m and the total computation booked from i by ingress nodesin K bi ⊆ K , i.e., (cid:80) k ∈K bi S ek , where K bi is the set of ingress nodes that booked computation from node i . If B = ∅ , weincrease the number of hops h ˆ k for ingress ˆ k . If not (we aredone with ˆ k ), we move to the next ingress node in the set K s (lines 8-9).At this point we rank B by descending values of tuple ( S li , − d ik : k ∈ K s ) , where d ik is the count of hops fromnode i to ingress node k ∈ K s . The ﬁrst computation node ˆ ı is selected as the one to compute the trafﬁc of ˆ k , and ˆ k is added into the corresponding set K b ˆ ı . To make full useof computation node ˆ ı , we further spread it to help otheringress nodes K s \{ ˆ k } , if ˆ ı is their neighbor within H hopsand has sufﬁcient computation budget (lines 10-13). Then,given such computation node ˆ ı and for each ingress node k ,we update the value of the overall computation, S ok , dueto the full use of computation nodes ˆ ı (line 14). Hence,ingress k with the minimum support S ok will be chosenas the next searching target and Algorithm 2 continues asfollows.The next searching target ˆ k is set to k ∈ K s with theminimum S ok value (lines 15-16). If S o ˆ k (cid:54) , this means thatthe current computation conﬁguration could not host all thetrafﬁc; hence, the algorithm will go back to the while loopand continue to the next searching. Otherwise, we set a ﬂag skip := ( S o ˆ k (cid:54) rD m ) where r is set to a small value (i.e., . ). If skip is true , it indicates that ˆ k has a high trafﬁc load,and this may cause the processing latency to increase. Thisﬂag is used in Algorithm 3. In fact, this step implements thestrategy of skipping point B to avoid the local minimum(point A ) in path III shown in Figure 3. Finally, based on Q k , we run Algorithm 3 to obtain the objective value O t and the corresponding solution. Algorithm 2

Priority searching of computation candidates Rank ingress nodes as K s by ascending ( S ek , − λ kn ) ; ˆ k = K s (0) , h k = 1 , S ok = S ek ( ∀ k ∈ K ) , K bi = ∅ ( ∀ i ∈ E ) ; while | (cid:83) k ∈K Q k | < (cid:98) Pmin ( D a ) (cid:99) and K s (cid:54) = ∅ do B = ∅ ; for i ∈ ( G h ˆ k ˆ k − [ K ∪ Q ˆ k ]) do S li = D m + (cid:80) k ∈K bi S ek if S li + S e ˆ k > then B = B ∪ { i } ; if B = ∅ then h ˆ k ++ , update K s , ˆ k when h ˆ k > H and continue ; Rank B by descending ( S li , − d ik : k ∈ K s ) , ˆ ı = B (0) ; Q ˆ k = Q ˆ k ∪ { ˆ ı } , K b ˆ ı = K b ˆ ı ∪ { ˆ k } , S b = D m ; for k ∈ K s \{ ˆ k } , if (ˆ ı ∈ (cid:83) Hh G hk ) & ( S b > λ k ) do Q k = Q k ∪ { ˆ ı } , K b ˆ ı = K b ˆ ı ∪ { k } , S b = S b − λ k ; S ok = S ok + ( D m + (cid:80) k (cid:48) ∈K b ˆ ı ∩K u −{ k } S ek (cid:48) ) , ∀ k ∈ K b ˆ ı ; ˆ k = argmin k ∈K s S ok ; if S o ˆ k (cid:54) then continue ; else skip := ( S o ˆ k (cid:54) rD m ) ; Run (

Algorithm

3) to obtain O t ; Return O t ; In Algorithm 3, we ﬁrst relax problem P to ˜ P , replacingbinary variables b kni , δ ai and γ kn,il with continuous ones.Given the set Q k (by Algorithm 2) of candidate computation nodes for processing the outsourced trafﬁc of ingress node k ,the goal is to allocate node k ’s different trafﬁc types tothe computation nodes in Q k starting with the trafﬁc withthe most stringent constraint in terms of latency. Unusedcomputation nodes are turned off. These two steps (lines1-2) provide a partial guiding information and also anacceleration for solving the relaxed problem, thus obtainingquite fast the relaxed optimal values of ˜ b kni .If ˜ P is infeasible ( O ˜ P < ), we check whether both theprevious best solution exists ( O t > ) and the algorithmdoes not skip . If yes, the searching process breaks andreturns O t (line 10). Otherwise, the algorithm will continuesearching to avoid getting stuck in a local optimum point inpath III (see Figure 3), according to the following.Hence, if ˜ P is feasible (line 3), the obtained ˜ b kni valuecan be regarded as the probability of processing trafﬁc kn atnode i . Based on this, for each ingress k , we rank the can-didates in descending order of the probabilities (cid:80) n ∈N ˜ b kni .Then we revert to the original problem P , set the upperbound for P if possible, allocate the candidates to host alltypes of trafﬁc in order and repeatedly for each ingress node,and also turn off the unused nodes (lines 5-7). By solving P ,we obtain the current solution and compare it with theprevious best one ( O t ). If the solution gets worse, the wholesearching process breaks out and returns the recorded bestresult (line 8). Otherwise (if the solution is improving), thecurrent solution is updated as the best one and the searchingprocess continues. Algorithm 3

Allocating resources and obtaining the solution Relax b kni , δ ai , γ kn,il to continuous ones ( P → ˜ P ); Allocate Q k to N k partially and solve ˜ P to obtain ˜ b kni ; if O ˜ P > then Rank candidates as Q sk by descending (cid:80) n ∈N ˜ b kni ; Revert to the original problem P ; if O t > then set O t as P ’s upper bound ; Allocate Q sk to N k and solve P ; if < O t &( O t < O P || O P < skip then break ; if < O P &( O P < O t || O t < then O t = O P ; else if O t > skip then break ; Essentially, the proposed heuristic described in the abovesubsections exploits the P formulation limiting the searchspace only to the nodes that are within a limited numberof hops h < H from the ingress nodes. We expect this isa realistic assumption based on the consideration that themain purpose of edge networks is to keep the trafﬁc asclose as possible to the ingress nodes and, therefore, to theusers. Thanks to this approach, we are able to make the P problem more tractable and solvable in a short time even inthe case of complex edge networks (see Section 6).We can further improve the solution time by eliminatingfrom the problem formulation all unneeded variables. Inparticular, we modify P by adding a scope k (where k is the ingress node) to E and L . E k ⊆ E represents the setof h -hop neighbor nodes ( h (cid:54) H ) of k and L k ⊆ L the setof links inside this neighborhood. This way, the solver will be able to skip all variables outside the considered k scope,thus reducing the time needed to load, store, analyze andprune the problem. Such modiﬁcation does not change theresult produced by the heuristic but it results in a consistentimprovement (up to 1 order of magnitude) in the computingtime needed to obtain the solution in our numerical analysis. UMERICAL R ESULTS

The goal of this evaluation is to show that: i) our P model offers an appropriate solution to the edge networkoptimization problem we have discussed in this paper, ii)our NESF heuristic computes a solution which is alignedwith the optimal one, and iii) when compared with twobenchmark heuristics,

Greedy and

Greedy-Fair , NESF offersbetter results within similar ranges of computing time.Consistently, the rest of this section is organized as fol-lows: Section 6.1 describes the heuristics we have comparedwith; Section 6.2 describes the setup for our experiments;Section 6.3 discusses about optimal solution and the resultsobtained by the heuristics in the small network scenario pre-sented in Section 3; Section 6.4 analyzes the results achievedby the heuristics when the network parameters vary; Finally,Section 6.5 discusses about the computing time needed toﬁnd a solution.

We propose two benchmark heuristics, based on a greedyapproach, which can be naturally devised in our context:

Greedy : With this approach, each ingress node uses itsneighbor nodes computation facilities to guarantee a lowoverall latency for its incoming trafﬁc. Hence, each ingressnode ﬁrst tries to locally process all incoming trafﬁc. Ifits computation capacity is sufﬁcient, a feasible solution isobtained; otherwise, the extra trafﬁc is split and outsourcedto its 1-hop neighbors, and so on, until it is completelyprocessed (if possible).

Greedy-Fair : It is a variant of

Greedy which performsa sort of “fair” trafﬁc ofﬂoading on neighbor nodes. Morespeciﬁcally, it proceeds as follows: 1) compute the maximumnumber of available computing nodes, based on the powerbudget and the average computation capacity of a node;2) divide such maximum number (budget) into |K| partsaccording to the ratio of the total trafﬁc rate among ingressnodes, and choose for each ingress node the correspondingnumber of computing nodes from its nearest h -hop neigh-bors. Each ingress node spreads its load on its neighborsproportionally to the corresponding distance ( hop +1 ), forexample, if the load is outsourced to two neighbors,the ratio is (1 : : ) = (0 . .

25 : 0 . . We implement our model and heuristics using SCIP (SolvingConstraint Integer Programs) , an open-source frameworkthat solves constraint integer programming problems. Allnumerical results presented in this section have been ob-tained on a server equipped with an Intel(R) Xeon(R) E5-2640 v4 CPU @ 2.40GHz and 126 Gbytes of RAM. The http://scip.zib.de parameters of SCIP in our experiments are set to theirdefault values.The illustrated results are obtained by averaging over 50instances with random trafﬁc rates λ kn following a Gaussiandistribution N ( µ, σ ) , where µ is the value of λ kn shownin Table 3 and σ = 0 . (we recall that the optimizationproblem is solved under the assumption that the trafﬁcshows only little random variations during the time slotunder observation. For this reason, the choice of a Gaus-sian distribution is appropriate). We computed 95% narrowconﬁdence intervals, as shown in the following ﬁgures.The network topologies used in the following ex-periments are generated based on Erd ¨os-R´enyi randomgraph [7] by specifying the numbers of nodes and edges.Note that the original Erd ¨os-R´enyi algorithm may pro-duce disconnected random graphs with isolated nodes andcomponents. To generate a connected network graph, wepatched it with a simple strategy that connects isolatednodes to randomly sampled nodes (up to 10 nodes) inthe graph. We generated several kinds of topologies withdifferent numbers of nodes and edges, shown in Figure 5,that span from a quasi-tree shape topology (Figure 5(c)) to amore general, highly connected one with 100 nodes and 150edges (Figure 5(f)). These topologies can be considered rep-resentative of various edge network conﬁgurations wheremultiple edge nodes are distributed in various ways overthe territory. Due to space constraints, in the following wepresent and discuss the results obtained for a representativetopology, i.e., the one in Figure 5(e), as well as those for thesmall topology of Figure 2, used to compare our proposedheuristics to the optimal solution. The full set of results isavailable online .TABLE 3: Parameters setting - Initial (reference) values (forthe case of high incoming trafﬁc load with low tolerablelatency) Parameter Initial valueNo. of nodes |E| ∼ No. of ingress nodes |K| (Topologies in Fig. 5)No. of trafﬁc types |N | Link bandwidth B l (Gb/s) ( l ∈ L )Network capacity C k (Gb/s) , , ( k ∈ K )Computation level D a (Gb/s) , , ( a ∈ A )Computation budget P (Gb/s) Trafﬁc rate λ kn (Gb/s)   ( K × N )Tolerable latency τ n (ms) , . , , , . ( n ∈ N )Weights κ i , w . , . ( i ∈ E ) In Table 3 we provide a summary of the reference valueswe deﬁne for each parameter. Such values are representativeof a scenario with a high trafﬁc load and low tolerablelatency relative to the limited communication and com-putation resources. Referring to the computation capacitylevels and budget in Table 3, it is worth noticing thatunit “cycles/s” is often used for these metrics; for sim-plicity we transform it into “Gb/s” by using the factor“8bit/1900cycles”, which assumes that processing 1 byteof data needs 1900 CPU cycles in a BBU pool [8]. The http://xiang.faculty.polimi.it/ﬁles/supplementary-results.pdf (a) 20 nodes 30 edges (avg. degree: 3.0)

12 514233 417 2431 32 121615 27 35386 297 18 825 349 10 1122 2113 4033 3036 19 20 393726 28 (b) 40 nodes 60 edges (avg. degree: 3.0) (c) 50 nodes 50 edges ( avg. degree: 2.0 ) (d) 60 nodes 90 edges (avg. degree: 3.0) (e) 80 nodes 120 edges (avg. degree: 3.0) (f) 100 nodes 150 edges (avg. degree: 3.0) Fig. 5: Network topologies. Ingress nodes for each graph are colored in red.number of trafﬁc types is set to ﬁve. Each trafﬁc typecan be dedicated to a speciﬁc application case (e.g., videotransmission for entertainment, real-time signaling, virtualreality games, audio). Our trafﬁc rates result from the ag-gregation of trafﬁc generated by multiple users connectedat a certain ingress nodes. We select rate values that canbe typical in a 5G usage scenario and that almost saturatethe wireless network capacity at the ingress nodes that weassume to vary from 40 to 60 Gb/s. The tolerable latencyfor each trafﬁc type aims at challenging the approach withquite demanding requirements ranging from 1 to 3.5 ms.More speciﬁcally, the values of trafﬁc rate λ kn and tolerablelatency τ n are designed to cover several different scenarios,i.e., mice , normal and elephant trafﬁc load under strict , nor-mal and loose latency requirements. For simplicity, in thispaper we ﬁx the number of ingress nodes to three. Anin-depth analysis of the impact of the number of ingressnodes on the performance of the optimization algorithmis the subject of our future research. To make the problemsolution manageable, we assume to adopt links of the samebandwidth (100 Gb/s) that are representative of currentﬁber connections. As in the example of Section 3, we assumethree possible levels for the computation capacity (30, 40 and50 Gb/s), under the assumption that, as it happens in typicalcloud IaaS, users see a predeﬁned computation service offer.The maximum computation budget is set to 300 Gb/s,which is a relatively low value considering the trafﬁc rateswe use in the experiments and the number of availablenodes in the considered topologies. Finally, by assigning thesame values to weights κ i , w , we make sure that the twocomponents of the optimization problem, the total latencyand the operation cost, have the same importance in theidentiﬁcation of the solution. O b j e c t i v e f un c t i o n Greedy-FairGreedyNESFOptimal (a) Network capacity ( C k ) O b j e c t i v e f un c t i o n Greedy-FairGreedyNESFOptimal (b) Trade-off weight ( w ) Fig. 6: Comparison with the optimum varying two selectedparameters ( C k and w ) in the example network scenario10N20E of Figure 2. We ﬁrst compare the results obtained by our proposedheuristic,

NESF , against the optimum obtained solvingmodel P in the simple topology illustrated in Figure 2,Section 3. Note that the original model could be solvedonly in such a small network scenarios due to a very highcomputing time. In Figure 6 we show the variation of theobjective function (the sum of total latency and operationcost) with respect to two parameters, the network capac-ity C k and the weight w in the objective function. In thesecases, it can be observed that NESF obtains near-optimalsolutions, practically overlapping with the optimum curve,for the whole range of the parameters, while both

Greedy and

Greedy Fair perform worse. The results achieved whenthe other parameters vary show the same trend. For the sakeof space, we do not show them, but they are reported in thesupplementary results available here .Figure 7 shows the conﬁguration of nodes and routing α , :1 α , :1 α , :1 α , :1 β , :1 D c , :27 , c , :23 β , :1 D c , :22 , c , :38 D β , :0 . β , :0 . (a) Optimal β , :1 D c , :27 , c , :23 β , :1 D c , :22 , c , :38 D β , :1 D β , :1 (b) Greedy β , :1 D c , :27 , c , :23 β , :1 D c , :22 , c , :38 D β , :0 . β , :0 . (c) NESF . .

67 0 . . . , . . , . . . β , :0 . β , :0 . D c , :27 , c , :23 β , :0 . β , :0 . D c , :20 , c , :40 D β , :0 . β , :0 . β , :0 . β , :0 . D β , :0 . β , :0 . D (d) Greedy-Fair

Fig. 7: Comparison of the solutions achieved by the heuristics and the optimum for the 10N20E topology.paths for the network (10N20E) with the parameter valuesdeﬁned in Section 3. Each sub-ﬁgure refers to one of the fourconsidered solutions. Here we highlight the ingress nodes(i.e., and ) and the other nodes which offer computationcapacity or support trafﬁc routing. The remaining nodes arenot shown for the sake of clarity. The black arrows representthe enabled routing paths. The trafﬁc ﬂow allocation ofeach solution is marked in red for trafﬁc type 1 and bluefor type 2, respectively. The values of all relevant decisionvariables (see Section 4) are shown as well.Comparing Figures 7(a) and 7(c), we notice that both Optimal and

NESF enable the computation capacity on theingress nodes and an intermediate node, with one type oftrafﬁc kept in the ingress nodes and the other ofﬂoaded tothe intermediate. The obvious differences between

Optimal and

NESF include: i) planning of the computation capacityon ingress node (i.e., D by Optimal while D by NESF ),and ii) the intermediate node selected and the consequentrouting paths. However, the obtained objective functionvalues (trade-off between the total latency and operationcost) by

Optimal and

NESF are respectively . and . ,and very close to each other. To further check the reasonsbehind, we found that the latencies for the trafﬁc of type 1and 2 are, respectively, . ms and . ms for Optimal ,while . ms and . ms for NESF . Since in this case

NESF can acquire less total latency at the expense of a littlebit higher computation cost, compared with

Optimal , theircorresponding objective function values are close. Note thatthe computing time needed to obtain the optimal solutionis around 10 hours ( seconds) while

NESF is able tocompute the approximate solution in only about second.The Greedy and

Greedy-Fair approaches tend to enablecomputation capacity on more nodes.

Greedy-Fair also splitseach type of trafﬁc following multiple paths. Both aspectsresult in a higher objective function value.When increasing the network capacity C k by the scalefactor . , the resulting solutions remain almost the same,except for the allocation of the wireless network capacityand computation capacity. We investigate the effect of several parameters on the ob-jective function value, with respect to link bandwidth B l , network capacity C k , computation capacity D a and corre-sponding total budget P , trafﬁc rate λ kn , tolerable latency τ n and trade-off weight w . We conduct our simulations byscaling one parameter value at a time, starting from theinitial values in Table 3. Since the goal is to minimize theweighted sum of total latency and operation cost, lowervalues for the objective function are preferable.In Figure 8 we report all results referring to the topol-ogy with 80 Nodes and 120 links (Figure 5(e)). All resultsobtained considering the other topologies in Figure 5 areavailable here and show similar trends. B l Figure 8(a) illustrates the variation of the objective functionvalue (costs w.r.t. latency and computation) versus the linkbandwidth B l , ∀ l ∈ L , the values of which are scaled withrespect to its initial ones in Table 3 from to . with a stepof . . In all cases, the problem instance is unfeasible belowa certain threshold bandwidth value. As B l increases abovethe threshold, the cost value achieved by each approachdecreases and converges to a smaller value, i.e., around . for NESF (achieved at . ), . for Greedy at . and . for Greedy-Fair at . . In all cases, NESF performs the bestamong all the approaches, with the following gains: around to Greedy and to Greedy-Fair . Greedy and

Greedy-Fair show little ﬂexibility to the variation of link bandwidth. C k Figures 8(b) demonstrates the variation of the objectivefunction value with respect to the wireless network capac-ity C k , ∀ k ∈ K , scaled with respect to the initial valuesreported in Table 3 from . to . , which corresponds tothe case in which the wireless network shows a capacitycomparable to the one of the internal network links. When C k increases, the objective function value obtained by eachapproach decreases quite fast (more than times) and con-verges to a speciﬁc value. For NESF , the cost decreases from . and converges to . ; Greedy and

Greedy-Fair exhibitclose performance, i.e.,

Greedy from . to . , Greedy-Fair from . to . . NESF still has the best performanceamong all the approaches, with consistent gaps: around to Greedy and up to for

Greedy-Fair . This trendreﬂects the strong effect of the wireless network capacity O b j e c t i v e f un c t i o n Greedy-FairGreedyNESF (a) Bandwidth ( B l ) O b j e c t i v e f un c t i o n Greedy-FairGreedyNESF (b) Network capacity ( C k ) O b j e c t i v e f un c t i o n Greedy-FairGreedyNESF (c) Computation capacity budget ( P ) O b j e c t i v e f un c t i o n Greedy-Fair Greedy NESF (d) Computation capacity ( D ) O b j e c t i v e f un c t i o n Greedy-Fair Greedy NESF (e) Computation capacity ( D ) O b j e c t i v e f un c t i o n Greedy-FairGreedyNESF (f) Computation capacity ( D ) O b j e c t i v e f un c t i o n Greedy-FairGreedyNESF (g) Trafﬁc rate ( λ kn ) O b j e c t i v e f un c t i o n Greedy-FairGreedyNESF (h) Tolerable latency ( τ n ) O b j e c t i v e f un c t i o n Greedy-FairGreedyNESF (i) Trade-off weight ( w ) Fig. 8: Numerical results for the large-scale network topology 5(e), 80N120E (averaged over 50 instances).increase on the minimization of the overall system cost andperformance. P Figures 8(c) shows the trend of the objective function valueat the variation of the computation capacity budget P ,whose value is scaled with respect to the initial one inTable 3 from . to . with a step of . . Clearly, a lowpower budget challenges the optimization approach thatmust ensure the available computation capacity is alwayswithin this budget. The ﬁgure shows that each heuristichas a limit budget value below which it is unable to ﬁnda feasible solution ( . for Greedy-Fair , . for Greedy and . for NESF ). Thus,

NESF is the most resilient in thiscase. As P increases, the cost values obtained by NESF and

Greedy monotonically decrease like staircases, and ﬁnallyfast converge to speciﬁc points, i.e., . for NESF and . for Greedy . The staircase pattern is due to the factthat the optimal solution remains constant when P variesin a small range, and the decreasing trend is also consistentwith the real world case. However, the cost value for Greedy-Fair exhibits an opposite trend. This is due to its strategy that tries to use the maximum number of nodes that thebudget P can cover, and distribute the trafﬁc load on all ofthem. This scheme, thus, results in a waste of computationcapacity and cost increase in some situations. Finally, NESF still achieves the best performance, with the following gaps:around to Greedy and to Greedy-Fair . D a Figures 8(d), 8(e), and 8(f) illustrate the variations of theobjective function value with respect to the three levelsof computation capacity D a , which are scaled from . to . w.r.t. the initial values in Table 3 with a step of . ,still keeping the relation D < D < D . In Figures8(d) and 8(e), the objective function values obtained bythe three approaches show very small variation when thecomputation capacity is scaled. In Figure 8(f), there is a cleardecreasing trend for the objective function values achievedby both Greedy and

Greedy-Fair . The reason is that manyedge nodes are enabled with the D computation level, andthe increased D capacity reduces much of the total latencywhile not adding much operation cost. The objective func-tion value achieved by NESF , on the other hand, almost does S o l v i n g t i m e ( s ) Greedy-FairGreedyNESFOptimal

Fig. 9: Computing time.not change. To summarize,

NESF could provide better andmore stable solutions, compared with the other approaches. λ kn Figure 8(g) shows the objective function value variationversus the trafﬁc rate. Values λ kn , kn ∈ K × N are scaledfrom . to . with respect to the initial value in Table 3,with a step of . . As trafﬁc λ kn increases, the objectivefunction values for all the approaches increase. We observethat N ESF is characterized by a smooth curve, whichindicates stability in the solving processing, while both

Greedy and

Greedy-Fair exhibit larger ﬂuctuations. When thescale is (cid:54) . , i.e., the trafﬁc rate is relatively low, the costvalues for all the approaches are the same since the bestconﬁguration, i.e., locally computing of the trafﬁc, is easilyidentiﬁed by all of them. After that point, NESF exhibitsa better performance with a clear gap (around ) withrespect to the other approaches. τ n Figure 8(h) illustrates the objective function value withrespect to the tolerable latency τ n , n ∈ N scaled from . to . on the initial value in Table 3. When τ n increases,the objective function values obtained by all the approachesdecrease and converge to speciﬁc points, i.e., around . for NESF , . for Greedy , and ﬁnally . for Greedy-Fair .Parameter τ n serves in our model as an upper bound (seeconstraint (18)), and limits the solution space. In fact, witha low τ n value, the feasible solution set is smaller and thetotal cost increases, and vice versa. Finally, NESF performsthe best, with the following gaps: around with respectto Greedy , and to Greedy-Fair . w This parameter permits to express, in the objective functioncomputation, the relevance of the overall operation costwith respect to the total latency experienced by users. Lowervalues of w correspond to a lower relevance of the operationcost w.r.t. latency. In Figure 8(i) w is scaled from to . with respect to the initial value in Table 3, with a step of . . When the scaling factor is , the optimization focusesalmost exclusively on the total latency. As w increases, theobjective function values increase almost linearly for allthe approaches. The NESF algorithm still achieves the bestperformance, with gaps around with respect to Greedy and w.r.t.

Greedy-Fair . Figure 9 compares the average computing time of the pro-posed approaches under all considered network topologies.The computing time for P is shown only for the smallesttopology and it is already signiﬁcantly larger than the oth-ers. For the tree-shaped network topology (Figure 5(c)), allapproaches are able to obtain the solution very fast, in lessthan s . This is due to the fact that routing optimizationis indeed trivial in such topology. The computing time isordered as: Greedy < NESF < Greedy-Fair . When consideringstandard deviation, the order is:

NESF < Greedy < Greedy-Fair ,and this shows the stability of our proposed approach inthe solving process. As for the network topology with nodes and edges (a general large scale network),

NESF is able to obtain a good solution in around s , and remainsbelow this value in the other considered cases. This givesus an indication that the network management componentcan periodically run NESF as a response to changes in thenetwork or in the incoming trafﬁc, and optimize nodes com-putation capacities and routing paths accordingly. This is akey feature for providing the necessary QoS levels in next-generation mobile network architectures and for updating itdynamically.

ELATED W ORK

Several works have been recently published on the resourcemanagement problem in a MEC environment; most of themconsider a single mobile edge cloud at the ingress nodeand do not account for its connection to a larger edgecloud network [9–11]. The following of this section providesa short overview on the various areas that are relevantto the problem we consider. As discussed in Section 7.6,ours is the ﬁrst approach that considers at the same timemultiple aspects related to the conﬁguration of an edgecloud network.

The network planning problem in a MEC/Fog/Cloud con-text tackles the problems concerning nodes placement, traf-ﬁc routing and computation capacity conﬁguration. Theauthors in [12] propose a mixed integer linear programming(MILP) model to study cloudlet placement, assignment ofaccess points (APs) to cloudlets and trafﬁc routing problems,by minimizing installation costs of network facilities. Thework in [6] proposes a MILP model for the problem offog nodes placement under capacity and latency constraints.[13] presents a model to conﬁgure the computation capacityof edge hosts and adjust the cloud tenancy strategy fordynamic requests in cloud-assisted MEC to minimize theoverall system cost.

The service and content placement problems are consideredin several contexts including, among others, micro-clouds,multi-cell MEC etc. The work in [14] studies the dynamicservice placement problem in mobile micro-clouds to min-imize the average cost over time. The authors ﬁrst pro-pose an ofﬂine algorithm to place services using predictedcosts within a speciﬁc look-ahead time-window, and then improve it to an online approximation one with polyno-mial time-complexity. An integer linear programming (ILP)model is formulated in [15] for serving the maximum num-ber of user requests in edge clouds by jointly consideringservice placement and request scheduling. The edge cloudsare considered as a pool of servers without any topology,which have shareable (storage) and non-shareable (commu-nications, computation) resources. Each user is also limitedto use one edge server. In [16], the authors extend the workin [15] by separating the time scales of the two decisions:service placement (per frame) and request scheduling (perslot) to reduce the operation cost and system instability.In [17], the authors study the joint service placement andrequest routing problem in multi-cell MEC networks tominimize the load of the centralized cloud. No topology isconsidered for the MEC networks. A randomized rounding(RR) based approach is proposed to solve the problemwith a provable approximation guarantee for the solution,i.e., the solution returned by RR is at most a factor (morethan 3) times worse than the optimum with high proba-bility. However, although it offers an important theoreticalresult, the guarantee provided by the RR approach is onlyspeciﬁc to the formulated optimization problem. [18] studiesthe problem of service entities placement for social virtualreality (VR) applications in the edge computing environ-ment. [19] analyzes the mixed-cast packet processing androuting policies for service chains in distributed computingnetworks to maximize network throughput.The work in [20] studies the edge caching problem ina Cloud RAN (C-RAN) scenario, by jointly considering theresource allocation, content placement and request routingproblems, aiming at minimizing the system costs over time.[21] formulates a joint caching, computing and bandwidthresources allocation model to minimize the energy con-sumption and network usage cost. The authors considerthree different network topologies (ring, grid and a hypo-thetical US backbone network, US64), and abstract the ﬁxedrouting paths from them using the OSPF routing algorithm. The cloud activation and selection problems are studied asa way to handle the conﬁguration of computation capacityin a MEC environment. The authors in [22] design an onlineoptimization model for task ofﬂoading with a sleep controlscheme to minimize the long term energy consumption ofmobile edge networks. The authors use a Lyapunov-basedapproach to convert the long term optimization problemto a per-slot one. No topology is considered for the MECnetworks. [23] proposes a model to dynamically switchon/off edge servers and cooperatively cache services andassociate users in mobile edge networks to minimize energyconsumption. [24] jointly optimizes the active base stationset, uplink and downlink beamforming vector selection,and computation capacity allocation to minimize powerconsumption in mobile edge networks. [25] proposes amodel to minimize a weighted sum of energy consump-tion and average response time in MEC networks, whichjointly considers the cloud selection and routing problems.A population game-based approach is designed to solve theoptimization problem.

The authors in [26] study the resource allocation problem innetwork slicing where multiple resources have to be sharedand allocated to verticals (5G end-to-end services). [27]formulates a resource allocation problem for network slicingin a cloud-native network architecture, which is based on autility function under the constraints of network bandwidthand cloud power capacities. For the slice model, the authorsconsider a simpliﬁed scenario where each slice serves net-work trafﬁc from a single source to a single destination. Forthe network topology, they consider a 6x6 square grid and a39-nodes fat-tree.

Inter-connected datacenters also share some common re-search problems with the multi-MEC system. The workin [28] studies the joint resource provisioning for Internetdatacenters to minimize the total cost, which includes serverprovisioning, load dispatching for delay sensitive jobs, loadshifting for delay-tolerant jobs, and capacity allocation. [29]presents a bandwidth allocation model for inter-datacentertrafﬁc to enforce bandwidth guarantees, minimize the net-work cost, and avoid potential trafﬁc overload on low costlinks.The work in [30] studies the problem of task ofﬂoadingfrom a single device to multiple edge servers to minimizethe total execution latency and energy consumption byjointly optimizing task allocation and computational fre-quency scaling. In [31], the authors study task ofﬂoadingand wireless resource allocation in an environment withmultiple MEC servers. [32] formulates an optimizationmodel to maximize the proﬁt of a mobile service providerby jointly scheduling network resources in C-RAN andcomputation resources in MEC.

To the best of our knowledge, our paper is the ﬁrst topropose a complete approach that encompasses both theproblem of planning cost-efﬁcient edge networks and allocat-ing resources , performing optimal routing and minimizingthe total trafﬁc latency of transmitting, outsourcing andprocessing user trafﬁc, under a constraint of user tolerablelatency for each class of trafﬁc. We model accurately bothlink and processing latency, using non-linear functions, andpropose both exact models and heuristics that are able toobtain near-optimal solutions also in large-scale networkscenarios, that include hundreds of nodes and edges, as wellas several trafﬁc ﬂows and classes.

ONCLUSION

In this paper, we studied the problem of jointly planningand optimizing the resource management of a mobile edgenetwork infrastructure. We formulated an exact optimiza-tion model, which takes into accurate account all the el-ements that contribute to the overall latency experiencedby users, a key performance indicator for these networks,and further provided an effective heuristics that computesnear-optimal solutions in a short computing time, as we

EFERENCES 15 demonstrated in the detailed numerical evaluation we con-ducted in a set of representative, large-scale topologies, thatinclude both mesh and tree-like networks, spanning wideand meaningful variations of the parameters’ set.We measured and quantiﬁed how each parameter hasa distinct impact on the network performance (which weexpress as a weighted sum of the experienced latency andthe total network cost) both in terms of strength and form.Trafﬁc rate and network capacity have the stronger effects,and this is consistent with real network cases. Tolerablelatency shows an interesting effect: the lower requirementson latency (or equivalently: the higher value of tolerablelatency) the system sets, the lower latency and costs thesystem will have. This information can be useful for net-work operators to design the network indicators of services.The computation capacity has relatively smaller effect on thenetwork performance, compared with the other parameters.Another key observation that we draw from our numericalanalysis is that as the system capacities (including link band-width, network capacity and computation capacity budget)increase, the system performance converges to a plateau,which means that increasing the system capacity over acertain level (which we quantify for each network scenario)will have small effectiveness, and on the contrary, it willincrease the total system cost. A CKNOWLEDGMENT

This research was supported by the H2020-MSCA-ITN-2016SPOTLIGHT under grant agreement number 722788. R EFERENCES [1] W. Xiang, K. Zheng, and X. S. Shen,

5G mobile communica-tions . Springer, 2017.[2] H. Zhang, N. Liu, X. Chu, K. Long, A.-H. Aghvami,and V. C. Leung, “Network slicing based 5G and futuremobile networks: mobility, resource management, andchallenges,”

IEEE Communications Magazine , vol. 55, no.8, pp. 138–145, 2017.[3] Y. C. Hu, M. Patel, D. Sabella, N. Sprecher, and V. Young,“Mobile edge computing–A key technology towards 5G,”

ETSI white paper , vol. 11, no. 11, pp. 1–16, 2015.[4] R. Kannan and C. L. Monma, “On the computational com-plexity of integer programming problems,” in

Optimizationand Operations Research , Springer, 1978, pp. 161–172.[5] B. Xiang, J. Elias, F. Martignon, and E. Di Nitto, “JointNetwork Slicing and Mobile Edge Computing in 5G Net-works,” in

IEEE International Conference on Communications(ICC) , IEEE, 2019, pp. 1–7.[6] A. Santoyo-Gonz´alez and C. Cervell´o-Pastor, “Latency-aware cost optimization of the service infrastructure place-ment in 5g networks,”

Journal of Network and ComputerApplications , vol. 114, pp. 29–37, 2018.[7] P. Erd˝os and A. R´enyi, “On Random Graphs I,”

Publica-tiones Mathematicae Debrecen , vol. 6, pp. 290–297, 1959.[8] J. Tang, W. P. Tay, T. Q. Quek, and B. Liang, “Systemcost minimization in cloud RAN with limited fronthaulcapacity,”

IEEE Transactions on Wireless Communications ,vol. 16, no. 5, pp. 3371–3384, 2017.[9] C. Wang, C. Liang, F. R. Yu, Q. Chen, and L. Tang, “Com-putation ofﬂoading and resource allocation in wirelesscellular networks with mobile edge computing,”

IEEETransactions on Wireless Communications , vol. 16, no. 8,pp. 4924–4938, 2017. [10] Y. Mao, J. Zhang, S. Song, and K. B. Letaief, “Stochasticjoint radio and computational resource management formulti-user mobile-edge computing systems,”

IEEE Trans-actions on Wireless Communications , vol. 16, no. 9, pp. 5994–6009, 2017.[11] X. Ma, S. Zhang, W. Li, P. Zhang, C. Lin, and X. Shen,“Cost-efﬁcient workload scheduling in cloud assistedmobile edge computing,” in

Quality of Service (IWQoS),IEEE/ACM 25th International Symposium on , IEEE, 2017,pp. 1–10.[12] A. Ceselli, M. Premoli, and S. Secci, “Mobile edge cloudnetwork design optimization,”

IEEE/ACM Transactions onNetworking (TON) , vol. 25, no. 3, pp. 1818–1831, 2017.[13] X. Ma, S. Wang, S. Zhang, P. Yang, C. Lin, and X. S.Shen, “Cost-efﬁcient resource provisioning for dynamicrequests in cloud assisted mobile edge computing,”

IEEETransactions on Cloud Computing , 2019.[14] S. Wang, R. Urgaonkar, T. He, K. Chan, M. Zafer, and K. K.Leung, “Dynamic service placement for mobile micro-clouds with predicted future costs,”

IEEE Transactions onParallel and Distributed Systems , vol. 28, no. 4, pp. 1002–1016, 2016.[15] T. He, H. Khamfroush, S. Wang, T. La Porta, and S.Stein, “It’s hard to share: Joint service placement andrequest scheduling in edge clouds with sharable and non-sharable resources,” in , IEEE, 2018,pp. 365–375.[16] V. Farhadi, F. Mehmeti, T. He, T. La Porta, H. Kham-froush, S. Wang, and K. S. Chan, “Service placementand request scheduling for data-intensive applications inedge clouds,” in

IEEE INFOCOM 2019-IEEE Conference onComputer Communications , IEEE, 2019, pp. 1279–1287.[17] K. Poularakis, J. Llorca, A. M. Tulino, I. Taylor, and L.Tassiulas, “Joint service placement and request routingin multi-cell mobile edge computing networks,” in

IEEEINFOCOM 2019-IEEE Conference on Computer Communica-tions , IEEE, 2019, pp. 10–18.[18] L. Wang, L. Jiao, T. He, J. Li, and M. M ¨uhlh¨auser, “Serviceentity placement for social virtual reality applications inedge computing,” in

IEEE INFOCOM 2018-IEEE Confer-ence on Computer Communications , IEEE, 2018, pp. 468–476.[19] J. Zhang, A. Sinha, J. Llorca, A. Tulino, and E. Modi-ano, “Optimal control of distributed computing networkswith mixed-cast trafﬁc ﬂows,” in

IEEE INFOCOM 2018-IEEE Conference on Computer Communications , IEEE, 2018,pp. 1880–1888.[20] L. Pu, L. Jiao, X. Chen, L. Wang, Q. Xie, and J. Xu, “Onlineresource allocation, content placement and request routingfor cost-efﬁcient edge caching in cloud radio access net-works,”

IEEE Journal on Selected Areas in Communications ,vol. 36, no. 8, pp. 1751–1767, 2018.[21] Q. Chen, F. R. Yu, T. Huang, R. Xie, J. Liu, and Y. Liu,“Joint resource allocation for software-deﬁned network-ing, caching, and computing,”

IEEE/ACM Transactions onNetworking , vol. 26, no. 1, pp. 274–287, 2018.[22] S. Wang, X. Zhang, Z. Yan, and W. Wang, “Cooperativeedge computing with sleep control under non-uniformtrafﬁc in mobile edge networks,”

IEEE Internet of ThingsJournal , 2018.[23] Q. Wang, Q. Xie, N. Yu, H. Huang, and X. Jia, “DynamicServer Switching for Energy Efﬁcient Mobile Edge Net-works,” in

IEEE International Conference on Communications(ICC) , IEEE, 2019, pp. 1–6.[24] J. Opadere, Q. Liu, N. Zhang, and T. Han, “Joint Computa-tion and Communication Resource Allocation for Energy-Efﬁcient Mobile Edge Networks,” in

IEEE InternationalConference on Communications (ICC) , IEEE, 2019, pp. 1–6.[25] B. Wu, J. Zeng, L. Ge, Y. Tang, and X. Su, “A game-theoretical approach for energy-efﬁcient resource alloca- tion in MEC network,” in IEEE International Conference onCommunications (ICC) , IEEE, 2019, pp. 1–6.[26] F. Fossati, S. Moretti, P. Perny, and S. Secci, “Multi-resource allocation for network slicing,” 2019.[27] M. Leconte, G. S. Paschos, P. Mertikopoulos, and U. C.Kozat, “A resource allocation framework for network slic-ing,” in

IEEE INFOCOM 2018-IEEE Conference on ComputerCommunications , IEEE, 2018, pp. 2177–2185.[28] D. Xu, X. Liu, and Z. Niu, “Joint resource provisioningfor internet datacenters with diverse and dynamic trafﬁc,”

IEEE Transactions on Cloud Computing , vol. 5, no. 1, pp. 71–84, 2017.[29] W. Li, K. Li, D. Guo, G. Min, H. Qi, and J. Zhang, “Cost-minimizing bandwidth guarantee for inter-datacenter traf-ﬁc,”

IEEE Transactions on Cloud Computing , 2016.[30] T. Q. Dinh, J. Tang, Q. D. La, and T. Q. Quek, “Of-ﬂoading in mobile edge computing: Task allocation andcomputational frequency scaling,”

IEEE Transactions onCommunications , vol. 65, no. 8, pp. 3571–3584, 2017.[31] K. Cheng, Y. Teng, W. Sun, A. Liu, and X. Wang, “Energy-efﬁcient joint ofﬂoading and wireless resource allocationstrategy in multi-mec server systems,” in

IEEE Inter-national Conference on Communications (ICC) , IEEE, 2018,pp. 1–6.[32] X. Wang, K. Wang, S. Wu, S. Di, H. Jin, K. Yang, and S. Ou,“Dynamic resource scheduling in mobile edge cloud withcloud radio access network,”

IEEE Transactions on Paralleland Distributed Systems , vol. 29, no. 11, pp. 2429–2445, 2018. A PPENDIX AP ROBLEM R EFORMULATION

Problem P formulated in Section 4 cannot be solved di-rectly and efﬁciently due to the reasons detailed in Sec-tion 4.4.To deal with these problems, we propose in this Ap-pendix an equivalent reformulation of P , which can besolved very efﬁciently with the Branch and Bound method.Moreover, the reformulated problem can be further relaxedand, based on that, we propose an heuristic algorithm whichcan get near-optimal solutions in a short computing time.To this aim, we ﬁrst reformulate the processing latencyand link latency constraints (viz., constraints (12) and (16)),and we deal at the same time with the computation plan-ning problem. Then, we handle the difﬁculties related tovariables R kni and the corresponding routing constraints. A.1 Processing Latency

In equation (12), the variable β kni and the function S i connect the computation capacity allocation and planningproblem together, and the processing latency t kn,iP has there-fore a highly nonlinear expression. To handle this problem,we ﬁrst introduce an auxiliary variable p kn,ai = β kni δ ai .Then, β kni S i is replaced by a linearized form β kni S i = (cid:80) a ∈A p kn,ai D a . Furthermore, we linearize p kn,ai = β kni δ ai ,which is the product of binary and continuous variables, asfollows: (cid:40) (cid:54) p kn,ai (cid:54) δ ai , (cid:54) β kni − p kn,ai (cid:54) − δ ai , ∀ k, ∀ n, ∀ a, ∀ i. (20)According to the deﬁnitions of α kni and b kni , we have thefollowing constraint: α kni (cid:54) b kni (cid:54) M α kni , ∀ k, ∀ n, ∀ i, (21) where M > is a big value; such constraint implies thatif α kni = 0 , the trafﬁc kn is not processed on node i , i.e. b kni = 0 .Based on the above, we can rewrite constraint (13) as: (cid:26) α kni λ kn − (1 − b kni ) < (cid:80) a ∈A p kn,ai D a ,β kni (cid:54) b kni , ∀ k, ∀ n, ∀ i. (22)Note that the term (1 − b kni ) permits to implementcondition α kni > in Eq. (13).In equation (12), we observe that if b kni = 1 , we have: β kni S i − α kni λ kn > S i (cid:62) j ∈E S j , otherwise β kni S i − α kni λ kn = 0 resulting in t kn,iP → ∞ .To handle this case, we ﬁrst deﬁne a new variable t kn,iP (cid:48) asfollows: t kn,iP (cid:48) = 1 (cid:80) a ∈A p kn,ai D a − α kni λ kn + (1 − b kni ) D m , (23)where D m is the maximum computation capacity that canbe installed on a node ( D m = max a ∈A D a ).From this equation, we have b kni = 1 ⇒ t kn,iP (cid:48) = t kn,iP > D m and b kni = 0 ⇒ t kn,iP (cid:48) = D m , t kn,iP = 0 . Hereafter,we prove that this reformulation has no inﬂuence on thesolution of our optimization problem.The outsourcing latency is deﬁned as the maximum ofthe processing latency t kn,iP and link latency t kn,iL among allnodes. Equation (17) can be transformed as t knP L (cid:62) t kn,iP + t kn,iL , ∀ k, ∀ n, ∀ i . When b kni = 0 , t kn,iP = t kn,iL = 0 . Thus,based on above, the inequality is equivalent to t knP L (cid:62) t kn,iP (cid:48) + t kn,iL , ∀ k, ∀ n, ∀ i. A.2 Link Latency

As we stated before, to compute the link latency, we needto determine the routing path R kni , and this problem willbe speciﬁcally handled in the next subsection. Assuming R kni has been determined, we ﬁrst introduce a binary vari-able γ kn,il deﬁned as follows: γ kn,il = (cid:26) , if l ∈ R kni , , otherwise , ∀ k, ∀ n, ∀ i, ∀ l. which indicates whether l is used in the routing path R kni or not. Note that only if trafﬁc kn is processed on node i (i.e., b kni = 1 ) and i (cid:54) = k , the corresponding routing path isdeﬁned. Then we have: (cid:40) γ kn,kl = 0 , ∀ k, ∀ n, ∀ l,γ kn,il (cid:54) b kni , ∀ k, ∀ n, ∀ i, ∀ l. (24)We now introduce variable v l , deﬁned as follows: v l = 1 B l − (cid:80) k (cid:48) ∈K (cid:80) n (cid:48) ∈N f k (cid:48) n (cid:48) l λ k (cid:48) n (cid:48) , ∀ l. (25)This permits to transform equation (16) as t kn,iL = (cid:80) l ∈L γ kn,il v l . We then need to linearize the product of thebinary variable γ kn,il and the continuous variable v l , and tothis aim we introduce an auxiliary variable g kn,il = γ kn,il v l , thus also eliminating t kn,iL . Speciﬁcally, we ﬁrst compute thevalue range of v l as follows: B − l (cid:54) v l (cid:54) V l = 1max { B l − (cid:80) k ∈K (cid:80) n ∈N λ kn , (cid:15) } , where (cid:15) > is a small value. Based on the above, thelinearization is performed by the following constraints. (cid:40) γ kn,il B − l (cid:54) g kn,il (cid:54) γ kn,il V l , (1 − γ kn,il ) B − l (cid:54) v l − g kn,il (cid:54) (1 − γ kn,il ) V l . (26)At the same time, the link latency is rewritten as (cid:80) l ∈L g kn,il . A.3 Routing Path

Based on the deﬁnitions introduced in the previous subsec-tion, the trafﬁc ﬂow f knl can be transformed as: f knl = (cid:88) i ∈E γ kn,il α kni . (27)Due to the product of binary and continuous variables, h kn,il = γ kn,il α kni is introduced for linearization, as follows: (cid:40) (cid:54) h kn,il (cid:54) γ kn,il , (cid:54) α kni − h kn,il (cid:54) − γ kn,il . (28)Now we need to simplify the trafﬁc ﬂow conserva-tion constraint (see Eq. (8)). To this aim, and to simplifynotation, we ﬁrst introduce in the network topology a“dummy” entry node which connects to all ingress nodes k ∈ K . All trafﬁc is coming through this dummy nodeand going to each ingress node with volume λ kn , i.e. f knl = 1 , ∀ k, ∀ n, ∀ l ∈ F , where F is the dummy link setdeﬁned as F = { (0 , k ) | k ∈ K} . Then, we extend thedeﬁnition of I i to I i = { j ∈ E | ( j, i ) ∈ L ∪ F} . Equation (8)is hence transformed as: (cid:88) j ∈I i f knji − (cid:88) j ∈O i f knij = α kni , ∀ k, ∀ n, ∀ i. (29)Correspondingly, we add the following constraints to theset F of dummy links: (cid:40) γ kn,i k = b kni , ∀ k, ∀ n, ∀ i,γ kn,i k (cid:48) = 0 , ∀ k, ∀ n, ∀ i, ∀ k (cid:48) (cid:54) = k. (30)The ﬁnal stage of our procedure is the deﬁnition of theconstraints that guarantee all desirable properties that arouting path must respect: the fact that a single path (trafﬁcis unsplittable) is used, the ﬂow conservation constraintsthat provide continuity to the chosen path, and ﬁnally theabsence of cycles in the routing path R kni . We would liketo highlight that the trafﬁc kn can be only split at ingressnode k , and each proportion of such trafﬁc is destined toan edge node i ; this is why we have multiple routing paths R kni , i ∈ { , , · · · } .To this aim, we introduce the following conditions,and prove that satisfying them along with the constraintsillustrated before can guarantee that such properties arerespected: • For an arbitrary node i , the number of ingress linksused by a path R kni (cid:48) is one, and thus variables γ kn,i (cid:48) ji should satisfy the following condition: (cid:88) j ∈I i γ kn,i (cid:48) ji (cid:54) , ∀ k, ∀ n, ∀ i, i (cid:48) . (31) • The ﬂow conservation constraint (see Eq. (29)) imple-ments the continuity of a trafﬁc ﬂow. • Every routing path should have an end or a destinationto avoid loops. This can be ensured by the followingequation: γ kn,iij = 0 , ∀ k, ∀ n, ∀ ( i, j ) ∈ L . (32)The proof is as follows:a) Substitute Eq. (27) into (29) and make the transforma-tion: (cid:88) j ∈I i (cid:88) i (cid:48) ∈E γ kn,i (cid:48) ji α kni (cid:48) − (cid:88) j ∈O i (cid:88) i (cid:48) ∈E γ kn,i (cid:48) ij α kni (cid:48) = (cid:88) i (cid:48) ∈E α kni (cid:48) (cid:88) j ∈I i γ kn,i (cid:48) ji − (cid:88) i (cid:48) ∈E α kni (cid:48) (cid:88) j ∈O i γ kn,i (cid:48) ij = (cid:88) i (cid:48) ∈E α kni (cid:48) ( (cid:88) j ∈I i γ kn,i (cid:48) ji − (cid:88) j ∈O i γ kn,i (cid:48) ij ) = α kni b) Based on constraints (24) and (30), we have:if α kni (cid:48) = 0 , then (cid:88) j ∈I i γ kn,i (cid:48) ji − (cid:88) j ∈O i γ kn,i (cid:48) ij = 0 . c) From a) and b), we have:  (cid:88) j ∈I i γ kn,iji − (cid:88) j ∈O i γ kn,iij = 1 , ∀ k, ∀ n, ∀ i | α kni > , (cid:88) j ∈I i γ kn,i (cid:48) ji − (cid:88) j ∈O i γ kn,i (cid:48) ij = 0 , ∀ k, ∀ n, ∀ i, ∀ i (cid:48) (cid:54) = i. d) Based on c), constraint (30), conditions (31) and (32)can be written as:  (cid:88) j ∈I k γ kn,ijk = 1 , ∀ k, ∀ n, ∀ i | α kni > , (cid:88) j ∈I i γ kn,iji = 1 , ∀ k, ∀ n, ∀ i | α kni > , (cid:88) j ∈I i γ kn,i (cid:48) ji = (cid:88) j ∈O i γ kn,i (cid:48) ij (cid:54) , ∀ k, ∀ n, ∀ i, ∀ i (cid:48) (cid:54) = i. (33)(34)(35)Their practical meaning is explained as follows: • (33) ensures (0 , k ) to be the ﬁrst link in any routing path R kni if α kni > , • (34) ensures i to be the end node of the last link in anyrouting path R kni if α kni > , • (35) ensures that if i ∈ E\{ i (cid:48) } is an intermediate node ina routing path R kni (cid:48) , i should have only one input linkand one output link. It also indicates the continuity ofa trafﬁc ﬂow.e) Given a non-empty routing path R kni (cid:48) ( α kni (cid:48) > ), checkthe validity by using the following conditions: • Let i = k in (35), then based on (33), (cid:80) j ∈O k γ kn,i (cid:48) kj = 1 ; • Assume ( k, j (cid:48) ) is a link of R kni (cid:48) , then γ kn,i (cid:48) kj (cid:48) = 1 . • If j (cid:48) = i (cid:48) , then the path is found, otherwise, continuewith the following steps: • Let i = j (cid:48) in (35), due to γ kn,i (cid:48) kj (cid:48) = 1 , (cid:80) j ∈O j (cid:48) γ kn,i (cid:48) j (cid:48) j = 1 ; • Assume ( j (cid:48) , j (cid:48)(cid:48) ) is a link of R kni (cid:48) , then γ kn,i (cid:48) j (cid:48) j (cid:48)(cid:48) = 1 . • Check j (cid:48)(cid:48) = i (cid:48) in the same way as the above steps, thewhole path k → i (cid:48) must be found.Thus, if all the conditions are satisﬁed, R kni (cid:48) must be avalid routing path having the three properties (unsplittabil-ity, trafﬁc continuity, absence of cycles). A.4 Final Reformulated Problem

Based on the reformulation of routing and the demon-strations in the above subsections, the ﬂow conservationconstraints can be further improved and the ﬂow variable f knij can be eliminated as follows:  (cid:88) j ∈I i γ kn,iji = b kni , ∀ k, ∀ n, ∀ i, (cid:88) j ∈I i γ kn,i (cid:48) ji = (cid:88) j ∈O i γ kn,i (cid:48) ij , ∀ k, ∀ n, ∀ i, ∀ i (cid:48) (cid:54) = i. (36)(37)Equation (19) contains a maximization form, to get ridof which we use a standard technique by introducingvariable T n = max k ∈K { t knW + t knP L } and linearize it as T n (cid:62) t knW + t knP L , ∀ k, ∀ n (in Section A.1, a similar transfor-mation has been performed on t knP L (see Eq. (17))). Since thearguments of the two maximizations are independent, basedon the reformulation of processing latency, equation (18) canbe transformed as: t knW + t kn,iP (cid:48) + (cid:88) l ∈L g kn,il (cid:54) T n (cid:54) τ n , ∀ k, ∀ n, ∀ i. (38)Finally, the equivalent reformulation of P can be writ-ten as: P c kn ,b kni ,α kni ,β kni ,δ ai ,γ kn,il (cid:88) n ∈N T n + w (cid:88) i ∈E κ i S i , s.t. (1) , (2) , (3) , (4) , (9) , (10) , (11) , (20) , (21) , (22) , (23) , (24) , (25) , (26) , (28) , (30) , (31) , (32) , (36) , (37) , (38) . In problem P , c kn , b kni , α kni , β kni , δ ai and γ kn,il are themain decision variables, while other auxiliary variables like T n , S i , h kn,il , v l , etc. are not shown here for simplicity. All thevariables are bounded. Since constraints (9), (23) and (25) arequadratic while the others are linear, P1