Customized Slicing for 6G: Enforcing Artificial Intelligence on Resource Management
aa r X i v : . [ c s . N I] F e b Customized Slicing for 6G: Enforcing ArtificialIntelligence on Resource Management
Wanqing Guan, Haijun Zhang,
Senior Member, IEEE, and Victor C. M. Leung,
Fellow, IEEE
Abstract
Next generation wireless networks are expected to support diverse vertical industries and offercountless emerging use cases. To satisfy stringent requirements of diversified services, network slicing isdeveloped, which enables service-oriented resource allocation by tailoring the infrastructure network intomultiple logical networks. However, there are still some challenges in cross-domain multi-dimensionalresource management for end-to-end (E2E) slices under the dynamic and uncertain environment. Tradingoff the revenue and cost of resource allocation while guaranteeing service quality is significant totenants. Therefore, this article introduces a hierarchical resource management framework, utilizing deepreinforcement learning in admission control of resource requests from different tenants and resourceadjustment within admitted slices for each tenant. Particularly, we first discuss the challenges in cus-tomized resource management of 6G. Second, the motivation and background are presented to explainwhy artificial intelligence (AI) is applied in resource customization of multi-tenant slicing. Third,E2E resource management is decomposed into two problems, multi-dimensional resource allocationdecision based on slice-level feedback and real-time slice adaption aimed at avoiding service qualitydegradation. Simulation results demonstrate the effectiveness of AI-based customized slicing. Finally,several significant challenges that need to be addressed in practical implementation are investigated.
Wanqing Guan and Haijun Zhang are with Beijing Engineering and Technology Research Center for Convergence Networksand Ubiquitous Services, Institute of Artificial Intelligence, University of Science and Technology Beijing, Beijing, China (E-mail: [email protected], [email protected]).Wanqing Guan is also with State Key Laboratory of Media Convergence and Communication, Communication University ofChina, Beijing, China.Victor C. M. Leung is with the Department of Electrical and Computer Engineering, The University of British Columbia,Vancouver, BC V6T 1Z4 Canada (E-mail: [email protected]).
February 23, 2021 DRAFT
I. I
NTRODUCTION
In the coming era of 6G, the proliferation of new smart terminals and the explosion ofnew applications in vertical industry markets, such as augmented or virtual reality (AR/VR),unmanned aerial vehicles (UAV), fully autonomous driving, satellite-ground communications,etc., are forcing mobile network operators (MNOs) to carry the complex scenarios and deliverdiverse services [1]. Meanwhile, user demands are evolving continuously, making it more difficultto provide customized services and make personalized decisions in real time [2]. 6G is expectedto satisfy the dynamic and differentiated demands of users through real-time micro-managementof multiple resources including communication, computing and storage resources.The concept of network slicing is proposed in 5G to create end-to-end (E2E) slice instancesaccording to the different requirements of various services. As a key innovation expected to beinherited in 6G, network slicing is able to reduce the capital expenditure and operating expense(CAPEX/OPEX) by sharing the network resources among multiple tenants. Tenants, such asmobile virtual network operators (MVNOs), over-the-top (OTT) and vertical industries withlimited capacity or coverage, rent the physical resources of MNOs or infrastructure networkproviders (InPs) to provide diversified services. To further reduce CAPEX/OPEX and increaserevenue opportunities, tenants are motivated to unite the available resources provided by differentInPs to enhance their attractiveness and acquire more subscribers [3].Therefore, there is a tremendous need to efficiently manage multi-dimensional resources ofmulti-InPs while meeting the strict and diversified service requirements of multi-tenant underdynamic environment. Creating customized slices for multiple tenants according to their pref-erences enables flexible and adaptive resource management [4]. Moreover, allowing tenants tocustomize the resource allocation for each slice can dynamically adapt to the changes in networkenvironment caused by user mobility, time-varying channel conditions and so on. However,supporting more 6G innovative services and satisfying increasingly-diverse user demands imposesignificant challenges for customized slicing, particularly in terms of E2E slices managementand multi-dimensional resources orchestration.The first challenge is how to achieve real-time status observation of slices by depicting dynamicslice deployment and scalable resource utilization. Slice status information should be accurately
February 23, 2021 DRAFT obtained and quickly incorporated in decision-making of resource allocation. Then, efficientresources planning is conducted based on current status information of slices, including reservingresources for slices and determining the placement of virtual network functions (VNFs) fordifferentiated slices.Considering that an E2E slice consists of a number of interconnected VNFs from radio accessnetwork (RAN), core network and transport network, combinatorial optimization of numerousresources is the second challenge. The differences in profit of providing multiple resources todifferent tenants need to be accounted for when maximizing long-term revenue of InPs as networkslice providers (NSPs). Striking a balance between the resources utilization of infrastructures andthe profits of differentiated services provisioning is crucial for NSPs.Last but not least, quickly satisfying the dynamic demands of differentiated services is anotherchallenge. Since the scale and rates of network flows keep changing, the resources allocated toslices need to be adjusted in time to cope with the dynamic user demands. Additionally, thegrowing number and types of slices result in high complexity of slice adaption. Trading off thecost of reconfiguring slices and the satisfaction of stringent service quality becomes harder fortenants as network slice customers (NSCs).Artificial intelligence (AI) saw rapid development during the past ten years and solved manypain points in different industries, such as healthcare, autonomous driving, smart manufacturing,etc. As one of the most promising AI tools, machine learning (ML) techniques have been widelyapplied in wireless communications [5]. By iteratively learning from the reward feedback ofenvironment, an optimal decision can be quickly achieved with ML methods compared to theconventional model-based optimization methods. Many researches adopt reinforcement learning(RL) based approach to manage resources involving both radio access part and core network part[6]. RL incorporates farsighted system evolution into its decision-making and updates decisionstrategies to reach optimal performance through feedback of the previous decisions.However, the existing RL based resource allocation methods [6, 7] and the AI-assisted networkarchitecture for network slicing [8] are inadequate to balance the capability of customizing re-sources for multiple tenants and maximizing revenues for multiple InPs. It is still very challengingto allocate resources across multiple domains and customize resources for each tenant simulta-neously. In this article, we provide an AI-based hierarchical resource management framework,
February 23, 2021 DRAFT leveraging AI algorithms in both of the global management of multi-domain resources and thelocal slice adaption for multiple tenants. In addition, we introduce a customized slicing procedurefor the proposed framework, realizing real-time on-demand resource provisioning and long-termrevenue maximization.The remainder of this article is organized as follows. We first discuss the characteristicsof customized slicing in the scenario of multi-InPs and multi-tenant, and explain why AI-based approaches is adopted. Then, a hierarchical framework is proposed for intelligent resourcemanagement of E2E slices, supporting slice customization for each tenant based on RL methods.The procedure of observing E2E slices’ status and incorporating it into decision making asslice-level feedback is introduced. We illustrate the effectiveness of the prosposed intelligentmanagement scheme in achieving high revenue and maintaining service quality. Finally, thechallenges in practical implementation of 6G intelligent resource management are highlighted.II. M
OTIVATIONS AND B ACKGROUND
A. Network Slicing Across Multiple Infrastructures
As a fundamental attribute of 5G and beyond, network slicing is realized with the maturity ofsoftware defined networking (SDN) and network function virtualization (NFV). As the enablersof network slicing, NFV decouples software and hardware by virtualizing network functionsand running them on the virtual machines (VMs) while SDN architecture provides centralizedcontrol plane for the configuration of network resources. These techniques prompt a service-basedE2E wireless network architecture where VNFs of RANs and core network are placed as VMsdeployed in data centers (DCs) of cloud InPs. The diverse demands of tenants can be satisfiedthrough flexibly managing resources and efficiently orchestrating VNFs of slices. By involvingtenants in virtual network embedding (VNE) calculation, virtual networks could be provided in atenant-driven manner with a trade-off between cost-effectiveness and time-efficiency. Since E2Eslices require multiple resources, multiple domains administrated by different cloud InPs forma federated environment to jointly provide tenants with resources.E2E network slicing across multiple infrastructures has been discussed in the literature whilemanagement and orchestration (MANO) operations of slices in multiple administrative domainsare also concern [9], as well as the life-cycle management operations. Through flexible slicing,
February 23, 2021 DRAFT heterogeneous resources of these cloud infrastructures can be utilized in a customized mannerand the additional costs of the coalition can be reduced [10]. Besides, analyzing the profit ofresources provisioning and monitoring the status of resource utilization are essential in dynamicreal-time E2E slicing. Specifically, performing admission control of resource requests needs toconsider the revenue of NSPs as well as the service requirements and reallocating resourcesacross multiple domains requires a global view of slice deployment status. To handle massiverequests of configuring and modifying E2E slices dynamically, RL methods are used in ourarchitecture, improving the speed and accuracy of decision-making.
B. Multi-tenant Slicing
Following the upcoming trends of applications, such as smart driving, AR/VR cloud gaming,the typical scenarios supported by 6G include further enhanced mobile broadband (FeMBB),ultra-massive machine-type communications (umMTC), extremely reliable and low-latency com-munications (eURLLC), long-distance and high-mobility communications (LDHMC), and ex-tremely low-power communications (ELPC) [11]. For services in these scenarios, guaranteeingextreme quality of experience (QoE) continuously requires rapid adjustment of network pa-rameters based on real-time monitoring of network status. In order to guarantee QoE and boostrevenue, efficient sharing of the underlaid network infrastructure has stimulated the interest of theresearch community [12]. By establishing efficient network sharing schemes, multiple tenantswhich may own conflicting resource requirements obtain access to the different parts of thelimited resources.As service providers, tenants rent resources to offer slice instances according to heterogeneousservice requirements, which enhances the existent resource sharing flexibility. Network slicingallows various tenants to provide better-performing and cost-efficient services by supportingcustomized slices [6]. Due to the uncertainty of service requirements, many model-free AI-basedsolutions are applied to jointly allocate multi-dimensional resources to slices [13]. Moreover,RL has become an effective method to solve the decision-making problem of network slicingin the uncertain and probabilistic environment [14]. For each tenant, when the traffic flowarrival/departure results in the degradation of slice’s service quality, individually deciding how
February 23, 2021 DRAFT
DNN
Input
OutputExperience Replay MemoryTraining
EnvironmentFeedback
State Reward Action t A t R t S t S t S + t R + ( ) , , , t t t t S A R S + Q-ValueQ-learning
Observe new state and reward
Action selection strategy
Fig. 1: An illustration of deep Q-learningto reallocate available resources is necessary. Owing that the traffic variation cannot be predictedwithout error, RL methods are also applied in slice adaption decisions of our architecture.
C. Deep Reinforcement Learning
Because resource allocation in wireless networks affects the QoE of services, various resourceallocation methods have been studied over the past decades, including optimization, heuristic andgame theoretic. As wireless networks become more complex, the static model-based algorithmswill be inapplicable in the real dynamic network because of the long decision-making timeand high computing burden. Owing to the capability of learning an optimal policy quickly, RLhas been preferred for decision-making in the time-varying network environments and widelyapplied in solving many resource management problems, for instance, power control, spectrummanagement and computation resource management [5]. As one of the most commonly adoptedconventional RL algorithm, Q-learning suffers from slow convergence speed when the state spaceand action space are large. Deep reinforcement learning (DRL) algorithm which integrates deepneural network (DNN) with RL has been proposed by Google DeepMind, and the applicationof many advanced DRL algorithms has triggered tremendous research attention [15].Based on deep Q-network, DRL as shown in Fig. 1 outperforms conventional RL becauseexperience replay is used to increase the efficiency of learning and enhance the stability of
February 23, 2021 DRAFT
MNO/InP 1
MNO/InP 2
RRH
Radio Access Network
Tenant A Tenant B
RRH
Core Network (cid:56)(cid:50)(cid:41) (cid:54)(cid:42)(cid:41)(cid:54) (cid:56)(cid:56)(cid:51)(cid:51)(cid:39)(cid:41) (cid:54)(cid:46)(cid:63) (cid:57)(cid:45)(cid:61) (cid:54)(cid:45)(cid:61) (cid:51)(cid:51)(cid:43)(cid:39)(cid:51)(cid:44) (cid:57)(cid:51)(cid:44) (cid:59)(cid:54)(cid:44) (cid:46)(cid:57)(cid:57) (cid:54)(cid:41)(cid:56)(cid:44)(cid:39)(cid:59)(cid:57)(cid:44) (cid:59)(cid:42)(cid:51)
Subscription
Transport Network
PGW (cid:889)
Packet GatewaySGW (cid:889)
Serving GatewayPHY (cid:889)
PhysicalMAC (cid:889)
Medium Access ControlRRM (cid:889)
Radio Resource Management
PDCP (cid:889)
Packet Data Convergence Protocol
RLC (cid:889)
Radio Link Control MME (cid:889)
Mobility Management Entity HSS (cid:889)
Home Subscriber ServerAMF (cid:889)
Access and Mobility Management FunctionSMF (cid:889)
Session Management FunctionUPF (cid:889)
User Plane Function AUSF (cid:889)
Authentication Server Function UDM (cid:889)
Unified Data ManagementPCRF (cid:889)
Policy and Charging Rules Function
GRM
Admission control
LRM1
Demand changes
AI Slice adaption (cid:60)(cid:52)(cid:44)(cid:60)(cid:52)(cid:44) (cid:60)(cid:52)(cid:44)
LRM2
Demand changes
AI Slice adaption (cid:60)(cid:52)(cid:44)(cid:60)(cid:52)(cid:44) (cid:60)(cid:52)(cid:44)
Resource requests
Revenue management
AIAIStatus acquisition Status acquisitionStatus acquisition
Hierarchical management
Fig. 2: The AI-based hierarchical resource management frameworkDNN. After performing action selection, reward calculation and new state observation, the mini-batches of experience are sampled uniformly at random to feed into the neural network duringthe learning process. DNN which is used to approximate the Q-value function takes the currentstates as the input and outputs a set of Q-values for all of the state-action pairs. Instead of usingQ-table to store Q-values in the Q-learning algorithm, the deep convolutional network is used toaddress the instability caused by the correlations. Experience replay memory randomizes overthe data, thereby allowing for greater efficiency and breaking the strong correlations between thesamples. Hence, DNN improves the convergence of Q-learning and enables the deep Q-learning(DQL) algorithm to solve the problems which have a high-dimensional state-action space.III. AI-
BASED H IERARCHICAL R ESOURCE M ANAGEMENT F RAMEWORK
A. Hierarchical Resource Management
Planning deployment location from a global view can effectively avoid resource competitioncaused by the increasing number of co-located VMs on the same server. Scheduling multi-
February 23, 2021 DRAFT dimensional resources in a comprehensive and balanced way can potentially increase resourceefficiency and avoid resource waste and shortage. In order to improve the revenue of resourceproviders, managing multi-domain resources centrally and allocating optimal amount resourcesto E2E slice instances are required. Given that traffic load variation of the slice might degradeQoE, the centralized management approach faces performance issues and limits the autonomyof the tenants. Hence, based on the MANO architecture for multi-domain slices in 5G [9], anAI-based hierarchical resource management framework shown in Fig. 2 is proposed to integrateintelligence in customized slicing for 6G use cases in the scenario of multi-InPs and multi-tenant.To meet dynamically evolving service quality requirements and support fine-grained networkdecision optimization, the proposed framework introduces a global resource manager (GRM)to handle incoming differentiated resource requests from tenants, and multiple local resourcemanager (LRM) to deal with the demand changes in resource requirements for individual tenant.The deployment of GRM and LRMs enables two-layer customization of slices, which means thatthe resources are firstly allocated to each tenant according to the heterogeneous slice performancerequirements, and then resource allocation to each slice is optimized and adjusted according tothe real-time observation of demand changes. It is worth noting that the AI-based algorithmsused in global resource allocation and local slice adaption can be different.The hierarchical approach can enable flexibility and scalability properties by distributingresource management to individual tenant. GRM maintains the overall control over the LRMsand delegates the concrete operations to each LRM. GRM is responsible for charging of sliceowners and monitoring the LRMs while allocating federated resources across multiple domains.LRM performs slice adaption by adjusting the assigned resources to maintain service quality.Moreover, the LRMs not only provide each tenant the ability of resource customization, but alsohave the distinguishing feature of transmitting the status of slices to the GRM.To handle with the traffic dynamics quickly, the status of slices which include the deploymentlocation of VNFs and the condition of traffic flows passing through these VNFs are observedperiodically. Mornitoring slice deployment and resource utilization facilitates to maintain servicequality and enhance resource efficiency. Observing the real-time status of E2E slices provides areference for determining whether or not to perform resource adaption. To realize real-timeresource monitoring and slice topology information updating in the scenario of multi-InPs
February 23, 2021 DRAFT
AI-driven optimizationsAllocate resourcesMachine learning model
Measure the service quality Compared with targeted valuesCollect status of each slice
Service
BrokerService Conductor
Virtual ResourcesPhysical Resources
RAN Transport Core
Virtualization LayerNetwork Storage Computing VNFs
Domain 1: GRM
OSS/BSS A B Tenants
LRM1 LRM2Slice Life-Cycle Management Sub-domain ControlSub-domain NFV MANO
AI-based slice customization
Fig. 3: The procedure of customized slicing with the proposed frameworkand multi-tenant, both of the differentiated slices provided by multiple tenants and the jointinfrastructure network which consists of multiple infrastructures are depicted. Specifically, thecooperation between these infrastructure networks and the mapping relationships between thephysical servers and the VNFs deployed in these servers are precisely delineated.
B. Customized Slicing Procedure
Figure 3 shows the procedure of customized slicing with the proposed AI-based managementframework. After receiving the real-time slice requests from multiple tenants, the GRM deployedin the functional plane named the Service Broker performs admission control of these requestsbased on ML model. The NSPs make a trade off between the resource requirements associatedto these requests and the revenue achieved by providing required resources. Multi-dimensionalresources are allocated to tenants with the objective of maximizing the long-term revenue ofNSPs. For the DRL-based resource allocation performed in GRM, states are defined as thenumber of accepted requests belonging to different tenants, actions taken by each agent areaccepting/rejecting the arrival slice requests and reward is related to slice utility. With theoutput of the DRL algorithm in GRM, slices are deployed and the status of slices are recordedperiodically.
February 23, 2021 DRAFT0
Depending on perceived status, the current service quality can be measured and comparedwith its target quality requirements. The current service quality satisfaction reflects the gap withthe target value. The target values regarded as the desired service quality should be definedas the level of service that the available resources of InPs can and should provide, thus theyare preset and fed as input to the optimization problem of slice adaption. As the slice-levelfeedback, current service quality satisfaction is used to improve the performance of ML modeland update the model with demand changes that could occur over time. When there are changesin slice requests, such as a sudden increase in resource requirements, the ML model is utilizedto maintain service quality for admitted slices by micro-managing resources.The motivation of performing resource adaption generates from the mismatch between avail-able resources and the varying traffic demand in the slice. This mismatch might cause two kindsof issues, one is that available resources are exhausted and the other is that partial resourcesare idle. The former means unfair resource allocation resulting in the low data rate of newlyaccepted user, and the latter means that the revenue of tenant is declining. To avoid unbalanceddistribution of available resources, i.e., some resources are under-utilized, some are over-utilized,the allocated resource of each tenant should be adjusted to maximize the profits of availableresources. After receiving the requests of adjusting resources for multiple slices, tenant makesdecisions by weighing the cost and revenue of adjusting resources for each slice to ensure optimalresource efficiency.The LRM deployed in Service Conductor performs the DRL-based slice adaption. States aretied to the current service quality satisfaction and actions denotes whether slice adaption ispermitted. Reward is defined as the revenue obtained by adjusting resource minus the resourceconsumption cost and operational cost. The revenue is related to the amount of money paidby the service subscribers for guaranteeing service quality, which depends on the type of slice.The resource consumption cost represents the cost of providing more resources, such as theextra processing units required by the newly arrived traffic flows. The operational cost meansthe cost of performing reconfiguration, which includes the cost of service interruption causedby reallocating resources and migrating VNFs among physical servers. There is no doubt thatDQL used in this article can be replaced by other advanced DQL-based algorithms to achievebetter performance.
February 23, 2021 DRAFT1
IV. E
VALUATION
Having introduced the key elements of the proposed framework, the next important step is toevaluate the performance of customized slicing and verify the benefits of AI-based resource man-agement. In this section, the effectiveness of the proposed management framework is validatedin terms of improving the long-term revenue and guaranteeing service quality.
A. Experimental Setup
The AI-based resource management approach is implemented in python where the Tensorflowlibrary is used to build the ML model. For the purpose of comparison, a non-intelligent re-source management approach with centralized resource allocation of differentiated slices is alsoimplemented. DQL algorithm is compared with a greedy algorithm used in the non-intelligentresource management framework. The greedy algorithm permits resource reallocation so longas the remaining resources are enough, which ignores difference in the value of reconfiguringdifferentiated slice. The AI-based framework reserves more resources for slices which can bringmore revenue, and performs slice adaption in a cost-effective manner. To verify the performanceof the proposed framework, there are two tenants which owns different types of slices in thesimulation, and the flow dynamics of these two types of slices are different. Assuming that sliceof type 1 owned by tenant A and slice of type 2 owned by tenant B have the same total numberof VNFs, and the VNFs are deployed in DCs of the joint infrastructure network provided bytwo InPs at the beginning of simulation.A summary of the simulation parameters is listed in Table 1. The topology of each infrastruc-ture network is generated according to the algorithm of Barab´asi-Albert (BA) scale-free networksbecause a forthcoming node of the communication network tends to connect itself to the nodeswith large degrees. While resources in wireless networks are miscellaneous, here we confineresources to computational resource of DCs. We assume that processing one unit of data flowrequires one unit of computational capacity. There are 5 DCs in the joint infrastructure networkwith 10-node topology and each DC has capacities of 300 processing units. The flows in eachslice arrive following a Poisson process, and the service time follows an exponential distribution(the arrival and departure rates are given in Table I). The status of slices are recorded and storedin a database. Service quality satisfaction of slices are measured periodically and the period
February 23, 2021 DRAFT2
TABLE I: Simulation parameters
Items ValuesTotal number of physical nodes Number of DCs Resources of each DC processing unitsNumber of VNFs for each slice Number of flows for slice of type 1
Flow arrival interval for slice of type 1 (sec)Flow service time for slice of type 1 (sec)Number of flows for slice of type 2 Flow arrival interval for slice of type 2 (sec)Flow service time for slice of type 2 (sec) between two measurements is 1s. In this section, to calculate the values of current satisfaction,the number of waiting data flows in each moment of measurement is recorded and normalized. B. Long-term Revenue
First, the proposed framework is able to maximize the average reward by reserving resource forresource requests which could bring more revenue. According to the requests of tenants, GRMachieves the optimal decision-making for admission control through DRL-based algorithm. Thearrival rates of resource requests from tenant A and tenant B are set at 10 requests/hour and 12requests/hour respectively while the completion rates of requests are set at 6 requests/hour. Theimmediate reward obtained by accepting resource request from tenant A is set at 2 and tenantB is varied while each resource request requires 60 processing units of each DC.In Fig. 4(a), the performance of the DQL algorithm with Q-learning and greedy algorithmsin terms of long-term revenue are compared. It is shown that the revenue obtained by threealgorithms are increased when the immediate reward of accepting requests from tenant B isvaried from 1 to 6. However, the revenue obtained by RL algorithms, i.e., DQL and Q-learning,is significantly higher than that of the greedy algorithm. The reason is that RL algorithms reserveresource for the requests which may bring high reward while the greedy algorithm accepts the
February 23, 2021 DRAFT3 , P P H G L D W H U H Z D U G I R U W H Q D Q W % / R Q J W H U P U H Y H Q X H * U H H G \ 4 / H D U Q L Q J ' 4 / (a) Long-term revenue (b) Proportion of accepted requests Fig. 4: The performance of the proposed framework when the immediate reward is varied.requests according to the available resource without considering reward. Besides, due to the slowconvergence rate of Q-learning, the DQL algorithm achieves higher revenue than Q-learning.To further analyze the performance of the proposed framework, the proportion of acceptedrequests from two tenants are calculated and shown in Fig. 4(b). It can be observed that RLalgorithms are likely to accept the resource requests which have higher immediate reward. Forexample, when the immediate reward of accepting requests from tenant B is larger than thatfrom tenant A, there are more accepted requests from tenant B. In contrast, the greedy algorithmaccepts request when the available resource satisfy the demand. Hence, the composition ratio ofaccepted requests is not affected by the change of the immediate reward.
C. Service Quality Satisfaction
Second, to substantiate that the proposed framework is able to maintain service quality,the service quality satisfaction of the intelligent framework and non-intelligent framework arecompared. The satisfaction related to E2E delay is influenced by the remaining processing unitsallocated to this slice, which determines whether the incoming data flows can be delivered intime. Once the remaining resources are insufficient to meet the service quality requirements,the number of the waiting data flows is going to accumulate. When the allocated resource cannot satisfy the dynamic demand, the slice adaption algorithm in each LRM of the proposedframework will be triggered to make decisions of reallocating processing units. With resource
February 23, 2021 DRAFT4 (a) Service quality satisfaction (b) Decision delay
Fig. 5: Comparison between the intelligent and non-intelligent frameworks. (a) Service qualitysatisfaction vs. time; (b) Decision delay vs. the number of VNFs for each slice.customization, service quality of slice could be maintained in a higher level by sacrificing thesignaling cost and the communication overhead.In Fig. 5, we plot the service quality satisfaction vs. time and decision delay vs. the number ofVNFs for each slice for the intelligent resource management framework and the non-intelligentframework. As shown in Fig. 5(a), the average satisfaction of the intelligent framework is in-creased by nearly 20% with the proposed method, compared with the non-intelligent framework.In Fig. 5(b), it can be seen that the non-intelligent framework requires minimum decision delaywhile the proposed framework with Q-learning algorithm requires the most computation time. Inparticular, more decision time is required as the number of VNFs increases, because reconfiguringslices becomes more complex. However, both DQL and double DQL which can further improvestability and avoid overestimation have decision delays of less than 1 second.V. O
PEN I SSUES AND F UTURE C HALLENGES
With the deployment of GRM and LRMs, the proposed framework achieves multi-tenantoriented intelligent resource management with the aim of optimizing the long-term revenue of theNSPs, and realizes fine-grained resource customization with the aim of maintaining the servicequality of slices from different tenants. In addition, AI-enabled technique or, more precisely, theuse of ML algorithms allow the resource management framework to adapt the changes in resource
February 23, 2021 DRAFT5 requirements and learn optimal policy from the dynamic environment. Nevertheless, there areplenty of open issues that need further study and several challenges in practical implementation.Some of these issues and challenges are introduced below.
Real-time prediction of evolving user demands : User demands are highly dynamic and uncer-tain. Therefore, there are efforts in the literature to predict users’ behavior. However, correlatingthe evolutional tendency of demands to resource allocation in network slicing constitutes achallenging but interesting line of research.
Fast implementation of E2E slicing : Responding to user demands in a real-time manner isessential in providing better service quality. For this reason, E2E slices need to be instantiatedrapidly and completely. Therefore, more research should be conducted to develop practicalsolutions supporting easier experimentation in the scenario of multi-InPs and multi-tenant.
Adaptive adjustment of AI-based solution : In the learning stage of AI-based resource man-agement framework, the relatively long convergence time of ML methods undermines theirusefulness. Besides convergence, the stochastic nature of the wireless network may requireongoing updates of the parameters and continuous adaption of ML methods. Therefore, feasibleand scalable ML algorithms need more study and analysis.
Coordinated collaboration of multiple InPs : As mentioned earlier, the cooperation betweendifferent infrastructure networks offers an attractive mean of providing multiple resources atlow cost. However, the process of building effective collaboration is extremely challenging,which necessitates more investigation about management interfaces and resource isolation acrossmultiple administrative domains. VI. C
ONCLUSION
This article has proposed intelligent resource management framework to enable customizedslicing in the scenario of multi-InPs and multi-tenant based on the slice-level feedback andreal-time service quality. Along with artificial intelligence, we propose a hierarchical frameworkwhere global resource manager utilizes machine-learning-based method coupled with servicequality evaluation to make optimal resource allocation decisions and local resource managerscollect status information about slice deployment and resource utilization to support fine-grainedresource adaption. The simulation results show the effectiveness of the proposed reinforcement-
February 23, 2021 DRAFT6 learning-based resource management approach in terms of achieving high revenue of networkslice providers and maintaining high quality of end-to-end services.R
EFERENCES [1] K. David and H. Berndt, “6g vision and requirements: Is there any need for beyond 5g?”
IEEE Veh. Technol. Mag. , vol. 13,no. 3, pp. 72–80, 2018.[2] R. Alkurd, I. Abualhaol, and H. Yanikomeroglu, “Big-data-driven and ai-based framework to enable personalization inwireless networks,”
IEEE Commun. Mag. , vol. 58, no. 3, pp. 18–24, 2020.[3] K. Samdanis, X. Costa-Perez, and V. Sciancalepore, “From network sharing to multi-tenancy: The 5g network slice broker,”
IEEE Commun. Mag. , vol. 54, no. 7, pp. 32–39, 2016.[4] H. Zhang, N. Liu, X. Chu, K. Long, A. Aghvami, and V. C. M. Leung, “Network slicing based 5g and future mobilenetworks: Mobility, resource management, and challenges,”
IEEE Commun. Mag. , vol. 55, no. 8, pp. 138–145, 2017.[5] Y. Sun, M. Peng, Y. Zhou, Y. Huang, and S. Mao, “Application of machine learning in wireless networks: Key techniquesand open issues,”
IEEE Commun. Surv. Tutorials , vol. 21, no. 4, pp. 3072–3108, 2019.[6] R. Li, Z. Zhao, Q. Sun, C. I, C. Yang, X. Chen, M. Zhao, and H. Zhang, “Deep reinforcement learning for resourcemanagement in network slicing,”
IEEE Access , vol. 6, pp. 74 429–74 441, 2018.[7] N. Van Huynh, D. Thai Hoang, D. N. Nguyen, and E. Dutkiewicz, “Optimal and fast real-time resource slicing with deepdueling neural networks,”
IEEE J. Sel. Areas Commun. , vol. 37, no. 6, pp. 1455–1470, 2019.[8] X. Shen, J. Gao, W. Wu, K. Lyu, M. Li, W. Zhuang, X. Li, and J. Rao, “Ai-assisted network-slicing based next-generationwireless networks,”
IEEE Open J. Veh. Technol. , vol. 1, pp. 45–66, 2020.[9] T. Taleb, I. Afolabi, K. Samdanis, and F. Z. Yousaf, “On multi-domain network slicing orchestration architecture andfederated resource control,”
IEEE Network , vol. 33, no. 5, pp. 242–252, 2019.[10] M. Vincenzi, A. Antonopoulos, E. Kartsakli, J. Vardakas, L. Alonso, and C. Verikoukis, “Multi-tenant slicing for spectrummanagement on the road to 5g,”
IEEE Wireless Commun. , vol. 24, no. 5, pp. 118–125, 2017.[11] Z. Zhang, Y. Xiao, Z. Ma, M. Xiao, Z. Ding, X. Lei, G. K. Karagiannidis, and P. Fan, “6g wireless networks: Vision,requirements, architecture, and key technologies,”
IEEE Veh. Technol. Mag. , vol. 14, no. 3, pp. 28–41, 2019.[12] A. Antonopoulos, “Bankruptcy problem in network sharing: Fundamentals, applications and challenges,”
IEEE WirelessCommun. , vol. 27, no. 4, pp. 81–87, 2020.[13] X. Chen, Z. Zhao, C. Wu, M. Bennis, H. Liu, Y. Ji, and H. Zhang, “Multi-tenant cross-slice resource orchestration: Adeep reinforcement learning approach,”
IEEE J. Sel. Areas Commun. , vol. 37, no. 10, pp. 2377–2392, 2019.[14] Y. Abiko, T. Saito, D. Ikeda, K. Ohta, T. Mizuno, and H. Mineno, “Flexible resource block allocation to multiple slicesfor radio access network slicing using deep reinforcement learning,”
IEEE Access , vol. 8, pp. 68 183–68 198, 2020.[15] Y. Hua, R. Li, Z. Zhao, X. Chen, and H. Zhang, “Gan-powered deep distributional reinforcement learning for resourcemanagement in network slicing,”
IEEE J. Sel. Areas Commun. , vol. 38, no. 2, pp. 334–349, 2020., vol. 38, no. 2, pp. 334–349, 2020.