[PDF] NANCY: Neural Adaptive Network Coding methodologY for video distribution over wireless networks

Abstract

This paper presents NANCY, a system that generates adaptive bit rates (ABR) for video and adaptive network coding rates (ANCR) using reinforcement learning (RL) for video distribution over wireless networks. NANCY trains a neural network model with rewards formulated as quality of experience (QoE) metrics. It performs joint optimization in order to select: (i) adaptive bit rates for future video chunks to counter variations in available bandwidth and (ii) adaptive network coding rates to encode the video chunk slices to counter packet losses in wireless networks. We present the design and implementation of NANCY, and evaluate its performance compared to state-of-the-art video rate adaptation algorithms including Pensieve and robustMPC. Our results show that NANCY provides 29.91% and 60.34% higher average QoE than Pensieve and robustMPC, respectively.

Full PDF

NNANCY: Neural Adaptive Network Coding methodologY forvideo distribution over wireless networks

Paresh Saxena, Mandan Naresh, Manik Gupta, Anirudh Achanta

Dept. of CSIS, BITS Pilani

Hyderabad, India { psaxena, p20180420, manik, f21060022 } @hyderabad.bits-pilani.ac.in Sastri Kota

University Of Oulu

Oulu, [email protected]

Smrati Gupta

Microsoft Corporation

Redmond, Washington, [email protected]

Abstract —This paper presents NANCY, a system that gen-erates adaptive bit rates (ABR) for video and adaptive networkcoding rates (ANCR) using reinforcement learning (RL) for videodistribution over wireless networks. NANCY trains a neuralnetwork model with rewards formulated as quality of experience(QoE) metrics. It performs joint optimization in order to select:(i) adaptive bit rates for future video chunks to counter variationsin available bandwidth and (ii) adaptive network coding rates toencode the video chunk slices to counter packet losses in wirelessnetworks. We present the design and implementation of NANCY,and evaluate its performance compared to state-of-the-art videorate adaptation algorithms including Pensieve and robustMPC.Our results show that NANCY provides 29.91% and 60.34%higher average QoE than Pensieve and robustMPC, respectively.

Index Terms —video streaming, network coding, reinforcementlearning

I. I

NTRODUCTION

Internet video services have experienced tremendous growthin last few decades and have become a part of everydayinteractions. Based on the technical report from Cisco [1],the total Internet video is expected to be of all Internettrafﬁc by the end of 2020 with content delivery networks(CDNs) alone to deliver more than 73% of all Internet videotrafﬁc. Moreover, most of the video streaming happens throughuser devices with wireless connectivity, i.e. 3G/4G cellular orWiFi services. The video demand is expected to be potentiallyeven higher in future 5G networks for more advanced andsophisticated real time video applications like remote medicalsurgery, augmented reality, mobile broadcasting [2]. Owingto the huge demand for video delivery services, contentproviders often struggle to provide high quality video to end-users. Further, the underlying wireless network characteristicsincluding bandwidth variations, latency, packet losses, etc, canhighly inﬂuence the video quality. Both these factors, thusnecessitate the need for improving the quality of video deliveryservices over wireless networks.In order to cope up with varying wireless network con-ditions, one of the traditional approaches to improve videoquality is to make use of adaptive bit rate algorithms [3], [4].The client requests the video from the server of a speciﬁc videoquality based on the estimated network conditions and pastdecisions on bit rates. However, these algorithms are usuallyoptimized for speciﬁc scenarios where pre-programmed mod- els are used to generate adaptive bit rates to optimize Qualityof Service (QoS) or Quality of Experience (QoE) metrics.Recently, the integration of machine learning (ML) techniquesfor video streaming has been proposed in [5], [6], [7], [8].However, the prime focus in the earlier techniques has beenon application of ML to generate adaptive video quality levelsfor bandwidth ﬂuctuations that may arise due to the congestionin the network. They do not consider the packet losses dueto poor reception and underlying wireless fading conditionssuch as in remote locations with limited connectivity. Errorconcealment techniques of video codecs and physical layeradaptive modulation schemes provide protection only againstshort temporal losses and fails to provide protection againstsevere losses [9].Network coding (NC) has proven to be a powerful tool tocombat packet losses. With respect to packet loss recovery,NC can be seen as a generalization of forward error correction(FEC) codes with an inherent advantage of in-network coding[10], [11] for the topologies beyond end-to-end topology. NCallows mixing of packets [12] so that a ﬁxed amount ofadditional packets are sent along with the original packets.In the event of packet losses, these additional packets areused to recover the original packets. The network coding rateis selected based on the feedback on packet loss ratio suchthat the desired residual packet loss ratio is achieved. Usually,the selection of network coding rates depend on the pre-programmed model considering speciﬁc assumptions of theunderlying systems.In this paper, a system called Neural Adaptive NetworkCoding methodologY, NANCY has been proposed that learnsand generates adaptive bit rates (ABR) for video and adaptivenetwork coding rates (ANCR) for network coding withoutconsidering any pre-programmed model and assumptions ofthe underlying systems. The main contribution of the cur-rent work is to integrate adaptive network coding rates toexisting adaptive bit rates in state of the art systems suchas Pensieve [5]. NANCY provides an overall comprehensivesystem to counter both congestion and packet losses arisingfrom the varying network conditions. NANCY uses reinforce-ment learning (RL) [13] and is trained with rewards thatare formulated with QoE metrics. The results show that byincorporating network coding to counter packet losses andreducing re-transmissions, NANCY achieves a higher and a r X i v : . [ c s . MM ] A ug ore stable video bit rate. Speciﬁcally, NANCY provides anoverall 29.91% and 60.34% higher average QoE than Pensieveand robustMPC [14] respectively.The rest of the paper is structured as follows. Section IIhighlights the related work on use of NC for video deliverysystems as well as use of ML approaches for NC. Section IIIpresents details about NANCY design and system architecturealong with background on NC and RL approaches appliedin the current work. Section IV explains in detail the mea-surement setup along with the performance metrics used forcomparison with existing algorithms. Section V presents thecomparison results and Section VI draws conclusions on thepaper along with possible future research directions.II. R ELATED W ORK

There are several proposals leveraging the beneﬁts of NC to-gether with video delivery including [15], [16], [17] and [18].Hierarchical Network Coding (HNC) technique is proposed in[15] for using NC with scalable video bit stream in CDNs.Further, [16] and [17] investigate the use NC for the unequalerror protection (UEP) for layered video delivery. Recently,[18] proposed the integration of network coding with 5G formobile video delivery. However, all the existing works focuson the pre-programmed network coding models. They do notexplore the novel ways for incorporating learning techniquesto adapt network coding parameters.Some of the recent proposals on using machine learningalong with network coding include [19], [20], [21] and [22].The application of machine learning on physical layer networkcoding on signals is proposed in [19] with the aim on achievinghigher throughput, while [20] investigates the use of ML ondesigning the coordinator server to synchronize all nodes, andto assign their roles during a NC operation. Further, [21]proposed the role of ML for construction of network codes and[22] investigates the integration of ML with NC for wirelessbroadcast.The proposed work is different from the above as it fo-cuses on the integration of reinforcement learning with theapplication layer level NC for recovering from packet lossesby dynamically adjusting NC rates. The proposed approachutilizes ML techniques for the overall video delivery systemoptimization by integration of video and network codingcomponents. Using this approach, we jointly predict the pa-rameters for both video and NC by considering differentinput parameters including feedback on throughput prediction,packet losses, etc. Additionally, the results are based on thecomparison of QoE metrics, thus providing the analysis of acomplete end-to-end system.III. O

VERALL

NANCY D

ESIGN AND S YSTEM A RCHITECTURE

NANCY is designed predominantly for video content deliv-ery and the system architecture is based on the adaptive videostreaming design such as Dynamic adaptive streaming overHTTP (DASH) [23]. The proposed system design is shownin Figure 1 where videos are stored as multiple chunks at server encoded at different video quality levels. The clientrequests for a speciﬁc chunk with desired video quality fromserver. The client’s decision on chunk request is based onthe predicted throughput and playback buffer size. NANCYincorporates the NC functional blocks at both server and clientsides for improved system performance. The server has aContent Delivery Network (CDN) Chunk Manager that passesthe source chunk to NC Encoder where NC Encoder generatescoded chunk. The coded chunk is then sent to the client. Thecoded chunk is decoded by NC Decoder at the client side andthe source chunk is thus delivered to the video player. Theestimation of video parameters and network coding parametersis done by RL agent referred as Adaptive bit rate and NC rateController in Figure 1. The request based on these parametersis sent to the server. Speciﬁcally, the request consists of chunkindex n , generation size K , network coding rate ρ and videoquality ω . After receiving the request, the server sends anetwork coded chunk n with desired video quality ω to theclient. The process of network coding and role of networkcoding rate ρ is explained in the following subsection. A. Overview of Network Coding

An overview of the NANCY’s network coding functionsis described as follows. Let us assume that the video ﬁle issegmented into n chunks and denote the j -th chunk as C j ,with its size as n jc (in bytes) where j = 1 , , . . . , n . Eachchunk is further encoded into slices. The k -th slice is denotedas S k and its size as n s (in bytes) where k = 1 , , . . . , m with m = (cid:100) n jc n s (cid:101) as the number of slices per chunk. Accordingly, C j = (cid:2) S S . . S m (cid:3) . This is called as ’slice NCgeneration’ a set of K slices, denoted as X ∈ F n s × Kq . Hereper-slice NC generation block coding with block length N is assumed. Lets denote ρ = KN as slice-level coding rate.The encoding process is linear such that the coded packetsare given by Y = XG where Y ∈ F n s × Nq is a set of N coded slices and G ∈ F K × Nq is the generation matrix fornetwork coding. Speciﬁcally, a systematic coding approach isfollowed where N coded slices also consist of K originalslices. Hence, the generator matrix G results in G = [ I K C ] where I K is the identity matrix of size K and C ∈ F K × N − Kq is the matrix with K ( N − K ) coefﬁcients. There are severaldeterministic and random ways of selecting these coefﬁcients[12], [10], [11], etc. In the current work, the coefﬁcients canbe randomly selected from the ﬁnite ﬁeld F q . However, sincenetwork coding functions are separated from video streaming,the design provides the ﬂexibility of choosing deterministiccoefﬁcients as well. B. Learning Algorithms for ABR and ANCR

In the current work, RL has been used to train the agent andselect video bit rate and network coding rate. RL is modeledas a Markov decision process with agents states and actions.Lets consider a discrete system where time t is indexed by t ∈ { , , ... } . At each time step t , the agent observes somestate s t and chooses an action a t . The agent moves to state s t +1 and receives reward r t . The agent selects action based ig. 1. Overall NANCY System Architecture on a policy, π : π θ ( s t , a t ) → [0 , where π θ ( s t , a t ) is theprobability that action a t is taken in state s t and θ are thepolicy parameters upon which the actions are based. The goalof RL agent is to collect as much reward as possible and toﬁnd the policy π ∗ that maximizes the reward. The optimalpolicy is given by, π ∗ = argmax π E [ ∞ (cid:88) t =0 γ t r t | s , a t ∼ π ( . | s t )] (1)The overall reward is deﬁned by E [ (cid:80) ∞ t =0 γ t r t ], where γ ∈ (0,1] is a factor discounting future rewards.The proposed RL based ABR and ANCR is shown inFigure 2. By training a neural network, the agent takes anaction a t ∈ A at every time step t in order to maximize theoverall reward. A consists of different values for ω , K , ρ , i.e.,the code rate ρ , generation size K and the video quality ω ,as shown in Figure 2. The agent observes the inputs includingpredicted throughput, past bit rate decisions, buffer occupancy,past decisions on NC generation size, NC code rate and packetloss ratio. The agent uses the reward information to trainand further improve the model. In the model, the reward isthe QoE metrics that describes the overall user’s satisfactionon the video delivery. In the next section, QoE metrics havebeen deﬁned and its has been described how the performanceanalysis and results are based on those metrics.Following the training model in [5], NANCY is trainedusing an actor-critic method A3C [24]. A3C, a policy gradientmethod, takes advantage of value-based and policy-based RLmethods, where actor computes an action based on a stateand critic produces expected total reward. The critic networkhelps the actor network to make ABR and ANCR decisions.Speciﬁcally, the update of policy parameters θ follows thepolicy gradient with entropy regularization [24] as, θ ← θ + α (cid:88) t (cid:79) θ log π θ ( s t , a t ) A ( s t , a t ) + β (cid:79) θ H ( π θ ( . | s t )) (2)where, • α is the learning rate. • A ( s t , a t ) is the advantage function. It is a measure ofhow much a certain action a good or bad decision givena certain state. • β is to enhance exploration. It is set to a large valueinitially and decreases while the rewards improves. • (cid:79) θ log π θ ( s t , a t ) speciﬁes how to change the policy pa-rameters in order to increase π θ ( s t , a t ) . • H(.) is the entropy of the policy to push θ in the directionof higher entropy. The higher the entropy, the morerandom the actions an agent takes.The derivation and further details on Equation (2) can be foundin [24].IV. E XPERIMENTAL S ETUP AND P ERFORMANCE M ETRICS

In order to better understand the impact of different networkconﬁgurations on NANCY, we now describe our experimentalsetup that has been used for different measurements togetherwith the performance metrics employed.

A. Experimental Setup

The experimental setup proposed in [5] has been adapted,wherein the ABR server has been implemented using Pythonenvironment. The client queries the server to get the bit-ratefor the next chunk of the video. The request query consists ofobservations on the throughput, playback buffer occupancy,packet losses and other video properties. Based on the ob-servations, the trained model output the bit-rate for the nextvideo chunk. The MahiMahi [25] framework has been usedto emulate network conditions. It is used to record and playweb trafﬁc under emulated network conditions. Two different ig. 2. Reinforcement learning to generate adaptive bit rates and adaptive network coding rates data sets have been used for measurement purposes: broadbanddataset provided by FCC and mobile dataset collected inNorway [26]. These traces are reformatted to be compatiblewith the MahiMahi framework. Further, different packet lossconditions have been emulated using LossShell component[25] of MahiMahi emulator to compare the different rate-adaptation algorithms.

B. Performance Metrics and Comparison with Existing algo-rithms

In this paper, NANCY has been compared with other rate-adaptation algorithms using QoE metrics. Speciﬁcally, threevariants of QoE have been selected which have been used inthe previous works as well [27], [14] to compare the differentrate-adaptation algorithms. The QoE variants are based on thegeneral QoE metric, which is deﬁned as

QoE = N (cid:88) n =1 q ( R n ) − µ N (cid:88) n =1 T n − N − (cid:88) n =1 | q ( R n +1 − q ( R n ) | (3)The three components of the QoE metric are explained as fol-lows. The ﬁrst term includes R n that represents the bit-rate forchunk n and q ( R n ) represents the quality corresponding to bit-rate R n . A higher video quality means a higher overall QoE.The second term represents the penalty due to rebuffering time T n and the ﬁnal term represents the penalty due to ﬂuctuationsin video quality that hinders the overall smoothness. The threevariants that depend on the above general QoE metric arespeciﬁed as follows. • QoE - linear QoE where q ( R n ) = R n with µ = 4 . • QoE - log based QoE where q ( R n ) = log ( R/R min ) with µ = 2 . • QoE - QoE assigning high quality scores to HD bitrateswith µ = 8 .NANCY has been compared with different state-of-the-art rate-adaptation algorithms for all the three QoE variantsdescribed above. Speciﬁcally, ﬁve different rate-adaptationalgorithms have been considered for the comparison. • Pensieve [5] is the base case where RL is used as well totrain the agent for delivering adaptive bit rates. Pensieveis closest to NANCY with respect to design, however,Pensieve does not include packet loss recovery strategyand hence expected to perform worse in case of packetlosses. • robustMPC [14] optimally combines throughput andbuffer occupancy information to produce adaptive bitrates. • BOLA [27] uses Lyapunov optimization techniques tominimize rebuffering for improving video quality. • Rate-Based (RB) and Buffer-based (BB) algorithms [28]where RB adapts by taking into account only throughputpredictions and BB adapts by taking into account onlybuffer occupancy observations.V. R

ESULTS

In this section, we present the comparison of NANCYwith the other state-of-the-art algorithms under packet losses.Both FCC and Norway traces have been used for comparisonpurposes. In order to take into account the packet loss ratiovariations, different scenarios have been considered wherepacket losses may vary for each trace. Using MahiMahi, arandom packet loss ratio for each trace has been selected. Thepacket loss ratio value is selected from the following set: [0%,0.1%, 0.2%,...1.8%, 1.9%, 2%] where 0% and 2% are alsoincluded.The results show the comparison of linear QoE (

QoE ) inFigure 3 for FCC traces and in Figure 4 for Norway traces.The legend also show the average QoE achieved by all therate-adaptation algorithms. The results show that NANCYachieves a higher average QoE for both FCC and Norwaytraces. Speciﬁcally, for FCC traces, NANCY provides up to29.91% higher QoE than Pensieve and almost 60.34% higherQoE than robustMPC. Similar trends are observed for Norwaytraces as well.In order to better understand the impact of packet losseson QoE, we study the different components of the QoE, ig. 3. Comparison of NANCY with existing rate-adaptation algorithms forFCC traces with random packet losses within the range of 0% and 2%. Thelegend shows average value of QoE for each algorithm.Fig. 4. Comparison of NANCY with existing rate-adaptation algorithms forNorway traces with random packet losses within the range of 0% and 2%.The legend shows average value of QoE for each algorithm. speciﬁcally, the bit rate selection and the buffer size. Notethat both bit rate selection and buffer size impact the QoEvalue as shown in Equation (3). Although, a higher bit rateselection increases QoE, but frequent ﬂuctuations and higherbuffer size hinder the smoothness of the video impacting theoverall QoE. The comparison of bit rate selection and buffersize is shown in Figure 5 and Figure 6 respectively. Figure 5shows that the maximum bit rate achieved by NANCY over thetime is more than 4 Mbps whereas Pensieve and robustMPCachieve only around 2 Mbps. The results also show that thebit rate selection in both Pensieve and robustMPC are mostly ABR algorithms FCC Traces Norway Traces

NANCY 45.62 41.93Pensieve 35.17 30.71robustMPC 18.70 27.24BOLA 24.50 24.66RB 26.38 23.02BB 1.22 1.10TABLE IC

OMPARISON OF

NANCY

WITH EXISTING RATE - ADAPTATIONALGORITHMS FOR

FCC

AND N ORWAY TRACES USING

QoE METRIC

Fig. 5. Comparison of bit-rate selection for FCC traces with random packetlosses within the range of 0% and 2%. The legend shows average value ofQoE for each algorithm.Fig. 6. Comparison of buffer size for FCC traces with random packet losseswithin the range of 0% and 2%. The legend shows average value of QoE foreach algorithm. stable over the time, however, they fail to achieve a higherbit rate as compared to NANCY. Additionally, the bit rateallocation with NANCY is also quite stable for most of thetime. The other algorithms including BOLA, RB and BBachieve a smaller bit rate with higher ﬂuctuations as comparedto NANCY. Similarly, Figure 6 shows a high buffer size foralgorithms including RB and BOLA that results into a higherrebuffering penalty and a smaller QoE. NANCY, Pensieve androbustMPC show similar behavior where NANCY maintainsa constant buffer size of a higher duration and hence suffersfrom smaller rebuffering penalty. Hence, it can be inferred

ABR algorithms FCC Traces Norway Traces

NANCY 188.69 210.21Pensieve 123.82 127.17robustMPC 139.15 142.49BOLA 80.26 80.64RB 84.76 77.10BB 31.40 12.09TABLE IIC

OMPARISON OF

NANCY

WITH EXISTING RATE - ADAPTATIONALGORITHMS FOR

FCC

AND N ORWAY TRACES USING

QoE METRIC rom the experiments that overall NANCY achieves a higherbit rate which is stable over a longer period of time and it alsoencounters a smaller rebuffering penalty. Therefore, it resultsinto an overall higher average QoE as compared to the otherrate-adaptation algorithms.Further, NANCY outperform existing rate-adaptation algo-rithms for

QoE (Table I) and QoE (Table II) as well. Ourresults show that NANCY provides 29.71% and 36.53% higher QoE than Pensieve for FCC and Norway traces, respectively.The gain is even higher for QoE metric since more weightis given for HD video rates and NANCY is able to achievehigher bit rates. Our results show that NANCY provides up to52.39% and 65.29% higher QoE than Pensieve for FCC andNorway traces, respectively.VI. C ONCLUSION

In this paper, the design, development and evaluation ofNANCY, a system for generating adaptive video bit rates aswell as network coding rates using reinforcement learninghas been presented. Working with real traces and emulatingpacket losses with network emulator, it has been shownthat NANCY performs better than the current state-of-the-art video bit-rate algorithms. The results show the beneﬁtsof using NANCY for different QoE metrics. The future workincludes the performance evaluation of NANCY beyond end-to-end topologies for video multicast and broadcast scenarios.Furthermore, the study will be extended to evaluate NANCYunder controlled network conditions to understand the impactof different network parameters including bandwidth, delay,random and bursty packet losses.A

CKNOWLEDGMENT

This work has been supported by TCS foundation under theTCS research scholar program and SERB, DST, Governmentof India’s start-up research grant agreement SRG/2019/002027(MUT-DROCO). R

EFERENCES[1] Cisco, “Global - 2020 forecast highlights,” Tech. Rep., 2020.[2] J. Nightingale, P. Salva-Garcia, J. M. A. Calero, and Q. Wang, “5g-qoe: Qoe modelling for ultra-hd video streaming in 5g networks,”

IEEETransactions on Broadcasting , vol. 64, no. 2, pp. 621–634, 2018.[3] T.-Y. Huang, N. Handigol, B. Heller, N. McKeown, and R. Johari,“Confused, timid, and unstable: picking a video streaming rate is hard,”in

Internet measurement conference , 2012, pp. 225–238.[4] Y. Sun, X. Yin, J. Jiang, V. Sekar, F. Lin, N. Wang, T. Liu, andB. Sinopoli, “Cs2p: Improving video bitrate selection and adaptationwith data-driven throughput prediction,” in

Proceedings of the 2016ACM SIGCOMM Conference , 2016, pp. 272–285.[5] H. Mao, R. Netravali, and M. Alizadeh, “Neural adaptive video stream-ing with pensieve.” in

SIGCOMM . ACM, 2017, pp. 197–210.[6] Z. Akhtar, Y. S. Nam, R. Govindan, S. Rao, J. Chen, E. Katz-Bassett,B. Ribeiro, J. Zhan, and H. Zhang, “Oboe: Auto-tuning video abralgorithms to network conditions,” in

Proceedings of the 2018 Con-ference of the ACM Special Interest Group on Data Communication ,ser. SIGCOMM 18. New York, NY, USA: Association for ComputingMachinery, 2018, p. 4458.[7] R. Bhattacharyya, A. Bura, D. Rengarajan, M. Rumuly, S. Shakkottai,D. Kalathil, R. K. Mok, and A. Dhamdhere, “Qﬂow: A reinforcementlearning approach to high qoe video streaming over wireless networks,”in

Proceedings of the Twentieth ACM International Symposium onMobile Ad Hoc Networking and Computing , 2019, pp. 251–260. [8] M. Saleem, Y. Saleem, H. Asif, and M. Saleem Mian, “Quality enhancedmultimedia content delivery for mobile cloud with deep reinforcementlearning,”

Wireless Communications and Mobile Computing , 2019.[9] M. Usman, X. He, M. Xu, and K. M. Lam, “Survey of error concealmenttechniques: Research directions and open issues,” in , 2015, pp. 233–238.[10] S. Yang, R. W. Yeung, and R. Srikant,

BATS Codes: Theory andPractice , 2017.[11] P. Saxena and M. A. Vzquez-Castro, “Dare: Dof-aided random encodingfor network coding over lossy line networks,”

IEEE CommunicationsLetters , vol. 19, no. 8, pp. 1374–1377, 2015.[12] D. S. Lun, M. M´edard, R. Koetter, and M. Effros, “On coding for reliablecommunication over packet networks,”

Physical Communication , vol. 1,no. 1, pp. 3–20, 2008.[13] R. S. Sutton and A. G. Barto, “Reinforcement learning: An introduction,”2011.[14] X. Yin, A. Jindal, V. Sekar, and B. Sinopoli, “A control-theoretic ap-proach for dynamic adaptive video streaming over http.” in

SIGCOMM ,S. Uhlig, O. Maennel, B. Karp, and J. Padhye, Eds. ACM, 2015, pp.325–338.[15] K. Nguyen, T. Nguyen, and S.-C. Cheung, “Video streaming withnetwork coding,”

Journal of Signal Processing Systems , vol. 59, no. 3,pp. 319–333, 2010.[16] M. Esmaeilzadeh, P. Sadeghi, and N. Aboutorab, “Random linearnetwork coding for wireless layered video broadcast: General designmethods for adaptive feedback-free transmission,”

IEEE Transactionson Communications , vol. 65, no. 2, pp. 790–805, 2016.[17] M. A. Pimentel-Ni˜no, P. Saxena, and M. A. Vazquez Castro, “Qoe drivenadaptive video with overlapping network coding for best effort erasuresatellite links,” in , 2013, p. 5668.[18] D. Vukobratovic, A. Tassi, S. Delic, and C. Khirallah, “Random linearnetwork coding for 5g mobile video delivery,”

Information , vol. 9, no. 4,p. 72, 2018.[19] T. Matsumine, T. Koike-Akino, and Y. Wang, “Deep learning-basedconstellation optimization for physical network coding in two-way relaynetworks,” in

ICC 2019 - 2019 IEEE International Conference onCommunications (ICC) , 2019, pp. 1–6.[20] J. Mendoza-Almanza and F. de Ass Lpez-Fuentes, “Optimal networkcoding based on machine learning methods for collaborative networks,”in , 2019, pp. 1598–1603.[21] M. Jabbarihagh and F. Lahouti, “A decentralized approach to networkcoding based on learning,” in , 2007, pp. 1–5.[22] D. Nguyen, C. Nguyen, T. Duong-Ba, H. Nguyen, A. Nguyen, andT. Tran, “Joint network coding and machine learning for error-pronewireless broadcast,” in , 2017, pp. 1–7.[23] T. Stockhammer, “Dynamic adaptive streaming over http –: Standardsand design principles,” in

Proceedings of the Second Annual ACMConference on Multimedia Systems , ser. MMSys 11. New York, NY,USA: Association for Computing Machinery, 2011, p. 133144.[24] V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. Harley, T. P. Lil-licrap, D. Silver, and K. Kavukcuoglu, “Asynchronous methods fordeep reinforcement learning,” in

Proceedings of the 33rd InternationalConference on International Conference on Machine Learning - Volume48 , ser. ICML16. JMLR.org, 2016, p. 19281937.[25] R. Netravali, A. Sivaraman, S. Das, A. Goyal, K. Winstein, J. Mickens,and H. Balakrishnan, “Mahimahi: Accurate Record-and-Replay forHTTP,” in

USENIX Annual Technical Conference 2015 , Santa Clara,CA, July 2015.[26] H. Riiser, P. Vigmostad, C. Griwodz, and P. Halvorsen, “Commute pathbandwidth traces from 3g networks: Analysis and applications,” in

Pro-ceedings of the 4th ACM Multimedia Systems Conference , ser. MMSys13. New York, NY, USA: Association for Computing Machinery, 2013,p. 114118.[27] K. Spiteri, R. Urgaonkar, and R. K. Sitaraman, “Bola: Near-optimal bi-trate adaptation for online videos,” in

The 35th Annual IEEE INFOCOM ,2016, pp. 1–9.[28] A. Bentaleb, B. Taani, A. C. Begen, C. Timmerer, and R. Zimmermann,“A survey on bitrate adaptation schemes for streaming media over http,”