A Hybrid Control Scheme for Adaptive Live Streaming
AA Hybrid Control Scheme for Adaptive Live Streaming
Huan Peng
Communication University of [email protected]
Yuan Zhang
Communication University of [email protected]
Yongbei Yang
Communication University of [email protected]
Jinyao Yan
Communication University of [email protected]
ABSTRACT
The live streaming is more challenging than on-demand streaming,because the low latency is also a strong requirement in additionto the trade-off between video quality and jitters in playback. Tobalance several inherently conflicting performance metrics andimprove the overall quality of experience (QoE), many adaptationschemes have been proposed. Bitrate adaptation is one of the majorsolutions for video streaming under time-varying network con-ditions, which works even better combining with some latencycontrol methods, such as adaptive playback rate control and framedropping. However, it still remains a challenging problem to designan algorithm to combine these adaptation schemes together. Totackle this problem, we propose a hybrid control scheme for adap-tive live streaming, namely HYSA, based on heuristic playback ratecontrol, latency-constrained bitrate control and QoE-oriented adap-tive frame dropping. The proposed scheme utilizes KaufmanâĂŹsAdaptive Moving Average (KAMA) to predict segment bitrates forbetter rate decisions. Extensive simulations demonstrate that HYSAoutperforms most of the existing adaptation schemes on overallQoE.
CCS CONCEPTS • Information systems → Multimedia streaming ; Informationsystems applications ; Multimedia information systems . KEYWORDS live streaming; bitrate adaptation; playback rate control; framedropping
ACM Reference Format:
Huan Peng, Yuan Zhang, Yongbei Yang, and Jinyao Yan. 2019. A HybridControl Scheme for Adaptive Live Streaming. In
Proceedings of the 27thACM International Conference on Multimedia (MM ’19), October 21–25, 2019,Nice, France.
ACM, New York, NY, USA, 5 pages. https://doi.org/10.1145/3343031.3356049
Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected].
MM ’19, October 21–25, 2019, Nice, France © 2019 Association for Computing Machinery.ACM ISBN 978-1-4503-6889-6/19/10...$15.00https://doi.org/10.1145/3343031.3356049
Recent years have seen tremendous growth of live streaming ap-plications. Different from on-demand streaming, live streaminghas tight latency constraints. It’s very challenging to reduce thelatency while maintaining high video quality and smooth playback.Bitrate adaptation is the most common solution for improving theQoE of video streaming under time-varying network conditions.However, the existing adaptive bitrate algorithms, such as BOLA[6], RobustMPC [8], Pensieve [3] and Oboe [1], haven’t taken thelatency into account. In addition to adapting the video bitrate, play-back rate control and frame dropping are always utilized to reducethe latency in live streaming. Mingfu Li et al. [2] employed playbackrate adaptation, and Miller et al. [4] and Shen Y et al. [5] adoptedframe dropping to reduce the latency. However, these methods lackthe capability to balance between latency and video quality, withoutfully considering all aspects of QoE.In this paper, we propose HYSA, an effective hybrid controlscheme to realize playback rate adaptation, bitrate adaptation andframe dropping adaptation. First, the playback rate is adaptivelyadjusted with a buffer-based heuristic method. Then, taking ad-vantage of the playback rate decisions and KAMA-based predictedsegment bitrates, we propose the latency-constrained bitrate adapta-tion scheme to make optimal bitrate decisions for the QoE-orientedframe dropping adaptation scheme in the next step. Our extensivesimulation results demonstrate that the proposed HYSA outper-forms existing adaptation schemes on the overall QoE.The rest of the paper is organized as follows. Section 2 introducesthe system framework in live streaming scenario and the simulationplatform. Section 3 describes the details of the proposed hybridcontrol scheme. In Section 4, the performance of our method isevaluated comprehensively. Section 5 concludes this paper.
Figure 1 depicts a typical scenario of live streaming. The videoframes generated in real-time are uploaded to a transcoding server,which re-encodes the video into multiple representations, each ata different bitrate. These representations are then transmitted toCDN (Content Delivery Network) nodes, which act as edge servers.The client decides which representation to download from one ofthe CDN nodes based on some state information, such as bufferoccupancy and throughput. Besides, the client can adaptively adjustits playback rate and skip some frames to reduce the latency.The simulator simulates the downloading of video frames undervarious network conditions and the adaptive playback of a player.The simulator takes video trace, network trace and decisions from a r X i v : . [ c s . MM ] O c t igure 1: Architecture of live streaming system and the sim-ulator the hybrid control scheme as inputs. The video trace records the sizeof video frames and their time of arrival at CDN, while the networktrace simulates throughputs of the downloading network. The sim-ulator collects observations after downloading every frame, thenthe control scheme makes decisions after downloading a completegroup of pictures (GOP) by taking advantage of these observations.Hereinafter we use a segment to refer to a GOP. In this section, we describe the details of the proposed hybrid con-trol scheme – HYSA, which consists of segment bitrate predictionmodule, playback rate control module, bitrate control module andframe dropping control module, as shown in Figure 2.
Many studies have highlighted the critical role that QoE plays inthe design of adaptation schemes. Here we refer to the QoE modelspecified by the grand challenge in ACM MM 2019 , which mainlyfocuses on five performance metrics: video quality, rebuffering, la-tency, frame skipping and quality switching. Their impacts on QoEare notated by QoE quality , QoE rebuf , QoE latency , QoE skip and
QoE switch respectively. The overall QoE is calculated as follows:
QoE = QoE quality + QoE rebuf + QoE latency + QoE skip + QoE switch = K (cid:213) k = ( p q V k d f − p r t rk − p l l k − p s t sk − p w | V k − V k − |) (1)where K is the total number of frames. The coding bitrate of frame k is notated by V k , while d f is its length. t rk and t sk denote therebuffering duration and video length skipped when downloadingframe k respectively, and l k is the latency. p q , p r , p l , p s , p w areweight factors used to describe the importance of correspondingQoE metrics. | V k − V k − | describes the quality variation of twoadjacent video frames. The next download duration can be estimated if the next segmen-tâĂŹs actual bitrate is known. However, it isnâĂŹt realistic to getinformation about the video that hasnâĂŹt been generated in livestreaming. Therefore, most of the existing algorithms estimate the Figure 2: Overview of the proposed hybrid control scheme next download duration using the coding bitrate of the upcomingsegment instead of its actual bitrate, ignoring the fact that seg-ment’s actual bitrate varies significantly for a given coding bitrate,as indicated by Figure 3. Besides, we can observe that for segmentsin two videos with the same content but different qualities, theratios of their actual bitrates have a similar trend with the ratios oftheir coding bitrates.Assume that each segment is edcoded at M different bitrates,and let V n , m and R n , m be the coding bitrate and the actual bitrateof segment n at quality level m , satisfying V n , m < V n , m , R n , m < R n , m , ∀ m < m . For the n -th segment already downloaded atquality q n , we can estimate its actual bitrates of other quality levelsbased on the above observation: R n , m ≈ V n , m V n , q n R n , q n , m (cid:44) q n and m ∈ [ , M ] (2)Here we employ KaufmanâĂŹs Adaptive Moving Average (KAMA)to predict the actual bitrates of the upcoming segment. The bitrateof the next segment n + m , denoted as ˆ R n + , m , canbe predicted as follows:ˆ R n + , m = ( − SC n ) ˆ R n , m + SC n R n , m (3)The smoothing factor SC n is dynamically calculated for every sam-ple, i.e. segment bitrate. To get the smoothing factor, we first set twoboundaries for it, based on the method of calculating smoothingfactor in Exponential Moving Average (EMA): SC slowest = l max + SC f astest = l min + l max and l min are the number of samples for the slowest andfastest EMA respectively. Then we calculate the efficiency ratio ER n , which shows the efficiency of sample fluctuations. ER n = | R n , m − R n − N , m | (cid:205) N − i = | R n − i , m − R n − i − , m | (6)where N specifies the number of samples used for calculating ER n ,and ER n is always between 0 and 1. Using ER n and two boundaries,the SC n can be derived as below: SC n = [ ER n ( SC f astest − SC slowest ) + SC slowest ] (7) igure 3: Actual bitrates of segments at variable coding bi-trates In the playback rate control module, the target buffer is introducedto help control playback rate γ . The so-called target buffer is markedas [ B min , B tarдet , B max ] , where B min and B max form a buffer in-terval that player can play buffered video at normal playback rate,and B tarдet means the target buffer occupancy to resume playbackwhen interruptions occur. When the buffer occupancy is below B min , the player will slow down the playback rate to 0.95, whilespeeding up the playback rate to 1.05 when the buffer occupancy isabove B max . In the grand challenge, the target buffer can only beset to 0 or 1, i.e. [ B min , B tarдet , B max ] or [ B min , B tarдet , B max ] ,satisfying B min < B min < B tarдet < B tarдet < B max < B max .The heuristic playback rate control module decides which targetbuffer to choose solely depending on current buffer occupancy B n .Five cases are considered, as described below:Case 1: B n < B min , which means there is a substantial risk ofinterruptions, and γ = .
95 whichever target buffer is chosen. Torestart playback as soon as possible, the target buffer is set to 0 dueto smaller B tarдet .Case 2: B n ∈ [ B min , B min ) , which means stalls may be encoun-tered even though it is less possible than Case 1. Therefore, thetarget buffer is set to 1 to make player slow down the playback rateto 0.95.Case 3: B n ∈ [ B min , B max ) , which means buffer occupancy re-mains in a reasonable interval, and γ = . B n ∈ [ B max , B max ) , which means the latency is relativelylarge due to buffered video. Therefore, the target buffer is set to 0to speed up the playback rate to 1.05.Case 5: B n ≥ B max , which means large latency caused by bufferedvideo, and γ is equal to 1.05 whichever target buffer is chosen.Based on the discussions above, the playback rate can be adjustedby the target buffer as follows: tarдet _ bu f f er n + = (cid:40) , if B n ∈ [ B min , B max ) , otherwise (8) γ n + = . , if B n ∈ [ , B min ) . , if B n ∈ [ B min , B max ) . , otherwise (9) The latency-constrained bitrate control module makes bitrate deci-sions based on some state information, such as buffer occupancy,predicted segment bitrate and playback rate derived from the play-back rate control module. The optimal bitrate is selected to minimize D n + , the estimated latency after downloading the next segment.Denoting segment length by d , the duration of downloadingthe upcoming segment n+1 at quality m , notated by T n + , canbe calculated using the predicted segment bitrate ˆ R n + , m and theestimated network throughput ˆ C n + derived by Weighted MovingAverage as follows: T n + = ˆ R n + , m d ˆ C n + (10)During the downloading process, the player will consume bufferedvideo at playback rate γ n + if the buffer isnâĂŹt drained. The bufferoccupancy after downloading, namely B n + , can be calculated as: B n + = max [ B n + d − γ n + T n + , ] (11)To estimate the latency caused by video accumulated at CDNafter downloading the next segment, we start with estimating thevideo accumulation speed at CDN, denoted by ˆ v n + , as follows:ˆ v n + = βv n = β N _ nst n − N _ nst n − T n d f (12)where β is a predictive factor, and N _ nst n indicates the index ofthe latest frame at CDN after downloading the n -th segment. Then,the latency caused by accumulated video at CDN after next down-loading interval, namely D _ cdn n + , is calculated by: D _ cdn n + = max [( N _ nst n − N _ dld n ) d f + ˆ v n + T n + − d , ] (13)where N _ dld n represents the index of most recently downloadedframe. The objective of the bitrate adaptive algorithm is to find thequality that results in lowest latency without interruptions, whichcan be described as the following optimization problem: Minimize D n + = B n + + D _ cdn n + Subject to B n + > B th (14)where B th is a warning threshold indicating the upcoming stallevent. Since there are finite available bitrates, we can get the optimalquality level quality n + by going through all choices. The client can reduce the latency by dropping some frames whencurrent latency is above a specific threshold. We propose a QoE-oriented frame dropping method to adaptively adjust the latencythreshold that triggers frame skipping. Assuming that the clientskips N frames when the latency is above l n + during next seg-mentâĂŹs downloading, the positive and negative impact of frameskipping on QoE compared to non-skip, denoted by QoE p and QoE n ,can be estimated based on the given QoE model: QoE n = p q V n + , quality n + d f N + p s d f N (15) QoE p = p d λl n + N (16)Here, λl n + is used to estimate the average latency if frame skippingis not performed when the latency is above l n + . Therefore, when QoE p is larger than QoE n , frame skipping is a good choice: p d λl n + N > p q V n + , quality n + d f N + p s d f N (17) able 1: Performance comparison of the adaptation schemes Method
QoE overall
QoE quality
QoE rebuf
QoE latency
QoE skip
QoE switch
HYSA 2424.04 3548.00 -418.92 -608.32 -86.28 -10.44HYSA-N 2336.33 3398.36 -398.11 -571.27 -82.28 -10.37MPC 2000.44 3056.23 -255.98 -759.17 -30.23 -10.41DTTB 2038.34 3472.63 -410.20 -952.57 -63.72 -7.80 (a) Sports Video (b) Room Video (c) Game Video
Figure 4: Overall QoE when streaming three types of videos l n + > ( p q V n + , quality n + + p s ) d f p d λ (18)Thus, the latency threshold latency _ limit n + that triggers frameskipping when downloading the next segment can be set as: latency _ limit n + = ( p q V n + , quality n + + p s ) d f p d λ (19) In this section, we present the extensive evaluations, by which thefollowing questions can be answered: (1) Is the segment bitrate pre-diction helpful? (2) How does HYSA compare to existing adaptationschemes?To evaluate the performance of HYSA for streaming differenttypes of videos, video traces in three scenes including room, gameand sports are used for the evaluations. These videos are encodedat bitrates in { , , , } kbps. Besides, our evaluationsuse 140 network traces sampled from real network scenario. Theaverage bandwidth of these network traces covers from 0.8Mbps to2.5Mbps, while variance covers from 0.1Mbps to 2.0Mbps.First, experiments are conducted to evaluate the accuracy ofKAMA-based segment bitrate prediction by calculating the predic-tion error | PredictedBitrate − ActualBitrate | ActualBitrate . The results show thatit can reduce the prediction error to 0.22, against the prediction er-ror of 0.258 when using the segment’s coding bitrate for predictiondirectly.Then, we compare HYSA to the following adaptation schemesusing the simulator mentioned previously: (1) HYSA-N: our base-line scheme where segment bitrate prediction is not included. (2)MPC[8]: uses buffer occupancy and throughput predictions to se-lect the bitrate which maximizes a given QoE metric over a horizonof five future chunks. (3) DTTB[7]: selects video bitrate based on a dynamic buffer threshold adapted according to the estimatedthroughput. Table 1 provides the average value of overall QoE andother QoE metrics that each scheme achieves on the entire networktraces and video traces. Figure 4 gives the Cumulative Distribu-tion Function (CDF) of each network trace when streaming threetypes of videos. From these evaluations, we can easily draw twoconclusions. Firstly, the average overall QoE is improved with theKAMA-based segment bitrate predictions rather than the codingbitrates, which demonstrates that the KAMA-based segment bitrateprediction contributes to making better decisions. Secondly, theproposed hybrid control scheme outperforms other schemes withrespect to overall QoE when streaming video in different scenesunder various networks, because it tends to choose higher qualityto improve bandwidth utilization, as illustrated by Table 1.
In this paper, we have presented HYSA – an effective hybrid con-trol scheme consisting of heuristic playback rate control, latency-constrained bitrate control and QoE-oriented adaptive frame drop-ping. Our algorithm adopts KaufmanâĂŹs Adaptive Moving Aver-age to predict the segment bitrates, with which we could make thebitrate decisions more accurately. Extensive simulation results havedemonstrated that the segment bitrate prediction is advantageousin making better decisions, and HYSA can achieve higher overallQoE than the prior state-of-the-arts.
ACKNOWLEDGMENTS
This work was supported in part by the National Natural ScienceFoundation of China (Grant No. 61971382)
EFERENCES [1] Zahaib Akhtar, Yun Seong Nam, Ramesh Govindan, Sanjay Rao, Jessica Chen,Ethan Katz-Bassett, Bruno Ribeiro, Jibin Zhan, and Hui Zhang. 2018. Oboe: auto-tuning video ABR algorithms to network conditions. In
Proceedings of the 2018Conference of the ACM Special Interest Group on Data Communication . ACM, 44–58.[2] Mingfu Li, Chien-Lin Yeh, and Shao-Yu Lu. 2018. Real-time QoE monitoring systemfor video streaming services with adaptive media playout.
International Journal ofDigital Multimedia Broadcasting
Proceedings of the Conference of the ACM SpecialInterest Group on Data Communication . ACM, 197–210.[4] Konstantin Miller, Abdel-Karim Al-Tamimi, and Adam Wolisz. 2017. QoE-basedlow-delay live streaming using throughput predictions.
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)
13, 1 (2017), 4.[5] Yueshi Shen, Ivan Marcin, Josh Tabak, Abhinav Kapoor, Jorge Arturo Villatoro, andJeff Li. 2018. Buffer reduction using frame dropping. US Patent App. 10/015,224.[6] Kevin Spiteri, Rahul Urgaonkar, and Ramesh K Sitaraman. 2016. BOLA: Near-optimal bitrate adaptation for online videos. In
IEEE INFOCOM 2016-The 35thAnnual IEEE International Conference on Computer Communications . IEEE, 1–9.[7] Lan Xie, Chao Zhou, Xinggong Zhang, and Zongming Guo. 2017. Dynamicthreshold based rate adaptation for HTTP live streaming. In . IEEE, 1–4.[8] Xiaoqi Yin, Abhishek Jindal, Vyas Sekar, and Bruno Sinopoli. 2015. A control-theoretic approach for dynamic adaptive video streaming over HTTP. In