Duration-Squeezing-Aware Communication and Computing for Proactive VR
DDuration-Squeezing-Aware Communication andComputing for Proactive VR
Xing Wei, Chenyang Yang, and Shengqian Han
School of Electronics and Information Engineering, Beihang University, Beijing 100191, ChinaEmail: { weixing, cyyang, sqhan } @buaa.edu.cn Abstract —Proactive tile-based virtual reality video streamingcomputes and delivers the predicted tiles to be requested beforeplayback. All existing works overlook the important fact thatcomputing and communication (CC) tasks for a segment maysqueeze the time for the tasks for the next segment, whichwill cause less and less available time for the latter segments.In this paper, we jointly optimize the durations for CC tasksto maximize the completion rate of CC tasks under the taskduration-squeezing-aware constraint. To ensure the latter seg-ments remain enough time for the tasks, the CC tasks for asegment are not allowed to squeeze the time for computingand delivering the subsequent segment. We find the closed-form optimal solution, from which we find a minimum-resource-limited, an unconditional and a conditional resource-tradeoffregions, which are determined by the total time for proactiveCC tasks and the playback duration of a segment. Owing tothe duration-squeezing-prohibited constraints, the increase of theconfigured resources may not be always useful for improvingthe completion rate of CC tasks. Numerical results validatethe impact of the duration-squeezing-prohibited constraints andillustrate the three regions.
Index Terms —Proactive VR video streaming, computing com-munication tradeoff, resource configuration, duration-squeezing-aware constraint
I. I
NTRODUCTION
Virtual reality (VR) video requires 360 ◦ × ◦ panoramicview with ultra high resolution. Delivering such videos is cost-prohibitive for wireless networks. This inspires proactive tile-based streaming [1], [2], which divides a full panoramic viewsegment into small tiles in spatial domain, predicts the futurefield of view (FoV) of a user, and then renders and transmitsthe tiles overlapped with the predicted FoVs.Proactive tile-based VR video streaming contains threetasks: prediction, communication, and computing. Given thepredictor and the prediction accuracy required for satisfyingthe quality of experience (QoE), the total time for renderingand transmitting a segment can be determined [3]–[5]. Withsuch a total time budget, it has been shown in the literaturethat the communication and computing (CC) resources can beflexibly traded off [6], [7]. For example, when the communica-tion bandwidth is insufficient, one can assign more computingresource for rendering in order to provide longer time fordelivering.However, all existing works [3], [8]–[11] for proactive tile-based VR video streaming overlook an important fact: thecommunication and computing tasks for successive segmentsare coupled in timeline. Specifically, the communication taskfor multiple segments in a video forms a queue, and the com- puting task forms another queue. Transmitting and computinga segment may squeeze the time for tasks for the next segment,such that the QoE may degrade owing to the insufficient timeleft for accomplishing the tasks for latter segments.In this paper, we investigate how to maximize the perfor-mance of proactive tile-based VR video streaming consideringthe coupled timeline for computing and delivering successivesegments. To this end, we jointly optimize the durationsfor these two tasks to maximize the completion rate of CCtasks under the duration-squeezing-aware constraint. Whenthe length of a VR video is long, to ensure that the lattersegments remain enough time for the tasks, the CC tasks fora segment are not allowed to squeeze the time for computingand delivering the subsequent segment. We obtain the globaloptimal solution via Karush-Kuhn-Tucker (KKT) conditions.As far as the authors know, this is the first work that considersthe time squeeze of these two tasks in proactive VR streaming.From the closed-form solution of the optimal durations,we find a minimum-resource-limited, an unconditional anda conditional resource-tradeoff regions. The boundary of thethree regions depends on the relative values of the totaltime budget for communication and computing as well asthe playback duration of a segment. In practice, these twodurations can be very different, with the range of 0.2 ∼ ∼ YSTEM M ODEL
Consider a proactive tile-based VR video streaming systemwith a mobile edge computing (MEC) server co-located witha base station (BS). Each VR video consists of L segmentsin temporal domain, and each segment consists of M tiles inspatial domain. The playback duration of each tile equals tothe playback duration of a segment, denoted by T seg [1], [2]. a r X i v : . [ c s . MM ] J a n ach user is equipped with a head-mounted display (HMD),which can measure the head movement data, send the data tothe MEC server, and pre-buffer segments. The MEC serverrenders a video segment before delivering to the HMD. Observation
Transmitting
VR video playback t l+ Rendering l+ Head movement trace Predicted tiles ll l l +1 com t l B l E l B l E cpt t seg T cc T cc T p m cpt t com t (a) Rendering and transmitting pipeline squeeze, ∆ p > , ∆ m> Transmitting
VR video playback t l+ Rendering l+ ll l l +1 com t l B l E l B l E cpt t seg T cc T cc T m p cpt t com t (b) Transmitting pipeline squeeze, ∆ p < , ∆ m > Fig. 1: Proactively streaming the l th and ( l +1)th segments.When a user requests a VR video, the MEC server firststreams the first l − segments in a reactive or a passivemode [15]. When the MEC server collects the informationof the user (e.g., the head movement data) in an observationwindow, proactive streaming for the l th segment begins, thensubsequent segments are predicted, rendered, and transmittedone after another, as shown in Fig. 1a. Specifically, at the endof the observation window for the l th segment, i.e, B l , the tilesin the l th segment to be requested are first predicted, then thepredicted tiles are rendered with duration t cpt , and finally therendered tiles are transmitted with duration t com , which shouldbe finished before the start time for playback for the segment,i.e., E l . Therefore, the total computing and transmission timefor the segment T cc = E l − B l .To train a predictor for the whole video, T cc for everysegment needs to be identical. A predictor can be moreaccurate with a smaller value of T cc . This is because the tiles tobe predicted are closer to and hence are more correlated withthe head movement sequence in the observation window [3].Given a predictor and required viewport prediction accuracy,the value of T cc can be determined [3]–[5]. A. Duration-Squeezing-Aware Constraint
With identical value of T cc for every segment, we canobserve that E l +1 − E l = B l +1 − B l . Without playbackstalling, E l +1 − E l = T seg holds and thus B l +1 − B l = T seg .If the rendering for the l th segment finishes after B l +1 , thenthe computing task will squeeze the time for rendering the ( l +1)th segment. Denote the squeezed computing time as ∆ p = t cpt − ( B l +1 − B l ) = t cpt − T seg . If the renderingfor the l th segment can be finished within T seg , then ∆ p ≤ ,and there is no squeeze in rendering, as shown in Fig. 1b.Similarly, the communication task may also squeeze thetime for delivering the ( l +1)th segment. Denote the squeezedcommunication time as ∆ m . When ∆ p > , ∆ m = t com − t cpt , as shown in Fig. 1a. When ∆ p < , ∆ m = t com − t cpt − ( − ∆ p ) , as shown in Fig. 1b. By summarizing thetwo cases, we obtain ∆ m = t com − t cpt − ( − ∆ p ) + , where ( x ) + (cid:44) max { x, } . When ∆ m ≤ , the transmission can befinished on time and there is no squeeze in the pipeline.For the l th segment, which is the first segment with proac-tive streaming, the transmission and rendering tasks shouldbe finished within T cc , i.e., t cpt + t com ≤ T cc . For the( l +1)th segment, the remaining duration for CC tasks is t cpt + t com ≤ T cc − ((∆ p ) + + (∆ m ) + ) . For the L th segment,the remaining duration for CC tasks is t cpt + t com ≤ T cc − ( L − l ) (cid:8) (∆ p ) + + (∆ m ) + (cid:9) . (1) B. Computing and Transmission Model
The computing resource of MEC for rendering a VR videocan be assigned by allocating graphics processing unit (GPU)and compute unified device architecture cores [3], [16]. Togain useful insight, we assume that the computing resource,denoted as C total (in floating-point operations per second,FLOPS), is equally allocated among K users. Then, thenumber of bits that can be rendered per second, referred toas the computing rate , for the k th user, is C cpt ,k (cid:44) C total K · µ r ( in bit/s ) , where µ r is the required floating-point operations (FLOPs) forrendering one bit of FoV in FLOP/bit [3].The BS serves K single-antenna users using zero-forcingbeamforming over bandwidth B with N t antennas. The in-stantaneous data rate at the i th time slot for the k th user is C i com ,k = B log (cid:32) p k d − αk | ˜ h ik | σ (cid:33) , where ˜ h ik (cid:44) ( h ik ) H w ik is the equivalent channel gain, p k and w ik are respectively the transmit power and beamformingvector for the k th user, d k and h ik ∈ C N t are respectively thedistance and the small scale channel vector from the BS to the k th user, α is the path-loss exponent, σ is the noise power,and ( · ) H denotes conjugate transpose.We consider indoor users as in the literature, where thedistances of users, d k , usually change slightly [2], [17], [18]and hence are assumed fixed. Due to the head movementand the variation of the environment, small-scale channelsare time-varying, which are assumed as remaining constant ineach time slot with duration ∆ T and changing independentlywith identical distribution among time slots. With the proactivetransmission, the predicted FoVs in a segment should betransmitted with duration t com . The number of bits transmittedith t com can be expressed as C com ,k t com , where C com ,k (cid:44) N s (cid:80) N s i =1 C i com ,k ∆ T is the time average transmission rate, and N s is the number of time slots in t com . Since future channelsare unknown when optimizing the durations, we use ensemble-average rate E h { C i com ,k } to approximate the time-average rate C com ,k , which is very accurate when N s or N t /K is large [3].To ensure fairness among users in terms of QoE, the transmitpower is used to compensate for the path loss, i.e., p k = βd − αk ,where β can be obtained from β ( (cid:80) Kk =1 1 d − αk ) = P and P isthe maximal transmit power of the BS. Then, the ensemble-average transmission rate for each user is equal.Without loss of generality, we consider an arbitrary user foranalysis in the sequel. For notational simplicity, we use C com to represent E h { C i com ,k } and use C cpt to represent C cpt ,k .III. D URATION O PTIMIZATION FOR C OMPUTING AND C OMMUNICATION
To reflect the system performance for rendering and deliver-ing all the predicted FoVs in a segment, define the completionrate of communication and computing (CC) tasks as S cc (cid:44) min (cid:26) C com t com S com , C cpt t cpt S cpt (cid:27) , (2)where S com = s fov · r f · T seg /γ c and S cpt = s fov · r f · T seg arerespectively the number of bits of all the predicted FoVs ina segment for transmission [19] and for rendering, γ c is thevideo compression ratio, r f (in frames per second) is framerate, s fov (cid:44) γ fov R w R h b is the number of bits in a FoV, γ fov is the ratio of FoV in a frame, R w and R h are respectivelythe pixels in wide and high of a frame, and b is the numberof bits per pixel relevant to color depth [19]. By substituting S com and S cpt into (2), we obtain S cc = min { ˜ C com t com , C cpt t cpt } s fov · r f · T seg , (3)where ˜ C com (cid:44) C com γ c is the equivalent transmission rate.If S cc > S cc = 0 , the HMD cannot receive anyrendered FoV on time, which will cause playout stalls.The durations for computing and delivering are optimizedto maximize the completion rate of CC tasks, i.e., P0 : max t cpt ,t com S cc (4a) s.t. ∆ p = t cpt − T seg , (4b) ∆ m = t com − t cpt − ( − ∆ p ) + , (4c) t cpt + t com ≤ T cc − ( L − l ) (cid:8) (∆ p ) + + (∆ m ) + (cid:9) . (4d)Problem P0 contains four cases, depends on whether or not ∆ p and ∆ m exceed zero. When the length of a VR video (i.e., L ) is long, to ensure that every latter segment has time to berendered and delivered, i.e., the right-hand side of (4d) is largerthan zero, the values of ∆ p and ∆ m should be non-positive. That is to say, squeezing either transmission or renderingtime of the subsequent segment is strictly prohibited. When ∆ p ≤ , we obtain t cpt ≤ T seg from (4b). When ∆ m ≤ ,by substituting (4b) into (4c), we obtain t com ≤ T seg . Then,problem P0 degenerates into P1 : max t cpt ,t com S cc (5a) s.t. t cpt + t com ≤ T cc , (5b) t cpt ≤ T seg , (5c) t com ≤ T seg . (5d)Problem P1 can be transformed into a convex problem. Fromthe KKT conditions, its optimal solution and the maximalvalue of the objective function of P1 can be obtained as t ∗ cpt ∈ (cid:104) ˜ C com T seg C cpt , T min (cid:105) , ˜ C com < C cpt and T maxc > T seg , = T seg , ˜ C com ≥ C cpt and T maxc > T seg , = ˜ C com T cc ˜ C com + C cpt , T maxc ≤ T seg , (6a) t ∗ com = T seg , ˜ C com ≤ C cpt and T maxc > T seg , ∈ (cid:104) C cpt T seg ˜ C com , T min (cid:105) , ˜ C com > C cpt and T maxc > T seg , = C cpt T cc ˜ C com + C cpt , T maxc ≤ T seg , (6b) S ∗ cc = min { ˜ C com ,C cpt } s fov · r f , T maxc > T seg , ˜ C com C cpt T cc s fov · r f · T seg ( ˜ C com + C cpt ) , T maxc ≤ T seg , (6c)where T min (cid:44) min { T cc − T seg , T seg } and T maxc (cid:44) max { ˜ C com , C cpt } T cc ˜ C com + C cpt = max { t o cpt , t o com } . (7) t o cpt and t o com are the optimal durations for computing andcommunication without the constraints in (5c) and (5d) asconsidered in [3].IV. M INIMUM -R ESOURCE -L IMITED , U
NCONDITIONALAND C ONDITIONAL R ESOURCE -T RADEOFF R EGIONS
In this section, we show that the system may operatein a minimum-resource-limited, an unconditional resource-tradeoff, or a conditional resource-tradeoff regions.First we discuss the two cases in (6c).
Case 1 T maxc > T seg : If ˜ C com > C cpt , then T maxc = t o cpt from (7). Since the allowed maximal duration for renderingis T seg as shown in (5c), T maxc > T seg indicates that t o cpt exceeds the allowed rendering duration. This suggests that thecompletion rate of CC tasks is limited by the computing rate,where increasing the other type of resource ˜ C com is useless forimproving the system performance. Similarly, if ˜ C com < C cpt ,then T maxc = t o com and the system performance is limited bythe transmission rate. We refer to this case as “Minimum-resource-limited case”, where the efficient resource configura-tion should satisfy ˜ C com = C cpt .We refer to a resource configuration as “ efficient ” when thedecrease of arbitrary one type of resources in the configurationwill reduce the value of S ∗ cc . ase 2 T maxc ≤ T seg : Both t o cpt and t o com satisfy theduration-squeezing-prohibited constraints in (5c) and (5d).In this case, either increasing the computing rate or thetransmission rate can improve the completion rate of CCtasks. This indicates a tradeoff between the computing rateand transmission rate [3]. We refer to this case as “Resource-tradeoff case”, where the resource configuration is flexible.However, the boundary of the two cases depends on T maxc ,which further depends on ˜ C com and C cpt as shown in (7).To provide useful insight into the resource configuration, weprovide three regions in the following, which are independentof the configured resources.According to (7), we have T cc > T maxc ≥ T cc . (8) Minimum-resource-limited region : If T cc > T seg , then with T maxc ≥ T cc we have T maxc > T seg , i.e., Case 1 holds.
Unconditional resource-tradeoff region : If T cc ≤ T seg , thenwith T maxc < T cc we have T maxc < T seg , which is the sufficientcondition to make Case 2 satisfied.
Conditional resource-tradeoff region : If T cc ∈ ( T seg , T seg ] ,considering that max { C com ,C cpt } C com + C cpt ∈ [ , , we obtain T maxc ∈ ( T seg , T seg ) . The system may operate in Case 1 or Case 2 . If T maxc ≤ T seg , then the system lies in Case 2 . If T maxc > T seg ,then the system lies in Case 1 , where the efficient resourceconfiguration is C com = C cpt and we have T maxc = T cc from(7). Further considering one boundary of the region T cc ≤ T seg , we obtain T maxc ≤ T seg , which is the condition of Case2 and can also be re-written as the condition for the efficientresource configuration as max { C com ,C cpt } C com + C cpt ≤ T seg T cc . That is tosay, in this region even if in Case 1 , the efficient resourceconfiguration can transform the system into
Case 2 , i.e., theresource-tradeoff case.V. N
UMERICAL R ESULTS
In this section, we validate the obtained analytical resultsand evaluate the performance of the optimized durations.We consider the VR video with 4K resolution (3840 × b = 12 bits per pixel [19]. The ratio of aFoV to a frame is γ fov = 0 . [18], then the number of bitsin a FoV is s fov = 3840 × × b × γ fov = 19 . Mbits.The frame rate of VR video is r f = 30 frames per second[21]. The compression ratio is γ c = 2 . [22]. The playbackduration of a segment is T seg = 1 s [21]. Depending on theconfigured communication and computing resources as well asthe number of users, the computing and transmission rates fora user can be very different. For example, when K = 4 , N t =8 , P = 24 dBm, B = 40 MHz, and d k = 5 m, the ensemble-average transmission rate for a user is C com = 0 . Gbps [3],and the equivalent transmission rate ˜ C com = C com γ c = 1 . Gbps. When Nvidia P40 GPU is used for rendering VR videosfor four users, the computing rate for a user is C cpt = 1 . Gbps [3]. To reflect the variation of configured resources, weset ˜ C com , C cpt ∈ [0 , Gbps, unless otherwise specified.In Fig. 2, we illustrate the three regions. As shown inFig. 2a, if C com (cid:54) = C cpt , then the system performance is (a) T cc > T seg ( T cc = 2 . s) (b) T cc < T seg ( T cc = 0 . s)(c) T cc ∈ ( T seg , T seg )( T cc = 1 . s) Fig. 2: (a) Minimum-resource-limited region, (b) Uncon-ditional resource-tradeoff region, (c) Conditional resource-tradeoff region.restricted either by communication or computing resource. Bycontrast, in the unconditional resource-tradeoff region shownin Fig. 2b, the communication and computing resources can beflexibly adjusted. In the conditional resource-tradeoff regionin Fig. 2c, the system configured with different resourceslies in communication-limited case, resource-tradeoff case, orcomputing-limited case. The boundary of the three cases is max { C com ,C cpt } C com + C cpt = T seg T cc . We can observe that if the system isresource-limited, say P in the figure, no matter if we increasethe computing rate or reduce the transmission rate in order tosatisfy the condition for efficient resource configuration (i.e., max { C com ,C cpt } C com + C cpt ≤ T seg T cc ), the system will finally fall into theresource-tradeoff case.In Fig. 3, we verify the necessity of imposing the duration-squeezing-prohibited constraints by taking the value of S cc over the first four proactively streamed segments as an example(the results for other values of ˜ C com and C cpt are similarwhenever the difference between the two values are morethan 500). We compare the optimal durations in (6) with twobaseline schemes without considering the duration-squeezing-prohibited (SP) constraints. One is the optimal solution ofproblem P1 without the SP constraints in (5c) and (5d), where t com = t o com and t cpt = t o cpt , with legend “opt duration w/oSP”. The other scheme fixes the durations as t com = T cc , withlegend “1:1 duration”. As expected, the optimal durations yieldthe best performance from the ( l +1)th segment.When T cc < T seg as shown in Fig. 3a, the optimal durationsachieve the same performance as the baseline “opt durationw/o SP”, because T cc ≤ T seg is the sufficient condition of Case l+ l+ l+ (a) T cc < T seg ( T cc = 0 . s) l l+ l+ l+ (b) T cc ∈ ( T seg , T seg )( T cc = 1 . s)(c) T cc ∈ ( T seg , T seg )( T cc = 1 . s) l l+ l+ l+ (d) T cc > T seg ( T cc = 2 . s) Fig. 3: S cc and MTP latency v.s. segment index, ˜ C com = 900 Mbps and C cpt = 400 Mbps. . When Case 2 holds, t o com , t o cpt ≤ T seg , i.e., the transmittingand computing with “opt duration w/o SP” will not causethe squeeze. These two schemes outperform the scheme “1:1duration”, which shows the gain of matching the imbalancedcomputing rate and transmission rate.When T seg < T cc < T seg as shown in Fig. 3b, al-though “opt duration w/o SP” slightly outperforms the op-timal durations for the l th segment, the completion rate ofthe CC tasks of this baseline degrades to zero and stallinghappens for the ( l +1)th segment. This is because T maxc = max { ˜ C com ,C cpt } T cc ˜ C com + C cpt = 1 . > T seg , i.e., Case 1 holds, whereeither the transmitting or the computing of this baselinefor the l th segment squeezes the duration for the ( l +1)thsegment that causes the playback stalling, as visualized inFig. 3c. For the three schemes, the motion-to-photon (MTP)latency of ( l + n )th segment can be expressed as T MTP =[ t com + t cpt − ( n −
1) ((∆ p ) + + (∆ m ) + )] + .When T cc > T seg as shown in Fig. 3d, the squeeze isunavoidable for two baselines. This shows the necessity ofimposing the duration-squeezing-prohibited constraints.VI. C ONCLUSION
In this paper, we investigated maximizing the completionrate of CC tasks with task duration-squeezing-aware constraintin proactive VR streaming. From the obtained closed-formsolution, we found the minimum-resource-limited, uncondi-tional, and conditional resource-tradeoff regions. The bound-ary of the three regions depends on the relation between thetotal time budget for proactive communication and computingand the playback duration of a segment. In the minimum-resource-limited region, communication and computing re-sources can not be traded off. In the unconditional resource-tradeoff region, the resources can be flexibly configured while in the conditional resource-tradeoff region, the efficient config-uration should satisfy a condition. Numerical results validatedthe necessity of imposing duration-squeezing-prohibited con-straints and illustrated these regions.R
EFERENCES[1] F. Qian, L. Ji, B. Han, and V. Gopalakrishnan, “Optimizing 360 videodelivery over cellular networks,”
ACM SIGCOMM Workshop , 2015.[2] C.-L. Fan, W.-C. Lo, Y.-T. Pai, and C.-H. Hsu, “A survey on 360 ◦ videostreaming: Acquisition, transmission, and display,” ACM Comput. Surv. ,vol. 52, no. 4, Aug. 2019.[3] X. Wei, C. Yang, and S. Han, “Prediction, communication, and com-puting duration optimization for VR video streaming,”
IEEE Trans.Commun., early access , 2020.[4] C. Li, W. Zhang, Y. Liu, and Y. Wang, “Very long term field of viewprediction for 360-degree video streaming,”
IEEE MIPR , 2019.[5] C. Fan, S. Yen, C. Huang, and C. Hsu, “Optimizing fixation predictionusing recurrent neural networks for 360 ◦ video streaming in head-mounted virtual reality,” IEEE Trans. Multimedia , vol. 22, no. 3, pp.744–759, March 2020.[6] S. Mangiante, G. Klas, A. Navon, Z. GuanHua, J. Ran, and M. D. Silva,“VR is on the edge: How to deliver 360 ◦ videos in mobile networks,” ACM SIGCOMM , 2017.[7] S. Gupta, J. Chakareski, and P. Popovski, “Millimeter wave meets edgecomputing for mobile VR with high-fidelity 8K scalable 360 ◦ video,” IEEE MMSP , 2019.[8] F. Guo, F. R. Yu, H. Zhang, H. Ji, V. C. M. Leung, and X. Li, “Anadaptive wireless virtual reality framework in future wireless networks:A distributed learning approach,”
IEEE Trans. Veh. Technol. , vol. 69,no. 8, pp. 8514–8528, 2020.[9] J. Du, F. R. Yu, G. Lu, J. Wang, J. Jiang, and X. Chu, “MEC-assistedimmersive VR video streaming over Terahertz wireless networks: A deepreinforcement learning approach,”
IEEE Internet Things J. , vol. 7, no. 10,pp. 9517–9529, 2020.[10] C. Zheng, S. Liu, Y. Huang, and L. Yang, “MEC-enabled wirelessVR video service: A learning-based mixed strategy for energy-latencytradeoff,”
IEEE WCNC , 2020.[11] J. Chakareski and S. Gupta, “Multi-connectivity and edge computing forultra-low-latency lifelike virtual reality,”
IEEE ICME , 2020.[12] X. Hou, S. Dey, J. Zhang, and M. Budagavi, “Predictive adaptivestreaming to enable mobile 360-degree and VR experiences,”
IEEETrans. Multimedia, early access , 2020.[13] W. Xing and C. Yang, “Tile-based proactive virtual reality streaming viaonline hierarchial learning,”
APCC , 2019.[14] W. Lo, C. Huang, and C. Hsu, “Edge-assisted rendering of 360° videosstreamed to head-mounted virtual reality,”
IEEE ISM , 2018.[15] 3GPP, “Extended reality (XR) in 5G,” 2020, 3GPP TR 26.928 version16.0.0 release 16.[16] NVIDIA, “NVIDIA CloudXR cuts the cord for VR, raises the bar forAR,” https://blogs.nvidia.com/blog/2020/05/14/cloudxr-sdk.[17] C. Perfecto, M. S. Elbamby, J. Del Ser, and M. Bennis, “Tamingthe latency in multi-user VR 360°: A QoE-aware deep learning-aidedmulticast framework,”
IEEE Trans. Commun. , vol. 68, no. 4, pp. 2491–2508, 2020.[18] W.-C. Lo, C.-L. Fan, J. Lee, C.-Y. Huang, K.-T. Chen, and C.-H. Hsu,“360 ◦ video viewing dataset in head-mounted virtual reality,” ACMMMSys
IEEE J. Sel. Topics Signal Process. , vol. 14, no. 1,pp. 161–176, 2020.[21] A. Mahzari, A. T. Nasrabadi, A. Samiei, and R. Prakash, “FoV-awareedge caching for adaptive 360° video streaming,”
ACM MM , 2018.[22] M. Zhou, W. Gao, M. Jiang, and H. Yu, “HEVC lossless coding andimprovements,”