A Study on Impacts of Multiple Factors on Video Qualify of Experience
AA Study on Impacts of Multiple Factors onVideo Qualify of Experience
Huyen T. T. Tran , Nam Pham Ngoc , and Truong Cong Thang The University of Aizu, Aizuwakamatsu, Japan VinUniversity, Vietnam
Abstract
HTTP Adaptive Streaming (HAS) has become a cost effec-tive means for multimedia delivery nowadays. However,how the quality of experience (QoE) is jointly affected by1) varying perceptual quality and 2) interruptions is notwell-understood yet. In this paper, we present the firstattempt to quantitatively quantify the relative impacts ofthese factors on the QoE of streaming sessions. To achievethis purpose, we first model the impacts of the factors usinghistograms, which represent the frequency distributions ofthe individual factors in a session. By using a large dataset,various insights into the relative impacts of these factorsare then provided, serving as suggestions to improve theQoE of streaming sessions.
HTTP Adaptive Streaming (HAS) has become a popularsolution for multimedia delivery nowadays. In HAS, avideo is encoded into multiple versions with different bi-trates (and so different quality values). Each version isfurther divided into short segments [1]. Based on networkstatuses, suitable versions of individual segments are se-lected and delivered to clients so that the highest possiblequality of experience (QoE) can be provided to users. To-wards effective version selections, a main challenge is toquantify the impacts of factors on the QoE of streamingsessions.Previous studies have investigated, both qualitativelyand quantitatively, different factors affecting the QoE of HAS sessions [2–4]. In general, there are three key fac-tors, namely initial delay , varying perceptual quality , and interruptions as shown in Fig. 1. The initial delay refersto the waiting time before watching a video [5]. Varyingperceptual quality refers to quality changes of segmentsin a session as a consequence of network bandwidth fluc-tuations. This factor could be further divided into twosub-factors. The first, called quality levels , refers to con-tributions of high and low segment quality levels on theQoE. The second, called quality variations , refers to im-pacts of segment quality switches. Interruptions refer toinstances of rebuffering while watching a video [5].In the literature, the impact of varying perceptual qual-ity was modeled using some statistics such as the numberof switches [6, 7], the average [4, 6, 8, 9], the median [10],the minimum [10], and the standard deviation of segmentquality values [6]. As for interruptions, their impact wasmodeled using some statistics such as the number of in-terruptions [11, 12], the average [11], the maximum [11],and the sum [12, 13] of interruption durations.Different from the two above factors, the impact of theinitial delay was mostly found to be small [14–16]. Inaddition, as the initial delay appears only once at the be-ginning of a session, it is simple to individually model theimpact of this factor using a function of the initial delayduration. This could be a linear function [12], an exponen-tial function [13], or a logarithmic function [5, 17]. Thus,in this study, we mainly focus on the two more importantfactors of varying perceptual quality and interruptions.Though previous studies [5, 11–13, 18] have revealedsome general behaviors of each individual factor, insightsinto relative impacts of different events (i.e., switching and1 a r X i v : . [ c s . MM ] J un igure 1: A taxonomy of key factorsinterruptions) during a session are very limited. The un-derstanding of such relative impacts can help make moreeffective adaptation decisions. For example, instead ofabrupt down-switching to a very low quality value, remain-ing a moderate quality value with a very short interruptioncould bring a higher QoE. However, to the best of ourknowledge, there are only two studies in [3, 19] that couldgive some findings on this issue. In particular, the authorsin [19] show that a down-switch can result in a comparableimpact to an interruption. Meanwhile, an extensive studywith YouTube users in [3] indicates that an interruptionhas three times the impact of a quality switch. Both thestudies are not clear how different degrees of switches canbe compared to different interruption durations, and socausing the confusion between the conclusions.In this paper, our aim is to quantitatively investigate therelative impacts of different events of quality switchingand interruption. In particular, we focus on answeringfour important questions below.• How different are the impacts of different segmentquality levels?• What are the impacts of quality switching types? Doswitches with higher switching amplitudes alwayscause more negative impacts? Does the starting qual-ity of switches have any influence on the QoE?• How different are the effects of different interruptiondurations?• What are the relative impacts between quality switch-ing types and interruption types? Do interruptions always result in more negative effects than qualityswitches?For this purpose, we first extend our QoE model usinghistograms that has been presented in [17]. In particular,switches are classified into different types based on notonly their switching amplitudes (i.e., differences of seg-ment quality before and after switching) but also startingquality values (i.e., segment quality values before switch-ing). Then, by using two more datasets, the reliability ofthe model is confirmed. Finally, based on an analysis ofthe model parameters, the impact of each switching or in-terruption type could be quantified. Also, various insightsinto their relative impacts are provided, which help answerthe above questions.The remainder of the paper is organized as follows. Sec-tion 2 presents the related work. The quality model, whichaims at quantifying the relative impacts of different switch-ing and interrupting events, is described in Section 3. Theperformance of the proposed model is analyzed in Sec-tion 4. Section 5 draws a set of insights into the impactsof the factors on the QoE. Finally, Section 6 concludes thepaper. In this section, we first present related work on the impactof each single factor. Then, relative impacts of factorsfound in previous studies are described.
Aforementioned, the factor of varying perceptual qual-ity could be divided into two sub-factors of quality levelsand quality variations. The contributions of quality levelswere investigated in many existing studies [4, 6, 8, 9]. Itwas found that, given a quality level, its contribution de-pends on its total presence time during a session [12, 20].To model these contributions, some statistics of segmentquality values such as the average [4, 6, 8, 9], the me-dian [10], and the weighted sum [12] can be used.There have been some findings on the impacts of qualityvariations presented in [3, 20–22]. The experiment resultsin [3, 22] show that users expect the number of qualityswitches (or quality switching frequency) as low as pos-sible. In contrast, the finding in [3] is that users prefer2onstant quality to varying quality. Even the impact offrequent quality switching may be more than four timeshigher than that with no quality change. Meanwhile, theimpact of the number of quality switches is found to benegligible in [20, 21]. In particular, no significant dif-ference was observed between high and low numbers ofquality switches. The disagreement in the conclusionsmay stem from the fact that the use of the number ofquality switches equitably treats all quality switch typeswhile they may cause significantly different impacts. Inother words, the conclusions are specific to only the cor-responding experiment results in the original papers, butnot general in practice.Some in-depth investigations on the impact of qual-ity variations were conducted with classifying qualityswitches [12, 19–21]. It is found that the impactsof up-switches are much smaller than those of down-switches [12]. In addition, abrupt up-switches may notbe worse than smooth up-switches [19, 21]. Meanwhile,down-switches with larger switching amplitudes causemore negative impacts [20, 21]. Similar conclusions arealso given in [23], where the authors quantify the impactsof switching types with different switching amplitudes.However, these findings are limited because it cannot dif-ferentiate switches having different starting quality values(e.g., a switch from 5 MOS to 3 MOS is in fact not thesame as a switch from 3 MOS to 1 MOS). So far, therehas been no study on the impact of starting quality valuesof switches.In most existing studies, the impact of quality variationsare modeled using some statistics such as the number ofswitches [6, 7], the minimum [10], and the standard de-viation of segment quality values [6]. In our previousstudy [17], it is found that the use of histograms of switch-ing amplitudes is very effective to model the impact of thissub-factor.With regard to interruptions, the authors in [3] showedthat users prefer a single interruption to multiple interrup-tions. The impact of a single interruption is modeled as anexponential function of its duration in [5]. To model theimpacts of multiple interruptions, several previous studiesused the number of interruptions [11,12], the average [11],the maximum [11], and the sum [12, 13] of interruptiondurations.
Recently, many QoE models have been proposed for HAS.However only a few are multi-factor models [12, 13]. Thestudies in [2,11,24,25] modeled the contributions of qual-ity levels and interruptions. However, the impact of qualityvariations was not considered. In [13], the authors pro-posed a model taking into account the impacts of qualityvariations and interruptions. Yet, this model does notinclude the impacts of quality levels.To the best of our knowledge, the authors in [12] pro-posed the first QoE model taking into account all the three(sub-)factors of quality levels, quality variations, and in-terruptions. This model is built in two steps, which are 1)separately modeling and then 2) combining the impacts offactors. In particular, the impact of a quality level is mod-eled by the total presence time of that quality level. For theimpacts of quality variations, the authors used the meansquare of down-switching amplitudes. The impact of in-terruptions is modeled using the number of interruptionsand the sum of interruption durations. In the latest stage ofstandardization ITU-T P.1203 [26], a multi-factor modelis recommended. In this model, the impacts of quality lev-els, quality variations, and interruptions are modeled usingvarious statistics such as the difference between the maxi-mum and minimum segment quality values, the weightedsum of segment quality values, and the number of inter-ruptions.Although the factors were modeled in the above stud-ies, the relative impacts of these factors have not beenquantified. In the literature, there are very few studies in-vestigating this issue. In [19], the finding is that a qualityswitch can cause a comparable impact as an interruption.However, the switching and interruption types consideredin that study are limited. In particular, only two specificpairs consisting of two switching types and two interrup-tion types were compared in that study. An study in [3]revealed that an interruption has three times the impact ofa switch. However, the switching and interruption typesare not clearly defined.In this study, we, for the first time, attempt to fullyquantify the impacts of different switching and interrup-tion types. Based on obtained results, a set of insights intothe relative impacts of the factors are provided.3
Proposed QoE model
In this section, we first present two main components ofthe proposed model. The first, denoted Q PQ , representsthe impact of varying perceptual quality. The second,denoted D I R , represents the impact of interruptions. Then,a combination of these components to predict the QoE isgiven.
To model the impact of varying perceptual quality, we uti-lize two histograms of two sub-factors, i.e., quality levelsand quality variations.In particular, each segment is represented by a qualityvalue (i.e., MOS), which can be obtained by subjectivetests [6,17,20] or estimated from encoding parameters [10,18]. The range of segment quality values is split into N intervals { I Qn | ≤ n ≤ N } , which are given by I Qn = [ n − ϑ L n , n + ϑ U n ) , (1)where ϑ U n and ϑ L n are parameters to define the intervals’widths. Each interval I Qn corresponds to a segment qualitybin B Qn . If a segment quality value is in interval I Qn , itbelongs to bin B Qn .In the proposed model, each quality switch is repre-sented by a starting quality value and a switching ampli-tude. To define switching amplitudes, we use the conceptof “quality gradient” , which is given by ∇ Q = ∂ Q / ∂ t , (2)where ∂ Q is the change of segment quality values in timeinterval ∂ t .Currently, we use the quality value of the segment justbefore switching to represent the starting quality value, andquality changes between two segments right before andafter switching to represent the instant gradient value ateach switch. A positive (negative) gradient value indicatesan up-switch (down-switch). The range of gradient valuesis split into ( × M + ) intervals { I Vj | − M ≤ j ≤ M } ,which are defined by I Vj = [ j − θ L j , j + θ U j ) , (3)where the parameters θ U j and θ L j define the intervals’widths. Each quality switching bin { B Vi , j ( ≤ i ≤ N , − M ≤ j ≤ M )} is defined by two intervals I Qi and I Vj . A switchbelongs to bin B Vi , j if its starting quality value is in interval I Qi and its switching amplitude is in interval I Vj .Let F Qn denote the normalized frequency of seg-ment quality values in bin B Qn ( ≤ n ≤ N ) . F Vi , j denotes the normalized frequency of switches in bin B Vi , j ( ≤ i ≤ N , − M ≤ j ≤ M ) . Note that F Qn is normal-ized by the number of segments. F Vi , j is normalized by thetotal number of switches and interruptions. The perceptualquality Q PQ is modeled by Q PQ = N (cid:213) n = α n F Qn − N (cid:213) i = M (cid:213) j = − M β i , j F Vi , j , (4)where α n and β i , j are respectively the weights of bin B Qn and bin B Vi , j .In this study, we use the Absolute Category Rat-ing method with a 5-grade scale, which is widelyused for quality assessments of streaming sessions inHAS [19, 21, 26, 27]. So we currently split the rangeof segment quality values into N = ( M = ) with ϑ U n = ϑ L n = θ L j = θ U j = . { I Vj | j > } contain up-switches, interval I V represents quality maintaining, andintervals { I Vj | j < } include down-switches. As noted inthe previous study of [23], the impacts of down-switchesare significant while the impacts of non-negative switches(including up-switches and quality maintaining) are neg-ligible. So, we simplify the proposed model by groupingall the bins of non-negative switches into one bin (denotedby B um ). The normalized frequency F um of this bin isgiven by F um = N (cid:213) i = M (cid:213) j = F Vi , j . (5)Then, the simplified perceptual quality model is given by Q PQ = N (cid:213) n = α n F Qn − N (cid:213) i = − (cid:213) j = − M β i , j F Vi , j − β um F um , (6)where β um is the weight of bin B um . With the simplifiedmodel, the number of model parameters (i.e., weights) canbe reduced by approximately a half.4 a) (b) Figure 2: Distribution of interruptions: (a) left, the num-bers of interruptions per session; (b) right, interruptiondurations.Table 1: Intervals of Interruption Durations.
Interval (s) I I I I I I I I I I I I ( . , . ] ( . , . ] ( . , . ] ( . , . ] ( . , . ] ( . , + ∞) Table 2: Features of Source Videos
Video Framerate (fps) Type Content Motionactivity Spatialcomplexity
Video
24 Animation Slow move-ments ofcharacters Low Complex
Video
24 Animation A fight be-tween twocharacters High Simple
Video
24 News A reporter in aweather fore-cast Medium Complex
Video
24 Sport A soccermatch High Complex
To investigate the impact of interruptions, a histogram ofthis factor is defined as follows. Each interruption is repre-sented by its duration. The range of interruption durationsis divided into L intervals { I Il | ≤ l ≤ L } correspondingto L interruption bins { B Il | ≤ l ≤ L } . Let F Il be thenormalized frequency of interruptions in bin B Il . Notethat F Il is normalized by the total number of switches andinterruptions. The impact of interruptions D I R is modeledby D I R = L (cid:213) l = γ l F Il , (7)where γ l is the weight of bin B Il .To define the representative intervals I Il , a series of video streaming experiments was performed on a realtestbed. In these experiments, we recorded totally 120streaming sessions with at least one interruption per ses-sion. The distributions of the number of interruptions andinterruption durations are shown in Fig. 2. We can seethat, the number of interruptions per session is typicallyfrom 1 to 6. About 40% of the sessions contains onlyone interruption. Besides, about 13% of the observed in-terruptions has durations shorter than 0.25 seconds, andabout 10% longer than 3 seconds. Based on these obser-vations, we split the range of interruption durations into L = I I includes interruption durations longer than 3seconds. Finally, the proposed model, which integrates the abovecomponents to predict the QoE of streaming sessions, isgiven by
QoE pred = max ( Q PQ − D I R , ) . (8)This model has totally 22 model parameters which enablecomparisons of impacts between switching and interrup-tion types presented in Section 5. To evaluate the prediction performance of the proposedmodel, we combine our two databases presented in [17,28]. The combination database consists of 288 sessions,of which 168 contain only one single factor (i.e., eithervarying perceptual quality or interruption) and 120 includeboth the factors. From this database, we choose 50 pairs oftraining and test sets. In particular, for each pair, the test setconsists of 90 sessions randomly selected from the multi-factor sessions. The corresponding training set consistsof 198 remaining sessions (i.e., 168 single-factor sessionsand 30 remaining multi-factor sessions). The training setis to obtain model parameters using least squares fitting.The test set is to evaluate prediction performances. Theresults presented in the following are the average valuesover the 50 test sets.Table 3 shows the performance of the proposed modelin terms of Person Correlation Coefficient (PCC) and Root5able 3: Performance of the Proposed QoE Model
Model Training set Test set
PCC RMSE PCC RMSE
Proposed
Similar to [12], Tables 6, 7, and 8 show the values ofthe model parameters { α n , β i , j , γ l } corresponding to thetest set which achieves the highest PCC value (i.e., 0.96)among the 50 test sets. It is interesting to see that theweights α n and γ n increase w.r.t. their indexes of n and l , respectively. The weight β um of the non-negativeswitching bin B um is equal to zero. For down-switchingbins { B Vi , j | j < } , their weights { β i , j } increase when theswitching amplitudes get larger values or the starting qual-ity values become lower. More detailed discussions aboutthese weights will be made in Section 5. In this part, a comparison of prediction performance be-tween the proposed model and four existing models isconducted over two databases. The first is our database,where prediction performances are the average values overthe 50 test sets mentioned in Subsection 4. The second isan open database (called
VL04 ) of the ITU-T P.1203 stan-dardization procedure (P.NATS) [29, 30]. This databaseconsists of sixty 1-minute long sessions generated fromthree different videos. The performances of the modelsare calculated over all these sixty sessions.A description of the proposed model and the fourexisting models, denoted by
Guo’s [10],
Vriendt’s [6],
Liu’s [12], and
P.1203 [26,30–32], is presented in Table 4.It can be seen that these models use various statistics tomodel the impacts of the factors. The models
Guo’s and
Vriendt’s only take into account the impacts of varyingperceptual quality. Meanwhile, the remaining models in-clude the impacts of both varying perceptual quality andinterruptions.According to Recommendations ITU-T P.1401 [33] andITU-T P.1203 [26], we conducted a compensation for testcondition differences between models. In particular, eachreference model is re-implemented by using parametersstated in the original study. Then, a first-order linearregression is performed to adjust the predicted QoE values.Finally, after the adjustment, the prediction performanceis calculated. The coefficients (i.e., slopes and intercepts)of the regression and performance corresponding to eachdatabase are presented in Table 5.It can be seen that, with using our database, the proposedmodel achieves the best prediction performance (i.e., thehighest PCC value and the lowest RMSE value). Specif-ically, the PCC and RMSE values of the proposed modelare respectively 0.95 and 0.30 MOS. It is clear that themodels
Guo’s and
Vriendt’s fail to predict the QoE val-ues of multi-factor sessions since they do not include theimpacts of interruptions. Thus, their PCC values are low(i.e., ≤ . ≥ .
58 MOS). For the models
Liu’s and
P.1203 , their per-formances are significantly lower than that of the proposedmodel. In particular, the PCC and RMSE values are re-spectively 0.78 and 0.56 MOS for the model
Liu’s , and0.85 and 0.42 MOS for the model
P.1203 . A possibleexplanation for this result is that these models use somestatistics such as the number of switches and the numberof interruptions to model the impacts of varying percep-tual quality and interruptions. However, these statisticscan not fully reflect switches and interruptions occurringin a session as they can not distinguish different switchingtypes and interruption types. Meanwhile, thanks to theuse of the histograms, the proposed model can differenti-ate different switching and interruption types, and so canmore effectively model the impacts of the factors.
In this subsection, we quantitatively analyze the impactsof the factors by discussing the weights of the bins in the6able 4: Description of the proposed and existing models
Models Statistics used to represent the impacts of factors
Varying Perceptual Quality InterruptionsGuo’s [10] Median segment quality valueMinimum segment quality value —
Vriendt’s [6] Number of switchesAverage and standard deviation of segment quality values —
Liu’s [12] Weighted sum of segment quality valuesAverage of the squares of down-switching amplitudes Sum of interruption durationsNumber of interruptions
P.1203 [26] Number of switchesNumber of quality direction changesLongest switching durationFirst and fifth percentile of segment quality valuesAverage of segment quality values in each intervalWeighted sum of segment quality valuesDifference between the maximum and minimum of segmentquality values Number of interruptionsWeighted sum of interruption durationsAverage time distance between interruptionsSum of interruption durationsTime distance between the last interruption andthe end of session
Proposed
Histogram of segment quality valuesHistogram of switching amplitudes Histogram of interruptions
Table 5: Adjustment Coefficients and Performances of the Proposed Models and Existing Models
Model Our database VL04
Coefficients Performance Coefficients Performance
Slope Intercept PCC RMSE Slope Intercept PCC RMSE
Guo’s
Vriendt’s
Liu’s
P.1203
Proposed — — 0.95 0.30 0.79 0.82 0.90 0.39
Table 6: Parameters of the Segment Quality Component α α α α α α n of the segmentquality bins B Qn shown in Table 6. It can be seen that theseweights increase with their indexes n . This implies that ahigher quality level has bigger contribution in the QoE ofsessions. In other words, the contribution of the highestquality level is biggest, which is similar to the findingin [20]. Note that, only two quality levels are considered Table 7: Parameters of the Quality Switching Component β i , j Starting quality value (i)
Qualityswitchingamplitude(j)
Non-neg.( β um ) -1 -2 -3 -4 in [20]. It is interesting to see that most of these weightsare similar to the midpoints of the corresponding intervals7able 8: Parameters of the Interruption Component γ γ γ γ γ γ I Qn , except for the weight α . The reason may be becauseit is, in fact, difficult to achieve 5 MOS even at perfectquality [5, 34].Next, we quantitatively investigate the impacts of qual-ity variations. Table 7 reports the weights β i , j of thequality switching bins B Vi , j . It is found that the impacts ofswitches depend not only on switching amplitudes but alsoon starting quality values. With the same starting qual-ity value, the larger the switching amplitude is, the moreserious the impact becomes. Also, given a switching am-plitude, the lower the starting quality value is, the morenegative the impact is. This means, when having the sameswitching amplitude, a down-switch in low quality rangesis more severe than in high quality ranges. For example,as β , − > β , − and β , − > β , − , a switch from 3 MOSto 1 MOS has more negative impact than that from 4 MOSto 2 MOS, which is in turn worse than from 5 MOS to3 MOS.Although the switching amplitude is smaller, β , − ishigher than β , − and β , − . This reveals that switcheshaving larger switching amplitudes may not cause morenegative impacts. From Table 7, it is also observed thatthe weights of β , − , β , − , β , − , and β , − (in increasingorder) are very large (up to 24.76). So it is recommendedto avoid switching to very low quality levels (i.e., around1 ∼ β um turns out to be zero. This re-confirms the finding in [23] that the impacts of up-switchesand quality remaining are negligible. Therefore, it is un-necessary to classify up-switches (like down-switches).This finding is in line with those obtained in [19, 21] thatabrupt up-switches may not be worse than smooth up-switches. Even users prefer switching up to and thenmaintaining a high quality level to gradual switching. Inaddition, this result implies that switching to higher qual-ity levels is preferred to remaining at low quality levels,which is in agreement with [21].In contrast, a study in [3] shows that frequent quality in-creasing can lead to significantly larger impacts comparedto no quality change. This observation can be obtained Figure 3: An example of four sessions with different qual-ity variations.when comparing a session having frequent up-switches inlow quality ranges with a session having quality fixed athigher levels. However, this conclusion may not be cor-rect for up-switching in high quality ranges and qualityremaining at low levels. Therefore, to avoid confusions,findings should be specific to switching types.To better understand some statistics of segment qualityvalues, Fig. 3 shows some cases of sessions with differentquality variations. All the cases have the same statis-tics including the average, the median, the maximum, theminimum, and the standard deviation of segment qualityvalues. However, it can see that quality variations are verydifferent in these cases. So these statistics are not ableto fully reflect quality variations occurring in a session.Although the number of switches in the case γ l of the interruption bins B Il shown in Table 8. We can see that, the larger the index l is, the higher the weight γ l becomes. This means thatthe impact of interruptions increases with their durations.Especially, the weight γ is zero, implying that users aregenerally tolerant of interruptions having durations lessthan or equal to 0.25 seconds. For interruptions with8onger durations, their impacts are much more negative.In a comparison between the weights β i , i and γ l , it canalso be observed that the weights γ and γ are very large(45.58 and 50.65), even larger than the weights β i , j of anyquality switching bins. This indicates that an interruptionwith duration longer than 2 seconds is more annoying tousers than any down-switches. Therefore, avoiding suchinterruptions should be of the highest priority, possibly atthe cost of abrupt switches.Meanwhile, the weight γ is close to the weight β , − (i.e., 8.42 and 7.89). This result is in-line with [19] thata down-switch may cause a comparable impact as an in-terruption. Moreover, it can be seen that the weight γ isconsiderably lower than the weights β , − , β , − , and β , − .This indicates that a down-switch can even result in moreserious impacts than an interruption. In other words, aninterruption does not necessarily result in more negativeimpacts than a down-switch.In [3], the finding is that an interruption has three timesthe impact of a switch. However, the types of interruptionsand switches are not clearly mentioned. It can be seen thatthe weight γ (i.e., 24.16) is approximately three timeof the weight β , − (i.e., 7.89). However, for differentweight pairs, the ratio considerably changes. This againshows that findings should be specific to switching andinterruption types.Similar to switching types, different interruption typeshave different weights. Therefore, the number of interrup-tions, the average, the maximum, and the sum of interrup-tion durations are also not able to fully reflect interruptionsoccurring in a session.Finally, it can be seen that the above findings are enabledby the use of histograms in the proposed model. In thisway, it is possible to provide a set of insights into therelative impacts of the different factors. In other words, theuse of the histograms is more flexible and comprehensivethan the use of some statistics such as the average, themedian, the minimum, and the standard deviation. Based on the above results and discussions, some remarkson the impacts of the factors can be summarized as follows.• First, it is found that a higher quality level has biggercontribution in the QoE of sessions. • Second, the effects of switches depend on not onlyswitching amplitudes but also starting quality values.So switches having larger amplitudes do not neces-sarily cause more negative impacts.• Third, it is suggested that switching to a very lowquality value (i.e., around 1 ∼ In this paper, we have first proposed a QoE model tak-ing into account the impacts of varying perceptual qualityand interruptions. Then, by using the two databases, theexperiment results have showed that the proposed modelhas high prediction performance and outperforms the fourexisting models. Finally, by discussing the model param-eters, a set of the insights into the impacts of the factorshave been provided in detail to each switching and inter-ruption type. We hope that the findings in this paper willbe useful for researchers in better understanding the factorsaffecting the QoE of streaming sessions, and further pro-viding some suggestions to improve adaptation strategiesin HTTP Adaptive Streaming.For future work, we will extend the propose model totake into account the impact of the initial delay. Besides,we will seek to apply the proposed model in evaluationsand developments of adaptation strategies, such that theycan utilize network resources to provide the best qualityof experience.9 eferences [1] T. C. Thang, Q. D. Ho, J. W. Kang, and A. T. Pham,“Adaptive Streaming of Audiovisual Content usingMPEG DASH,”
IEEE Transactions on ConsumerElectronics, , vol. 58, no. 1, pp. 78–85, Feb. 2012.[2] J. Xue, D.-Q. Zhang, H. Yu, and C. W. Chen, “As-sessing quality of experience for adaptive HTTPvideo streaming,” in , Chengdu, China, Jul. 2014, pp. 1–6.[3] H. Nam, K.-H. Kim, and H. Schulzrinne, “QoE mat-ters more than QoS: Why people stop watching catvideos,” in , San Francisco,CA, USA, Apr. 2016, pp. 1–9.[4] D. Z. Rodríguez, Z. Wang, R. L. Rosa, and G. Bres-san, “The impact of video-quality-level switchingon user quality of experience in dynamic adaptivestreaming over HTTP,”
EURASIP Journal on Wire-less Communications and Networking , vol. 2014,no. 1, p. 216, 2014.[5] T. Hoßfeld, S. Egger, R. Schatz, M. Fiedler, K. Ma-such, and C. Lorentzen, “Initial delay vs. interrup-tions: Between the devil and the deep blue sea,” in , Yarra Valley, Australia, Jul.2012, pp. 1–6.[6] J. D. Vriendt, D. D. Vleeschauwer, and D. Robinson,“Model for estimating QoE of video delivered usingHTTP adaptive streaming,” in , Ghent, Belgium, May 2013, pp.1288–1293.[7] Y. Shen, Y. Liu, H. Yang, and D. Yang, “Quality ofExperience Study on Dynamic Adaptive Streamingbased on HTTP,”
IEICE Transactions on Communi-cations , vol. 98, no. 1, pp. 62–70, 2015.[8] X. Yin, A. Jindal, V. Sekar, and B. Sinopoli,“A control-theoretic approach for dynamic adaptive video streaming over HTTP,”
ACM SIGCOMM Com-puter Communications Review , vol. 45, no. 4, pp.325–338, 2015.[9] A. Bentaleb, A. C. Begen, and R. Zimmermann,“SDNDASH: Improving QoE of HTTP adaptivestreaming using software defined networking,” in
Proceedings 2016 ACM Multimedia Conference ,Amsterdam, The Netherlands, Oct. 2016, pp. 1296–1305.[10] Z. Guo, Y. Wang, and X. Zhu, “Assessing the visualeffect of non-periodic temporal variation of quantiza-tion stepsize in compressed video,” in ,Quebec City, Canada, Sept. 2015, pp. 3121–3125.[11] K. D. Singh, Y. Hadjadj-Aoul, and G. Rubino, “Qual-ity of experience estimation for adaptive HTTP/TCPvideo streaming using H. 264/AVC,” in , Las Vegas, USA, Jan. 2012, pp. 127–131.[12] Y. Liu, S. Dey, F. Ulupinar, M. Luby, and Y. Mao,“Deriving and validating user experience modelfor DASH video streaming,”
IEEE Transactions onBroadcasting , vol. 61, no. 4, pp. 651–665, 2015.[13] D. Z. Rodríguez, R. L. Rosa, E. C. Alfaia, J. I.Abrahão, and G. Bressan, “Video quality metricfor streaming service using DASH standard,”
IEEETransactions on Broadcasting , vol. 62, no. 3, pp.628–639, 2016.[14] T. Hoßfeld, R. Schatz, E. Biersack, and L. Plisson-neau, “Internet Video Delivery in YouTube: FromTraffic Measurements to Quality of Experience,”
Data Traffic Monitoring and Analysis , vol. 7754, pp.264–301, 2013.[15] M. Seufert, S. Egger, M. Slanina, T. Zinner,T. Hoßfeld, and P. Tran-Gia, “A survey on qualityof experience of HTTP adaptive streaming,”
IEEECommunications Surveys & Tutorials , vol. 17, no. 1,pp. 469–492, 2015.[16] N. Barman and M. G. Martini, “Qoe modeling forhttp adaptive video streamingâĂŞa survey and open10hallenges,”
IEEE Access , vol. 7, pp. 30 831–30 859,2019.[17] H. T. T. Tran, N. P. Ngoc, A. T. Pham, and T. C.Thang, “A Multi-Factor QoE Model for AdaptiveStreaming over Mobile Networks,” in , WashingtonDC, USA, Dec. 2016, pp. 1–6.[18] H. T. Tran, T. Vu, N. P. Ngoc, and T. C. Thang,“A novel quality model for HTTP Adaptive Stream-ing,” in , Ha Long,Vietnam, Jul. 2016, pp. 423–428.[19] S. Egger, B. Gardlo, M. Seufert, and R. Schatz, “Theimpact of adaptation strategies on perceived qualityof HTTP adaptive streaming,” in
Proceedings of the2014 Workshop on Design, Quality and Deploymentof Adaptive Video Streaming , Sydney, Australia, Dec.2014, pp. 31–36.[20] T. Hoßfeld, M. Seufert, C. Sieber, and T. Zinner,“Assessing effect sizes of influence factors towards aQoE model for HTTP adaptive streaming,” in , Singapore, Sept. 2014,pp. 111–116.[21] S. Tavakoli, S. Egger, M. Seufert, R. Schatz,K. Brunnström, and N. García, “Perceptual qual-ity of HTTP adaptive streaming strategies: Cross-experimental analysis of multi-laboratory and crowd-sourced subjective studies,”
IEEE Journal on Se-lected Areas in Communications , vol. 34, no. 8, pp.2141–2153, 2016.[22] P. Ni, R. Eg, A. Eichhorn, C. Griwodz, andP. Halvorsen, “Flicker effects in adaptive videostreaming to handheld devices,” in
Proceedings ofthe 19th ACM international conference on Multi-media , Scottsdale, Arizona, USA, Nov. 2011, pp.463–472.[23] H. T. Tran, N. P. Ngoc, Y. J. Jung, A. T. Pham, andT. C. Thang, “A Histogram-Based Quality Model forHTTP Adaptive Streaming,”
IEICE Transactions onFundamentals of Electronics, Communications and Computer Sciences , vol. 100, no. 2, pp. 555–564,2017.[24] Z. Duanmu, K. Zeng, K. Ma, A. Rehman, andZ. Wang, “A Quality-of-Experience Index forStreaming Video,”
IEEE Journal of Selected Top-ics in Signal Processing , vol. 11, no. 1, pp. 154–166,Feb 2017.[25] X. Liu, F. Dobrian, H. Milner, J. Jiang, V. Sekar,I. Stoica, and H. Zhang, “A Case for a Coordi-nated Internet Video Control Plane,” in
Proceedingsof the ACM SIGCOMM 2012 Conference on Appli-cations, Technologies, Architectures, and Protocolsfor Computer Communication , ser. SIGCOMM ’12,New York, NY, USA, 2012, pp. 359–370.[26] Recommendation ITU-T P.1203.3, “Parametricbitstream-based quality assessment of progressivedownload and adaptive audiovisual streaming ser-vices over reliable transport-Quality integrationmodule,”
International Telecommunication Union ,2017.[27] W. Robitza, M.-N. Garcia, and A. Raake, “A modularHTTP adaptive streaming QoE model-Candidate forITU-T P. 1203 (“P. NATS”),” in , Erfurt, Germany, Jul. 2017, pp. 1–6.[28] H. T. T. Tran, D. V. Nguyen, D. D. Nguyen,N. P. Ngoc, and T. C. Thang, “An LSTM-basedApproach for Overall Quality Prediction in HTTPAdaptive Streaming,” in
IEEE Conference on Com-puter Communications Workshops (INFOCOM WK-SHPS) , Paris, Apr. 2019.[29] “P.1203 Open Dataset,” https://github.com/itu-p1203/open-dataset, accessed 2018-07-01.[30] W. Robitza, S. Göring, A. Raake, D. Lindegren,G. Heikkilä, J. Gustafsson, P. List, B. Feiten,U. Wüstenhagen, M.-N. Garcia, K. Yamagishi, andS. Broom, “HTTP Adaptive Streaming QoE Estima-tion with ITU-T Rec. P.1203 - Open Databases andSoftware,” in
Proceedings of the 9th ACM Multime-dia Systems Conference , Amsterdam, Netherlands,Jun. 2018, pp. 466–471.1131] A. Raake, M.-N. Garcia, W. Robitza, P. List,S. Göring, and B. Feiten, “A bitstream-based,scalable video-quality model for HTTP adaptivestreaming: ITU-T P.1203.1,” in
Ninth InternationalConference on Quality of Multimedia Experience(QoMEX) , Erfurt, Germany, May 2017, pp. 1–6.[32] “ITU-T Rec. P.1203 Standalone Implementation,”https://github.com/itu-p1203/itu-p1203/, accessed2018-07-01.[33] Recommendation ITU-T P.1401, “Methods, metricsand procedures for statistical evaluation, qualifica-tion and comparison of objective quality predictionmodels ,”
International Telecommunication Union ,2012.[34] T. Tominaga, T. Hayashi, J. Okamoto, and A. Taka-hashi, “Performance comparisons of subjective qual-ity assessment methods for mobile video,” in2010Second international workshop on Quality of mul-timedia experience (QoMEX)