[PDF] A Survey of Multimedia Streaming in LTE Cellular Networks

Abstract

With the growing of Long Term Evolution (LTE) cellular networks and the increase in the demand of the video services, it is vital to consider the challenges in the streaming services from a different perspective. A perspective that focuses on the streaming services in light of cellular networks challenges, both per layer basis and across multiple layers as well. In this tutorial, we highlight the main challenges that faces the industry of video streaming in the context of cellular networks with a focus on LTE. We also discuss proposed solutions for these challenges while highlighting the limitations of these solutions and the conditions/assumptions required for these solution to deliver high performance. In addition, we show different work in cross layer optimization for video streaming and how it leads towards a more optimized end to end LTE networking for video streaming. Finally, we suggest different open research areas in the domain of video delivery over LTE networks that can significantly enhance the quality of streaming experience to the end user.

Full PDF

aa r X i v : . [ c s . N I] M a r A Survey of Multimedia Streaming in LTE CellularNetworks

Ahmed Ahmedin ∗ , Amitabha Ghosh † , and Dipak Ghosal ∗∗ University of California, Davis, CA 95616{ahmedin, dghosal}@ucdavis.edu † UtopiaCompression Corporation, Los Angeles, CA [email protected]

Abstract —With the growing of Long Term Evolution (LTE)cellular networks and the increase in the demand of the videoservices, it is vital to consider the challenges in the streamingservices from a different perspective. A perspective that focuseson the streaming services in light of cellular networks challenges,both per layer basis and across multiple layers as well. In thistutorial, we highlight the main challenges that faces the industryof video streaming in the context of cellular networks with afocus on LTE. We also discuss proposed solutions for thesechallenges while highlighting the limitations of these solutionsand the conditions/assumptions required for these solution todeliver high performance. In addition, we show different workin cross layer optimization for video streaming and how it leadstowards a more optimized end to end LTE networking for videostreaming. Finally, we suggest different open research areas in thedomain of video delivery over LTE networks that can signiﬁcantlyenhance the quality of streaming experience to the end user.

I. I

NTRODUCTION

Video streaming is currently one of the fastest and mostexpending services due to the emergence of multimedia basedapplications. The nature of these applications varies betweenbusiness video conferences and telesurgeries to home enter-tainment, including but not limited to security surveillance andtracking operations. The evolution in wireless networks addsthe mobility as another dimension to the streaming services.The high bit rates demands by users impose challenges onthe network operators and vendors to continuously developand enhance the cellular networks capabilities which leads tocreating new services in addition to enhancing the quality ofthe existing ones. Furthermore, today’s mobile devices (e.g.,iPhone, iPad, Android, tablet) not only have advanced capa-bilities in terms of more processing power, longer battery life,higher resolution display, and a variety of form factors, but alsosupport seamless execution of numerous applications (Apps)developed by third parties. This expansion of smart mobiledevices industry is driving service providers and operatorsto introduce more effective techniques to bring high-qualityservices for the end users. Video streaming services are one ofthe services that was highly affected by the cellular networksand industry evolution. Content providers such as YouTubeand Netﬂix direct some of their potential towards perfectingmobile apps and enhancing the streaming quality on mobiledevices. As shown in Figure 1, Reelseo, a media market guide,claims that video streaming applications represents of the cellular data trafﬁc. It is expected that of the cellular datatrafﬁc will be from video by 2016 [1].Fig. 1: A study by Reelseo shows that 35 % of the cellulardata trafﬁc is for video streaming.However, there are key challenges that affects the qualityof video streaming over a cellular network, such as mobility,changing wireless environments, diversity in devices capabil-ities, power management, the strict delay requirements forthe video trafﬁc and other difﬁculties that have to be consid-ered for successful video transmission. Indeed, increasing thebandwidth and the bitrate can solve some of these problems,however smart and advanced protocols are needed to managethis bandwidth and distribute it fairly among the multiple usersand handle different requests. Long Term Evolution (LTE),often referred to as (4G), is one of the promising cellulartechnologies that has highly evolved, and been developed anddeployed over the past few years. The fact that LTE advancedhas a peak download data rate of 1 Gbps and upload rate of0.5 Gbps [2] promotes it as a good candidate for multimediastreaming. Hence, the focus of this survey is to shed some lightover the mechanisms of video delivery over wireless cellularnetworks in general and LTE in particular.We approach the challenges of the video streaming problemon a per layer basis and we speciﬁcally focus on LTE net-works. We discuss the main challenges for each layer and someproposed solutions for each problem in addition to pointing out Fig. 2: End to end full system architecture including LTE radio network, LTE EPC network and the content distribution networkthe limitations of these solutions and the conditions neededfor them to perform well. The survey also propose some openresearch areas for each layer. In addition, we survey somecross layer optimization solutions, where interaction/couplinghappens between two or more layers. Unfortunately, quanti-fying the effect of the solutions on the video quality is verydifﬁcult due to the existence many performance metrics suchas frame delay, peak signal to noise ratio, among others thatare hard to connect together [3] and each focus on a differentvital aspect of the video performance. Hence, we show theimpact of each solution on the different streaming performanceaspects.There has been many related surveys addressing the multi-media streaming problems over wireless networks in generalsuch as Zhu and Girod [4] which provides an overview ofthe technical challenges of video streaming over differenttypes of wireless networks. Mantzouratos et al. [5] pointsout the suitability of cross layer design for optimizing videostreaming over mobile ad hoc networks (MANETs). A surveyof issues in supporting QoS in MANETs was presented byMohapatra in [6]. However, none compare between the prosand cons of the different existing approaches and also thereis no focus on the cellular technology specially 3G and 4G.Our tutorial is considered as a supplementary to the existingsurveys as we address the video streaming solutions proposedto overcome the problems of each layer starting from theapplication layer to the medium access control (MAC). Wealso focus more on the LTE technology and identify theadvantages, the disadvantages and the limitations for each ofthe discussed solutions. We discuss the advantage of crosslayer design and show some successful approaches in thecross layer design context while providing directions for openproblems and future research.The rest of this survey is organized as follows: In section II,we introduce an overview about the LTE networks. Section IIIgives an overview over the structure of the video coding typesused for streaming and discusses some famous parameters toquantify the video quality. Sections IV, V, and VI discusses re- spectively the application, transport and MAC layers problemsand possible solutions. A cross layer approaches are discussedin Section VII. Finally, we conclude our tutorial in SectionVIII. II. LTE O

VERVIEW

The LTE project started in 2004 by the Third Genera-tion Partnership Project (3GPP) telecommunication body toenhance the cellular communication. It started as an evolu-tion from the Universal Mobile Telecommunication System(UMTS). LTE Advanced is expected to achieve peak downlinkrates up to 1Gbps [2]. Besides the high data rate and the widecoverage range that LTE provides, it is backward compatiblewith the previous cellular networks generations. There hasbeen some competitors to the LTE systems such as WiMAXdeﬁned by the standard IEEE 802.16e which provides highdata rates and mobility advantage similar to the LTE. However,the fact that LTE supports seamless connection to existingcellular networks and the simple compatible architecture,which reduces the operating expenditure (OPEX), promotesLTE to be the perfect candidate for the next generation ofcellular networks [7].

A. LTE Architecture

LTE has been designed to support the packet switchingservices. Hence, the system architecture has been devolvedto contain the Evolved Packet System (EPS) in addition tothe regular connectivity radio core network functions. In thissection, we will show some of the related EPS functionality tothe video streaming and give an overview for the core network.Figure 2 shows the network architecture, including the LTEradio network, LTE core network and the content distributionnetwork (CDN). These three parts provide the end to endconnectivity between the user equipments (UEs) and thecontent provider. The eNodeB is considered the radio accesspart of the LTE network, which provides the radio connectivityto the UE. The ﬁgure also highlights the different control loops which ensures smooth end to end video delivery and will bediscussed later in depth throughout the entire survey.The physical layer of an LTE downlink uses orthogonalfrequency-division multiple access (OFDMA), and allocatesradio resources in both time and frequency domains, as shownin Figure 3. The time domain is divided into LTE downlinkframes, which are split into Transmission Time Intervals(TTIs) of duration 1 millisecond (ms) each. The LTE downlinkframe has a total duration of 10 ms corresponding to ten TTIs.Each TTI is further subdivided into two time slots, each ofduration 0.5 ms. Each of these slots corresponds to 7 OFDMsymbols. In the frequency domain, the available bandwidth isdivided into subchannels of bandwidth 180 kHz each, and eachsubchannel comprises 12 adjacent OFDM subcarriers. As thebasic time-frequency unit in the scheduler, a single PhysicalResource Block (PRB) consists of one 0.5 ms time slot andone subchannel. The minimum unit of assignment for a UEis one PRB, and each one can be assigned to only a singleUE. Additionally, the LTE downlink makes use of adaptivemodulation and coding (AMC) to match the transmissionparameters to changing wireless channel conditions. In AMC,the modulation and the coding changes based on the wirelessenvironment to provide more robustness in weak channels andhigher data rates over the strong PRBs. S ub c a rr i e r ( f r equen cy ) Fig. 3: LTE downlink frame structure showing a physicalresource block (PRB), a resource element, and the durationsfor a time slot, a subframe, and a frame. A PRB consists of 12consecutive subcarriers and 7 or 6 OFDM symbols dependingon whether a long or short cyclic preﬁx is used, respectively.The basic time-frequency unit in the scheduler is a single PRB.III. V

IDEO S TREAMING O VERVIEW

A video is considered as a time sequence of still-images. Asthe wireless communication technology are becoming moreadvanced, the video streaming services are becoming moresophisticated and cover wide range of applications. Theseapplications vary from entertainment purposes such as video on demand, video chatting and interactive gaming to businesspurposes such as tele-surveillance, video conferencing andremote learning. Every one of these applications has itsown requirements and performance metrics to emphasize. Forexample, the effect of few seconds lagging is not severelysensitive for video on demand services, however it is verycritical for video conferencing.Al-Mualla and Canagarajah in [8] introduce the main chal-lenges for video streaming over cellular networks. One ofthe main challenges in the cellular networks is the limitedbandwidth, hence efﬁcient video compression techniques areneeded to overcome the bandwidth problem. Also, compres-sion techniques with higher compression ratios are needed toenhance the video quality by ﬁtting more frames and detailsinto the same container then the radio spectrum. Another chal-lenge is the different capabilities of the UEs. Mobile devicesrange from battery and hardware constrained cell phones, tomore powerful tablets with sophisticated transcoding features.Hence, video codecs implementations over cell phone needto take into consideration the computation complexity andthe limited battery life for cellular devices. Until 2000, im-plementations of video codecs [9], [10] indicate that digitalsignal processors (DSPs) could not achieve real-time videoencoding. Recently, there is more focus on low complexitycodecs implementation on embedded processors such as [11].In addition to coding challenges, we consider the severityof the mobile channel is the most difﬁcult. The link qualitydepends the UE’s distance from the eNodeB as well as theshadowing and the fading. These effects reduce the reliabilityof video delivery over the wireless spectrum. Hence, a possiblesolution is using forward error correction codes shall reducethe effect of the wireless channel degradation, however itintroduces extra redundancy to the video packets [12]. Theseredundancy comes on the expense of the ﬁrst challenge thatwe explained due to the limited radio spectrum. To ﬁgureout the correct coding, feedback signaling should be usedby making the UE report the quality. Unfortunately, it ishard to have a uniﬁed metric to quantify the video quality.Hence, a quantitative comparison between different schemesis difﬁcult as each scheme focuses on ﬁxing a certain problemand displays the advantage by showing the effect on the relatedperformance metrics. Furthermore, the objective video qualitymetrics are built based on mathematical models to quantifythe subjective quality assessments that usually needs a trainedeye to judge it. For the rest of this section, we will discusssome popular video performance metrics.

A. Peak Signal to Noise Ratio (PSNR)

PSNR is a full-reference video quality metric, i.e., uses thedistortion-free version of the video as the reference . Assumingwe have a video I , hence PSNR is measured in reference toa video R , typically a high quality or a distortion free versionof the video I . If the frame size is u × v (in pixels), the PSNRof the i th frame, PSNR(i), can be calculated using the meansquare error (MSE) between the i th frame in the video I , I i , and its correspondence in the video R , R i , as follows [13],MSE ( i ) = 1 uv u − X k =0 v − X l =0 [ I i ( k, l ) − R i ( k, l )] , (1)PSNR(i) = 10 log (cid:18) MAX MSE(i) (cid:19) , (2)where MAX is the maximum possible pixel value (typically,255). Hence, the average video PSNR is given as,PSNR = P zi =0 PSNR(i) z , (3)where z is the number of frames in I and R . PSNR is themost widely used objective video quality metric because of itssimplicity. However, PSNR values do not perfectly correlatewith the perceived visual quality due to the non-linear behaviorof the human visual system [14]. B. Structure Similarity Index Matrix (SSIM)

SSIM is another full referenced metric to characterize thevideo quality. Since it takes into consideration the inter-dependency between pixels, it is more consistent with thehuman eye [15]. SSIM ( i ) is calculated on various windows x and y of the frames I i and R i , respectively.SSIM x,y ( i ) = (2 µ x µ y + c )(2 σ xy + c )( µ x + µ y + c )( σ x + σ y + c ) , (4)where µ x and σ x are the the mean and variance for window x ,respectively; likewise, µ y and σ y are the mean and variancefor window y . The covariance of x and y is σ xy . The twovariables c and c are to stabilize the division. Hence, theSSIM is given as SSIM = P zi =0 SSIM(i) z . (5)

C. Video Quality Metric (VQM)

VQM is another objective full referenced quality metricdeveloped by the Institute for Telecommunication Science(ITS). It shows a high correlation with the subjective qualitytests. The VQM calibrates the video and corrects the temporaland spatial shifts then extract the different quality features.It combines these quality measurements together using a setof linear combination of 7 parameters based on one of thevarious models deﬁned for the VQM tool such as television,conference, general and PSNR [16].

D. Kullback-Leibler Divergence (KLD)

This is a reduced reference metric that is designed to predictthe quality of distorted images without full information aboutthe video. It is helpful in the real time streaming schemessuch as video conference, where a quick feedback is requiredfrom the receiver to the transmitter without much informationabout the original video. The overall distortion, D , betweenthe distorted and reference images can be calculated, accordingto [17], as D = log (1 + P Kk =1 (cid:12)(cid:12)(cid:12) ˆ d k ( p k || q k ) (cid:12)(cid:12)(cid:12) D ) (6) where K is the number of sub-bands, p k and q k are theprobability density functions of the k-th sub-bands in thereference and distorted images, respectively, and ˆ d k is the KLDbetween p k and q k , and D is a constant used to control thescale of the distortion measure. E. Blind Quality Assessment

These are a no reference category matrix, where no infor-mation needed about the original video. The blind techniquesare very helpful over the cellular network where a bandwidthutilization is required and the network protocols are trying toavoid extra overhead signaling. Chen and Song [18] analyzedthe common mobile video impairments and used it as ametric for the video quality. The ﬁrst is blockiness where theimage contains small blocks of a single color. The secondis blurriness where the edge of the image is not as sharp asthe neighboring pixels. And ﬁnally, the noise in the imagedue to the random variation in the color of the images. Theexperimental results for this technique shows that this blindestimation techniques gives a close result to the SSIM metric.IV. A

PPLICATION L AYER M ECHANISMS FOR V IDEO S TREAMING

The application layer is responsible of deciding the ap-propriate encoding techniques for the video frames basedon the application type and requirements. For example, thevideo on demand services require transmitting high qualityvideos, however it can tolerate a reasonable amount of delayhence less transmission errors. On the other hand, real timestreaming services are more strict and require low delay andjitter. Video compression/ encoding is very important to utilizethe bandwidth. It is considered, as we discussed earlier inSection III, a rare resource and the UEs compete over it.According to [8], a row video data of an HDTV will needat least 1.09 Gbits/s to be appropriately received, while atypical HDTV encoded video application over 6 MHz channelneeds only 20 Mbits/s. The idea of compression is to getrid of the redundancy in the frames. Redundancies can bea spatial, such as when part of the picture have the samecolor like a painted wall. The other type of redundancy isthe temporal redundancy, such as consecutive frames havingthe same background. Shannon’s lossless coding theorem [19]states that it is impossible to have a lossless compressioncoding rate less than the source entropy.The year 1984 marks the birth of the ﬁrst video codinginternational, H120 [20], by the ITU-T formerly known asInternational Telegraph and Telephone Consultative Commit-tee (CCITT, French of: Comité Consultatif International Télé-phonique et Télé graphique). The performance of the H120was remarkable on the spatial resolution, however it had avery poor temporal resolution. After that, different standardshas evolved specially the H26x family and the MPEG family.In the rest of this section, we will discuss some of the mostused coding schemes in cellular networks and the recent workrelated to these schemes.

A. MPEG-4

The ﬁrst version of the MPEG-4 standard has been ﬁnalizedin 1998 [21]. MPEG-4 is designed to work across varietyof bit rates starting from a few kilobits per second to tensof megabits per second which makes it very suitable for theunstable wireless environment. The concept of proﬁles wasintroduced in the MPEG-4, hence it is suitable for differentvideo sources, communication techniques and applications.The most important development in the MPEG-4 is adaptingan object based representation techniques, where the sceneis coded based on individual objects rather than pixels. Asshown in Figure 4, the frame is divided into different videoobject planes (VOPs) that can be decoded independently andmanipulated.Fig. 4: An example of the object based representation inMPEG-4.All these advantages encouraged the service providers andthe cellular network operators to use MPEG-4 in the differentvideo streaming applications as it satisﬁes different ranges ofrequirements. A performance evaluation for the MPEG-4 overthe UMTS is shown in [22]. The results show that MPEG-4 over the UMTS in the unacknowledged mode providestimely delivery, but no error recovery. On the other hand, theacknowledged mode enhances the radio link control (RLC)block error rate by with an acceptable video quality dueto using the Hybrid Automatic Repeat Request (HARQ) as wewill see later in Section VI. Increasing the RLC pre-decoderbuffer can help in increasing the video quality and ensuringthat the packets are received in a timely fashion. Our researchgroup in [23] explores the use of the MPEG-4 charachtersticsinto novel schedueling techniques to maximize the averagequality of a multiple users Wideband Code Division MultipleAccess (WCDMA) cellular network. In fact, the networkparameters and settings directly affect the pefromance of theMPEG-4 video transmission. These network parameters canbe adjusted based on the application as shown in [24], [25].The MPEG-4 perfromance over LTE is investigated in [26]as it discusses the LTE downlink air interface capacity usingrealistic MPEG-4 trafﬁc models. The results show the tradeoff between the user outage and the video frame loss for differentnumber of users as shown in Figure 5.Fig. 5: Tradeoff between user’s outage and the frame loss inMPEG-4 over LTE [26].

B. H.264

H.264, also known as Advanced Video Coding (AVC), isanother standard by the ITU that was started in 2003 andcompleted in 2004. The standard adds many extensions overthe MPEG-4 and the H.26x series in general [27]. These ex-tensions can accommodate the new applications requirements,increase the compression ratio and enhance the playbackquality. The standard adds ﬁve new proﬁles for professionalstreaming services, specially real time videos and surveillanceapplications. A major addition in the H.264 standard is theScalable Video Coding (SVC). The H.264 SVC standard iswell suited for wireless environments in general and the cel-lular network in particular which exhibit variable link qualitydue to shadowing, multipath fading, and limited bandwidth[28]. These factors can cause link quality degradation, leadingto reduction in the video delivery rate as well as increase in thepixel error rate. SVC grants three different scalability options.The ﬁrst is a quality scalability, in which, the data and decodedsamples of lower quality layers are used to predict higherquality layers to reduce the required UE rate to encode a higherquality layers. The second option is a temporal scalability,where complete frames are dropped from a video by motiondependency. Finally, we have a spatial scalability where videosare coded at multiple resolutions. A streaming device can useany of these scalability options or combine them dependingon the type of video, application and user’s requirements.Consequently, H.264 SVC has many levels and proﬁles thatdiffer in the level of compression, bit rate, and size. Anotheraddition in the H.264 is the multi view coding (MVC). Thiscomes in handy to efﬁciently code the same scene fromdifferent viewpoints, hence a frame from a certain cameracan be temporally predicted from other cameras’ frames inaddition to the same camera [29].All these additions and more made H.264 as one of themost popular video coding schemes for LTE. A statistical andsimulation analysis is conducted in [30] to evaluate the H.264performance over LTE. The authors use some of the metrics discussed in Section III, such as PSNR, SSIM, blockingand blurting, to compare between the different scalabilityoptions. However, the results show that scalability by itself isnot enough to avoid video quality degradation. Hence, smartscheduling, and efﬁcient routing techniques are needed, whichwill be discussed in the following sections. The scalability ofthe SVC, the high LTE data rates and the development of thecloud computing ﬁelds add new dimensions to the streamingservices specially with social networks. As proposed in [31],cloud based agents are created for each active UE. Theseagents are responsible of adjusting the video quality usingSVC based on the feedback information received from theUE to prefetch the videos from the social networks throughthe cellular networks. This can help reducing the networkcongestion by fetching the videos to the users in advance whenthe network is not crowded such as midnight to 6 am.

C. High Efﬁciency Video Coding (HEVC)

The HEVC is the successor of the H.264 in video coding.The standard has been through a lot of development since2004 until it was ﬁnalized in 2013. It is also known bythe name of some of its development stages titles such asH.265 and MPEG-DASH [32]. The HVEC aims to increasethe compression ratio by two while decreasing the complexitylevel to half. HEVC has some new specs to increase thecompression ratio and the video quality such as coding treeunit (CTU). The HVEC replaces the concept of macroblocksby CTU which is a large block of pixels with variablesizes to better encode the frames. Another extension is theparallel processing ability, where the frame is divided intotiles, and each can be encoded and decoded independently.Also, HEVC has at least four times prediction modes morethan the H.264 which makes the prediction more sophisticatedand results in better playback quality. Moreover, the HEVCadded new proﬁles to support displays up to (8192 × pixels compared to the (4096 × pixels in the H.264.An evaluation for the use of HEVC over mobile networksis presented in [33]. In addition, HEVC is considered to bea highly effective encoding standard for some of the recentapplications over LTE such as Telemedicine as suggested in[34]. D. Summary and Discussion

In this section, we discussed different famous video codingschemes that can be suitable for video streaming over LTEnetworks. It is clear that the video coding industry is alwaysin development and always being pushed to be evolved bydifferent entities. Investments from industry has a huge impacton the direction of the enhancements. Content providers suchas Google, Netﬂix and Hulu are pushing for lower codingcomplexities without quality degradations. On the other hand,devices manufacturers, such as Sony and Samsung, wantthe video encoding to introduce a better quality that fullyutilizes their hardware platforms to satisfy the end user. As wementioned earlier, quantitative comparison between differentschemes is hard, moreover the tradeoff between the providerand the manufacturer is difﬁcult to characterize. However,

Standard Average bit rate reduction compared toH.264 MPEG-4 H.263HVEC .

4% 63 .

7% 65 . H.264 N.A. .

5% 46 . MPEF-4 N.A. N.A. . TABLE I: Comparison between different coding schemesbased on bitrate reduction [35].there has been some trials to show the compression ratio dif-ference on the same platform between different video schemesshown in [35]. The comparison results are shown in Table I.Apparently, HEVC can be the future of the video codingschemes over wireless networks, however HEVC Adoption isstill in progress. While HEVC can help content producers,and distributors having a better quality content at the currentbitrate, It still needs a lot of work and can form interestingresearch topics. Also, current hardware platforms upgrades arerequired, hence compatibility issues which can generate manyinteresting research problems. Industry investments play animportant rule in adopting the HEVC. The current revolution-ary movement of using adaptive streaming makes the HEVCa good candidate to be used in video delivery over wirelessnetworks. In general, adaptive video streaming introducesbetter end to end quality. The content provider adapts theirtransmitted video quality according to some network measure-ments, such as congestion, rate of ACKs, and buffer underﬂow.This adds additional complexity and signaling overhead to thenetwork. This loop of signaling feedback and quality decisionrepresents the Adaptive bitrate (ABR) control loop as shown inFigure 2. Dynamic Adaptive Streaming over HTTP (DASH) isone of the solid examples That uses the ABR control loop con-cept to enhance the user’s experience. DASH is widely usedby Netﬂix, YouTube and Hulu. Using the HTTP protocol, theclient downloads chunks from the server then it can seamlesslyreconstruct the original media stream. During download, theclient dynamically requests fragments with the right encodingbitrate that maximizes the quality of the streaming application, typically determined by factors such as startup delay, videofreezes due to re-buffering, and the playback video bitrate [36],[37], and reduce network congestions. There are a number ofdeployed solutions of DASH. Adobe Dynamic Streaming forFlash [38] is available in latest versions of Flash Player andFlash Media Server which support adaptive bit-rate streamingover the traditional Real-Time Messaging Protocol (RTMP),as well as HTTP. Apple HTTP Adaptive Streaming HTTPLive Streaming (HLS) is an HTTP-based media streamingcommunications protocol implemented by Apple. MicrosoftSmooth Streaming [39] enables adaptive streaming of mediato clients over HTTP.An advanced prototype for streaming using the DASH overLTE is deployed in [40]. The demo results suggest the possibil-ity of adaptively streaming the next generation video standardcontent over LTE networks with a very reliable quality ofexperience (QoE). In [41], Thomson Video Networks (TVN)shows the possibility of having a high quality live broadcastingservices over LTE, such as HD-TV multi broadcasting forgames and events, by combining three technology enablers.

The ﬁrst is the HEVC for its high compression ratio to savethe bandwidth. The second is the dynamic adaptive streamingover HTTP (DASH) to adapt the transmission with the user’schannel quality and ﬁnally, the evolved Multimedia/BroadcastMulticast Service (eMBMS) to have efﬁcient broadcast deliv-ery to the UEs over LTE.The problem with such end to end adaptive control loopsthat it takes long time to respond to the network variations,which affects, sometimes negatively, the rest of the networkelements into consideration. In LTE networks, there is fastvariations of channel quality and demands. A content providermay decide to assign a certain user a high quality video,according to the measurements, which forces the radio networkto either assign more resources to this UE on the expense ofother users and other trafﬁc types or the UE may suffer bufferunderﬂow if the network decides not to honor the contentprovider required rate. Moreover, reporting the LTE networkfast pace measurements to the content provider introducescomplexity and signaling overhead. Hence, this calls for opti-mizing all network’s elements simultaneously. An alternativeto the application layer ABR is to consider a cross-layer designin which the ABR control is tightly integrated with the MACscheduler. This can open doors to more enhancements andinnovative design to the existing schemes. We will discussthis in more details later in Section VII.V. T

RANSPORT L AYER M ECHANISMS FOR V IDEO S TREAMING

The transport layer, often referred to as layer 4, is respon-sible of providing an end to end service for the differentapplications via the transport layer control loop shown earlierin Figure 2. Transport layer also can provide a reliabilityoption for the received packets by using means of errordetection such as checksum then notify the sender usingACK/NAK to retransmit the corrupted or lost packets. Inaddition, the transport layer applies ﬂow control mechanismsto prevent buffer overﬂows when the sender is faster than thereceiver as well as congestion control mechanisms to mitigatethe effect of low quality links and network congestions byslowing down the sender [42]. In multimedia and streamingapplications, the transport layer has to ensure the end toend quality and handles many challenges such as jitter, datapriorities, packet reordering, delay, bandwidth availability, andsession establishing and maintaining [43]. Unlike the wirednetwork, assumptions of no interference can not be appliedin wireless networks as packet losses result from the noisytime varying channel as well as the usual congestion reasons[44], [45]. In the following subsections, we show some of thecommon layer four protocols for video streaming.

A. Transmission Control Protocol (TCP)

Different transport protocols are used for media streaming.The most well known core transport protocol is the TCPas it supports variety of trafﬁc types including multimediastreaming. The TCP is ﬁrst speciﬁed in 1974 [46]. However,a lot of enhancements and additions are applied to it over theyears while keeping the basic operation the same. The TCP was originally optimized for wired networks as a connectionoriented protocol to handle the ﬂow and congestion control,and ensure the reliability of the received packetized data. Mostof the TCP versions have congestion control and reliabilityassurance mechanisms to retrieve the lost data [47]. The relia-bility part is obtained using the accumulative acknowledgmentscheme where the receiver sends an ACK with a sequencenumber to inform the receiver of successfully acquiring allthe packets perceiving this acknowledged sequence number.The sender retransmits the lost packets. Moreover, TCP usesthe ACKs time stamps to get estimates for the round trip time(RTT). Based on the RTT estimated values, the rate of thesender is adjusted to avoid congestions and decrease packetlosses.Unlike wired networks, most of the packet drops in thewireless environments are not from congestions but due to thetemporary degradations in the link quality. These losses aredue to fading, interference, or shadowing. When packets arelost due to link quality degradation, TCP enters the congestionavoidance mode which avoidably decreases the sender’s ratequickly. Consequently, radio resources are wasted due to thesender’s rate back off. There has been trials to adapt TCPto the wireless environment and to utilize resources [44],[48]. These modiﬁcations contributed in using the TCP inLTE to deliver different data trafﬁc types in general andfor media streaming in particular. A study is conducted in[49] to evaluate the performance of the TCP running in anLTE network for a severe vehicular environment. The studyshows that the obtained aggregated TCP throughput variesamong users based on their radio scheduling algorithms (willbe discussed in more details in Section VI) and the channelconditions severity. Although some studies suggest that TCPmay not be suitable for media streaming due to the back-off, retransmission mechanisms and delays [50], it is stillcommonly used in the commercial streaming trafﬁc because ofits reliability [51]. An analytical study for the performance ofTCP for live and stored media streaming is conducted in [52]under various conditions. The study shows that TCP providesa good performance when the achievable TCP throughout isroughly twice the media bit rate with few seconds of start updelay.As a result, media streaming over LTE using TCP is possiblewith slight modiﬁcations to optimize the performance of theTCP over LTE networks for video streaming applications. In[53], authors present a novel scheme of adaptive TCP ratecontrol to stream SVC encoded videos to accommodate thevarying channel conditions. The TCP rate adaptation schemeadds signiﬁcant improvement to losses, playback interruption,delay and buffer size. Figure 6, shows the improvements in 3different measurements when using the TCP rate adaptationalong with the SVC. It is worth to mention that the algorithmadopted in [53] is not optimized for LTE, hence a betterperformance can be obtained by using more optimized TCPversions for LTE.Another study is conducted in [54] to determine the optimalUE buffer for smooth playback and mitigate the TCP sawtooththroughput behavior. Increasing the receiver buffer leads tosmooth playback. However, increasing the receiver buffer for h (a) Scenario without adaptation. (b) Scenario with adaptation. Fig. 6: TCP adaptation performance over LTE for video streaming.video streaming applications comes on the expense of the otherrunning applications specially with the limited memory in theUE. The study states that given a network model characterizedby the packet loss rate ( p ) , RTT ( R ) , and retransmissiontimeout ( T ) , the receiver buffer size ( q ) that achieves desiredbuffer under-run probability ( P u ) is given by q ≥ . pP u [1 + 9 . T bR min(1 , r bp p (1 + 32 p )] . (7)where b is the number of acknowledged packets. Anotherenhancement is suggested in [55] to enhance the TCP per-formance for video streaming over LTE by using the forwardadmission control to reduce the impact of handover betweensmall cells on the TCP throughput. However, this scheme canbe bandwidth consuming due to the extra signaling.In general, TCP can be used for video streaming overLTE cellular network due to its highly preferable reliability.However, TCP lacks for satisfying the delay requirements dueto the retransmission mechanism used in the TCP. Also, TCPcan not guarantee the real time delivery for real time streamingservices’ packets such as video conferencing. Another disad-vantage in TCP is that it stalls when packet losses happen. B. Real-time Streaming Protocol (RTSP)

RTSP is a protocol designed for entertainment and com-munication to control streaming media servers and facilitatemedia streaming over the network. RTSP provides severalcommands for the streaming such as pause, play, record andmany more. RTSP can be used along with other protocols likeTCP and UDP to add the urgency of time to the streamingdata. RTSP can stream concurrent sessions and it keeps trackof the state of each session. Furthermore, RTSP providesspeed over reliability by using asynchronous QoS metrics suchas packet-loss counts, jitter, and round-trip delay times [56].RTSP can be used for real time trafﬁc such as multi-player gaming and video calling such as Skype. RTSP is one ofthe main multimedia streaming protocols for the 3G mobiletechnology as well [57]. Hence, LTE can use similar standardas illustrated in [57] for live videos streaming. In [58], ananalysis is conducted for the RTSP performance over LTE andWiMAX. The simulations in this study is done using OPNET[59] to measure various network statistics such as end-to-enddelay, trafﬁc throughput, jitter, and packet-loss. The RTSPensures the packets to be delivered on time with a price of ahigher packet losses or errors probability than TCP. The maindisadvantage of the RTSP that it uses multi-cast that is notsupported by many routers and occasionally is bing blockedby ﬁrewalls.

C. Stream Control Transmission Protocol (SCTP)

This protocol was deﬁned in 2000 by the Internet Engi-neering Task Force (IETF) [60]. SCTP is a message orientedtransport protocol unlike TCP that transports a continuous datastream. One of the main advantages of SCTP is the multi-stream capabilities. This can help multi-interface devices (suchas phones with WiFi and LTE) to receive video packets ondifferent internet paths leading to a better video quality. Thisis beyond the scope of this survey, but more information forinterested readers can be found in [61]. Another advantageto SCTP is the independent ordering for packets in eachstream, which allows the application to choose processingthe received messages by the order of receiving or the orderthey were sent in [62]. These two advantages makes SCTPa very desirable protocol for LTE trafﬁc in general andmultimedia streaming in particular. As we mentioned before,that most of the LTE packets loss results from the wirelessenvironment degradation. SCTP overcomes the problem oftrafﬁc stopping and resource wasting due to the LTE channelvariations because of the multi-stream capabilities. When anerror happens in one stream, it does not affect any of the otherstreams, hence packet delivery is not suspended. SCTP has an advanced congestion control mechanism than the TCP, whichconsists of three phases: slow-start, congestion avoidance, andfast retransmition. As a result, many researchers directed theirefforts to investigate the possibility and suitability of usingthe SCTP in the LTE including the 3GPP group themselves.A comparison between the TCP and SCTP in LTE has beenintroduced in [63]. The comparison recommends that SCTPis more suitable for LTE than the TCP due to the nature ofmulti-streaming.

D. Summary and Discussion

In this section, we discussed different well known transportprotocols. Based on our evaluations, we think that the SCTPis by far the most suitable protocol for stored video streamingover LTE among the discussed protocols and the RSTP ismore suitable for the live multimedia streaming. TCP is verypopular, widely deployed and well researched, hence it is stillbeing used for some of the video trafﬁc. However, there isplenty of room for research and enhancements to optimizethe performance of these protocols or introduce new onesthat can achieve a breakthrough for multimedia streamingover LTE. The new solutions must have the ability to reducejitters in the high data rates and accurate data ordering andsegmenting. Moreover, the future solutions should introducenew and fast ways to handle the congestions and packet lossestaking into consideration the wireless channel variability. Ourresearch group believes that the cross layers design approachcan introduce a new dimension and provide new options toenhance the performance as it will be discussed in Section VII.Also, we think that using multiple layer protocols (such as onefor delay and other for delivery) is not the optimal solution as itintroduces more delay and complexity to the network. Anotherapproach is using QoS aware or context aware protocols thatcan optimize the performance not only according to the trafﬁcbut also according to the user’s experience and the wirelessenvironment such as “the over the” top approaches followedin [64].VI. MAC L

AYER M ECHANISMS FOR V IDEO S TREAMING

The MAC layer is responsible of addressing, channelaccessing mechanisms, and organizing the medium sharingamong different users. In the wireless environment, the MAClayer provides power control, bandwidth assignment, interfer-ence reduction, and collision avoidance [65]. In high speedcellular networks, different types of services with different pri-orities are being requested by the UEs such as web browsing,voice over IP, video calling, downloading ﬁles and many more.Hence, another mission for the MAC layer is to provide packetdelay assurance and handle the different trafﬁc priorities. Forexample, downloading a ﬁle have an overall rate and reliabilityrequirements while video streaming or video conferencinghave a more strict packet to packet delay requirements. That isto say, if a video frame is received late, it will be dropped asit is not needed anymore which will affect the video quality.The controls in the MAC layer is represented by the RLC loopand scheduler loop as shown in Figure 2. In the followingsubsections, we will discuss the different MAC layer aspectsand its importance to the video delivery.

A. Multiple Access Mechanisms

There are different common access protocols for packetwireless networks such as Carrier sense multiple access withcollision avoidance (CSMA/CA) which is used in WiFi 802.11[66]. Another common protocol is Code division multipleaccess (CDMA), where several UEs can simultaneously trans-mit over a single communication channel [67]. CDMA iscommonly used in the UMTS. LTE uses OFDMA as themultiple access technique in the downlink and the Single-carrier FDMA (SC-FDMA) in the uplink. The SCFDMA hasa lower Peak to Average Power Ratio (PAPR) than OFDMAwhich makes it favorable in the uplink to increase the UEpower efﬁciency and battery life [68].

B. Modulation and Coding Schemes

LTE uses the concept of adaptive modulation and coding(AMC) to change the packets coding and modulation basedon the UE channel quality and the radio bearer requirements.These requirements depend mainly on the trafﬁc type andis speciﬁed by the QoS Class Indicator (QCI) table givenin the 3GPP standard [69] as shown in Figure 7. The QCIattribute determines the radio bearer trafﬁc type, maximumallowed packet error and maximum packet delay. The AMCcan be used to achieve these QCI and the rate requirementsspecially for video trafﬁc. For example, a user with a strongchannel can tolerate more errors, hence eNB can increase itsoverall rate by increasing the coding rate and increase theconstellation order (bits/symbol) which will eventually lead toincreasing the video quality by receiving more enhancementlayers successfully. On the other hand, a user with a weakchannel can not tolerate many errors and the eNB decidesto use lower coding rates, more redundancy, which willeventually help decreasing the packet loss. This is essentialin error sensitive applications such as interactive gaming ormulti resolution broadcast as proposed in [70] and [71]. Hence,the eNB includes information in each packet that speciﬁes theModulation Coding Scheme (MCS) for the next packet.

C. PRB Scheduling

The MAC layer in LTE is also responsible of managing theallocation functions, prioritizing the logical channel and itsmapping to transport channels, scheduling information report-ing, and managing HARQ, which is a transport-block levelautomatic retry. The MAC layer also selects transport formatand provides measurements information about the network,while the radio link control (RLC) layer is responsible ofpacket segmentation and reassembly.The LTE TTI scheduler is one of the vital MAC layerfunctions that has a great inﬂuence on the video quality overLTE. The time granularity of the TTI scheduler is the PRB uniti.e., 1 ms. The type of used scheduler by the eNB determinethe resource distribution among the different users, hence thequality. The LTE scheduler has to ensure the time as wellas the rate constraints for each user and radio bearer. Oneuser can have multiple radio bearers, each carries differenttrafﬁc type with different trafﬁc constraints. Hence, researchers Fig. 7: LTE standardized QCI values [69].developed different types of LTE schedulers to cover manyperformance aspects. Some schedulers focus on enhancingthe resource utilization and overall rate on the expense offairness among users while others consider fairness as theirﬁrst priority [72]. In the following, we will introduce some ofthe common TTI schedulers and discuss the trade-off betweenfairness and spectral efﬁciency. We also show how this trade-off affects the video trafﬁc and quality. Scheduler algorithmsin general can be categorized based on the channel knowledgeinto types; a channel non-aware and aware schedulers.

D. Channel Non-Aware Schedulers

First in First out (FIFO) is considered the simplest channelunaware scheduler, however it is not efﬁcient nor fair speciallyfor the video trafﬁc due to the frame delay limitations thatvaries between the different videos, services, and encod-ing. Another simple scheduler is round robin (RR) whichguarantees the resource time occupation fairness but not thethroughput fairness which is more important to the video trafﬁc[73]. Weighted fair queuing (WFQ) identiﬁes weights for eachuser/class of users based on their trafﬁc type. This idea canhelp prioritizing trafﬁc [74], however it needs to be used withanother scheduler such as Round Robin to avoid starvation.Our research group thinks that the weights in the WFQ canbe optimized as a function of the QCI values supported in theLTE MAC layer. A similar idea has been introduced using thepacket delay instead of the Queue using the packet deadlinemetric and giving the highest priority to the packets with theclosest deadline. This scheduler can be used in services such asvideo conferencing, where the quality is highly affected by thedelay of the frames. A performance evaluation for the differentchannel non aware schedulers is conducted using OPNET in[75].

E. Channel Aware Schedulers

The other type of LTE schedulers are the channel awareschedulers. There has been plenty of research to estimate theLTE channel such as [76]–[78]. Channel aware schedulersuse the channel quality indicator (CQI) sent from the UEsto the eNB to estimate the channel quality between the eNBand the UE. The simplest channel aware scheduler is themaximum throughput which assigns the PRB to the UE withthe maximum achievable throughput. Although this strategycan increase the UE’s video quality, it does not take fairnessinto consideration [79]. Hence, starvation can happen to someusers, moreover applications such as interactive gaming orvideo conferencing can be greatly degraded by the starvation.Proportional fair scheduler (PF) addresses the trade-off be-tween the achievable throughput and the fairness. The mainidea of the PF scheduler that it uses the average past receivedthroughput as a metric which ensures that users with badchannel condition does not starve [80], [81]. Another channelaware scheduler is time to average throughput (TTA). The TTAcan be considered as an intermediate scheduler between themaximum throughput and the PF. The metric for this scheduleris the maximum throughput for a certain PRB normalized tothe overall average throughput. The more average throughputa user can obtain, the lower the metric is. Hence, it givesan opportunity to other users to utilize the resources and todecrease the probability of starvation. However, this metricdoes not take the time constraints into consideration whichmakes it not suitable for delay sensitive videos. A performanceevaluation has been carried out via NS3 [82] in [83] tocompare between the different schedulers implementation inthe LTE environment.

F. Summary and Discussion

LTE MAC layer has many enhancements and advantagesthat can accommodate the streaming of high quality and highdeﬁnition videos over the cellular networks and adapt theeffect of the channel change in the channel environment.However, we believe that there is still a lot of space toadd more enhancements and introduce new algorithms that isspeciﬁcally designed for video streaming capabilities. Thesenew algorithms shall integrate the advantages of the MAClayer such as AMC and the new proposed LTE scheduler withthe video quality of service metrics and encoding informationused by the service provider to create a hybrid/cross layerdesign schemes for multimedia applications as it will bediscussed in Section VII. Some open research questions inthis area is which metrics should be used and what are theeffects for partial and/or full use of all the available metricsand feedback information. Furthermore, the complexity andthe processing overhead in the MAC layer for these hybridalgorithms needs to be studied to evaluate the quality gainversus the complexity.VII. C

ROSS L AYER T ECHNIQUES FOR M ULTIMEDIA S TREAMING A PPLICATIONS

Based on the earlier discussion regarding the LTE cellularnetwork and the work done to accommodate multimedia streaming applications, it clearly appears that the performanceof LTE cellular networks is not yet optimized for end to endmultimedia streaming delivery. This is normal as LTE is notdesigned to transmit video trafﬁc alone but also for other typesof services such as web surﬁng and ﬁle transfer in additionto voice calls using the Vo-LTE. The previously discusseddelivery algorithms and mechanisms are limited with the LTEstandard and optimizes only in one of the seven layer of theOpen Systems Interconnection OSI model introduced in [84].The OSI layered model forces the hierarchy among thelayers and does not allow communication between them,however this deﬁes the dynamic nature of the new cellularnetworks including LTE. In other words, the new cellularenvironments with all their different capabilities, multipleequipments capabilities and dynamic trafﬁc need to be selfadapting based on the dynamic factors in the network. Suchconcept of self conﬁgurable heterogeneous networks is hard toachieve without exchanging information between the networklayers, i.e. cross layer design [85]. Moreover, to optimize theend to end delivery performance for any application in generaland video streaming over the LTE network in particular,application and network information need to be exchangedbetween the different layers. Hence, in cross layer design wecan exploit the layers dependency instead of treating each layeras an independent entity [5]. That is to say, the control loopsillustrated in Figure 2 are either exchanging information ormerging together. This helps optimizing the entire networkelements performance at once, while exploiting the abilityof the LTE radio network to early detect variations. It isimportant to note that the LTE network time granularity ismuch faster (order of milliseconds) than the end to end timeframe. For the rest of the section, we will discuss some of therecently proposed solutions that involves cross layer designto optimize multimedia applications delivery over wirelesscellular networks in general and over LTE in particular.Different ideas have been proposed to optimize the videotransmission over LTE. Most of these ideas depend mainlyon changing the video and network parameters based on theenvironment, channel quality and video packets informationand priority. For example, a game theoretic spectrum agilityapproach is used in [86] to ensure the delivery of sensitivedelay applications over wireless networks. The goal of thiswork is to maximize the number of satisﬁed users whileensuring a fast reaction for secondary users in a cognitive radionetwork which is also known as Opportunistic Spectrum AgileRadios (OSAR). This is achieved by sharing the desired videoQoS information among the different users to be consideredin the scheduling phase. A multi description coding is usedin [87] to change the 802.11 MAC layer to be adaptive to thewireless channel so that the receiver can change its receivedquality by receiving more descriptions of the video frameswhen it has a good link quality. Similar ideas have beenproposed in general wireless networks and based on the crosslayer over wireless network framework described in [88] haveinspired the cross layer design for video delivery over cellularnetworks in general and LTE in particular.Our research group have some work in cross layer designover LTE. In this work, the mutual information between the video layers is the context. In [23], we propose a content awarescheduler for the MPEG4 video frames over WCDMA cellularnetworks. In this work, we mutually design the applicationlayer and the MAC layer, where the transmitter decides whichenhancements frames to add in a group of pictures structurebased on the channel quality between the eNB and the UE.In addition to working with WCDMA, our group proposed anovel LTE scheduler in [89] to optimize H.264 videos deliveryover LTE. In this work, we also mutually design the MAC andthe application layers to simultaneously choose the number ofenhancement levels of an H.264 encoded video that should betransmitted by the content provider for each user to maximizethe quality over the network. The eNB is also consideredcontent aware in this scenario, where it assigns more PRBsto the users who have higher priorities videos, or videos needhigh quality, such as action scenes, or with a low channelquality to ensure no starvation while maximizing the QoSacross all users. A similar work is done by our collaboratorsin [90] to optimize the delivery of DASH videos over LTEcellular network. We extended our previous work to include across layer design between the MAC, TCP and the applicationlayer in [91] by making the LTE scheduler aware to the UEand eNB buffers status in addition to being originally contentaware. Hence, a scheduler can effectively assign resources tothe users in need without video freeze or buffer overload.A merge between the MAC and RTP is proposed in [92]where the eNB MAC layer sends the average CQI informationper user to the video RTP server. The video RTP server decideswhich temporal enhancement layers to drop based on a prede-ﬁned look-up table. This scheme shows enhancement inthe quality of low CQI users as shown in Figure 8. A similarwork is also proposed in [93]. In [94], a MOS-based QoEpredictionfunction is derived to maximize the users qualitywhile guaranteeing fairness among them. This research alsoestablishes a mapping between PSNR and the user’s opinionbased on a tangent function curve to outline analytically therelation between the PSNR and a human visual perceivingmodel. Finally, it adopts the Particle Swarm Optimization(PSO) [95] to ﬁnd the optimal resource allocation based onthe quality of experience of each user.Fig. 8: Video quality of low CQI user (CQI 3 and 4) with andwithout adaptation [92]. A. Summary and Discussion

As we have seen in the previously discussed schemes,cross layer designing can enhance the performance because itdynamically adapts the parameters of different network layerssimultaneously based on the video information and the qualityof the link between the UE and the eNB. However, this comeswith some trade-off and limitations. In this subsection, wewill discuss some of the main limitations in the cross layerschemes and propose some of the open research problemsin this area. The ﬁrst limitation is that most of the proposedalgorithms in this area are centralized. Centralized algorithmsusually come with a high computational complexity, intensivesignaling and violation for the standard rules in expense ofthe performance. It is rare to ﬁnd a centralized approach,such as our work in [89], [91], that can be a plug andplay without much modiﬁcations in the network or signalingoverhead on the radio core. A complex centralized scheme isnot likely to be commercially implemented rather than justclarifying that cross layer design can signiﬁcantly enhancethe video quality. Hence, we think that developing distributedcross layer schemes is an open research problem and alsocritical to be able to implement in reality. The work in [96],[97] can be considered a paradigm for distributed cross layeroptimization schemes. The proposed scheme in [96] suggeststhat each user will run an optimization problem to determinethe number of resources needed to satisfy certain video delayrequirements given the number of users in each network ina multiple heterogeneous network systems. Similar schemesneed to be studied over LTE networks in addition to complyingwith the standard. Furthermore, there is an extra overhead,whether the scheme is centralized or distributed. This overheadis introduced by the signaling between the layers and thedifferent users to collect their channel and video information.Hence, studying the signaling and performance overhead isanother open problem. Signaling can also be a serious problemspecially during the UE handoff. Handover in LTE happensfrequently due to supporting high speed mobility. When ahandover happens, the UE is added to a new set of users andhas to exchange information again as the previous informationis useless. Hence, this extra signaling is considered a wasteduring the handoff time as well as the old information.Another open research question is determining the helpfulattributes of the video application to include in the optimiza-tion problem along with the network information. These at-tributes can range from high level such as genre (For example,action movies in general requires higher bit rate than musicalvideo clips to achieve the same quality), or the applicationtype. It can also be in a microscopic level such as frame delay,and coding settings. Hence, it is vital to analytically quantifyand experimentally test the contribution of different subsets ofthese attributes to the user’s experience while simultaneouslyreducing the signaling and complexity overhead. Moreover,the lack of uniﬁed simulation model as we mentioned earliermakes the comparison between schemes very hard as some ofthe cross layer schemes sacriﬁces the reality of the model toshow signiﬁcance improve in the quality. However, It becomeshard to check this and compare between the multiple existing schemes with the lack of incorporated testing model or aquantitative analysis for the proposed schemes.There exist some scenarios where information exchange andsignaling for the cross layer optimization is desirable andhelpful rather than being overhead. According to [98], theresources are shared in the eMBMS mode and the MBMSbearer service uses IP multi-casting to deliver its trafﬁc. Thecore network can decide to assign some users more uni-castresources to individually enhance their quality if possible. Tothe latest information of the authors, there has not been a solidwork to exploit the advantages of the cross layer optimizationin the eMBMS over the LTE networks.Finally, we think that ﬁnding implementable cross layertechniques that takes into consideration other trafﬁc typesbeside video application trafﬁc is an important research topic.As we explained before, LTE can carry different types of trafﬁcand support a lot of different applications due to it’s high datarate and mobility support. Hence, it is important to ensurethat the video trafﬁc does not compromise the performance ofother applications. Hence, an implementable proper cross layerdesign must consider other trafﬁc types and their priorities.VIII. C

ONCLUSION

In this survey, we discussed different optimization aspectsover cellular networks, in particular LTE, in order to en-hance the delivery of the video streaming services to the enduser. We discussed the different metrics that can be usedto characterize the performance of proposed solutions. Inaddition, we highlighted the limitations of a per layer basissolutions and pointed out to the environment and assumptionsaccompanied to these solutions in order to perform well. Inour opinion, we think that cross layer techniques lead tobetter end to end optimization and takes into considerationa lot more optimization parameters compared to the per layeroptimization techniques.R

Journal of Communications , vol. 4, no. 3, 2009. [Online].Available: http://ojs.academypublisher.com/index.php/jcm/article/view/0403146154[3] J. L. Martínez, P. Cuenca, F. Delicado, and F. Quiles, “Objective videoquality metrics: A performance analysis.”[4] X. Zhu and B. Girod, “Video streaming over wireless networks,” 2007.[5] S. Mantzouratos, G. Gardikis, H. Koumaras, and A. Kourtis, “Survey ofcross-layer proposals for video streaming over mobile ad hoc networks(manets),” in

International Conference on Telecommunications andMultimedia (TEMU), 2012 . IEEE, 2012, pp. 101–106.[6] P. Mohapatra, J. Li, and C. Gui, “Qos in mobile ad hoc networks,”

IEEEWireless Communications , vol. 10, no. 3, pp. 44–53, 2003.[7] H. G. Myung, “Technical overview of 3gpp lte,”

Polytechnic Universityof New York , 2008.[8] M. Al-Mualla, C. N. Canagarajah, and D. R. Bull,

Video coding for mo-bile communications: efﬁciency, complexity and resilience . AcademicPress, 2002.[9] A. Launiainen, A. Jore, E. Ryytty, T. D. Hämäläinen, and J. Saarinen,“Evaluation of tms320c62 performance in low bit-rate video encoding,”in

Third Annual Multimedia and Applications Conference (MTAC) .IEEE, 1998, pp. 364–368. [10] M. Budagavi, W. R. Heinzelman, J. Webb, and R. Talluri, “Wirelessmpeg-4 video communication on dsp chips,” Signal Processing Maga-zine, IEEE , vol. 17, no. 1, pp. 36–53, 2000.[11] O.-C. Chen, M.-L. Hsia, and C.-C. Chen, “Low-complexity inversetransforms of video codecs in an embedded programmable platform,”

Multimedia, IEEE Transactions on , vol. 13, no. 5, pp. 905–921, 2011.[12] A. Nafaa, T. Taleb, and L. Murphy, “Forward error correction strategiesfor media streaming over wireless networks,”

IEEE CommunicationsMagazine , vol. 46, no. 1, p. 72, 2008.[13] N. Thomos, N. V. Boulgouris, and M. G. Strintzis, “Optimized Transmis-sion of JPEG2000 Streams over Wireless Channels,”

IEEE Transactionson Image Processing , vol. 15, no. 1, pp. 54–67, 2006.[14] J. L. Martínez, P. Cuenca, F. Delicado, and F. Quiles, “Objective videoquality metrics: A performance analysis,” in

Proceedings of the 14thEuropean Signal Processing Conference , 2006.[15] Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image QualityAssessment: From Error Visibility to Structural Similarity,”

IEEE Trans-actions on Image Processing , vol. 13, no. 4, pp. 600–612, 2004.[16] Y. Wang, “Survey of objective video quality measurements,”

EMCCorporation Hopkinton, MA , vol. 1748, p. 39, 2006.[17] Z. Wang and E. P. Simoncelli, “Reduced-reference image qualityassessment using a wavelet-domain natural image statistic model,”in

Electronic Imaging 2005 . International Society for Optics andPhotonics, 2005, pp. 149–159.[18] C. Chen, L. Song, X. Wang, and M. Guo, “No-reference video qualityassessment on mobile devices,” in

IEEE International Symposium onBroadband Multimedia Systems and Broadcasting (BMSB), 2013 . IEEE,2013, pp. 1–6.[19] C. E. Shannon, “A mathematical theory of communication,”

ACMSIGMOBILE Mobile Computing and Communications Review , vol. 5,no. 1, pp. 3–55, 2001.[20] “Codecs for videoconferencing using primary digital group transmission.recommendation h.120.” CCITT (currently ITU-T), 1989.[21] T. Ebrahimi and C. Horne, “Mpeg-4 natural video coding–an overview,”

Signal Processing: Image Communication , vol. 15, no. 4, pp. 365–385,2000.[22] A. Lo, G. Heijenk, and I. Niemegeers, “Evaluation of mpeg-4 videostreaming over umts/wcdnl4, dedicated channels,” in

Proceedings. FirstInternational Conference on Wireless Internet, 2005.

IEEE, 2005, pp.182–189.[23] K. Pandit, A. Ghosh, D. Ghosal, and M. Chiang, “Content AwareOptimization for Video Delivery over WCDMA,”

EURASIP Journal onWireless Communications and Networking , vol. 2012, no. 1, pp. 1–14,2012.[24] C. Kodikara, S. Worrall, S. Fabri, and A. Kondoz, “Performanceevaluation of mpeg-4 video telephony over umts,” in

3G Mobile Com-munication Technologies, 2003. 3G 2003. 4th International Conferenceon (Conf. Publ. No. 494) , June 2003, pp. 73–77.[25] M. Ali, B. Pathak, and G. Childs, “Mpeg-4 video transmission over umtsmobile networks,”

PAMM , vol. 7, no. 1, pp. 1 011 003–1 011 004, 2007.[26] A. Talukdar, M. Cudak, and A. Ghosh, “Streaming video capacities oflte air-interface,” in

IEEE International Conference on Communications(ICC), 2010 . IEEE, 2010, pp. 1–5.[27] H. Schwarz, D. Marpe, and T. Wiegand, “Overview of the ScalableVideo Coding Extension of the H.264/AVC Standard,”

IEEE Transac-tions on Circuits and Systems for Video Technology , vol. 17, no. 9, pp.1103–1120, 2007.[28] T. Stockhammer, M. Hannuksela, and T. Wiegand, “H.264/avc in wire-less environments,”

Circuits and Systems for Video Technology, IEEETransactions on , vol. 13, no. 7, pp. 657–673, July 2003.[29] A. Vetro, T. Wiegand, and G. J. Sullivan, “Overview of the stereo andmultiview video coding extensions of the h. 264/mpeg-4 avc standard,”

Proceedings of the IEEE , vol. 99, no. 4, pp. 626–642, 2011.[30] P. McDonagh, C. Vallati, A. Pande, P. Mohapatra, P. Perry, and E. Min-gozzi, “Investigation of Scalable Video Delivery using H.264 SVC onan LTE Network,” in

International Symposium on Wireless PersonalMultimedia Communications (WPMC) , 2011, pp. 1–5.[31] X. Wang, T. Kwon, Y. Choi, H. Wang, and J. Liu, “Cloud-assistedadaptive video streaming and social-aware video prefetching for mobileusers,”

Wireless Communications, IEEE , vol. 20, no. 3, pp. 72–79, June2013.[32] G. J. Sullivan, J. Ohm, W.-J. Han, and T. Wiegand, “Overview of thehigh efﬁciency video coding (hevc) standard,”

Circuits and Systems forVideo Technology, IEEE Transactions on , vol. 22, no. 12, pp. 1649–1668,2012. [33] R. Garcia and H. Kalva, “Subjective evaluation of hevc and avc/h. 264in mobile environments,”

Consumer Electronics, IEEE Transactions on ,vol. 60, no. 1, pp. 116–123, 2014.[34] S. M. Majeed, S. K. Askar, and M. Fleury, “H. 265 codec over4g networks for telemedicine system application,” in

Proceedings ofthe 2014 UKSim-AMSS 16th International Conference on ComputerModelling and Simulation . IEEE Computer Society, 2014, pp. 292–297.[35] J. Ohm, G. J. Sullivan, H. Schwarz, T. K. Tan, and T. Wiegand, “Com-parison of the coding efﬁciency of video coding standards includinghigh efﬁciency video coding (hevc),”

IEEE Transactions on Circuits andSystems for Video Technology , vol. 22, no. 12, pp. 1669–1684, 2012.[36] T. Stockhammer, “Dynamic adaptive streaming over http –:standardsand design principles,” in

Proceedings of the Second AnnualACM Conference on Multimedia Systems , ser. MMSys ’11. NewYork, NY, USA: ACM, 2011, pp. 133–144. [Online]. Available:http://doi.acm.org/10.1145/1943552.1943572[37] S. Lederer, C. Müller, and C. Timmerer, “Dynamic adaptive streamingover http dataset,” in

Proceedings of the 3rd Multimedia SystemsConference , ser. MMSys ’12. New York, NY, USA: ACM, 2012,pp. 89–94. [Online]. Available: http://doi.acm.org/10.1145/2155555.2155570[38] M. Levkov, “Video encoding and transcoding recommendations for httpdynamic streaming on the adobe R (cid:13) ﬂash R (cid:13) platform,” White Paper,Adobe Systems Inc , 2010.[39] A. Zambelli, “Iis smooth streaming technical overview,”

MicrosoftCorporation , vol. 3, 2009.[40] X. Wang, Y. Xu, C. Ai, P. Di, X. Liu, S. Zhang, L. Zhou, andJ. Zhou, “Dash(hevc)/lte: Qoe-based dynamic adaptive streaming of hevccontent over wireless networks,” in

Visual Communications and ImageProcessing (VCIP), 2012 IEEE , Nov 2012, pp. 1–1.[41] H. DURAND, “Hevc, mpeg-dash and embms: three enablers for en-riched video contents delivery to handheld devices over 4g lte network,”in

White paper . THOMSON VIDEO NETWORKS, 2013.[42] C. M. Kozierok,

The TCP/IP guide: a comprehensive, illustrated Internetprotocols reference . No Starch Press, 2005.[43] D. Wu, Y. T. Hou, W. Zhu, Y.-Q. Zhang, and J. M. Peha, “Streamingvideo over the internet: approaches and directions,”

Circuits and Systemsfor Video Technology, IEEE Transactions on , vol. 11, no. 3, pp. 282–300,2001.[44] H. Balakrishnan, S. Seshan, E. Amir, and R. H. Katz, “Improving tcp/ipperformance over wireless networks,” in

MobiCom , vol. 95. Citeseer,1995, pp. 2–11.[45] A. Boukerche,

Algorithms and protocols for wireless, mobile Ad Hocnetworks . John Wiley & Sons, 2008, vol. 77.[46] V. Cerf, Y. Dalal, and C. Sunshine, “Rfc 675: Speciﬁcation of internettransmission control program, december 1, 1974,”

URL ftp://ftp. internic.net/rfc/rfc675. txt, ftp://ftp. math. utah. edu/pub/rfc/rfc675. txt. Status:UNKNOWN. Not online. RFC0676 .[47] B. Qureshi, M. Othman, and N. Hamid, “Progress in various tcpvariants,” in , 2009, p. 1.[48] M. Adeel and A. A. Iqbal, “Tcp congestion window optimization forcdma2000 packet data networks,” in

Information Technology, 2007.ITNG’07. Fourth International Conference on . IEEE, 2007, pp. 31–35.[49] D. Zhou, W. Song, N. Baldo, and M. Miozzo, “Evaluation of tcp per-formance with lte downlink schedulers in a vehicular environment,” in

Wireless Communications and Mobile Computing Conference (IWCMC),2013 9th International . IEEE, 2013, pp. 1064–1069.[50] T. Nguyen and A. Zakhor, “Distributed video streaming with forwarderror correction,” in

Packet Video Workshop , vol. 2002, 2002.[51] K. Sripanidkulchai, B. Maggs, and H. Zhang, “An analysis of livestreaming workloads on the internet,” in

Proceedings of the 4th ACMSIGCOMM conference on Internet measurement . ACM, 2004, pp. 41–54.[52] B. Wang, J. Kurose, P. Shenoy, and D. Towsley, “Multimedia streamingvia tcp: an analytic performance study,” in

Proceedings of the 12thannual ACM international conference on Multimedia . ACM, 2004,pp. 908–915.[53] K. Tappayuthpijarn, G. Liebl, T. Stockhammer, and E. Steinbach,“Adaptive video streaming over a mobile network with tcp-friendlyrate control,” in

Proceedings of the 2009 International Conference onWireless Communications and Mobile Computing: Connecting the WorldWirelessly . ACM, 2009, pp. 1325–1329.[54] T. Kim and M. H. Ammar, “Receiver buffer requirement for videostreaming over tcp,” in

Electronic Imaging 2006 . International Societyfor Optics and Photonics, 2006, pp. 607 718–607 718. [55] R. Cohen and A. Levin, “Handovers with forward admission controlfor adaptive tcp streaming in lte-advanced with small cells,” in Com-puter Communications and Networks (ICCCN), 2012 21st InternationalConference on , July 2012, pp. 1–7.[56] A. Rao, R. Lanphier, M. Stiemerling, H. Schulzrinne, and M. Wester-lund, “Real time streaming protocol 2.0 (rtsp),” 2013.[57] I. Elsen, F. Hartung, U. Horn, M. Kampmann, and L. Peters, “Streamingtechnology in 3 g mobile communication systems,”

IEEE Computer ,vol. 34, no. 9, pp. 46–52, 2001.[58] F. Zivkovic, J. Priest, and H. Haghshenas, “Quantitative analysis ofstreaming multimedia over wimax and lte networks using opnet v. 16.0,”

Group , 2013.[59] OPNET Modeler, “Opnet technologies inc,” 2009.[60] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. Schwarzbauer, T. Taylor,I. Rytina, M. Kalla, L. Zhang, and V. Paxson, “Stream control transmis-sion protocol (2007),”

RFC2960 .[61] K. Habak, K. A. Harras, and M. Youssef, “Bandwidth aggregationtechniques in heterogeneous multi-homed devices,”

Comput. Netw. ,vol. 92, no. P1, pp. 168–188, Dec. 2015.[62] L. Ong, “An introduction to the stream control transmission protocol(sctp),” 2002.[63] G. A. Abed, M. Ismail, and K. Jumari, “Comparative performanceinvestigation of tcp and sctp protocols over lte/lte-advanced systems,”

International Journal of Advanced Research in Computer and Commu-nication Engineering , 2012.[64] H. Nam, K. H. Kim, B. H. Kim, D. Calin, and H. Schulzrinne, “Towardsa dynamic qos-aware over-the-top video streaming in lte,” 2013.[65] G. R. Hiertz, S. Max, Y. Zang, T. Junge, and D. Denteneer, “Ieee 802.11s mac fundamentals,” in

IEEE Internatonal Conference on Mobile Adhocand Sensor Systems, 2007. MASS 2007.

IEEE, 2007, pp. 1–8.[66] K. Xu, M. Gerla, and S. Bae, “How effective is the ieee 802.11rts/cts handshake in ad hoc networks,” in

Global TelecommunicationsConference, 2002. GLOBECOM’02. IEEE , vol. 1. IEEE, 2002, pp.72–76.[67] V. A. Dubendorf, “Multiple access wireless communications,”

WirelessData Technologies , pp. 17–30.[68] N. Benvenuto and S. Tomasin, “On the comparison between ofdmand single carrier modulation with a dfe using a frequency-domainfeedforward ﬁlter,”

IEEE Transactions on Communications, , vol. 50,no. 6, pp. 947–955, 2002.[69] “3GPP TS 23.203. Policy and charging control architecture (release 9),v9.2.0,” Dec. 2009.[70] A. M. Correia, J. C. Silva, N. M. Souto, L. A. Silva, A. B. Boal, andA. B. Soares, “Multi-resolution broadcast/multicast systems for mbms,”

Broadcasting, IEEE Transactions on , vol. 53, no. 1, pp. 224–234, 2007.[71] H. Luo, S. Ci, D. Wu, and H. Tang, “End-to-end optimized tcp-friendlyrate control for real-time video streaming over wireless multi-hopnetworks,”

Journal of Visual Communication and Image Representation ,vol. 21, no. 2, pp. 98–106, 2010.[72] F. Capozzi, G. Piro, L. A. Grieco, G. Boggia, and P. Camarda, “Down-link packet scheduling in lte cellular networks: Key design issues and asurvey,”

IEEE Communications Surveys & Tutorials , vol. 15, no. 2, pp.678–700, 2013.[73] M. T. Kawser, H. M. Farid, A. R. Hasin, A. M. Sadik, and I. K. Razu,“Performance comparison between round robin and proportional fairscheduling methods for lte,”

International Journal of Information andElectronics Engineering , vol. 2, no. 5, pp. 678–681, 2012.[74] D. Stiliadis and A. Varma, “Latency-rate servers: a general model foranalysis of trafﬁc scheduling algorithms,”

IEEE/ACM Transactions onNetworking (ToN) , vol. 6, no. 5, pp. 611–624, 1998.[75] N. Sheta, F. W. Zaki, and S. Keshk, “Packet scheduling in lte mobilenetwork,”

International Journal of Scientiﬁc and Engineering Research, ,vol. 4, no. 6, 2013.[76] A. Mehmood and W. A. Cheema, “Channel estimation for lte downlink,”

Blekinge Institute of Technology , 2009.[77] A. Ancora, C. Bona, and D. T. Slock, “Down-sampled impulse responseleast-squares channel estimation for lte ofdma,” in

IEEE InternationalConference on Acoustics, Speech and Signal Processing, 2007. ICASSP2007. , vol. 3. IEEE, 2007, pp. III–293.[78] L. Ruiz de Temino, C. Navarro i Manchon, C. Rom, T. Sorensen, andP. Mogensen, “Iterative channel estimation with robust wiener ﬁlteringin lte downlink,” in

Vehicular Technology Conference, 2008. VTC 2008-Fall. IEEE 68th . IEEE, 2008, pp. 1–5.[79] S. Schwarz, C. Mehlfuhrer, and M. Rupp, “Low complexity approximatemaximum throughput scheduling for lte,” in . IEEE, 2010, pp. 1563–1569. [80] C. Wengerter, J. Ohlhorst, and A. G. E. von Elbwart, “Fairness andthroughput analysis for generalized proportional fair frequency schedul-ing in ofdma,” in

Vehicular Technology Conference, 2005. VTC 2005-Spring. 2005 IEEE 61st , vol. 3. IEEE, 2005, pp. 1903–1907.[81] F. D. Calabrese, C. Rosa, K. I. Pedersen, and P. E. Mogensen, “Per-formance of proportional fair frequency and time domain scheduling inlte uplink,” in

Wireless Conference, 2009. EW 2009. European . IEEE,2009, pp. 271–275.[82] G. J. Carneiro, “Ns-3: Network simulator 3,” in

UTM Lab Meeting April ,vol. 20, 2010.[83] D. Zhou, N. Baldo, and M. Miozzo, “Implementation and validation oflte downlink schedulers for ns-3,” in

Proceedings of the 6th InternationalICST Conference on Simulation Tools and Techniques . ICST (Institutefor Computer Sciences, Social-Informatics and TelecommunicationsEngineering), 2013, pp. 211–218.[84] D. P. Bertsekas, R. G. Gallager, and P. Humblet,

Data networks .Prentice-Hall International, 1992, vol. 2.[85] V. T. Raisinghani and S. Iyer, “Cross-layer design optimizations inwireless protocol stacks,”

Computer Communications , vol. 27, no. 8,pp. 720–724, 2004.[86] A. Larcher, H. Sun, M. van der Shaar, Z. Ding et al. , “Decentralizedtransmission strategy for delay-sensitive applications over spectrum agilenetwork,” in

Proceedings of International Packet Video Workshop , 2004.[87] C. Greco and M. Cagnazzo, “A cross-layer protocol for cooperativecontent delivery over mobile ad-hoc networks,”

International Journalof Communication Networks and Distributed Systems , vol. 7, no. 1, pp.49–63, 2011.[88] M. Van Der Schaar et al. , “Cross-layer wireless multimedia transmission:challenges, principles, and new paradigms,”

Wireless Communications,IEEE , vol. 12, no. 4, pp. 50–58, 2005.[89] A. Ahmedin, K. Pandit., D. Ghosal, and A. Ghosh, “Exploiting ScalableVideo Coding for Content Aware Downlink Video Delivery over LTE,”in

International Conference on Distributed Computing and Networking(ICDCN), 2014 (accepted) .[90] J. Chen, R. Mahindra, M. A. Khojastepour, S. Rangarajan, andM. Chiang, “A scheduling framework for adaptive video delivery overcellular networks,” in

Proceedings of the 19th Annual InternationalConference on Mobile Computing Networking , ser. MobiCom ’13.New York, NY, USA: ACM, 2013, pp. 389–400. [Online]. Available:http://doi.acm.org/10.1145/2500423.2500433[91] A. Ahmedin, K. Pandit, D. Ghosal, and A. Ghosh, “Content and BufferAware Scheduling for Video Delivery over LTE,” in

Conference onemerging Networking EXperiments and Technologies (CoNEXT) StudentWorkshop, 2013 (accepted) .[92] R. Radhakrishnan and A. Nayak, “Cross layer design for efﬁcient videostreaming over lte using scalable video coding,” in

IEEE InternationalConference on Communications (ICC), 2012 , June 2012, pp. 6509–6513.[93] S. Karachontzitis, T. Dagiuklas, and L. Dounis, “Novel cross-layerscheme for video transmission over lte-based wireless systems,” in

IEEEInternational Conference on Multimedia and Expo (ICME), 2011 , July2011, pp. 1–6.[94] Y. Ju, Z. Lu, D. Ling, X. Wen, W. Zheng, and W. Ma, “Qoe-basedcross-layer design for video applications over lte,”

Multimedia Toolsand Applications , pp. 1–21, 2013.[95] Y. Shi and R. C. Eberhart, “Empirical study of particle swarm optimiza-tion,” in

CEC 99. Proceedings of the 1999 Congress on EvolutionaryComputation, 1999. , vol. 3. IEEE, 1999.[96] L. Zhou, B. Geller, X. Wang, A. Wei, B. Zheng, and H.-C. Chao, “Multi-user video streaming over multiple heterogeneous wireless networks: adistributed, cross-layer design paradigm,”

Journal of Internet Technol-ogy , vol. 10, no. 1, pp. 1–12, 2009.[97] S. Khan, Y. Peng, E. Steinbach, M. Sgroi, and W. Kellerer, “Application-driven cross-layer optimization for video streaming over wireless net-works,”

IEEE Communications Magazine , vol. 44, no. 1, pp. 122–130,2006.[98] O. Oyman, J. Foerster, Y.-j. Tcha, and S.-C. Lee, “Toward enhancedmobile video services over wimax and lte [wimax/lte update],”