Cost-Efficient Storage for On-Demand Video Streaming on Cloud
Mahmoud Darwich, Yasser Ismail, Talal Darwich, Magdy Bayoumi
11 Cost-Efficient Storage for On-Demand VideoStreaming on Cloud
Mahmoud Darwich , Yasser Ismail , Talal Darwich , Magdy Bayoumi Department of Mathematical and Digital Sciences, Bloomsburg University of Pennsylvania, PA 17815 Electrical Engineering Department, Southern University and A&M College, Baton Rouge LA 70807 Microchip Technology Inc., San Jose, CA 95134 Department of Electrical and Computer Engineering, University of Louisiana at Lafayette, LA 70504Email: [email protected], yasser [email protected], [email protected],[email protected]
Abstract —Video stream is converted to several formats tosupport the user’s device, this conversion process is called videotranscoding, which imposes high storage and powerful resources.With emerging of cloud technology, video stream companiesadopted to process video on the cloud. Generally, many formatsof the same video are made (pre-transcoded) and streamed to theadequate user’s device. However, pre-transcoding demands hugestorage space and incurs a high-cost to the video stream com-panies. More importantly, the pre-transcoding of video streamscould be hierarchy carried out through different storage types inthe cloud. To minimize the storage cost, in this paper, we proposea method to store video streams in the hierarchical storage of thecloud. Particularly, we develop a method to decide which videostream should be pre-transcoded in its suitable cloud storage tominimize the overall cost. Experimental simulation and resultsshow the effectiveness of our approach, specifically, when thepercentage of frequently accessed videos is high in repositories,the proposed approach minimizes the overall cost by up to 40%.
Index Terms —cloud, storage, video stream, pre-transcoding,clustering, transcoding
I. I
NTRODUCTION
Video streaming has become widely used in electronicdisplaying devices-based applications. Due to the huge numberof videos that are streamed on a variety of devices such aslarge screen TVs, desktops, tablets, and smart-phones. Videostreaming is the main source of Internet traffic in the UnitedStates. It consumes up to 77% of the Internet Bandwidthin the United States [1]. Additionally, video streaming isexpected to consume up to 85% of Internet traffic by 2021[2]. Based on the characteristics of the end-device of the videostreaming; i.e. the allowed bit-rate, resolution, and networkbandwidth; the Video contents have to be transcoded to matchthe characteristics of the end-device [3]. Video On-Demand(VOD) such as YouTube or Netflix and live-streaming such asLivestream are examples of video content. Video transcodingis an exhaustively time and computation consuming process.Cloud computing services have been used by the Video StreamProviders (VSP) to greatly decrease the overall computationsof the transcoding process [4]. The VSPs perform the transcod-ing operation offline on the VOD to guarantee high-speedvideo streaming operations. During the offline transcoding operation, multiple formats of the video stream are stored ona cloud. Based on the specifications of the end-device of theviewer, the proper stored format will be selected from thecloud. The process of storing multiple formats of the videostream is called pre- transcoding. As a practical example of thepre-transcoding operation, Netflix pre- transcodes and storesapproximately 70 different formats of each video on theircloud [5]. As a result, there will be an extra overhead costfor the VSPs [6], [9].The distribution of accessing the pre-transcoded videos is along tail distribution [8]. It means many pre-transcoded videosare accessed while a small number of such videos are rarelyaccessed. This encourages researchers to reduce the overallpre-transcoding cost by transcoding rarely-accessed videos ason-demand videos [4], [9]. This will allow one or a few for-mats of a video to be stored and transcoding is executed whenaccessing a video format that is not already pre-transcoded. Weterm the rarely-accessed videos (lazy transcoding of videos)as re-transcoding and storing of videos as pre-transcoding.The cost of cloud VMs is higher than the cloud storagecost [10]. This is because the computational cost is calculatedper hour in the cloud. This indicates that the re-transcodingoperation is cost efficient to VSPs when applying it to therarely accessed videos. By contrast, if pre-transcoding isexecuted on frequently accessed videos (FAVs), it results ina very high cost because it charges VSPs each time the videois transcoded. This is why the pre-transcoding approach isalternatively applied to frequently accessed videos.The research problem of the proposed work is how to decidewhere the video stream should be stored in the cloud storage.Therefore, a method that performs clustering on the frequentlyaccessed video streams is proposed in this paper to tackle thisissue.The main contributions of this paper can be summarizedas follows: (1) Proposing a method to reduce the incurredcost of using cloud services through clustering the frequentlyaccessed video streams in the repository; and (2) Analyzingthe effectiveness of the proposed method when changing thenumber of frequently accessed video streams in a repository .This proposed work differs from the previous work [7] in thatwe design an approach that stores video streams efficiently a r X i v : . [ c s . MM ] J u l and thus it decreases further the cost of cloud services.The paper is organized as follows: section II reveals therelated work. The clustering method will be explained insection III. Experiment setup and results are explained insections IV and V, respectively. Conclusion and future workare presented in sections VI respectively.II. R ELATED W ORK
Kim et al. [14] proposed a scheme to transcode multimediavideo streams resources using intra-cloud and the parallelcomputing framework. Their scheme provided improved tasksassignment and high-speed video transcoding.Darwich et al. [15] proposed algorithms to reduce theincurred cost paid by VSPs when using cloud services. Par-ticularly, they developed a method that measures how muchfrequently the video stream is accessed and accordingly, theirapproach decides whether the video stream to be stored in therepository or transcodes it upon request.Gao et al. [16] proposed an approach to transcode videopartially using the cloud. Their proposed method store thefrequently accessed segments of video and which are locatedin the beginning, while they drop the remaining segments andtranscode them upon request. Their method reduced 30% ofthe cost compared to storing all segments of the video.Zhao et al. [12] developed a method that reduces theoperational cost of video streaming on the cloud. Particularly,their approach trades off between the transcoding and storingthe video, they implemented it by using the weight graph ofthe video transcoding. They used the transcoding relationshipsbetween video and their popularity to decide the video versionsthat should be kept and stored or dropped and re-transcodedupon request.Jokhio et al. [11] proposed an approach that estimatesthe costs of storing and transcoding a video using cloudresources. Besides, their approach utilizes the popularity ofeach transcoded video to come up with a decision about thetime frame for storing or re-transcoding it. Their results showthe efficiency of the method by reducing the cost significantly.III. P
ROPOSED C LUSTERING M ETHOD
A. Structure of Video Stream
A Video is composed of many sequences as illustrated inFig.1. The sequence in the video stream is formed by GroupOf Pictures (GOPs). The structure of a sequence and GOP isstarted with sequence header and GOP header respectively.The headers include meta-data about sequence and GOP.Different types of frames are contained in a GOP ( i.e.,
I (intra),P (predicted), and B (bi-directional) frames). Further, eachframe is composed of tiny slices called macroblocks (MB)[17].The operation of video transcoding is carried out at theGOPs level because they can be processed independently [17].Thus in this research, we considered the transcoding processat the GOPs level.
Fig. 1: Structure of a video stream
B. Video Streaming Using Cloud
The cloud services are available in an on-demand way. Thatmeans the users are charged in a pay-as-you-go way. Videostreaming on the cloud requires the following services: • Computational Services: The transcoding operations ofvideos are achieved using Virtual Machines (VM) andthe charge is an hourly basis • Storage Services: Cloud providers offer different storagesfor users and the charge is a monthly basis.Amazon Web Services is a well know company for cloudservices, it offers cloud services with affordable price and highreliability. Although we consider AWS services and modelsin our study. This is research could be applied to any cloudservices.Amazon offers different types of storages.
S3Standard
Storage, S3 Standard-Infrequent Access (
S3Standard-IA ) Storage, S3 One Zone-Infrequent Access(
S3 One Zone-IA ) Storage, and
S3 Glacier
Storage.The Amazon storage services are rated for each Gigabyte ofstored data in a month. These storage services are based ondifferent bandwidth accesses at different rates.
C. Algorithm
The proposed algorithm is an improved version of theprevious work [7]. Its purpose is to reduce the cost of videostreaming on the cloud by applying clustering on the frequentaccessed video/GOPs and then storing them in the hierarchicalstorage of the cloud. For that purpose, the algorithm is carriedout at the GOP level of the video stream repository periodi-cally. In the proposed algorithm, we present its pseudo-code.The GOPs of a video stream, GOP transcoding time, GOPsize, cloud storage price, and the number of accesses to thevideo in the last period are received as inputs to the algorithm.The output is to cluster the frequent accessed GOPs/videos,and store them in the cloud storages.The the GOPs’ access in a video is a long-tail distribution[13] as shown in Fig.2 . The GOPs before the boundary point
Algorithm 1:
Clustering Pre-transcoding Method
Input :
Pre-transcoded
GOP to GOP th Size of
GOP to GOP th : S GOP j Cloud Storages price: P S , P S , P S , P S Number of views of
GOP to GOP th Output:
Storage Cost of pre-transcoded
GOP to GOP th Apply K-Means clustering on
GOP to GOP th with K = 4 Cluster 1 pre-transcoding cost: C S i ← (cid:80) S GOP j · P S Cluster 2 pre-transcoding cost: C S i ← (cid:80) S GOP j · P S Cluster 3 pre-transcoding cost: C S i ← (cid:80) S GOP j · P S Cluster 4 pre-transcoding cost: C S i ← (cid:80) S GOP j · P S Total cost of pre-transcoding
GOP to GOP th : C S GOP − GOPth ← C S i + C S i + C S i + C S i Fig. 2: Pre-transcoding and Clustering of frequently accessed GOPsin the long-tail distribution ( GOP th ) are pre-transcoded. The algorithm applies the K-means clustering on the pre-transcoded GOPs, it is based onusing their number of views as a parameter to decide whereeach GOP to be stored. The number of clusters of GOPs is 4.In steps (2 - 5), the algorithm calculates the sum of storagecost of GOPs that have similar number of views and storesthem in each cluster. In step 6, it sums all the storage costsup of the pre-transcoded GOPs.IV. E XPERIMENT S ETUP
A. Videos Synthesis
Video streams companies use huge repositories to store thevideos. We do not have permission to access these repositories.Generally, it is a long and costly process to download a bignumber of videos and then transcode them.Synthesizing videos requires to know their characteristics,particularly, such as GOP size, GOP transcoding time, andnumber of GOPs for each video. Therefore, we build ourrepository by executing the method in [4], [7].Based on the obtained characteristics of videos i.e., numberof GOPs, size of GOP, and the linear equation for GOPtranscoding times, we synthesized our repository by generating50,000 videos.
B. Amazon Storage Rates
Amazon offers four types of cloud storage with differentprice rates and bandwidth accesses as illustrated in table I
TABLE I: Amazon storage types and their rates in USD
Storage Price
S3 Standard $0.023 GB/month
S3 Standard-IA $0.0125 GB/month
S3 One Zone-IA $0.01 GB/month
S3 Glacier $0.001 GB/month
C. Methods for Comparison
To assess our proposed method, we use three other methodsfor comparison. • Fully pre-transcoding method, item stores the wholevideo streams • Fully re-transcoding method, it re-transcodes all videostreams upon request. • Partial pre-transcoding method in [7] , it partially storesthe video stream in the cloud standard storage.V. S
IMULATION R ESULTS
A. Clustering of FAVs
We applied the K-means clustering method on the FAVsin the repository as illustrated in Figure 3. The cluster-ing method groups the frequently accessed videos in fourclusters:
Cluster contains GOPs which have the highestand similar views and uses S3 Standard to store them, cluster , cluster , and cluster contains GOPs accord-ing to their views similarity and use S3 Standard-IA , S3 One Zone-IA , and
S3 glacier respectively to storethese GOPs. the storage types are based on the accessbandwidth access and rate. That means the highest numberof views for FAVs would be stored in
S3 Standard ( cluster )which has the highest bandwidth access and high-est price. Cloud storage S3 Standard-IA , S3 OneZone-IA , and
S3 glacier come subsequently and
S3One Zone-IA ( cluster ) has the lowest number of viewsfor FAVs, it provides the lowest bandwidth access and haslowest price rate.In this experiment, the x and y-axis represent the videosize and number of views respectively. the simulation resultshowed the GOPs’ views of FAVs ranges from to .However, the clustering method could be applicable for anydifferent views range of FAVs. Furthermore, we selected theparameter of the clustering k = 4 because AWS offers 4 typesof storage in the cloud.In our previous work [7], we assumed all the pre-transcodedGOPs to be stored in the S3 Standard storage which hasthe highest price, this incurred a high cost to store all videosin the same storage in the cloud while the proposed clusteringmethod distribute the videos in different cloud storages whichcost less.
Fig. 3: Clustering frequently accessed video streams based on thenumber of viewsFig. 4: Cost comparison of the four methods, full storing method,fully transcoding method, partially pre-transcoding method, and pro-posed the clustering method when number of frequently accessedvideo varies
B. Impact of Changing FAVs number in repository
To evaluate the proposed method effectively, we need tobuild a huge repository of videos. Therefore, we synthesizeseveral repositories that contain a total of 50,000 videos each.In such repositories, the percentage of FAVs is varied from5% to 30%.The simulation result of the total cost of fully pre-transcoding, fully re-transcoding, partial pre-transcoding, andclustering pre-transcoding methods is shown in Fig. 4. Themethod of the full storage does not vary and is constant eventhe percentage of FAVs changes because the full storage costdoes depend on the number of views of videos.The experimental results in Fig.4 show that our proposedclustering method outperforms the other methods and reducesthe incurred cost compared to the fully re-transcoding methodby up to 90% when FAVs are 30% of the repository, also, ourproposed method reduces the incurred cost up to 75% whencompared to the fully pre-transcoding method and reduces thecost up to 40% when compared to the partial pre-transcodingmethod [7]. Our proposed method could reduce the costsignificantly when the percentage of FAVs increases in therepository. VI. C
ONCLUSION AND F UTURE W ORK
In this paper, we propose an improved algorithm to min-imize further the cost of cloud resources, in particular, wecluster the frequently accessed GOPs/videos and then storethem in the four types of cloud storage. We analyze the per-formance of the proposed when changing the number of FAVsin the repository. Experimental results show the efficiency ofour proposed when the number of views of FAVs increases,the incurred cost is reduced up to 40%.The future work will be focused on developing videosummarization for viewers to improve the quality of servicesof video streaming. R “Video transcoding: an overview of various tech-niques and research issues.”
IEEE Transactions on Multimedia 7.5, 2005.[4] Li, Xiangbo, Mohsen Amini Salehi, and Magdy Bayoumi. “Cloud-basedvideo streaming for energy-and compute-limited thin clients.” the Stream2015 Workshop at Indiana University, 2015.[5] http://techblog.netflix.com/2012/12/videos-of-netflix-talks-at-aws-reinvent.html[6] Li, Xiangbo, Mohsen Amini Salehi, and Magdy Bayoumi. “VLSC:Video Live Streaming Using Cloud Services.” “Cost-efficient repository management for cloud-based on-demand video streaming.”
In 2017 5th IEEE International Conferenceon Mobile Cloud Computing, Services, and Engineering (MobileCloud),pp. 39-44, 2017.[8] N. Sharma, D. K. Krishnappa, D. Irwin, M. Zink, and P. Shenoy “GreenCache: augmenting off-the-grid cellular towers with multimediacaches.”
Proceedings of the 4th ACM Multimedia Systems Conference,2013.[9] X. Li, M. A. Salehi, M. Bayoumi, and R. Buyya, “CVSS: A Cost-Efficientand QoS-Aware Video Streaming Using Cloud Services, in Proceedingsof the 16th ACM/IEEE International Conference on Cluster Cloud andGrid Computing, CCGrid 16, May 2016.[10] https://aws.amazon.com/ec2/pricing/on-demand/. Accessed October2019.[11] F. Jokhio, A. Ashraf, S. Lafond, and J. Lilius, “A Computation andStorage Trade-off Strategy for Cost-Efficient Video Transcoding in theCloud,” et al.“A version-aware computation and storage trade-off strategy for multi-version VOD systems in the cloud.” “Characterizing video access s in mainstream media portals.”
Proceed-ings of the 22nd International Conference on World Wide Web, 2013.[14] Kim, Hyun-Woo, He Mu, Jong Hyuk Park, Arun Kumar Sangaiah,and Young-Sik Jeong. “Video transcoding scheme of multimedia data-hiding for multiform resources based on intra-cloud.”
Journal of AmbientIntelligence and Humanized Computing: pp 1-11, 2019.[15] Mahmoud Darwich, Mohsen Amini Salehi, Ege Beyazit, Magdy Bay-oumi, “Cost-Efficient Cloud-Based Video Streaming Through MeasuringHotness’ , The Computer Journal, Volume 62, Issue 5, pages 641656, May2019.[16] G. Gao, W. Zhang, Y. Wen, Z. Wang and W. Zhu, “Towards Cost-Efficient Video Transcoding in Media Cloud: Insights Learned From UserViewings,” in IEEE Transactions on Multimedia, vol. 17, no. 8, pp. 1286-1296, Aug. 2015.[17] Jokhio, Fareed, et al.“Analysis of video segmentation for spatial resolu-tion reduction video transcoding.”et al.“Analysis of video segmentation for spatial resolu-tion reduction video transcoding.”