Issues and challenges in Cloud Storage Architecture: A Survey
Anwar Ghani, Afzal Badshah, Saeedullah Jan, Abdulrahman A. Alshdadi, Ali Daud
JJournal Name [Online ISSN], Volume X, Issue X, Pages, Month 2020
Digital Object Identifier 10.1109/RPIJOURNAL.2020.DOI Number
Cloud Storage Architecture: ResearchChallenges and Opportunities
Anwar Ghani , Afzal Badshah , Saeedullah Jan , Abdulrahman A. Alshdadi ,and Ali Daud Department of Computer Science & Software Engineering, International Islamic University Islamabad, 44000, Pakistan (e-mail: [email protected],[email protected]) Department of Computer Science & IT, University of Malakand, Chakdara, 18800, Pakistan (e-mail:[email protected]) Department of Information Systems and Technology, College of Computer Science and Engineering, University of Jeddah, Jeddah, Saudi Arabia (email:[email protected]) Department of Information Systems and Technology, College of Computer Science and Engineering, University of Jeddah, Saudi Arabia (e-mail:[email protected])
Corresponding author: Anwar Ghani(e-mail: [email protected]).
ABSTRACT
From home appliances to industrial enterprises, the Information and Communication Technology (ICT)industry is revolutionizing the world. We are witnessing the emergence of new technologies (e.g, Cloudcomputing, Fog computing, Internet of Things (IoT), Artificial Intelligence (AI) and Block-chain) whichproves the growing use of ICT (e,g. business, education, health and home appliances), resulting in massivedata generation. It is expected that more than 175 ZB data will be processed annually by 75 billion devicesby 2025. The 5G technology (i.e. mobile communication technology) dramatically increases network speed,enabling users to upload ultra high definition videos in real-time, will generate a massive stream of big data.Furthermore, smart devices, having artificial intelligence, will act like a human being (e.g, a self-drivingvehicle etc) on the network, will also generate big data. This sudden shift and massive data generationcreated serious challenges in storing and managing heterogeneous data at such a large scale. This articlepresents a state-of-the-art review of the issues and challenges involved in storing heterogeneous big data,their countermeasures (i.e, from security and management perspectives), and future opportunities of cloudstorage. These challenges are reviewed in detail and new dynamics for researchers in the field of cloudstorage are discovered.
INDEX TERMS
Internet of Things, Cloud Computing, Storage Architecture, Cloud Security, Cloud DataManagement
I. INTRODUCTION T HE recent advances and development in smart technol-ogy is getting more attention and attraction, resulting ina massive data generation. The 75 billion devices forecast isa big number; even ten times greater than the whole worldpopulation [1]. These devices will increase the annual size ofthe global data-sphere up to 175 ZB [2], [3]. Another reportstates, as shown in Fig. 1, that more than 331 billion dollarswill be invested in cloud up to 2023 [2]. This development notonly requires special infrastructural improvements but alsospecial techniques to process and store the incoming data [4].Furthermore, integrating Artificial Intelligence (AI) in smartdevices makes the data storage process more complicated.Fig. 1 shows the devices and revenue forecast of cloudcomputing.
FIGURE 1.
The cloud devices and revenue forecost.
VOLUME 1, 2020 a r X i v : . [ c s . N I] A p r hani: et al. : Preparation of Papers for RPiOAJ (March 2020) Storage Storage Storage Storage
Servers Servers Servers Servers
Underlying physicalstorage
Cloud storage provides virtual storage resources without buying the physical resources. Remote accessibility (i.e, access from everywhere and anytime) is the core of cloudstorage. Leading cloud provider (i.e, Apple iCloud, Microsoft OneDrive, and Google Driveetc) are providing remote services to their users.
FIGURE 2.
Structure of cloud storage.
Today more embedded devices joined the Internet to mon-itor and connect everything (e.g, traffic facilities, buildings,environment, and lakes), enlarging the size of the data gen-eration [5], [6], [7], [8], [9]. As the data on the Internetis increasing day by day, therefore, analyzing and storingit through traditional data management method is a greatchallenge [10], [11] . However, researchers are struggling todesign new kinds of databases based on NoSQL for handlingunstructured data at such a large scale [12], [13], [14]. Thereare many proposals in the literature for a universal storagearchitecture which supports multiple data models at the sametime and can store big heterogeneous data in the cloudenvironment [15], [16], [17], [18].Fig. 2 shows the structure of cloud storage.With the advent of technology, computing requirements oforganizations grew exponentially prompting the organizationto incorporate more computing and storage resources [19].Setting up systems at such large scale require more effortsand heavy investments prompting the enterprise customers tooutsource their computing and storage resources [20], [21],[22], [23]. The users have no full control over the computingresources available through cloud over the Internet [24], [25].Storage in the cloud is becoming a hot research venue today because new applications are data intensive which doublesstorage capacity requirement as well as data usage everyyear. It prompted some commercial organizations to workfor another cloud service called as “on demand storage”.Currently, the storage providers are fixated towards otheraspects related to cloud storage like cost issues, performanceissues and incorporating multiple storage [26]. Fig. 2 showsthe structure of master and data node in cloud storage archi-tecture.The models of data centers in cloud computing are basedon “design-for-failure” principle. Provisioning of global stor-age services require, cloud storage must use scalable, cheaperand purposed built solutions. Such solutions may includedifferent hardware like, servers, networking equipment, andstorage systems. It should use standard delivery models onmassive economies of scale. “off-the-shelf” products de-signed for the traditional IT market may not be suitable touse in cloud data centers since they are not only expensivebut also they do not meet the specific requirements of clouddata center environment.This study explores the cloud storage architecture its chal-lenges and possible solutions. Additionally the cloud storagefuture and opportunities. Cloud storage issues include but not VOLUME 1, 2020 hani: et al. : Preparation of Papers for RPiOAJ (March 2020)
Consumer-1Consumer-2Consumer-3 Consumer-nInternet Storage Service ProviderData node-1 Data node-2Data node-n Data node-5Master node Data node-3Data node-4
FIGURE 3.
Master and data node in cloud storage architecture. limited to Security, Confidentiality, Data Dynamics, Integrity,Data Access, Data Segregation, Authentication and Autho-rization, Data Breaches, Backup Problem and vulnerabilitiesin Virtualization.Rest of this article is structured as follows: Section IIprovides an insight into the issues related to cloud storageand their countermeasures. Section II discusses the futureopportunities of cloud storage. Finally, section IV concludesthe article with the key findings and future directions.
II. CLOUD STORAGE CHALLENGES AND POSSIBLESOLUTIONS
Storage in a cloud is a crucial part of the Infrastructure asa Service (IaaS). The lack of proper storage management incloud environment, may lead to severe consequences [27].Cloud storage related issues have been categorized as datasecurity and data management issues [28], [29]. This paperfocuses on issues related to these two categories and a reviewof possible solutions to such issues. Some of the pointsmay overlap both categories, however, this distinction mayhelp in understanding the challenges faced by cloud storageproviders and tenants. The following subsections elaboratethese issues and their counter measures.
A. DATA SECURITY ISSUES
Data security is an important requirement from tenant as aright. Secure services attract users to store their data in acloud. Companies providing the cloud storage services aresearching for techniques that can control access to clouddata and improve security. With increase in size of the data,there is also an increase in data attacks and interceptions.The cloud computing provides storage services as a vitalizedenvironment where a user has no control over the data [30]. Insuch situation, a user may ask questions like “where exactlyis my data located?”, “what happen if I delete my data?” and“is the deleted data really deleted?”.Many solutions to data security in cloud can be found in
Cloud Storage Issues Data Security IssuesData Management Issues Data ConfidentialityData Integrity Data AccessAuthentication & AuthorizationData Breaches
Data Dynamics
Data Segregation
Virtualization
Vulnerabilities
Backup Issues
Availability
Data Locality
FIGURE 4.
Overall flow of this study. literature. Authors in [30] divided the security solutions intofour layers (i.e. availability, authentication, confidentialityand integrity). They argued that if confidentiality is achieved,it automatically ensures integrity. However, this sub sectionis dedicated to a more elaborate study of the issues related todata security. A recent study exploring data security and pri-vacy in cloud storage [27] pointed out the three main reasonsbased on the features of cloud computing independent of thetechnology being used on the server. It includes outsourcingand multitenancy.A Time Stamp Authority (TSA) and Public Key Infras-tructure (PKI) technologies are introduced into the cloud
VOLUME 1, 2020 et al. : Preparation of Papers for RPiOAJ (March 2020) storage system for authentication and security with minimumcost and less system overhead. Trusted time stamp helps inaudit and recording [31], [19], [32], [33]. The three pointsconsidered are User Identification, Time Stamping and UserVerification through cloud storage system. The use of PKIimproves security whereas authentication is done throughdirectory services. The use of a time stamp provides securityservices like audit and evidences with a very minimum over-head. TSA also performs data management and optimizationin cloud storage system. The workload is increased by TSAand client communication and the verification of users’ oper-ations. As during the communication process no certificateis used so extra overhead is not involved. The operationcommands are converted into time stamp and sent to TSAserver, which communicates with directory server and verifycertificate. On validating the certificate, a time stamp isissued. The corresponding time stamp is then sent to thecloud and further operations may be performed. The cloudsystem stores the time stamps and operations record. Theoperations may be queries, downloads and uploads etc. Thebasic approaches used in designing data security techniquesare shown in Table 1. Furthermore, AI, 5G, IoT and blockchain are improving the privacy and security [34].
1) Confidentiality Issues
Cloud storage is a collection of storage servers on which mul-tiple customers’ data is stored, which makes privacy a majorconcern. The fundamental requirement for confidentiality ofthe information stored or processed in the cloud is the guar-anteed protection of confidential or sensitive information.Based on the requirements of a specific scenario, this mayrelate to all or part of the externally stored data, the identityof the users who have access to the data or the actions thatthe users take on the data cite t05. Encryption techniquesare used to achieve confidentiality in such systems. Cloudcomputing is a technology that uses the internet and serversto maintain and manage data and applications. Cloud com-puting has improved computing capabilities without largeinvestments.In the existing situation in order to avoid confidentialityissues, the system may want to implement encryption anddecryption techniques [35] which lead to limited systemoperations and the user must know encryption decryptionKeys. Some systems may implement both encryption andobfuscation depending on the type of data to be stored[30].A system based on proxy encryption, which supportsvarious functions during the distributed storage system, isproposed in cite p38, p55, which consists of four stages: 1)system configuration, 2) data storage, 3) data transfer and 4)data recovery. An RSA-based algorithm is used to generatekeys. The solution is when a sender “ A ” wants to send amessage to recipient “ B ”, “ A ” signs the message withhis secret key and then encrypts it with the public key of“ B ” and downloads the encoded text. After retrieving themessage, “ B ” decrypts it with its public key and thenchecks the public key sign “ A ”. The whole process involves two communication stages; a download from the sender “A ” and download by the recipient “ B ”. This is why theproxy recoding scheme is used to reduce the overhead of thedata transfer function in the secure storage system. Here aresome crucial points regarding data privacy in a cloud storageenvironment.1) In a cloud computing paradigm confidentiality of gov-ernmental and business information as well as privacyof personal information has the highest insinuations.2) The level of confidentiality and privacy of a user de-pends upon the privacy policies and terms of serviceprovided by a cloud provider.3) Disclosure of information to a cloud provider by auser may change information of some specific types aswell as certain user categories, rights and obligationsof privacy and confidentiality.4) Personal and business information may be adverselyaffected in terms of legal status protection.5) Protecting confidentiality and privacy and the privacyrights of those processing and storing this informationin a cloud environment may be highly affected by thelocation of information.6) A cloud may store information at different venues withdifferent legal implications leading to different legalconsequences at the same time.7) Different laws against criminal activities and othermatters can oblige/force a provider to disclose or ex-amine user records for the sack of evidence.8) In addition to the legal protection for protecting a user’sprivacy and confidentiality, various legal qualms resistagainst gauging an information in a cloud for its status.
2) Integrity Issues
Data integrity is one of the most crucial elements of any sys-tem. Integrity requires that the authenticity of the parties (i.e.users and vendors) communicating in the cloud guaranteethe data stored with third-party vendors and the responsesresulting from the calculation of requests cite t05. In astandalone system, data integrity may be achieved with asingle database using constraints and transactions. To insureintegrity of the data transactions must adhere the mostly usedproperty in databases known as the ACID (atomicity, con-sistency, isolation and durability) property. But distributedsystems are entirely different in complexity where multipledatabases and multiple applications execution is a normaltrait. In a distributed environment, data may be maintainedat different sites. Therefore, any transaction involving datashared by multiple sites must be handled carefully in a wayto avoid transaction failure and allow various distributedapplications through a resource manager to be a part of theglobal transaction.With the entrance to the world of Service Oriented Ar-chitecture (SOA) and Cloud computing, issues of data in-tegrity grow exponentially because a mixture of local andSaaS (Software as a Service) applications are displayed asa service. SaaS model supports multi tenancy in applications VOLUME 1, 2020 hani: et al. : Preparation of Papers for RPiOAJ (March 2020)
TABLE 1.
Basic approaches used in designing data security techniques
Data Security Public Key Inscription Low cost/system overheadTrusted Timestamps Auditing, recording, data managementDirectory Services Authentication, verification which usually hosted by third party and their functionalityis exposed through XML based APIs (Application Program-ming Interface). Similarly in other environments like SOAvarious applications uses web services for example SOAPand REST to expose their functionality. However, managingtransactions using web services is a serious challenge. Sinceguaranteed delivery or transactions are not supported byHTTP protocol level giving the only way out of implement-ing these SOA at the API level.
3) Data Access Issues
Issues in access to data in a cloud storage are mostly due tosecurity policies. For example, a small level business organi-zation may use services of a cloud provider for executing itsbusiness processes [36], [37]. Such organizations allow theiremployees to access a specific organizational data accordingto its own organizational security policies. These policiesmay prevent some employees from accessing a specific set ofdata and allow them to access certain data. To stop intrudersfrom gaining unauthorized access to cloud resources, a cloudmust adhere these security policies [38]. The SaaS modelmust have the ability to allow organizations to integrate theirsecurity policies as well as keep organizational data withinits boundary in case when multiple organizations use thesame cloud environment. The requirement of availabilityis; there must be a mechanism for verification of ServiceLevel Agreements (SLA) between a user and providers whichverifies that the user’s requirements are fulfilled [39].Many counter measures proposed in the literature can befound to mitigate the problems related to data access in cloudstorage. In literature three categories of secure access controlcan be found (i.e. Role Based Access Control (RBAC),User Based Access Control (UBAC) and Attribute BasedAccess Control (ABAC)) [40], [41]. Due to the attachmentof access control list (ACL) to user data, UBAC is usuallynot considered as a suitable candidate for cloud storage.Additionally, the involvement of Big Data the computationaland communication overhead required for handling ACL ishigh [42]. Then there is role based classification of usersto control access to data. A user matching a specific roleis granted access to data. Such approaches are consideredsuitable for business organizations at enterprise level forexample hospitals [43]. The third and often used categoryin cloud storage is Attribute Based Access Control (ABAC)where a data owner assigns attributes and policies to usersand data respectively [44]. In this case access to the data isgranted to users having attributes that satisfy a specific accesspolicy. For a confidential fine-grained access to data in cloud, this category is further divided into two approaches i.e. KP-ABE [45], [31] and CP-ABE [46][13]. In case of KP-ABEthe key of a user is linked with an access policy whereasattributes are linked with ciphertext. In contrast to KP-ABE,in CP-ABE the key of a user is linked with an attributewhereas the ciphertext is linked with an access policy.However, the complexity of attribute based access controltechniques grows linearly, as the number of attributes usedin decryption raises, incorporating tremendous overhead incomputation specially for devices with limited resources likemobile devices [37].
4) Authentication and Authorization Issues
Authentication, in any system that needs a foolproof security,plays a crucial role like an entrance door that allows onlytrusted individuals, to the premises of a cloud. Access toimportant information depends on authentication, therefore,due to it’s sensitive nature, authentication process must berobust to ensure availability to authentic users. In combina-tion with cryptography, not only data confidentiality, but alsoits integrity can be ensured by granting access only to au-thenticated individuals. Most of the security concerns can bemitigated through a sophisticated authentication mechanism[47], [48], [35].A Lightweight Directory Access Protocol (LDAP) serveris used by various companies to store information about theiremployees [49]. Managing users in small and medium sizebusinesses is mostly achieved through Active Directory inthe portion of business where the adoption of SaaS modelis high (Microsoft White Paper, 2000). This model allowssoftware to be hosted outside the organizational firewall.Many organizations separate user credential database fromtheir IT infrastructure therefore, a customer must keep trackof all the employees joining or leaving the organization andmust enable or remove their accounts accordingly from thesystem. This may result in extra management overhead onthe customer organization if it uses multiple SaaS products.In such cases different powers can be delegated to the cus-tomer by the provider, authentication for example enablingcustomer organizations internal LDAP/AD server to controltheir user management.
5) Data Breaches
A cloud environment is usually shared among many cus-tomers to store their data. Therefore, a compromise of thecloud environment means a potential threat to the data of allusers making cloud an attractive target for attackers [36]. R.Cooper in his report [50] rated external criminals as the high-est threat contributing 73% but with least impact compro-
VOLUME 1, 2020 et al. : Preparation of Papers for RPiOAJ (March 2020) mising 30,000 records producing 67,500 Pseudo Risk Score(PRS). Similarly, insider threats received the minimum ratingof (18%) but with greatest impact compromising 375,000records with a PRS of 67,500. The middle rating has been re-ceived by partners with 73.39% compromising 187,500 witha PRS of 73,125. The security provided by SaaS is arguedto be better in comparison to conventional means, howeverinsiders may not have direct database access but it stillraises a risk with huge impact on data security. Employeesof SaaS providers can cause exposure of customers privateinformation since they have access to a lot of information. Inorder to avoid such complications, standards like PCI-DSS(Payment Card Industry-Data Security Standards) must befollowed by SaaS providers.
B. DATA MANAGEMENT ISSUES
The management issues related to data has been exploredin this sub sections. The data management issues has beencategorized and briefly explained as follows.
1) Data Dynamics Issues
Data management in cloud is considered to be untrustworthydue to the fact that it shifts databases as well as applicationsoftware to large centralized data centers. This new paradigmintroduces various security issues yet to be understood. Datadynamics support through operations in cloud for exampleinsertion, block modification, and deletion is a huge stepin the direction of practicality as cloud services are not re-stricted only to backup and archiving. The following differentmethods are used for the assurance of data dynamics in cloudstorage [51], [52]. • On a large scale the data centers are being transformedinto computing pools by “Software as a Service” (SaaS)computing architecture. In addition the fast growth innetwork resources like bandwidth and reliability enablescustomers to subscribe services with high quality fromthe remote data and software applications in data cen-ters. • A cloud service provider for his own benefits mayconceal errors in data or software used by the clients.For example a provider may deliberately delete data ofan ordinary client which is accessed less often withoutthe client’s knowledge in order to increase his savingsin money and storage space [51]. • For data dynamics various schemes have been designedwith the efforts to combine efficiency, unlimited use ofqueries and information retrievablity in these schemes.One possible solution in this case could be to motivatepublic auditing system of data storage security in Cloudcomputing [33]. In addition, fully dynamic protocols for dataoperations specially for block insertion, must be designedwhich is a lacking feature in most of the existing approaches.To support public auditing which is efficient and scalable, theexisting schemes must be extended. Such extension shouldachieve batch auditing enabling a third party auditor (TPA) to perform auditing tasks delegated from multiple users si-multaneously.
2) Data Segregation Issues
Cloud computing architecture became popular because ofit multi-tenancy nature [49], [53]. Multi-tenancy in cloudthrough SaaS applications allow storage of data from multi-ple users simultaneously. This may create an opportunity fora user’s data to intrude into another user’s data since data ofdifferent users reside at single location. This intrusion mayexploit application’s loopholes or by injecting SaaS systemwith malicious client code. If an application injected with amasked code executes it without verification shows that thereare high possibilities of intrusion into others data. Therefore,a SaaS model must ensure that the data of each user isbounded both at physical and application levels. Data fromdifferent users must be ghettoise intelligently by the SaaSservice [54].Security checks may be bypassed using vulnerabilities inapplication by attackers through handcraft parameters. Thismay lead to the exposure of other tenants sensitive data.Therefore different assessments test must be performed toensure that data from different users in multi-tenant environ-ment is fully segregated from each other. These tests include;i) Data validation, ii) SQL injection flaws and iii) Storageinsecurity. Any possible flaws detected by these tests couldbe used to illegally access sensitive data of the enterprise orother tenants.
3) Virtualization Issues and Vulnerability
One of the major component of cloud environment which en-sures that various instances running over a single machine beghettoise from each other is known as virtualization. It is thesource of major security challenges in a cloud environmentwhich are not fully investigated today [55], [56]. Secondissue is the administrative control of the operating systems,operating as guest and host systems and their imperfect provi-sioning of isolation [57] and scalability issues [58]. Many ofthe current Virtual Machine Monitors (VMM’s) suffer frombugs allowing escape from VM therefore, “root security” ismandatory in such cases to prevent host operating systemfrom being interfere with by any virtualized guest systems.Some virtualization software has been reported to have vul-nerabilities which could allow a local user or an attacker toskip certain security checks and gain illegitimate access [57],[59]. One such example is that of Microsoft Virtual Serverand Virtual PC vulnerability where a user of guest operatingsystem could be allowed to execute code on other guest op-erating system or even the host operating system itself. Thiscould allow a raise in privileges which can lead to unautho-rized access of sensitive information. Similarly a validationerror in “tools/pygrub/src/GrubConf.py” of Xen which couldallow a user with “root” access in a guest operating systemthrough specific crafted contents in grub conf to use domain0 for running various commands at booting time of guestoperating system. Fully functional interposition, inspection VOLUME 1, 2020 hani: et al. : Preparation of Papers for RPiOAJ (March 2020)
TABLE 2.
Security solution for cloud storage architecture (Part-I)
Security Properties Approaches Description
Confidentiality Cryptography Cryptography secure and protect data during communication. It is useful to blockan unauthorized users from accessing private data.Digital signatures Digital signature is a symbolic description that can verify the authenticity ofmessages or digital documents. A true digital signature provides access to thedata.Proxy Re-encryption Proxy encryption is commonly used when a party wants to reveal to a third partythe content of messages sent to it that are encrypted with their public key.Obfuscation Obfuscation is the intentional creation of source code or machine code that isdifficult for a human to understand. Like natural language eclipse, it can useunnecessarily redirected expressions to make statements.Blockchain Blockchain is a smart design that offers digital information for sharing, but notfor copying. Blockchain technology has created the backbone of a new type ofinternet.Atomicity MC Data MC is a world leader in the delivery of highly complex data migrations,specializing in end-to-end delivery of industry-specific, custom and enterprisetransformation ERP, CRM projects.Consistency The consistency of the database system refers to the fact that the databasetransaction can only be modified in an authorized manner.Isolation Isolation in database systems determines, how the integrity of activities arevisible to other users and systems.Durability In database systems, sustainability is the ACID property that ensures that closedtransactions persist.Data Access Role Based Access Control RBAC is an entrance approach to regulate access to the system to authorizedusers. It is used by most companies with more than 500 employees and canimplement mandatory access control (MAC) or discretionary access control(DAC).User Based Access Control Role-based access (or role-based permissions) adds another categorization layerin addition to what is provided by user-based access.Attribute Based Access Control Attribute-based access control (ABAC) is also called as policy-based accesscontrol, defines an access control that give access rights to users through theuse of policies that combine attributes.Data Breaches Directory Services A Lightweight Directory Access Protocol (LDAP) server is used to provideauthentication and authorization services and complete isolation are not achieved in VMMM yet andneed further investigation.
4) Backup Issues
The sensitive data belonging to various business enterprisesmust be backed up by the SaaS providers to be used for fastrecovery in disasters cases. Also, to protect against securitythreats like accidental leakage of data various encryptionschemes be used to protect the back up data. These encryp-tion schemes must be strong enough to resist modern attacks.Amazon as cloud vendor does not encrypt the data bydefault at rest in S3. This control is given to the user tosecure their back up data separately in order to protect againstunauthorized access or tempering. Various tests can be per-formed to validate that a back up data is secure provided bySaaS model. These tests include; i) Storage insecurity and ii)Configuration insecurity. Any flaws identified by these testsmay be potential threats which can lead unauthorized usersto access information which is sensitive and stored in cloudbackups belonging to different enterprises.
5) Availability
The SaaS applications guarantee around the clock servicesto a client. This involves architectural level changes in SaaS infrastructure and applications to attain availability and scala-bility. Multitier cloud architecture needs to be adopted, cloudarchitecture must also support load balancing of applicationinstances, running on different servers. Cloud storage mustbe resilient to software and hardware failures further, it mustbe protected from both distributed denial of service attacks(DDOS) as well as denial of service DOS attacks[60], [61],[62], [63].For any unforeseen disaster, appropriate disaster recoveryand operational sustainability action plan should be con-sidered. This is important for certifying organizational datasecurity and organizational nominal downtime. For example,at Amazon, the AWS API endpoints are hosted by the sameworld-class Internet infrastructure that Amazon supports anduse connection throttling. To further reduce the potentialimpact of a DDOS attack, Amazon internally maintains thebandwidth that surfs on its vendor’s internet bandwidth tovalidate the SaaS vendor’s availability and evaluation tests.Many applications automatically provide security locks foruser accounts after successive incorrect credentials. Also, im-proper implementation and configuration of these functionscan be vulnerable to malicious users as a result of DDOSattacks.
VOLUME 1, 2020 et al. : Preparation of Papers for RPiOAJ (March 2020)
TABLE 3.
Security solution for cloud storage architecture (Part-II)
Security Issue Solution Description
Data Dynamics Public Auditing Efficient and scalable public auditing system should be introduced to extend theexisting schemesData validity Data Segregation Security layers give you the flexibility to consolidate vast amounts of data whilecontrolling who can see what, through a sophisticated system of work groups,organizational rollups, and access levels, combined with field and function levelsecurity.Virtualization Vulnerabili-ties Root Security Root protection enables users (e.g. smartphones, tablets) with the android mobileoperating system to get privileged control (known as root access) over differentAndroid subsystemsBackup Issues Encryption schemes Different encryption schemes coded the data before storing. This secure thebackup data from unauthorized users.Availability Multitier architecture Multitier architecture (often referred to as multi-level architecture) or multi-layerarchitecture is a client-server structure in which the functions of presentation,application processing and data management are separated.Load Balancing Running on different servers, resilient to software and hardware failure, and beprotected against DOS and DDOS attacksData Locality Regional backup servers Due to the different, region cyber rules, data should be kept in the same regionservers to avoid data locality, cultural and cyber rules issues.
6) Data Locality
In SaaS cloud model, a client uses the application providedby the SaaS and their own business data, but the client isunaware of storage location of the data in the cloud [38], [64].This may lead to several issues and many cases. For example,due to data privacy laws in different counties, data localityis of utmost importance in enterprise business architecture.For instance in many Southern American States and severalcountries in European Union, certain types of data may notbe allowed to leave the country premises because of the sen-sitivity of the information. Similarly local Governmentâ ˘A ´Zslaws and jurisdiction issues may arise in case of any type ofinvestigation [65]. A secure SaaS model may be capable toprovide reliability to its clients at the consumer data locality.
III. CLOUD STORAGE FUTURE AND OPPORTUNITIES
The future of the cloud is not less than a dream. AI-enabledobjects (e.g, self-driving vehicle), the web of IoT devices,and 5G connectivity (i.e, mobile communication technology)is changing the way of living [3]. The IT industry is rapidlychanging everything. Its simple and easy user interface; nocost and capacity constraints; and other numbers of featuresare attracting the individual and market [66]. .The recent advances in smart technology generate a mas-sive data traffic. The 51 billion devices forecast is a bignumber; even seven times greater than the whole worldpopulation [1]. These devices will increase the annual size ofthe global data-sphere up to 175 ZB [2], [3]. Another reportstates, as shown in Figure 1, that more than 331 billion dollarswill be invested in cloud up to 2023 [2]. It needs specialtechniques and infrastructure to process the incoming data[4]. Furthermore, integration of Artificial Intelligence (AI)in smart devices increases the data production value dramati-cally. With this rapid development in smart technology, cloudstorage is getting more and more attention. Along with theAI, block-chain is adding safety and security to the cloudstorage. This technology in storage is getting mature and will
Cloud StorageOpportunities
Invisibility
Cost SavingsAutomation andSynchronizationPrivacy and Security Remote AccessibilitySharing andcollaboration Disaster RecoveryUsability
FIGURE 5.
Opportunities of cloud storage increase the customer trust on cloud.This section presents a quick review of the cloud storagefuture and its opportunities [67], [38], [68], [69], [70], [71],[72], [73], [74], [75].
A. REMOTE ACCESSIBILITY
Remote accessibility (i.e, access from everywhere and any-time) is the core of cloud storage. The fast network speedand AI is making it more smarter and faster. Leading cloudprovider (i.e, Apple iCloud [76], Microsoft OneDrive [77],and Google Drive [78] etc) are providing fast and reliableremote services to their users. Remote access allows to storeand retrieve items from a cloud storage without needing tocreate a physical connection. Accessibility of storage devicesis getting interested after introducing high storage devicesand high bandwidth network. Remote access increases theusage of cloud storage and business. In the presence of VOLUME 1, 2020 hani: et al. : Preparation of Papers for RPiOAJ (March 2020) internet services, cloud storage can provide seamless accessto data files [68]. The coming 5G internet service will makethe accessibility very easy and smart as real-time access [79].
B. 5G CONNECTIVITY
With this high-speed technology, humans will be able tovirtually operate any machine at a distance of thousands ofKMs [80]. This will reduce the latency of up to 0 ms. Sucha big speed will minimize the need for the local hard drive.This technology will able to store and process data on thecloud without facing any jitters or delay. It is making real-time use possible. 5G is a new era of cloud storage [81], [82].
C. INTERNET OF THINGS (IOT)
With the introduction of the Internet of Things (IoT), thenumber of devices connected to the internet has increasedenormously. By 2025, 75 billion devices are expected to beconnected to the Internet processing 75 ZB data annually.This is a great number and will need a high technology toprocess and store this data. These figures clearly shows thatthe cloud storage has great worth in coming years. Further-more, the use smart devices are also dramatically increasing.These devices are small in size and have not enough spaceto store or process big data therefore, they depend on cloud[83].
D. ARTIFICIAL INTELLIGENCE (AI)
From facial expressions to self-driving vehicles, AI is pro-gressing very rapidly. AI is making smart decisions in com-plex situations. The today AI is called the weak AI whichperforms limited tasks such as recognizing facial expressionand driving a car, however, the future will have general AIwhich will perform a task just like human beings. The AI ismaking the cloud storage further smarter and attractive [84].Furthermore, the use of block-chain in storage is making itmore secure [85].
E. USABILITY
The provider business directly depends on resources utiliza-tion. Today technologies massively increase the cloud usagebecause it provides a very easy and reliable user interface.Usually, cloud storage has a local desktop folder for PCsand mobile devices which allows users to move files backand forth between the cloud and the local system usingdrag and drop facilities [38], [68], [72]. The integration ofsmarts technologies (i.e, IoT, AI, fog and 5G), making thecloud storage usability very easy. The 5G will provide ahigh bandwidth like real-time access. Its cost is very lowcompared to buy the devices; which is very appealing [79].
F. DISASTER RECOVERY
In today modern world, data is the most valuable asset. Los-ing it, cause irreversible damage to the business (includingloss of productivity, income, reputation and even customers).Business enterprises use cloud storage as a backup for their important files. In cloud storage, data is stored in threedifferent locations and in case of any disaster, data may easilybe recovered. Furthermore, cloud storage provides remoteaccess to files therefore, these files can be used for recoveryof their system in case any emergency or disaster [38]. 5Gtechnology made the recovery process very easy and fast.Comparatively to the traditional disaster recovery, cloud stor-age recovery is very easy, cheaper and fast. High investment,staff and maintenance are required for local disaster recoverysite .
G. COST SAVINGS
When we talk about cloud, it means that we are getting theresources of a supercomputer at our home without buyingit. We actually, hire these resources on very cheaper rateswhich save the capital investment of the consumer. Cloudstorage is used by various types of business enterprises toreduce their annual database operation expenses. Especially,the medium corporations, which are not able to invest toomuch on storage infrastructure, hire the cloud storage. Thissaves their major investment. Storing one gigabyte of datausing cloud storage services cost about three whereas a usercan achieve further saving in terms of power consumptionas remote cloud storage does not need internal power [73],[74]. Cloud storage saves operational and maintenance costand just as per their usage.
H. INVISIBILITY
The word storage create the imagination of a big physicaldevice to store big data. Big data and storage mean a bigphysical device, which will need operation and maintenance.However, cloud storage does not need physical space anduser access it remotely [86]. Cloud storage services, use vir-tualization techniques to provide resources to the customers.Customers do not know the complexities and working of theback end. Cloud storage is invisible and provides storagetransparency, with no physical presence on the user side. Itdoes not take up valuable space in the office or at home. Itdoes not need to spare a huge space for rocks and storage.Customers only hire the services and use them on the go.
I. PRIVACY AND SECURITY
Security of cloud storage for sensitive and confidential in-formation is usually higher than that for the locally storeddata, especially for enterprises. It uses advance security (i.e,advanced firewalls, event logging, internal firewalls, intrusiondetection, data replication, encryption, and physical security)to protect the data from outside attacks. Different type ofsecurity layers is used to protect the data houses. Concerningindividual storage, enterprises invest more in security. Stor-age services in the cloud used encrypted data both in trans-mission as well as at rest ensuring no unauthorized accessto data files. AI, 5G, IoT and block chain are improving theprivacy and security [34].
VOLUME 1, 2020 et al. : Preparation of Papers for RPiOAJ (March 2020)
J. AUTOMATION AND SYNCHRONIZATION
Cloud automation is a term for the processes and tools thatare used to reduce the manual effort involved in provisioningand managing cloud workloads. The cloud storage is selfmanaged and does not need any human efforts [87]. The mainissue most businesses and customers have, the proper followup of data backup. Cloud storage provides an automated databackup service to ease this tedious process. A user simplyneeds to tell the system what and when to back up, and thecloud service takes care of it by itself.Another attraction with the cloud storage is automaticsynchronization. Synchronization process ensures that userdata files are automatically updated across all of the userdevices. In this way the latest versions of the user’s data filesare saved on his/her local device and available on all of otheruser devices like user Smartphone etc. 5G made the syncingmore easy and now the devices works on real time [87].
K. SHARING AND COLLABORATION
Cloud storage makes the sharing easy. Either it is a photo or afile or even a folder containing hundreds of information files,storage service in cloud make it convenient for a user to shareit with a few clicks. Furthermore, it makes the files avail-ability everywhere and every time [88]. Online cloud storageservices are also ideal for collaboration purposes. It allowsmultiple users to collaborate and edit on a single documentor data file. User do not have to concern about tracking theup-to-date version or who has made what changes [68].
L. MASSIVE DEVICES AND DATA
As mentioned earlier, up to 2025, approximately 75 billiondevices will connect to the internet and this will process morethan 175 ZB of data per year. This is a very large figureand requires a lot of cloud storage. These predictions willdrastically change the need for cloud storage. This shows thatcloud storage has a very bright future ahead.
IV. CONCLUSION AND FUTURE DIRECTIONS
The recent advances in IT industry (e.g, Cloud computing, In-ternet of Things (IoT), Fog computing, Artificial Intelligence(AI) and Block-chain) is rapidly revolutionizing the cloudstorage. Especially, the 5G facilitation (i.e, minimum accessdelay and ultra high speed) boost the use of cloud storagedramatically. This article presented different challenges, theircounter measures, opportunities and future of cloud storage.it seems that cloud storage is designed to be highly scalableand conveniently manageable storage system rather than anefficient file system.Further, it is revealed that despite the ease of use andeconomic benefits, cloud storage technology still suffersfrom numerous problems. The cloud storage architecture ismostly clouded by security (e.g, confidentiality, integrity,access, authentication, authorization and data breaches) anddata management issues (e.g, dynamics, data segregation,backup, and virtualization). To counter these threats, variousmeasures are proposed in the literature. For example, for the security of data, digital certificates are used along witha trusted timestamp approach. Similarly, confidentiality isensured through cryptography solutions while for data accessissues, attribute-based encryption is mostly used. Accesscontrol is achieved through authentication and authorization.To maintain the integrity of the data, a global transactionmanager is used to ensure fail-safe management of transac-tion across multiple databases.Finally, it can be concluded that cloud computing (alongwith the integrated technologies) is a fast-growing technol-ogy which rapidly changing traditional computing. However,still, a lot of research efforts are needed to attract customers,especially business and enterprise customers to store theirsensitive data, using cloud storage.
REFERENCES VOLUME 1, 2020 hani: et al. : Preparation of Papers for RPiOAJ (March 2020) [17] V. Chang and G. Wills, “A model to compare cloud and non-cloud storageof big data,” Future Generation Computer Systems, vol. 57, pp. 56–76,2016.[18] R. Kumar and A. K. Bose, “Internet of things and opc ua,” ICNS 2015,p. 52, 2015.[19] S. Kamara and K. Lauter, Cryptographic Cloud Storage. Canary Islands,Spain: Springer Berlin Heidelberg, January 25-28 2010, pp. 136–149.[Online]. Available: http://dx.doi.org/10.1007/978-3-642-14992-4$_$13[20] P. Paola, C. Roberto, B. Alberto, and P. Lorenzo, “Amazon, google andmicrosoft solutions for iot: Architectures and a performance comparison,”IEEE Access, vol. 8, pp. 5455 – 5470, 2020.[21] T. Ye, X. Peng, and J. Hai, “Secure data sharing and search for cloud-edge-collaborative storage,” IEEE Access, vol. 7, pp. 15 963 – 15 972, 2019.[22] Q. Shuaiqing, Z. Qisheng, Z. Qimao, G. Feng, and L. Wenhao, “Hybridseismic-electrical data acquisition station based on cloud technology andgreen iot,” IEEE Access, vol. 8, pp. 31 026 – 31 033, 2020.[23] K. P. Chan and J. B. Seung, “Blockchain of finite-lifetime blocks withapplications to edge-based iot,” IEEE Internet of Things Journal, vol. 7,no. 9, p. 10.1109/JIOT.2019.2959599, 2020.[24] G. Murugaboopathi, C. Chandravathy, and P. Vinoth Kumar, “Study oncloud computing and security approaches,” International Journal of SoftComputing and Engineering (IJSCE), vol. 3, pp. 212–215, 2013.[25] A. P. Rajan et al., “Evolution of cloud storage as cloud computing infras-tructure service,” arXiv preprint arXiv:1308.1303, no. 1, 2013.[26] V. Chang, Y.-H. Kuo, and M. Ramachandran, “Cloud computing adoptionframework: A security framework for business clouds,” Future GenerationComputer Systems, vol. 57, pp. 24–41, 2016.[27] N. Kaaniche and M. Laurent, “Data security and privacy preservation incloud storage environments based on cryptographic mechanisms,” Com-puter Communications, vol. 111, pp. 120–141, 2017.[28] X. Sun, S. Qu, X. Zhu, M. Zhang, Z. Ren, and C. Yang, “Cloud storagearchitecture achieving privacy protection and sharing,” Appl. Math, vol. 9,no. 3, pp. 1639–1644, 2015.[29] L. Geng, “The research of digital library mass information storage systemarchitecture,” in International Symposium on Computers & Informatics.Atlantis Press, 2015.[30] L. Arockiam and S. Monikandan, “Efficient cloud storage confidentialityto ensure data security,” in International Conference on Computer Com-munication and Informatics. IEEE, Jan 2014, pp. 1–5.[31] J. Han, W. Susilo, Y. Mu, and J. Yan, “Privacy-preserving decentralizedkey-policy attribute-based encryption,” IEEE Transactions on Parallel andDistributed Systems, vol. 23, no. 11, pp. 2150–2162, 2012.[32] N. H. Hussein, A. Khalid, and K. Khanfar, “A survey of cryptographycloud storage techniques,” International Journal of Computer Science andMobile Computing, no. 5, 2016.[33] C. Wang, K. Ren, W. Lou, and J. Li, “Toward publicly auditable securecloud data storage services.” IEEE network, vol. 24, no. 4, pp. 19–24,2010.[34] G. Hittu and D. Mayank, “Securing iot devices and securelyconnectingthe dots using rest api and middleware,” in 4th International Conferenceon Internet of Things: Smart Innovation and Usages (IoT-SIU). IEEE,2019, pp. 1 – 6.[35] Z. Kartit, A. Azougaghe, H. K. Idrissi, M. El Marraki, M. Hedabou,M. Belkasmi, and A. Kartit, “Applying encryption algorithm for data secu-rity in cloud storage,” in Advances in Ubiquitous Networking. Springer,2016, pp. 141–154.[36] V. Chang and M. Ramachandran, “Towards achieving data security withthe cloud computing adoption framework,” IEEE Transactions on ServicesComputing, vol. 9, no. 1, pp. 138–151, Jan 2016.[37] Q. Li, J. Ma, R. Li, X. Liu, J. Xiong, and D. Chen, “Secure, efficientand revocable multi-authority access control system in cloud storage,”Computers & Security, vol. 59, pp. 45–59, 2016.[38] S. Subashini and V. Kavitha, “A survey on security issues in servicedelivery models of cloud computing,” Journal of network and computerapplications, vol. 34, no. 1, pp. 1–11, 2011.[39] P. Samarati and S. De Capitani di Vimercati, “Cloud security: Issues andconcerns,” Encyclopedia on Cloud Computing. Wiley, New York, 2016.[40] A. Sahai and B. Waters, Fuzzy Identity-Based Encryption. Aarhus,Denmark: Springer Berlin Heidelberg, May 2005, pp. 457–473.[41] E.-J. Goh, H. Shacham, N. Modadugu, and D. Boneh, “Sirius: Securingremote untrusted storage.” in NDSS, vol. 3, 2003, pp. 131–145.[42] J. Shen, D. Liu, Q. Liu, B. Wang, and Z. Fu, “An authorized identityauthentication-based data access control scheme in cloud,” in 18th Inter- national Conference on Advanced Communication Technology (ICACT).IEEE, 2016, pp. 56–60.[43] L. Zhou, V. Varadharajan, and M. Hitchens, “Achieving secure role-basedaccess control on encrypted data in cloud storage,” IEEE Transactions onInformation Forensics and Security, vol. 8, no. 12, pp. 1947–1960, 2013.[44] S. Yu, C. Wang, K. Ren, and W. Lou, “Achieving secure, scalable, and fine-grained data access control in cloud computing,” in proceedings of IEEEInfocom. IEEE, 2010, pp. 1–9.[45] N. Attrapadung, J. Herranz, F. Laguillaumie, B. Libert, E. De Panafieu,and C. Ràfols, “Attribute-based encryption schemes with constant-sizeciphertexts,” Theoretical Computer Science, vol. 422, pp. 15–38, 2012.[46] C. Hong, Z. lv, M. Zhang, and D. Feng, A Secure and Efficient Role-BasedAccess Policy towards Cryptographic Cloud Storage. Wuhan, China:Springer Berlin Heidelberg, September 2011, pp. 264–276.[47] D. Pritam and M. Chatterjee, “Enforcing role-based access control forsecure data storage in cloud using authentication and encryption tech-niques,” Journal of Network Communications and Emerging Technologies(JNCET), vol. 6, no. 4, 2016.[48] L. Zhou, V. Varadharajan, and M. Hitchens, “Enforcing role-based accesscontrol for secure data storage in the cloud,” The Computer Journal, p.bxr080, 2011.[49] A. Gholami and E. Laure, “Security and privacy of sensitive data incloud computing: A survey of recent developments,” arXiv preprintarXiv:1601.01498, 2016.[50] R. Cooper, “Verizon business data breach security blog,” 2008.[51] Q. Wang, C. Wang, K. Ren, W. Lou, and J. Li, “Enabling public auditabil-ity and data dynamics for storage security in cloud computing,” IEEEtransactions on parallel and distributed systems, vol. 22, no. 5, pp. 847–859, 2011.[52] Q. Wang, C. Wang, J. Li, K. Ren, and W. Lou, “Enabling public veri-fiability and data dynamics for storage security in cloud computing,” inEuropean Symposium on Research in Computer Security. Springer, 2009,pp. 355–370.[53] N. Ahmed, V. K. Ojha, and A. Abraham, “An ensemble of neuro-fuzzymodel for assessing risk in cloud computing environment,” in Advances inNature and Biologically Inspired Computing. Springer, 2016, pp. 27–36.[54] N. Khan and A. Al-Yasiri, “Framework for cloud computing adoption: Aroad map for smes to cloud migration,” arXiv preprint arXiv:1601.01608,2016.[55] A. V. Nimkar and S. K. Ghosh, “Router framework for secured networkvirtualization in data center of iaas cloud,” in Proceedings of 3rd Interna-tional Conference on Advanced Computing, Networking and Informatics.Springer, 2016, pp. 475–483.[56] S. Mercyshalinie, G. Madhupriya, S. Vairamani, and S. Velayutham,“Defense against dos attack: Pso approach in virtualization,” in 6th In-ternational Conference on Advanced Computing (ICoAC), Dec 2014, pp.199–204.[57] K. Benzidane, S. Khoudali, and A. Sekkaki, “Secured architecture forinter-vm traffic in a cloud environment,” in 2nd IEEE Latin AmericanConference on Cloud Computing and Communications (LatinCloud), Dec2013, pp. 23–28.[58] Y. Dong, X. Zhang, J. Dai, and H. Guan, “Hyvi: A hybrid virtualizationsolution balancing performance and manageability,” IEEE Transactions onParallel and Distributed Systems, vol. 25, no. 9, pp. 2332–2341, Sept 2014.[59] C. Li, A. Raghunathan, and N. K. Jha, “A trusted virtual machine inan untrusted management environment,” IEEE Transactions on ServicesComputing, vol. 5, no. 4, pp. 472–483, April 2012.[60] R. K. Banyal, V. K. Jain, and P. Jain, “Data management system to improvesecurity and availability in cloud storage,” in International Conference onComputational Intelligence and Networks (CINE), Jan 2015, pp. 124–129.[61] C. W. Chang, P. Liu, and J. J. Wu, “Probability-based cloud storageproviders selection algorithms with maximum availability,” in 41st Inter-national Conference on Parallel Processing, Sept 2012, pp. 199–208.[62] B. Mao, S. Wu, and H. Jiang, “Exploiting workload characteristics andservice diversity to improve the availability of cloud storage systems,”IEEE Transactions on Parallel and Distributed Systems, vol. 27, no. 7, pp.2010–2021, July 2016.[63] M. H. Chen, Y. C. Tung, S. H. Hung, K. C. J. Lin, and C. F. Chou,“Availability is not enough: Minimizing joint response time in peer-assisted cloud storage systems,” IEEE Systems Journal, vol. PP, no. 99,pp. 1–11, 2016.[64] Y. Hua, B. Xiao, X. Liu, and D. Feng, “The design and implementationsof locality-aware approximate queries in hybrid storage systems,” IEEE
VOLUME 1, 2020 et al. : Preparation of Papers for RPiOAJ (March 2020)12