soCloud: A service-oriented component-based PaaS for managing portability, provisioning, elasticity, and high availability across multiple clouds
aa r X i v : . [ c s . S E ] J u l Noname manuscript No. (will be inserted by the editor) soCloud: A service-oriented component-based PaaS formanaging portability, provisioning, elasticity, and highavailability across multiple clouds
Fawaz Paraiso · Philippe Merle · LionelSeinturier
Received: 12 th July, 2013 / Accepted: date
Abstract
Multi-cloud computing is a promising paradigm to support very large scaleworld wide distributed applications. Multi-cloud computing is the usage of multi-ple, independent cloud environments, which assumed no priori agreement betweencloud providers or third party. However, multi-cloud computing has to face severalkey challenges such as portability , provisioning , elasticity , and high availability . De-velopers will not only have to deploy applications to a specific cloud, but will alsohave to consider application portability from one cloud to another, and to deploydistributed applications spanning multiple clouds. This article presents soCloud aservice-oriented component-based Platform as a Service (PaaS) for managing porta-bility, elasticity, provisioning, and high availability across multiple clouds. soCloudis based on the OASIS Service Component Architecture (SCA) standard in order toaddress portability. soCloud provides services for managing provisioning, elasticity,and high availability across multiple clouds. soCloud has been deployed and evalu-ated on top of ten existing cloud providers: Windows Azure, DELL KACE, AmazonEC2, CloudBees, OpenShift, dotCloud, Jelastic, Heroku, Appfog, and an Eucalyptusprivate cloud. Keywords
Multi-cloud computing · Platform as a Service · Portability · Provision-ing · Elasticity · High availability · Service Component Architecture
Fawaz ParaisoInria Lille - Nord Europe & University Lille 1LIFL UMR CNRS 8022, FranceE-mail: [email protected] MerleInria Lille - Nord Europe & University Lille 1LIFL UMR CNRS 8022, FranceE-mail: [email protected] SeinturierInria Lille - Nord Europe & University Lille 1LIFL UMR CNRS 8022, France
IUF - Institut Universitaire de France
E-mail: [email protected] Fawaz Paraiso et al.
Cloud computing builds on established trends for driving the cost out of the deliv-ery of services while increasing the speed and agility with which services are de-ployed. Virtualization, on-demand deployment, Internet delivery of services are partsof Cloud computing. Cloud computing differentiates itself by changing how we in-vent, develop, deploy, scale, update, maintain, and pay for applications and the in-frastructure on which they run.Different cloud service providers, based on different technologies, support a largenumber of cloud services such as Infrastructure as a Service (IaaS) and Platform asa Service (PaaS). Cloud service consumers select what fit their requirements fromthe cloud services. For instance, requirements can be: price, quality of service (QoS),programming language, database, middleware, etc. It is difficult to cloud service con-sumers to meet all these requirements with a single cloud provider. Multi-cloud com-puting as the usage of multiple, independent cloud environments, which assumed nopriori agreement between cloud providers or third party is a promising paradigm tosupport very large scale world wide distributed applications.However, multi-cloud computing has to face several key challenges: portability , provisioning , elasticity , and high availability . Multi-cloud portability means writingapplications once and running them on any clouds. Most existing cloud providersare typically offered through proprietary APIs and limited to a single infrastructureprovider. In such situations, vendor lock-in is a primary concern for moving towardsa cloud provider. Multi-cloud provisioning refers to the capability to deploy a dis-tributed application spanning multiple cloud providers. Deploying a distributed ap-plication in a multi-cloud context is not an easy task. Multi-cloud elasticity refersto the capability to scale applications across multiple clouds. Currently, there is noconvenient way to express specific application elasticity rules for each part of a dis-tributed application as needed. Multi-cloud high availability refers to the degree towhich an application is operable across multiple clouds. Cloud provider services canbecome unavailable due to outages or denials of services. High availability needsto be analysed and set across multiple clouds in order to reduce the probability ofoutages that could affect services deployed in a single cloud system.In this article we discuss the design and implementation of soCloud. soCloud isa multi-cloud PaaS that addresses the four key challenges presented previously. so-Cloud is a distributed PaaS that provides a model for building distributed applications.This model is an extension of the OASIS SCA standard . Our ongoing approach toaddress portability and provisioning in a multi-cloud context is the use of the SCAstandard. Our elasticity management approach is based on autonomic computing withthe overall aim of creating self-managed elastic multi-cloud applications. High avail-ability is achieved in two ways. Firstly, soCloud provides a multi-cloud load balancerservice that fronts traffic for applications deployed across multiple clouds and makesa decision about where to route the traffic when cloud nodes fail. Secondly, the so-Cloud architecture uses redundancy at all levels to ensure that no single componentfailure in a cloud provider impacts the overall system availability. We describe a way to annotate SCA artifacts with deployment information needed to optimize the use ofservices in multiple cloud environments. These annotations also allow to express elas-ticity rules that ensure the appropriate adjustment decisions made in timely manner tomeet service needs in the presence of cloud service failures. The soCloud architectureis composed of the following SCA components: service deployer, constraints valida-tor, PaaS deployment, SaaS deployment, load balancer, node provisioning, monitor-ing, workload manager and controller components. soCloud is deployed and evalu-ated on ten existing cloud providers Windows Azure, DELL KACE, Amazon EC2,CloudBees, OpenShift, dotCloud, Jelastic, Heroku, Appfog, and an Eucalyptus pri-vate cloud.The remainder of this article is organized as follows. In Section 2, we discuss thefour challenges we addressed for multi-clouds. Next, Section 3 presents the designand implementation of the soCloud platform, and its integration with existing cloudproviders. The evaluation of soCloud is discussed in Section 4. Section 5 comparessoCloud with the state-of-the-art. Section 6 discusses the limitations of this work,while Section 7 concludes this article and presents future work we intend to address. IT companies are starting to realize and recognize the benefits and advantages ofcloud computing. However, cloud technology maturity is still a concern. This sectiondescribes four key challenges for multi-cloud computing: portability , provisioning , elasticity , and high availability .2.1 Multi-cloud portabilityIn the cloud computing area, the portability issue should take into account both appli-cation and data . Although data portability is an important feature, this article focusesonly on application portability. In an emerging and rapidly changing market such ascloud computing, it is easy to create applications that are locked into one vendor cloudbecause of the use of proprietary APIs and formats. To avoid this vendor lock-in syn-drome, SaaS must be portable on top of various cloud PaaS and IaaS providers. Then,this multi-cloud portability allows the migration from one cloud provider to anotherin order to take advantage of cheaper prices or better QoS. However, SaaS portabilityrequires that the runtime support provides a common model to hide the diversity ofunderlying PaaS and IaaS. Furthermore, the dominant programming models todayhave grown increasingly complex. SCA, in contrast, provides a simplified program-ming model and unified way to applications that communicate using a variety ofnetwork protocols [28].To address the challenge of multi-cloud portability , soCloud promotes SCA as themodel to design and develop both multi-cloud SaaS applications and the underlyingsoCloud PaaS.2.2 Multi-cloud provisioning Application Provisioning:
Application provisioning includes building and deploy-ment on multiple cloud environments. Providing a consistent methodology and pro-cess for modelling how applications are built and provisioned, enabling flexibility and
Fawaz Paraiso et al. choice for developers to use any cloud provider they choose. Application provision-ing should deliver business agility and operational efficient by high level abstractionand automating provisioning of applications across multiple cloud providers.
Geo-diversity:
The authors in [43] advocates that small data centers, which consumeless power, may be more advantageous than large ones, and that geo-diversity tendsto better match user demands. Geo-diversity lowers latency to users and increasesreliability in the presence of an outage taken out an entire site. In a legal context, dataprotection law and confidentiality can lead users to place their data in a specific area.In fact, the location of data can be facilitated or restricted in particular jurisdictions.Overall, to address the challenge of multi-cloud provisioning , soCloud offers aservice to provision applications across multiple cloud providers.2.3 Multi-cloud elasticityThe management of elasticity can be further split into two approaches: fine-grained orcoarse-grained. The first one allows to scale resources either by changing the numberof virtual machines (VMs) using horizontal scaling (adding more virtual machines ordevices to the computing platform to handle an increased application load) or verticalscaling (adding more CPU, Memory, Disk, Bandwidth to handle an increased appli-cation load) depending on the application memory, storage, network bandwidth andCPU requirements. The second one manages the resources scalability by changingcloud providers. Indeed, when outages occur with one cloud provider, the coarse-grained elasticity will switch to another cloud provider. While, the fine-grained elas-ticity can actually be made up of many fine-grained resources. In managing elasticityacross multiple clouds, automation is a mandatory requirement, and it is thus a foun-dational design principle [23]. The function of any autonomic capability is a controlloop that collects details from the system and acts accordingly. However, developersshould have the possibility to define specific elasticity rules on their services. Forexample, the developers specify constraints on the response time depending of thenumber of users currently accessing the provided service.To address the challenge of multi-cloud elasticity , soCloud offers an autonomicservice which provides a global mechanism to manage elasticity across multipleclouds and also offers the possibility to define application specific elasticity rules.2.4 Multi-cloud high availabilityA series of news [22,42] and papers [3,36] have pointed several cloud provider out-ages. According to a recent report by the International Working Group on Cloudcomputing Resiliency , a total of 568 hours of downtime at thirteen well-knowncloud services since 2007 caused financial damage of more than US$71.7 million.The average unavailability of cloud services is 7.5 hours per year, amounting to anavailability rate of 99.9%, according to the group preliminary results. These resultsare far from the expected reliability of mission critical system which is 99.999%.As a comparison, the average unavailability for electricity in a modern capital cityis less than 15 minutes per year [30]. Besides this economic impact, the downtime http://iwgcr.orgoCloud: A service-oriented component-based PaaS for multiple clouds 5 also affects millions of end-users. Of course, downtime costs money and dammage,unfortunately protecting systems against downtime with 99.999% of availability isnot free.To address the challenge of multi-cloud high availability despite outages, soCloudprovides high availability in two ways. Firstly, with the applications deployed witha soCloud platform, the high availability is ensured by using a load balancer servicewhich distributes requests among instances of the application deployed on multiplecloud providers. Secondly, the soCloud architecture uses redundancy at all levels toensure that no single component failure in a cloud provider impacts the overall systemavailability. In this section we present the design and implementation of soCloud platform. Wefirst discuss background elements of SCA and FraSCAti. Next, we describe somecomponents of the soCloud architecture and its implementation. Finally, we describehow the soCloud platform is deployed on existing IaaS/PaaS providers.3.1 SCAsoCloud is based on the SCA standard. SCA is a set of OASIS specifications forbuilding distributed applications and systems using Service-Oriented Architecture(SOA) principles [15]. SCA promotes a vision of Service-Oriented Computing (SOC)where services are independent of implementation languages (Java, Spring, BPEL,C++, COBOL, C, etc.), networked service access technologies (Web Services, JMS,etc.), interface definition languages (WSDL, Java, etc.) and non-functional proper-ties. Component-Based Design [11] and SOA are two major software engineeringapproaches widely used for structuring systems. SCA targets composition of servicesin SOA systems and thus is suitable for building enterprise and cross-enterprise ap-plications built on already-developed components and services.3.2 FraSCAtiSeveral open source implementations of the SCA specifications exist. Three of themost well known are Apache Tuscany, Fabric3 and FraSCAti. Compared to Tuscanyand Fabric3, FraSCAti introduces reflective capabilities to the SCA programmingmodel, and allows dynamic introspection and reconfiguration via a specialization ofthe Fractal component model [6]. FraSCAti provides a component-based approachto support the heterogeneous composition of various interface definition languages(WSDL, Java), implementation technologies (Spring, EJB, BPEL, OSGI, Jython,Jruby, Xquery, Groovy, Velocity, Fscript, Beanshell.), and binding technologies (WebServices, JMS, RPC, REST, RMI, UPnP.).soCloud is built on top of FraSCAti. FraSCAti is the execution environment ofboth the soCloud PaaS and soCloud applications deployed on the top of this multi-cloud PaaS.3.3 soCloud SaaS applications
Application specification soCloud applications are built using the SCA model. Asillustrated in Fig. 1, the basic SCA building blocks are software components, which
Fawaz Paraiso et al. provide services, require references and expose properties. The references and ser-vices are connected by wires. For SCA references, a binding describes the accessmechanism used to invoke a remote service. In the case of services, a binding de-scribes the access mechanism that clients use to invoke the service. We describe howSCA can be used to package SaaS applications. The first requirement is that the pack-age must describe and contain all artifacts needed for the application. The second re-quirement is that provisioning constraints and elasticity rules must be described in thepackage. The SCA assembly model specification describes how SCA and non-SCAartifacts (such as code files) are packaged. The central unit of deployment in SCA isa contribution. A contribution is a package that contains implementations, interfacesand other artifacts necessary to run components. The SCA packaging format is basedon ZIP files, however, other packaging formats are explicitly allowed. Fig. 1 shows athree-tier application is packaged as a ZIP file (SCA contribution) and its architectureis described.
Fig. 1
An annotated soCloud application.
Annotations
Some cloud-based applications require more detailed description oftheir deployment (c.f. Fig. 1). The deployment and monitoring of soCloud appli-cations are bound by a description of the overall software system architecture and therequirements of the underlying components, that we refer to as the application man-ifest . Basically, the application manifest consists of describing what components theapplication is composed with functional and non-functional requirements for deploy-ment. In fact, the application can be composed of multiple components (c.f. Fig. 1).The application manifest defines elasticity rule for the service component (e.g., in-crease/decrease instance of component). Commonly, scale up or down, is translatedto a condition-action statement that reasons on performance indicators of the compo-nent deployed. In order to fulfill the requirements for the soCloud application descrip-tor, we propose to annotate the SCA components with the four following annotations: oCloud: A service-oriented component-based PaaS for multiple clouds 7 placement constraint ( @location ) allows to map components of a soCloud ap-plication to available physical hosts within a geographical datacenter in multi-cloud environments.2. computing constraint ( @vm ) provides necessary computing resources definedfor components of a soCloud application in the multi-cloud environments.3. replication ( @replication ) specifies the number of instances of the componentthat must be deployed in multi-cloud environments.4. elasticity rule ( @elasticity ) defines a specific elasticity rule that should be ap-plied to the component deployed on multi-cloud environments.For example, let us consider the three-tier web application described in Fig. 1. Theannotation ( @location=France ) of the frontend component indicates to deploy thiscomponent on a cloud provider located in France. Next, the annotation ( @vm=medium )on the computing component specifies the kind of computing resources required bythis component and can be deployed on any cloud provider. The developer has thepossibility to specify through the @vm annotation the computing resources (micro,small, medium, large) she need. Finally, the annotations ( @location=Norway and @replication=2 ) on the storage component indicate to deploy this component on twodifferent cloud providers located in Norway. soCloud automates the deployment ofthis three-tier application in a multiple cloud environment by respecting given anno-tations.3.4 Constraint analysis and formulationTo express constraints (placement, computation, etc.) and define specific elasticityrules, we analyse each step of the formulation of these constraints. To express a con-straint we use this formula P = { n, v } . Where n indicates the name of the constraintand v the value of the constraint. Regarding the elasticity rule, we use R = { c, a } ,where c indicates the condition and a the resulting action.The placement can be a location or a provider name. For example, specifying @placement=”Amazon Ireland” or @placement=”Ireland” on a component has thesame interpretation (i.e, the component should be placed in Ireland) from the pointof view of placement constraint .To express different computation capacities, we use the instance type taxonomydefined by Amazon EC2 . An example of computation constraint request is P = { vm,medium } . This request has name “vm” and value “medium” which represents thecomputing capacity.However, in a heterogeneous multi-cloud environment, existing types of VM pro-vided by different clouds can have small differences. In order to use these clouds andhide these differences, the soCloud platform defines a high-level abstraction wheresimilar VMs are classified in the same type. Let T be the set of VM types defined inthe soCloud platform (see Equation 1) and C the characteristics of the VM providedby different clouds (see Equation 2). Equation 3 defines how the soCloud platformhides the differences between the VMs provided from different clouds. T = { micro, small, medium, large } (1) http://aws.amazon.com/ec2/instance-types/ Fawaz Paraiso et al. C = { C i , i ∈ P rovider } (2) V M type = { C : f ( C i ) , f ( C i ) ∈ T, i ∈ P rovider } (3)The soCloud platform offers to developers the opportunity to choose the spe-cific VM (with the best performance) by indicating, as an additional information, theprovider name . Overall, choosing a specific VM refers to the combination of @vm and @placement annotations, where @placement corresponds to the provider name.3.5 A soCloud application descriptorLet us consider the three-tier application architecture described in Fig. 1, where weneed to deploy a distributed application. This distributed application is packaged asa contribution that contains three contributions and one file ( application descriptor )describing the architecture of the distributed application. Each contained contribu-tion corresponds to each tier of the distributed application. In the case the applicationdeployed is not distributed, the contribution contains a single contribution with acomposite file. This distributed application needs placement, elasticity, and compu-tation requirements. In order to fulfill these requirements, we use SCA properties toexpress them (see Listing 1). Lines 3, 9, 18 correspond to our SCA extension definedto represent respectively frontend, computing and storage contributions. The place-ment constraints for frontend and storage components are expressed at Line 6 and20 respectively. The computing constraint is expressed at Line 12 for the computingcomponent. The number of replication of the storage component is expressed at Lines21. Lines 13-15 express the elasticity rule for the computing component. We adoptan event-condition-action approach for rule specification. The event-condition syntaxis an Event Processing Language statement [25]. Basically, the elasticity rule and ac-tion defined at Lines 13-15 means: when the average response time of the componentexceeds 4 seconds, then add a new virtual machine running this component. < composite name="DistributedApplication"> < component name="frontend">
A soCloud application descriptor.oCloud: A service-oriented component-based PaaS for multiple clouds 9 master , and the soCloud agent . This partitioning providesflexibility for deploying the soCloud PaaS across a highly distributed multi-cloud en-vironment. Firstly, the soCloud master consists of a set of eight components. Thispart of the architecture focuses on the intelligence processing of soCloud. Secondly,the soCloud agent is used to host, execute and monitor soCloud applications. Thispart provides the necessary services for managing a set of applications and resources.soCloud agents work with the soCloud master and run in different cloud infrastruc-tures. All communication between a soCloud master and the applications deployedis mediated by the soCloud agent . Fig. 2
Overview of the soCloud Architecture.
The component provides an unified-platform independent mechanism that collects,aggregates and reports details (such as health and performance metrics) about ap-plications deployed on multiple cloud environments. It brings information about cur-rently executing process as well as the system on which the monitoring service is run-ning. The monitoring component captures any change in the state of the application.The monitoring associates to each application deployed on the soCloud platform, atemporary table (ResponsivenessEvent) that collects informations such as applica-tion responseTime , number of requests , etc. The metrics collected in a time intervalare sent to the Workload Manager component for analyzing. The monitoring compo-nent acts at three levels: Operating System (OS), Java Virtual Machine (JVM), andExecution Environment (FraSCAti). The monitoring component exposes services viaREST and JMS to monitor a distributed environment. The consumer of these servicesis the workload manager component or can be also any external application runningoutside soCloud. Each application deployed with the soCloud PaaS is automaticallymonitored. However, with some cloud providers such as Salesforce.com or Google
App Engine our monitoring component could not work. As example, Google AppEngine forbids the use of JMX.
The Workload Manager (WM) component provides some event processing function-ality [16]. All events are processed to extract drift indicators (DI). An example of DIcan be a CPU consumption is greater than 90% for a period of 2 minutes. The WM iscentered on DI tracking perform filtering, transformation, and most importantly ag-gregation of events. All the metrics (events data) sent by monitoring components arecontinuously analyzed in terms of drift indicators that are expressed by event rules,and acts upon opportunities and threats in real time, potentially by creating derivedevents. One of WM major goals is to find a symptom and analyses it to find its rootcause. The WM uses a technique called event correlation to examine symptoms andidentify groups of symptoms that have a common root cause. As an example of eventcorrelation, WM takes multiple occurrences of the same event, examines them for du-plicate information, removes redundancies and reports them as a single event. Whena drift occurs, the WM reports it to the controller component. The ability to deriveinstant insights into the operations of the resource provisioning is essential. Thus, thecapability to dynamically allocate and dispose resources is an important ingredient tobuild a platform for elastic applications.Related to the events received from an inbound monitoring component, howevents can be woven together to pull out the right information? This is accomplishedthrough Complex Event Processing (CEP) . To achieve this, we use DiCEPE, a Dis-tributed Complex Event Processing Engine we have presented in [35]. The particu-larity of DiCEPE is the integration of CEP engines in distributed systems, and thefact that they can be exposed via various communication protocols. The DiCEPEintegrates the Esper engine for further processing. We apply Esper because of itsperformance and the metric-value pairs are delivered as events each time their valueschange between measurements. The controller component provides the mechanisms that construct the actions neededto achieve goals and objectives. For example, it multiplexes workloads onto an ex-isting infrastructure, and allows for on-demand allocation of resources to workloads.The system state is managed by the controller component. By state, we mean theinformation retained in one component that is meaningful for this component (as ex-ample: a table on each instance of LB to associate network addresses with the sym-bolic names of available hosts). The system state offers the potential for improvingthe consistency, and reliability of the system. For components to work together effec-tively, they must agree on common goals and coordinate their actions. This requiresthat each part to know something about the other. For example, the node provisioning stores a table of available resources: If the developer wants to deploy an application http://tinyurl.com/qdrcpm3. CEP : Computing that performs operations on complex events, including reading, creating, transform-ing, or abstracting them[25]. on the resource, the controller can notify the node provisioning to allocate new re-sources for the application when the available resource is not sufficient. The secondpotential advantage of the system state is reliability. If information is replicated atseveral cloud providers and one of the copies is lost due to a failure, then it may bepossible to use one of the other copies to recover the lost information. Compared tothe workload manager and node provisioning components, the controller takes deci-sion in the system. The controller component is self-adaptive in order to respond ina coherent and timely manner to changes in environment, and to failures of compo-nents.All requests handled by a controller component are processed as transactions.The transaction engine is implemented for the specific needs of the soCloud archi-tecture. Each transaction is created and managed by a coordinator. Two well-knownproblems of concurrent transactions can be mentioned: i) lost update and ii) incon-sistent retrievals. To avoid these problems we use a serially equivalent execution oftransactions [12]. The use of serial equivalence as a criterion for correct concurrentexecution prevents the occurrence of lost updates and inconsistent retrievals. The controller component is the core of the elasticity management, it is made to toleratefailures by the use of redundant components. The process illustrated by the sequence diagram in Fig. 3 describes how each taskvary with a service deployment scenario. The
Service Deployer (SD) component isresponsible for handling the additional information of coordinating and managingthe service across multiple clouds (i.e., placement, binding, manage service). The SDcomponent decomposes and captures the constraints (specified by the developer) ofthe service. For example, a constraint can be a placement of an application, resourcecapacities needed by an application, or defined elasticity rules. In the case where theconstraints expressed on the components are fulfilled by multiple cloud providers,soCloud randomly choses one of the providers offering the lowest price. To perform
Fig. 3
Application deployment sequence diagram. the deployment, the SD component captures the constraints defined and validates them with the
Constraint Validator component. Once the validation is done, the SDcomponent deploys a whole application this corresponds to sequences 4 to 6 in Fig. 3with the support of the SaaS Deployment, PaaS Deployment, and Node Provisioningcomponents.This component deploys the contribution package in three steps:1. Validates the contribution package by checking if the contribution package con-tains at least one ZIP file and one composite file.2. Uses the constraint validator to validate the SCA properties defined in the com-posite file.3. Matches each constraint or elasticity rule defined in the composite file and in-vokes the corresponding execution operation:
Node provisioning , PaaS deploy-ment , SaaS deployment .3.7 Elasticity specificationIn this section, we will describe how the soCloud architecture automates specificelasticity rules associated with soCloud applications.soCloud manages elasticity at IaaS and PaaS levels in the same manner. In fact,the elasticity management in soCloud is not focused on any cloud layer (IaaS orPaaS) specific resources, instead it refers to resources through abstractions providedby the NP component, that offers an uniform way to manage resources from bothIaaS and PaaS. soCloud provides the capacity to scale the resources allocated forthe application as needed. For example, soCloud can add more nodes if it detects adegradation on the application performance. On the other hand, if the resources areunderused, resizing is necessary. This feature is managed as a feedback control loopby the soCloud platform. However, for specific cases, the developer should be able todefine automatic elasticity rules associated to its application. These rules are definedinside the application architecture and supervised by the soCloud platform. Each ruleis composed of a condition or a set of conditions to be monitored. Those specificelasticity rules are also managed by the soCloud feedback control loop. In order toachieve elasticity, we need to keep track of the frequency of requests to resourceshosted and applications deployed on them. Thus, we use a proactive scheme that relieson the current workload arrival rate to detect overload conditions. We measure theincoming workload rate by monitoring the number of user connections being openedin the load balancer component. To maintain hit statistics for frequently-accessedapplications, we dynamically compute an exponential weighted moving average ofrequest inter-arrival times, along the same lines as TCP computes its estimated round-trip time [24]. Specially, we compute an average of the inter-arrival time using thefollowing formula: f ( t ) = (1 − α ) ∗ f ( t −
1) + α ∗ ( δt ( t ) − δt ( t − (4)The arrival time of every hit is represented by δt ( t ) . The constant α is a smoothingfactor that puts more weight on recent samples than on old samples. We have used avalue of α = 0.125, which is recommended for TCP . http://tools.ietf.org/html/rfc2988oCloud: A service-oriented component-based PaaS for multiple clouds 13 To detect overloads and underloads in the soCloud platform, we use a threshold-based scheme to trigger dynamic allocation. Let us note that the calculation of a threshold scheme based on equation 4 varies from one deployed application to an-other.
Fig. 4
Conceptual view of a soCloud deployment. master is deployed indotCloud. In the second step, the soCloud master (deployed in dotCloud) dynami-cally deploys another soCloud master in CloudBees. Automatically, the first soCloud master becomes leader and the second one the follower. The soCloud master leaderis active, while the soCloud master follower is passive. By active, we mean the so-Cloud master processes the operations in the system. By passive, we refer to thestandby soCloud master used as replication. At this stage, only the soCloud master and its replication are deployed. Finally, the soCloud master leader will provision anew cloud node on which it deploys both the execution environment (FraSCAti) anda soCloud agent . When the soCloud agent is deployed, it uses a service discoverymechanism to find which
Workload Manager component the information collected should be sent. Periodically, the service discovery checks if the
Workload Manager component is reachable in order to update the services table when failure occurs onthe target soCloud master . The soCloud PaaS service discovery mechanism is imple-mented using Google Fusion Table [19] and FraSCAti dynamical multiple referencebinding. We use
Google Fusion Table to persist the state of the active master and withthe dynamical multiple reference provided by FraSCAti we add on the fly a referenceto the new component. By state we mean the operational state of the soCloud master .soCloud provides a capability for reliability using sources of state that are external tosoCloud itself. Typically, this is done with
Google Fusion Table . The soCloud plat-form provides a mechanism called “health checking” by which a component notifiesits health. This mechanism is implemented as an XML push mechanism which testsif a component is reachable. Both the soCloud master and agent need the executionenvironment (FraSCAti) to be running. However, when the system grows (the numberof applications or load increase), the third step is repeated.3.9 Fail-oversIn this section we describe how the soCloud PaaS ensures the high availability at twolevels: soCloud level , and application level . soCloud level The active soCloud master is called the leader and the passive soCloud master is called the follower. The process of electing a leader allows the system to in-dicate which soCloud master will have the decision of execution. The soCloud master leader and follower are synchronized such that when the leader fails, automaticallythe leader election is organized to elect a new leader. Specifically, we use
Wait-FreeSynchronization that is appropriate in fault tolerant and real-time applications [18].In the case the system administrator has been defined only one replication of the so-Cloud master , the soCloud master follower is automatically elected. Otherwise, theelection is organized between the soCloud master followers. The leader election isorganized and supervised by the controller component. We assume that each com-ponent has a reachable latency . The reachable latency is obtained by making a pingfrom one component to another. Ping refers to the ability to have a live componentconnection. Our leader election algorithm is simple. This algorithm ensures that thecomponent with minimum reachable latency gets elected as the leader. However, thesoCloud platform is not restricted to this algorithm, the system administrator has thepossibility to define another one (e.g., Chang-Roberts algorithm [9], Malpini algo-rithm [26]), according to her requirement. Elections are held between two entitiesthat have the same function (e.g., two monitoring components, two workload man-ager components, etc.). Then, the controller component organizes an election in orderto compare the reachable latency . By using this strategy, all the components of thesoCloud master leader are in the same cloud and the follower components in other.When the soCloud master follower fails, automatically the soCloud master leaderdeploys a new soCloud master follower.
Application level
Same to soCloud replication, the developer has the possibility todefine the number of instances which will be deployed for its application. Each appli- oCloud: A service-oriented component-based PaaS for multiple clouds 15 cation deployed with soCloud is replicated in different clouds. The fail-overs mecha-nism is achieved by the LB component. When failure occurs with one instance of theapplication, the Controller component takes the decision to instantiate a new one.Overall, the fail-overs automation in the soCloud platform enables our systemto recover quickly from most outages. In addition, we also monitor our system forany variety of error conditions. With the two levels of availability, the soCloud PaaSaddresses the high availability challenge presented in Section 2.4.3.10 RecoveryIn this section we describe the method used by the soCloud PaaS for fault tolerance,i.e., check-pointing . A checkpoint can be local to a process or global in the system.With the soCloud PaaS we use a global checkpoint. We use Google Fusion Table [19]to record a global state of the system so that in the event of failure the entire systemcan be rolled back to the global checkpoint and restarted. To record the global state,soCloud uses the coordinated checkpoint method [5]. In fact, there are some dis-advantages of uncoordinated checkpoint compared with coordinated checkpointingschemes [18]. First, for coordinated checkpoint it is sufficient to keep just the mostrecent global state snapshot in the stable storage. For uncoordinated checkpoints amore complex processing scheme is required. Moreover, in the case of a failure, therecovery method for coordinated checkpoint is simpler. Fig. 5 soCloud deployment with ten cloud providers. A checkpoint is a snapshot of the state of a process, saved on nonvolatile storage to survive processfailures [41].6 Fawaz Paraiso et al. . The deployment is done withIaaS/PaaS providers as illustrated in Fig. 5. With IaaS, resources are provisioned fromWindows Azure, DELL KACE, Amazon EC2, and our Eucalyptus private cloud, weinstalled a PaaS stack composed of a Linux distribution, a Java Virtual Machine, aweb container and FraSCAti. soCloud is also deployed on PaaS such as: CloudBees,OpenShift, dotCloud, Jelastic, Heroku, and Appfog as a WAR file. In this section, we evaluate three key aspects of the soCloud platform: elasticity , highavailability and the overhead introduced by soCloud. Firstly, Section 4.1 describesa use case scenario. Then, Section 4.2 evaluates the reaction of soCloud when facedwith flash crowd effects (i.e., elasticity of soCloud). Section 4.3 evaluates the soCloudbehavior against failures (i.e., high availability of soCloud). Finally, Section 4.4 eval-uates the overhead introduced by soCloud.4.1 Use caseWe describe a scenario that can be used in a multiple clouds environment, and explainbriefly its requirements. Let us consider a motivating scenario in which a company built a device called ”Fueloptimiser” in charge of reducing the fuel consumed by vehicles (car, boat, tractor,lorry, etc). To improve the quality of their products, they analyse metrics (fuel con-sumption per km) collected from vehicles. At the end of each trip, the vehicle sensorssent metrics to a company application via REST messages. The application must facerequirements like: – The application must be close to vehicles (geo-diversity). – Unpredictable and unlimited growth of vehicles. – Peaks and unpredictable workloads.To address these challenges, the architecture of this application and the infrastruc-ture need to be flexible, highly available, well performing, reliable and scalable. Theapplication uses a three-tier model; the vehicle sensors are directly connected to the frontend tier, the middle tier analyses the metrics collected, and the storage tierstores the metrics into a database. The application described in this scenario is usedfor the evaluation of soCloud elasticity, and high availability. http://socloud.soceda.cloudbees.netoCloud: A service-oriented component-based PaaS for multiple clouds 17 Fig. 6 The series of two flash crowd effects. (a) Effective number of requests during the evolution ofthe scenario. (b) Response time experienced by clients during the flash crowd effect without soCloudelasticity. (c) Number of requests failed during the two phases of the flash crowd effect. (d) Responsetime experienced by clients during the flash crowd effect with soCloud elasticity. . We conducted an analysis of: (a) the application isdeployed without the soCloud elasticity mechanism, and (b) the application is de-ployed with the soCloud elasticity mechanism. In the first case, we have observed the behavior of this application without elasticitycapability under high request load. Each request triggers an operation that consistsof analysing metrics collected by a vehicle and stores the results into a database.To that end, we have configured httperf [32] to create 50,000 connections, with 10requests per connection and a number of new connections created per second varyingbetween 10 and 150 ; this corresponds to a total of 3,020,000 requests. Fig. 6 (a) shows the number of requests achieved by the application with two phases of a flashcrowd effect, and Fig. 6 (b) shows the corresponding response time (computed as thenumber of operations performed). During the two phases of the flash crowd effect, theaverage response time is 65.90 seconds. Fig. 6 (a) and 6 (b) show a mounted suddenload caused by the flash crowd effect. We have noted that the number of requests The flash crowd effect, also called the slash dot effect, resulte from a sudden increase in request traffic.8 Fawaz Paraiso et al. increases with the response time. Then, Fig. 6 (c) shows the number of request errors,and shows the corresponding number of the failed requests. Thus, during the flashcrowd effect, 1.13% of requests have failed, precisely 34,039 requests. These requesterrors are due to the processing timeout that we have set at 5 seconds for each request.In fact, this timeout means that the lack of any server activity on the TCP connectionfor this duration will be considered to be an error.Overall, when the application becomes saturated, it suffers from performance fail-ures and cause long response delays. We observe that the application can sustain therequest rate only up to a certain limit, which directly depends on the number of re-quests on a time interval.
In the second case, we have studied the evolution of the response time during the twophases of the flash crowd effect when soCloud elasticity is activated.We assume that resource(VM) is preallocated and a soCloud agent is deployedinside. Fig. 6 (d) shows the results of the same experiment when using the soCloudelasticity mechanism. We initially observe some contention at the source of applica-tion as the response time decreases. During the first phase of the flash crowd effect,the average response time is 37.30 seconds. Indeed, the soCloud platform has de-tected peak mounted in
300 ms . After 4 seconds, the soCloud platform replicatesthe application into another soCloud agent and updates the load balancer table forbalancing charge across different instances of the application. This reaction appearsclearly in Fig. 6 (d) , where the application replication is performed. The soCloud loadbalancer dispatches the requests among the two instances of the application and theresponse time remains small despite the high traffic. During the second phase of theflash crowd effect, the application was already deployed, the soCloud platform hasdetected peak mounted in
300 ms . As shown in Fig. 6 (d) , we do not notice mountedpeak during the second phase of the flash crowd effect, and the average response timeis 23.38 seconds. The relatively small response time during the second phase of theflash crowd is due to the fact that the soCloud platform has already replicated theapplication.Overall, at the peak of the flash crowd, all the requests are performed with zerofailure and relatively acceptable response time, the soCloud platform allows the ap-plication to scale more with better quality of service. These results demonstrate thatthe soCloud platform deals well with elasticity across multiple cloud providers.4.3 soCloud behavior against failuresWe perform all our evaluation with the application described in the previous sections.To show the behavior in soCloud over time as failures are injected, we deploy so-Cloud as described in Fig. 4. The deployment of soCloud is done on ten clouds. ThesoCloud master is replicated to tolerate more faults. The leader and follower of the so-Cloud master are deployed respectively on dotCloud and CloudBees. soCloud agentsare deployed on Amazon EC2, Windows Azure, DELL KACE, OpenShift, Jelastic,Heroku, Appfog, and our Eucalyptus private cloud. oCloud: A service-oriented component-based PaaS for multiple clouds 19
As described in Section 3, the deployment of soCloud consists of the deployment ofboth a soCloud master and several agents. soCloud master
The deployment of a soCloud master is done by deploying bothleader and follower instances on two different clouds to ensure the high availabil-ity. The deployment on each cloud consists of deploying the execution environment(FraSCAti) with the soCloud master. The deployment of a soCloud master takes about . soCloud agent We measure the time for the deployment of one soCloud agent. Thedeployment consists of deploying the execution environment (FraSCAti) with thesoCloud agent. The deployment of a soCloud agent takes about .Overall, the average time taken to deploy soCloud with two masters (leader andfollower) and one agent is about . We assume that soCloud is running and our scenario application is deployed. To sim-ulate a failure, we stop the soCloud master leader in dotCloud. In our observations,soCloud takes about average to recovery and to become operational. so-Cloud takes less than 200 ms to elect a new leader. The recovery process is performedas follows. First, the soCloud master follower becomes leader after the election androllbacks the system. Then, a new soCloud master follower is deployed on anothercloud. Finally, the soCloud agent discovers automatically the new soCloud masterleader. According to [30,1], the average Mean Time To Recovery (MTTR) for pub-lic clouds is . As a comparison, the recovery time of soCloud takes only as shown in Table 1.
Table 1
MTTR results
MTTR(Hour) soCloud 0.06 hourPublic clouds 7.5 hours
Failure and recovery of a soCloud master follower
In this case, we simulate thefailure of a soCloud master follower in CloudBees, the soCloud master leader detectsautomatically the failure. The soCloud master leader takes about to elect andstart a new master follower.
Downtime of an application deployed on soCloud
The failure of an application de-ployed with soCloud does not affect its availability. In fact, when a failure occurs,the load balancer takes about ms to detect and switch automatically to anotherinstance of the application deployed.
Downtime of a soCloud agent
The failure of a single soCloud agent does not affectthe availability of the application deployed on soCloud. The soCloud load balancerallows to redirect the requests to another instance of the application. The soCloudagent deployment and start still take about . However, the deployment timeof applications that were on the platform depends on the size of these applications.
Let us consider the availability equation below [27,39]:
Availability = M T BFM T BF + M T T R (5)As Equation 5 shows, the longer the MTTR is, the worse off a system is. Theformula illustrates how both Mean Time Between Failure (MTBF) and MTTR impactthe overall availability of a system. As MTTR goes up, availability goes down. Tocompare the availability of soCloud and public clouds, we must estimate the sameMTBF. Then, in a year we assume that the MTBF is 8760 hours. The availability iscalculated in Table 2.
Table 2
Availability comparison
Availability soCloud . = 99.999%Public clouds . = 99.914% Overall, as shown in Table 2, the availability of public clouds is 99.914%. As acomparison, the soCloud availability is 99.999%. This result is close from the ex-pected reliability of mission critical systems (c.f. Section 2.4). The soCloud platformincreases high availability. This result demonstrates that soCloud ensures well highavailability across multiple clouds.4.4 Overhead introduced by soCloudIn order to analyse the overhead introduced by the soCloud platform, we have de-ployed our use case application directly on CloudBees and through the soCloud plat-form. We have packaged two different archive files. The first archive file is a WARfile, its size is . Mb. This file contains the application and the execution environ-ment FraSCAti. The second archive file is a Zip file (an SCA contribution), its sizeis . Mb. The second file contains only the application. The WAR and Zip files aredeployed respectively on CloudBees and soCloud. The deployment of the WAR andZip files is performed ten times. Table 3 reports the average deployment time of eachfile.
Table 3
Deployment time of the Zip and WAR files
Implementation File size Avg. deploy. time
Zip File (Application) 2.1 Mb . msWAR File (Application + FraSCAti) 50.7 Mb . ms As noticed, the deployment time of the application directly on CloudBees isgreater than the deployment time on soCloud. This is explained by the size of theWAR file which is greater than the Zip file. In fact, uploading a small file in the net-work is faster than a big file. When deployed the Zip file on the soCloud platform,the execution environment is already deployed and started. This is not the case of the oCloud: A service-oriented component-based PaaS for multiple clouds 21
WAR file which contains the FraSCAti execution environment that will be installedand instantiated on the CloudBees PaaS before deploying the application on it.To evaluate the overhead introduced by the soCloud platform, , requestswere generated and sent with the Httperf tool. We evaluate two implementations ofthis scenario: i) the application without soCloud, and ii) with soCloud. The scenariowas executed ten times on each of the two implementations. Table 4 presents theresults of the average execution time for each implementation, as well as the meanoverhead introduced by the soCloud platform. Table 4
Execution time and Overhead
Implementation Avg. exec. time soCloud overhead (Application + FraSCAti) . sec -(Application + FraSCAti + soCloud) . sec . From the results presented in Table 4, we can notice that the overhead introducedby the soCloud platform is . . This overhead is generated by the soCloud monitor-ing and the Load Balancing components. The overhead of the monitoring componentis due to the information collected for the elasticity.Overall, the abstraction provided by the soCloud platform is not free, becauseit introduces an overhead of . . However, the benefits provided by the soCloudplatform in multi-cloud environment outweigh the difference in the execution time. Related to the Inter-Cloud Architectural taxonomy presented in [20], soCloud can beclassified into the Multi-Cloud service category. This section presents some of therelated work to multi-cloud computing challenges discussed in Section 2: portability , provisioning , elasticity , and high availability across multiple clouds. Multi-cloud portability
Portability approaches can be classified into three categories-[33]: functional portability , data portability and service enhancement . The authors[37]of mOSAIC deal with service enhancement portability at IaaS and PaaS levels. mO-SAIC provides a component-based programming model with asynchronous commu-nication. However, mOSAIC APIs are not standardized and are complex to put atwork in practice. Our soCloud solution deals with service enhancement portabilitywith an API that runs on existing PaaS and IaaS. soCloud supports both synchronousand asynchronous communications offered by the SCA standard. Moreover, SCAdefines an easy way to use portable API. The Cloud4SOA [14] project deals withthe portability between PaaS using a semantic approach. soCloud intends to provideportability using an API based on the SCA standard. Multi-cloud provisioning
A great deal of research on dynamic resource allocationfor physical and virtual machines and clusters of virtual machines [2] exists. Thework of dynamic provisioning of resource in cloud computing may be classified intotwo categories. Authors in [31] have addressed the problem of provisioning resourcesat the granularity of VMs. Other authors in [10] have considered the provisioning ofresources at a finer granularity of resources. In our work, we consider provisioning atboth VM and finer granularity of resources.
The authors in [17] have addressed the problem of deploying a cluster of vir-tual machines with given resource configurations across a set of physical machines.While [13] defines a Java API permitting developers to monitor and manage a clusterof Java VMs and to define resource allocation policies for such clusters. Unlike [17,13], soCloud uses both an application-centric and virtual machine approaches. Usingknowledge on application workload and performance goals combined with serverusage, soCloud utilizes a more versatile set of automation mechanisms.
Multi-cloud elasticity
Managing elasticity across multiple cloud providers is a chal-lenging issue. However, although managed elasticity through multiple clouds wouldbenefit when outages occur, few solutions are supporting it. For instance, in [7], theauthors present a federated cloud infrastructure approach to provide elasticity for ap-plications, however, they do not take into account elasticity management when out-ages occur. Another approach was proposed by [40], which managed the elasticitywith both a controller and a load balancer. However, their solution does not addressthe management of elasticity through multiple cloud providers. The authors in [29]propose a resource manager to manage application elasticity. However, their approachis specific for a single cloud provider.
Multi-cloud high availability
Cloud providers such as Amazon EC2, Windows Azure,Jelastic already provide a load balancer service with a single cloud to distribute loadamong virtual machines. However, they do not provide load balancing across multiplecloud providers. Different approaches of dynamic load balancing have been proposedin the literature [8,21], however, they do not provide a mechanism to scale the loadbalancers themselves. The authors in [38] have explored the agility way to quicklyreassign resources. However, their approach does not take into account a multi cloudenvironment. Most existing membership protocols [4] employ a consensus algorithmto achieve agreement on the membership. Achieving consensus in an asynchronousdistributed system is impossible without the use of timeouts to bound the time withinwhich an action must take place. Even with the use of timeouts, achieving consensuscan be relatively costly in the number of messages transmitted, and in the delays in-curred. To avoid such costs, soCloud uses a novel Leader Determined MembershipProtocol that does not involve the use of a consensus algorithm. soCloud is a PaaS to aggregate multiple clouds. Throughout the article, we have es-sentially discussed the advantages of soCloud. On the one hand, soCloud may misssome features that are provided by the underlying clouds used. In other words, so-Cloud may not exploit the specific features (i.e., elasticity rules, provisioning prop-erties, replication trigger) that is not provided by it. On the other hand it may be thecase that some developers or companies may not like to use an SCA-based approach.Indeed, the soCloud adoption can become therefore an issue. One approach for so-Cloud to address these concerns is to use a wrapper that enable transparent accessto cloud provider features. As an SCA-based approach, soCloud offers a solution todeploy and execute service oriented applications. It would be useful in future workfor soCloud to overcome the constraint of supporting only SCA-based applications. oCloud: A service-oriented component-based PaaS for multiple clouds 23
When deployed the soCloud platform on other PaaS, the scaling mechanism of theseplatforms is not used by our platform in order to avoid duplicated mechanism.The cloud platforms and their features provided, especially at the PaaS level areevolving dynamically. However, in general the problem of maintaining the mappingsto various cloud providers and managing this evolution to keep up with recent featuresof our supported clouds are a concern. A common way to address these issues is bywrapping them as soCloud features. However, the use of the future standard for Cloudcomputing is still the best approach.soCloud provides an abstraction to hide the heterogeneity and the complexityof the underlying clouds. The solution provided by soCloud can introduce an ad-ditional cost (i.e., in term of performance, footprint) to existing IaaS/PaaS environ-ments. However, soCloud provides a uniform way to deploy, execute and manageapplications in multi-cloud environments. As benefits, the developer focuses on thecloud rather than troubleshooting implementations, exploits multi-cloud portability,has an efficient management of her applications across multi-cloud. In comparisonto heterogeneous ways offered by the several IaaS/PaaS solutions, soCloud providesmany benefits.
In this article, we have proposed soCloud a service-oriented component-based PaaSfor managing portability, provisioning, elasticity, and high availability across mul-tiple clouds. soCloud is a distributed PaaS that provides a model for building anymulti-cloud SaaS applications. This model is based on an extension of the OASISSCA standard. We surveyed each of the concepts related to express specific elasticityrules, ensure high availability across multiple clouds and pointed out problematics.To address these problems, this article proposes an architecture, and describes theinteractions between each component of this architecture. We explain how the com-ponents in a soCloud application descriptor can be annotated with elasticity rules,placement constraints, computation constraints. Based on these annotations, deploy-able contributions can be loaded and deployed in a suitable manner. The article de-scribed the approach used by the soCloud platform to ensure high availability. Inparticular soCloud takes a wait-free approach to the problem of coordinating com-ponents in different clouds and uses load balancer to switch from one applicationinstance to another in case of failures. In comparison, the soCloud’s availability withpublic [30] cloud availability, we demonstrate that soCloud ensures high availabilityin minutes instead of hours. We analyse the flash crowd phenomenon on a use case,and demonstrate how the soCloud platform increases the elasticity of the application.This approach is proactive in the case that the content replication is performed whendetecting a traffic surge and anticipating a flash crowd.As of future work, we plan to continue our research in the following directions.First, currently, soCloud manages application’s components as contribution file interms of packaging and deployment. The archive that is referred to by implemen-tation.contribution may be an artifact within a larger contribution (i.e., an EAR orWAR file inside a larger ZIP file), or archive may itself be a contribution. Indeed,soCloud will manage and deploy all Java EE archive (WAR, EAR). Second, we will investigate how the concept of aggregated multiple clouds can be used to reduce theresource provisioning cost, while maintaining the Quality of Service (QoS) to cus-tomers who use the resources. Third, as many organizations need to move data fromone cloud to another we will work on data portability in a multi-cloud environment.
This work is partially funded by the ANR (French National Research Agency) ARPEGESocEDA project and the EU FP7 PaaSage project.
References
1. Lessons Learned from Recent Cloud Outages (2013). http://tinyurl.com/qz5maey2. Anedda, P., Leo, S., Manca, S., Gaggero, M., Zanetti, G.: Suspending, Migrating and Resuming HPCvirtual clusters. Future Generation Computer Systems (8), 1063–1072 (2010)3. Armbrust, M. and Fox, A. Griffith, R. Joseph, A.D. Katz, R. Konwinski, A. Lee, G. Patterson, D.Rabkin, A. Stoica, I. et al.: A view of cloud computing. Communications of the ACM (4), 50–58(2010)4. Birman, K.P., Van Renesse, R., et al.: Reliable distributed computing with the Isis toolkit, vol. 85.IEEE Computer Society Press Los Alamitos (1994)5. Bouteiller, A., Lemarinier, P., Krawezik, K., Capello, F.: Coordinated checkpoint versus message logfor fault tolerant mpi. In: Cluster Computing, 2003. Proceedings. 2003 IEEE International Conferenceon, pp. 242–250. IEEE (2003)6. Bruneton, E., Coupaye, T., Leclercq, M., Qu´ema, V., Stefani, J.B.: The Fractal component model andits support in Java: Experiences with Auto-adaptive and Reconfigurable Systems. Softw. Pract. Exper. (11-12), 1257–1284 (2006)7. Buyya, R., Ranjan, R., Calheiros, R.: Intercloud: Utility-oriented federation of cloud computing envi-ronments for scaling of application services. Algorithms and architectures for parallel processing pp.13–31 (2010)8. Cardellini, V., Colajanni, M., Yu, P.: Dynamic load balancing on web-server systems. Internet Com-puting, IEEE (3), 28–39 (1999)9. Chang, E., Roberts, R.: An improved algorithm for decentralized extrema-finding in circular configu-rations of processes. Communications of the ACM (5), 281–283 (1979)10. Chase, J.S., Anderson, D.C., Thakar, P.N., Vahdat, A.M., Doyle, R.P.: Managing energy and serverresources in hosting centers. In: ACM SIGOPS Operating Systems Review, vol. 35, pp. 103–116.ACM (2001)11. Chen, Z., Liu, Z., Stolz, V., Yang, L., Ravn, A.P.: A refinement driven component-based design. In:Engineering Complex Computer Systems, 2007. 12th IEEE International Conference on, pp. 277–289. IEEE (2007)12. Coulouris, G., Dollimore, J., Kindberg, T.: Distributed systems: concepts and design. Addison-WesleyLongman (2005)13. Czajkowski, G., Wegiel, M., Daynes, L., Palacz, K., Jordan, M., Skinner, G., Bryce, C.: Resourcemanagement for clusters of virtual machines. In: Cluster Computing and the Grid, 2005. CCGrid2005. IEEE International Symposium on, vol. 1, pp. 382–389. IEEE (2005)14. Dandria, F., Bocconi, S., Cruz, J.G., Ahtes, J., Zeginis, D.: Cloud4SOA: Multi-Cloud ApplicationManagement Across PaaS Offerings. In: Symbolic and Numeric Algorithms for Scientific Computing(SYNASC), 2012 14th International Symposium on, pp. 407–414. IEEE (2012)15. Erl, T.: Soa: principles of service design, vol. 1. Prentice Hall Upper Saddle River (2008)16. Etzion, O., Niblett, P.: Event Processing in Action. Manning Publications Co. (2010)17. Foster, I., Freeman, T., Keahy, K., Scheftner, D., Sotomayer, B., Zhang, X.: Virtual clusters for gridcommunities. In: Cluster Computing and the Grid, 2006. CCGRID 06. Sixth IEEE InternationalSymposium on, vol. 1, pp. 513–520. IEEE (2006)18. Garg, V.K.: Concurrent and distributed computing in Java. Wiley-IEEE Press (2005)oCloud: A service-oriented component-based PaaS for multiple clouds 2519. Gonzalez, H., Halevy, A.Y., Jensen, C.S., Langen, A., Madhavan, J., Shapley, R., Shen, W., Goldberg-Kidon, J.: Google fusion tables: web-centered data management and collaboration. In: Proceedingsof the 2010 international conference on Management of data, pp. 1061–1066. ACM (2010)20. Grozev, N., Buyya, R.: Inter-Cloud Architectures and Application Brokering: Taxon-omy and Survey. Software: Practice and Experience (2012). DOI 10.1002/spe.2168.http://dx.doi.org/10.1002/spe.216821. Harchol-Balter, M., Downey, A.: Exploiting process lifetime distributions for dynamic load balancing.ACM Transactions on Computer Systems (TOCS) (3), 253–285 (1997)22. InfoWorld: The 10 worst cloud outages (and what we can learn from them). http://tinyurl.com/br9ck4a23. Isard, M.: Autopilot: automatic data center management. ACM SIGOPS Operating Systems Review (2), 60–67 (2007)24. Karn, P., Partridge, C.: Improving round-trip time estimates in reliable transport protocols. ACMSIGCOMM Computer Communication Review (5), 2–7 (1987)25. Luckham, D., Schulte, R.: Event Processing Glossary - Version 1.1. Processing (July), 1–19(2008). http://complexevents.com/wp-content/uploads/2008/08/epts-glossary-v11.pdf26. Malpani, N., Welch, J.L., Vaidya, N.: Leader election algorithms for mobile ad hoc networks. In: Pro-ceedings of the 4th international workshop on Discrete algorithms and methods for mobile computingand communications, pp. 96–103. ACM (2000)27. Marcus, E., Stern, H.: Blueprints for high availability. Wiley (2003)28. Marino, J., Rowley, M.: Understanding SCA (Service Component Architecture). Addison-WesleyProfessional (2010)29. Marshall, P., Keahey, K., Freeman, T.: Elastic site: Using clouds to elastically extend site resources.In: Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and GridComputing, pp. 43–52. IEEE Computer Society (2010)30. Maurice Gagnaire, Felipe Diaz,Camille Coti, Christophe Cerin, Kazuhiko Shiozaki, Yingjie Xu,Pierre Delort, Jean-Paul Smets, Jonathan Le Lous, Stephen Lubiarz, Pierrick Leclerc: Downtimestatistics of current cloud solutions (2012)31. Mietzner, R., Leymann, F.: Towards provisioning the cloud: On the usage of multi-granularity flowsand services to realize a unified provisioning infrastructure for saas applications. In: Services-Part I,2008. IEEE Congress on, pp. 3–10. IEEE (2008)32. Mosberger, D., Jin, T.: httperf a tool for measuring web server performance. SIG-METRICS Perform. Eval. Rev. (3), 31–37 (1998). DOI 10.1145/306225.306235.Http://doi.acm.org/10.1145/306225.30623533. Oberle, K., Fisher, M.: ETSI CLOUD–initial standardization requirements for cloud services. In:Economics of Grids, Clouds, Systems, and Services, pp. 105–115. Springer (2010)34. Paraiso, F., Haderer, N., Merle, P., Rouvoy, R., Seinturier, L.: A Federated Multi-Cloud PaaS Infras-tructure. In: 5th IEEE International Conference on Cloud Computing, pp. 392 – 399. Hawaii, UnitedState (2012). DOI 10.1109/CLOUD.2012.79. http://hal.inria.fr/hal-0069470035. Paraiso, F., Hermosillo, G., Rouvoy, R., Merle, P., Seinturier, L.: A Middleware Platform to Feder-ate Complex Event Processing. In: Sixteenth IEEE International EDOC Conference, pp. 113–122.Springer, Beijing, China (2012). http://hal.inria.fr/hal-0070088336. Paraiso, F., Merle, P., Seinturier, L.: Managing Elasticity Across Multiple Cloud Providers. In: 1stInternational workshop on multi-cloud applications and federated clouds. Prague, Czech, Republic(2013). http://hal.inria.fr/hal-0079045537. Petcu, D., Macariu, G., Panica, S., Cr˘aciun, C.: Portable Cloud applications From theory to practice.Future Generation Computer Systems (2012)38. Qian, H., Miller, E., Zhang, W., Rabinovich, M., Wills, C.E.: Agility in virtualized utility comput-ing. In: Virtualization Technology in Distributed Computing (VTDC), 2007 Second InternationalWorkshop on, pp. 1–8. IEEE (2007)39. Torell, W., Avelar, V.: Mean time between failure: Explanation and standards. White Paper (2004)40. Vaquero, L., Rodero-Merino, L., Buyya, R.: Dynamically scaling applications in the cloud. ACMSIGCOMM Computer Communication Review (1), 45–52 (2011)41. Wang, Y.M.: Consistent global checkpoints that contain a given set of local checkpoints. Computers,IEEE Transactions on (4), 456–468 (1997)42. Zdnet: Amazon cloud down; Reddit, Github, other major sites affected (2012).http://tinyurl.com/95kmk8y43. Zhang, Q., Cheng, L., Boutaba, R.: Cloud computing: state-of-the-art and research challenges. Journalof Internet Services and Applications1