Ricardo Jiménez-Peris

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ricardo Jiménez-Peris is active.

Explore More

Publication

Featured researches published by Ricardo Jiménez-Peris.

IEEE Transactions on Parallel and Distributed Systems | 2012

StreamCloud: An Elastic and Scalable Data Streaming System

Vincenzo Gulisano; Ricardo Jiménez-Peris; Marta Patiño-Martínez; Claudio Soriente; Patrick Valduriez

Many applications in several domains such as telecommunications, network security, large-scale sensor networks, require online processing of continuous data flows. They produce very high loads that requires aggregating the processing capacity of many nodes. Current Stream Processing Engines do not scale with the input load due to single-node bottlenecks. Additionally, they are based on static configurations that lead to either under or overprovisioning. In this paper, we present StreamCloud, a scalable and elastic stream processing engine for processing large data stream volumes. StreamCloud uses a novel parallelization technique that splits queries into subqueries that are allocated to independent sets of nodes in a way that minimizes the distribution overhead. Its elastic protocols exhibit low intrusiveness, enabling effective adjustment of resources to the incoming load. Elasticity is combined with dynamic load balancing to minimize the computational resources used. The paper presents the system design, implementation, and a thorough evaluation of the scalability and elasticity of the fully implemented system.

ACM Transactions on Computer Systems | 2005

MIDDLE-R: Consistent database replication at the middleware level

Marta Patiño-Martínez; Ricardo Jiménez-Peris; Bettina Kemme; Gustavo Alonso

The widespread use of clusters and Web farms has increased the importance of data replication. In this article, we show how to implement consistent and scalable data replication at the middleware level. We do this by combining transactional concurrency control with group communication primitives. The article presents different replication protocols, argues their correctness, describes their implementation as part of a generic middleware, Middle-R, and proves their feasibility with an extensive performance evaluation. The solution proposed is well suited for a variety of applications including Web farms and distributed object platforms.

international world wide web conferences | 2006

WS-replication: a framework for highly available web services

Jorge Salas; Francisco Perez-Sorrosal; Marta Patiño-Martínez; Ricardo Jiménez-Peris

Due to the rapid acceptance of web services and its fast spreading, a number of mission-critical systems will be deployed as web services in next years. The availability of those systems must be guaranteed in case of failures and network disconnections. An example of web services for which availability will be a crucial issue are those belonging to coordination web service infrastructure, such as web services for transactional coordination (e.g., WS-CAF and WS-Transaction). These services should remain available despite site and connectivity failures to enable business interactions on a 24x7 basis. Some of the common techniques for attaining availability consist in the use of a clustering approach. However, in an Internet setting a domain can get partitioned from the network due to a link overload or some other connectivity problems. The unavailability of a coordination service impacts the availability of all the partners in the business process. That is, coordination services are an example of critical components that need higher provisions for availability. In this paper, we address this problem by providing an infrastructure, WS-Replication, for WAN replication of web services. The infrastructure is based on a group communication web service, WS-Multicast, that respects the web service autonomy. The transport of WS-Multicast is based on SOAP and relies exclusively on web service technology for interaction across organizations. We have replicated WS-CAF using our WS-Replication framework and evaluated its performance.

ACM Transactions on Database Systems | 2003

Are quorums an alternative for data replication

Ricardo Jiménez-Peris; Marta Patiño-Martínez; Gustavo Alonso; Bettina Kemme

Data replication is playing an increasingly important role in the design of parallel information systems. In particular, the widespread use of cluster architectures often requires to replicate data for performance and availability reasons. However, maintaining the consistency of the different replicas is known to cause severe scalability problems. To address this limitation, quorums are often suggested as a way to reduce the overall overhead of replication. In this article, we analyze several quorum types in order to better understand their behavior in practice. The results obtained challenge many of the assumptions behind quorum based replication. Our evaluation indicates that the conventional read-one/write-all-available approach is the best choice for a large range of applications requiring data replication. We believe this is an important result for anybody developing code for computing clusters as the read-one/write-all-available strategy is much simpler to implement and more flexible than quorum-based approaches. In this article, we show that, in addition, it is also the best choice using a number of other selection criteria.

international symposium on distributed computing | 2000

Scalable Replication in Database Clusters

Marta Patiño-Martínez; Ricardo Jiménez-Peris; Bettina Kemme; Gustavo Alonso

In this paper, we explore data replication protocols that provide both fault tolerance and good performance without compromising consistency. We do this by combining transactional concurrency control with group communication primitives. In our approach, transactions are executed at only one site so that not all nodes incur in the overhead of producing results. To further reduce latency, we use an optimistic multicast technique that overlaps transaction execution with total order message delivery. The protocols we present in the paper provide correct executions while minimizing overhead and providing higher scalability.

international conference on distributed computing systems | 2010

StreamCloud: A Large Scale Data Streaming System

Vincenzo Gulisano; Ricardo Jiménez-Peris; Marta Patiño-Martínez; Patrick Valduriez

Data streaming has become an important paradigm for the real-time processing of continuous data flows in domains such as finance, telecommunications, networking, Some applications in these domains require to process massive data flows that current technology is unable to manage, that is, streams that, even for a single query operator, require the capacity of potentially many machines. Research efforts on data streaming have mainly focused on scaling in the number of queries or query operators, but overlooked the scalability issue with respect to the stream volume. In this paper, we present StreamCloud a large scale data streaming system for processing large data stream volumes. We focus on how to parallelize continuous queries to obtain a highly scalable data streaming infrastructure. StreamCloud goes beyond the state of the art by using a novel parallelization technique that splits queries into subqueries that are allocated to independent sets of nodes in a way that minimizes the distribution overhead. StreamCloud is implemented as a middleware and is highly independent of the underlying data streaming engine. We explore and evaluate different strategies to parallelize data streaming and tackle with the main bottlenecks and overheads to achieve scalability. The paper presents the system design, implementation and a thorough evaluation of the scalability of the fully implemented system.

international conference on distributed computing systems | 2002

Improving the scalability of fault-tolerant database clusters

Ricardo Jiménez-Peris; Marta Patiño-Martínez; Bettina Kemme; Gustavo Alonso

Replication has become a central element in modem information systems playing a dual role: increase availability and enhance scalability. Unfortunately, most existing protocols increase availability at the cost of scalability; This paper presents architecture, implementation and performance of a middleware based replication tool that provides both availability and better scalability than existing systems. Main characteristics are the usage of specialized broadcast primitives and efficient data propagation.

pacific rim international symposium on dependable computing | 2007

Boosting Database Replication Scalability through Partial Replication and 1-Copy-Snapshot-Isolation

Damián Serrano; Marta Patiño-Martínez; Ricardo Jiménez-Peris; Bettina Kemme

Databases have become a crucial component in modern information systems. At the same time, they have become the main bottleneck in most systems. Database replication protocols have been proposed to solve the scalability problem by scaling out in a cluster of sites. Current techniques have attained some degree of scalability, however there are two main limitations to existing approaches. Firstly, most solutions adopt a full replication model where all sites store a full copy of the database. The coordination overhead imposed by keeping all replicas consistent allows such approaches to achieve only medium scalability. Secondly, most replication protocols rely on the traditional consistency criterion, 1-copy-serializability, which limits concurrency, and thus scalability of the system. In this paper, we first analyze analytically the performance gains that can be achieved by various partial replication configurations, i.e., configurations where not all sites store all data. From there, we derive a partial replication protocol that provides 1-copy-snapshot isolation as correctness criterion. We have evaluated the protocol with TPC-W and the results show better scalability than full replication.

symposium on reliable distributed systems | 2002

Non-intrusive, parallel recovery of replicated data

Ricardo Jiménez-Peris; Marta Patiño-Martínez; Gustavo Alonso

The increasingly widespread use of cluster architectures has resulted in many new application scenarios for data replication. While data replication is, in principle, a well understood problem. recovery of replicated systems has not yet received enough attention. In the case of clusters, recovery procedures are particularly important since they have to keep a high level of availability even during recovery. In fact, recovery is part of the normal operations of any cluster as the cluster is expected to continue working while sites leave or join the system. However, traditional recovery techniques usually require stopping processing. Once a quiescent state has been reached, the system proceeds to synchronize the state of failed or new replicas. In this paper. we concentrate on how to perform recovery in a replication middleware without having to stop processing. The proposed protocol focuses on how to minimize the redundancies that take place during concurrent recovery of several sites.

ACM Transactions on Database Systems | 2009

Snapshot isolation and integrity constraints in replicated databases

Yi Lin; Bettina Kemme; Ricardo Jiménez-Peris; Marta Patiño-Martínez; José Enrique Armendáriz-Iñigo

Database replication is widely used for fault tolerance and performance. However, it requires replica control to keep data copies consistent despite updates. The traditional correctness criterion for the concurrent execution of transactions in a replicated database is 1-copy-serializability. It is based on serializability, the strongest isolation level in a nonreplicated system. In recent years, however, Snapshot Isolation (SI), a slightly weaker isolation level, has become popular in commercial database systems. There exist already several replica control protocols that provide SI in a replicated system. However, most of the correctness reasoning for these protocols has been rather informal. Additionally, most of the work so far ignores the issue of integrity constraints. In this article, we provide a formal definition of 1-copy-SI using and extending a well-established definition of SI in a nonreplicated system. Our definition considers integrity constraints in a way that conforms to the way integrity constraints are handled in commercial systems. We discuss a set of necessary and sufficient conditions for a replicated history to be producible under 1-copy-SI. This makes our formalism a convenient tool to prove the correctness of replica control algorithms.

Explore More