Florin Isaila
Instituto de Salud Carlos III
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Florin Isaila.
international conference on supercomputing | 2004
Florin Isaila; Guido Malpohl; Vlad Olaru; Gábor Szeder; Walter F. Tichy
This paper presents the integration of two collective I/O techniques into the Clusterfile parallel file system: disk-directed I/O and two-phase I/O. We show that global cooperative cache management improves the collective I/O performance. The solution focuses on integrating disk parallelism with other types of parallelism: memory (by buffering and caching on several nodes), network (by parallel I/O scheduling strategies) and processors (by redistributing the I/O related computation over several nodes). The performance results show considerable throughput increases over ROMIOs extended two-phase I/O.
ieee/acm international symposium cluster, cloud and grid computing | 2011
Juan M. Tirado; Daniel Higuero; Florin Isaila; Jesús Carretero
Workload variations on Internet platforms such as YouTube, Flickr, LastFM require novel approaches to dynamic resource provisioning in order to meet QoS requirements, while reducing the Total Cost of Ownership (TCO) of the infrastructures. The economy of scale promise of cloud computing is a great opportunity to approach this problem, by developing elastic large scale server infrastructures. However, a proactive approach to dynamic resource provisioning requires prediction models forecasting future load patterns. On the other hand, unexpected volume and data spikes require reactive provisioning for serving unexpected surges in workloads. When workload can not be predicted, adequate data grouping and placement algorithms may facilitate agile scaling up and down of an infrastructure. In this paper, we analyze a dynamic workload of an on-line music portal and present an elastic Web infrastructure that adapts to workload variations by dynamically scaling up and down servers. The workload is predicted by an autoregressive model capturing trends and seasonal patterns. Further, for enhancing data locality, we propose a predictive data grouping based on the history of content access of a user community. Finally, in order to facilitate agile elasticity, we present a data placement based on workload and access pattern prediction. The experimental results demonstrate that our forecasting model predicts workload with a high precision. Further, the predictive data grouping and placement methods provide high locality, load balance and high utilization of resources, allowing a server infrastructure to scale up and down depending on workload.
foundations of computer science | 2001
Florin Isaila; Walter F. Tichy
This paper presents Clusterfile, a parallel file system that provides parallel file access on a cluster of computers. Existing parallel file systems offer little control over matching the I/O access patterns and file data layout. Without this matching the applications may face the following problems: contention at I/O nodes, fragmentation of file data, false sharing, small network messages, high overhead of scattering/gathering the data. Clusterfile addresses some of these inefficiencies. Parallel applications can physically partition a file in arbitrary patterns. They can also set arbitrary views on a file. Views hide the parallel structure of the file and ease the programmers burden of computing complex access indices. The intersections between views and layouts are computed by a memory redistribution algorithm. Read and write operations are optimized by pre-computing the direct mapping between access patterns and disks. Clusterfile uses the same data representation for file layouts, access patterns, and the mappings between each other.
cluster computing and the grid | 2008
J.G. Bias; Florin Isaila; David E. Singh; Jesús Carretero
This paper presents the design and implementation of a new file system independent collective I/O optimization based on file views: view-based collective I/O. View-based collective I/O has been implemented and evaluated inside ROMIO implementation of MPI-IO standard. The evaluation section shows that view-based I/O outperforms the original two-phase collective I/O from ROMIO in most of the cases for three well-known parallel I/O benchmarks. This is especially due to a smaller cost of scatter/gather operations, a reduction of the metadata overhead, and a smaller number of collective communication and synchronization primitives used in the implementation.
Computer Networks | 2010
Juan M. Tirado; Daniel Higuero; Florin Isaila; Jesús Carretero; Adriana Iamnitchi
The last years have brought a dramatic increase in the popularity of collaborative Web 2.0 sites. According to recent evaluations, this phenomenon accounts for a large share of Internet traffic and significantly augments the load on the end-servers of Web 2.0 sites. In this paper, we show how collaborative classifications extracted from Web 2.0-like sites can be leveraged in the design of a self-organizing peer-to-peer network in order to distribute data in a scalable manner while preserving a high-content locality. We propose Affinity P2P (AP2P), a novel cluster-based locality-aware self-organizing peer-to-peer network. AP2P self-organizes in order to improve content locality using a novel affinity-based metric for estimating the distance between clusters of nodes sharing similar content. Searches in AP2P are directed to the cluster of interests, where a logarithmic-time parallel flooding algorithm provides high recall, low latency, and low communication overhead. The order of clusters is periodically changed using a greedy cluster placement algorithm, which reorganizes clusters based on affinity in order to increase the locality of related content. The experimental and analytical results demonstrate that the locality-aware cluster-based organization of content offers substantial benefits, achieving an average latency improvement of 45%, and up to 12% increase in search recall.
IEEE Transactions on Parallel and Distributed Systems | 2011
Florin Isaila; J Garcia Blas; Jesús Carretero; Robert Latham; Robert B. Ross
Parallel applications currently suffer from a significant imbalance between computational power and available I/O bandwidth. Additionally, the hierarchical organization of current Petascale systems contributes to an increase of the I/O subsystem latency. In these hierarchies, file access involves pipelining data through several networks with incremental latencies and higher probability of congestion. Future Exascale systems are likely to share this trait. This paper presents a scalable parallel I/O software system designed to transparently hide the latency of file system accesses to applications on these platforms. Our solution takes advantage of the hierarchy of networks involved in file accesses, to maximize the degree of overlap between computation, file I/O-related communication, and file system access. We describe and evaluate a two-level hierarchy for Blue Gene systems consisting of client-side and I/O node-side caching. Our file cache management modules coordinate the data staging between application and storage through the Blue Gene networks. The experimental results demonstrate that our architecture achieves significant performance improvements through a high degree of overlap between computation, communication, and file I/O.
ieee international conference on high performance computing, data, and analytics | 2011
Juan M. Tirado; Daniel Higuero; Florin Isaila; Jesús Carretero
Infrastructures serving on-line applications experience dynamic workload variations depending on diverse factors such as popularity, marketing, periodic patterns, fads, trends, events, etc. Some predictable factors such as trends, periodicity or scheduled events allow for proactive resource provisioning in order to meet fluctuations in workloads. However, proactive resource provisioning requires prediction models forecasting future workload patterns. This paper proposes a multi-model prediction approach, in which data are grouped into bins based on content locality, and an autoregressive prediction model is assigned to each locality-preserving bin. The prediction models are shown to be identified and fitted in a computationally efficient way. We demonstrate experimentally that our multi-model approach improves locality over the uni-model approach, while achieving efficient resource provisioning and preserving a high resource utilization and load balance.
Archive | 2011
Giulio Giunta; Raffaele Montella; Giuliano Laccetti; Florin Isaila; Francisco Javier García Blas
Numerical models play a main role in the earth sciences, filling in the gap between experimental and theoretical approach. Nowadays, the computational approach is widely recognized as the complement to the scientific analysis. Meanwhile, the huge amount of observed/modelled data, and the need to store, process, and refine them, often makes the use of high performance parallel computing the only effective solution to ensure the effective usability of numerical applications, as in the field of atmospheric /oceanographic science, where the development of the Earth Simulator supercomputer [65] is just the edge. Grid Computing [38] is a key technology in all the computational sciences, allowing the use of inhomogeneous and geographically spread computational resources, shared across a virtual laboratory. Moreover, this technology offers several invaluable tools in ensuring security, performance, and availability of the applications. A large amount of simulation models have been successfully developed in the past, but a lot of them are poorly engineered and have been designed following a monolithic programming approach, unsuitable for a distributed computing environment or to be accelerated by GPGPUs [53]. The use of the grid computing technologies is often limited to computer science specialists, because of the complexity of grid itself and of its middleware. Another source of complexity resides on the use of coupled models, as, for example, in the case of atmosphere/seawave/ocean dynamics. The grid enabling approach could be hampered by the grid software and hardware infrastructure complexity. In this context, the build-up of a grid-aware virtual laboratory for environmental applications is a topical challenge for computer scientists. The term “e-Science” is usually referred to computationally enhanced science. With the rise of cloud computing technology and on-demand resource allocation, the meaning of eScience could straightforwardly change to elastic-Science. The aim of our virtual laboratory is to bridge the gap between the technology push of the high performance cloud computing and the pull of a wide range of scientific experimental applications. It provides generic functionalities supporting a wide class of specific e-Science application environments and
parallel processing and applied mathematics | 2011
Raffaele Montella; Giuseppe Coviello; Giulio Giunta; Giuliano Laccetti; Florin Isaila; Javier Garcia Blas
This paper describes the generic virtualization service GVirtuS (Generic Virtualization Service), a framework for development of split-drivers for cloud virtualization solutions. The main goal of GVirtuS is to provide tools for developing elastic computing abstractions for high-performance private and public computing clouds. In this paper we focus our attention on GPU virtualization. However, GVirtuS is not limited to accelerator-based architectures: a virtual high performance parallel file system and a MPI channel are ongoing projects based on our split driver virtualization technology.
international conference on distributed computing systems | 2012
Jesús Carretero; Florin Isaila; Anne-Marie Kermarrec; François Taïani; Juan M. Tirado
Geolocated social networks, combining traditional social networking features with geolocation information, have grown tremendously over the last few years. Yet, very few works have looked at implementing geolocated social networks in a fully distributed manner, a promising avenue to handle the growing scalability challenges of these systems. In this paper, we focus on georecommendation, and show that existing decentralized recommendation mechanisms perform in fact poorly on geodata. We propose a set of novel gossip-based mechanisms to address this problem, in a modular similarity framework called GEOLOGY. The resulting platform is lightweight, efficient, and scalable, and we demonstrate its advantages in terms of recommendation quality and communication overhead on a real dataset of 15,694 users from Foursquare, a leading geolocated social network.