Gabriel Alatorre | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gabriel Alatorre is active.

Explore More

Publication

Featured researches published by Gabriel Alatorre.

Ibm Journal of Research and Development | 2014

Efficient and agile storage management in software defined environments

Alfredo Alba; Gabriel Alatorre; Christian Bolik; Ann Corrao; Thomas Keith Clark; Sandeep Gopisetty; Robert Haas; Ronen I. Kat; Bryan Langston; Nagapramod Mandagere; Dietmar Noll; Sumant Padbidri; Ramani R. Routray; Yang Song; Chung-Hao Tan; Avishay Traeger

The IT industry is experiencing a disruptive trend for which the entire data center infrastructure is becoming software defined and programmable. IT resources are provisioned and optimized continuously according to a declarative and expressive specification of the workload requirements. The software defined environments facilitate agile IT deployment and responsive data center configurations that enable rapid creation and optimization of value-added services for clients. However, this fundamental shift introduces new challenges to existing data center management solutions. In this paper, we focus on the storage aspect of the IT infrastructure and investigate its unique challenges as well as opportunities in the emerging software defined environments. Current state-of-the-art software defined storage (SDS) solutions are discussed, followed by our novel framework to advance the existing SDS solutions. In addition, we study the interactions among SDS, software defined compute (SDC), and software defined networking (SDN) to demonstrate the necessity of a holistic orchestration and to show that joint optimization can significantly improve the effectiveness and efficiency of the overall software defined environments.

international congress on big data | 2013

Storage Mining: Where IT Management Meets Big Data Analytics

Yang Song; Gabriel Alatorre; Nagapramod Mandagere; Aameek Singh

The emerging paradigm shift to cloud based data center infrastructures imposes remarkable challenges to IT management operations, e.g., due to virtualization techniques and more stringent requirements for cost and efficiency. On one hand, the voluminous data generated by daily IT operations such as logs and performance measurements contain abundant information and insights which can be leveraged to assist the IT management. On the other hand, traditional IT management solutions cannot consume and exploit the rich information contained in the data due to the daunting volume, velocity, variety, as well as the lack of scalable data mining and machine learning frameworks to extract insights from such raw data. In this paper, we present our on-going research thrust of designing novel IT management solutions by leveraging big data analytics frameworks. As an example, we introduce our project of Storage Mining, which exploits big data analytics techniques to facilitate storage cloud management. The challenges are discussed and our proof-of-concept big data analytics framework is presented.

international conference on cloud computing | 2014

Improving Hadoop Service Provisioning in a Geographically Distributed Cloud

Qi Zhang; Ling Liu; Kisung Lee; Yang Zhou; Aameek Singh; Nagapramod Mandagere; Sandeep Gopisetty; Gabriel Alatorre

With more data generated and collected in a geographically distributed manner, combined by the increased computational requirements for large scale data-intensive analysis, we have witnessed the growing demand for geographically distributed Cloud datacenters and hybrid Cloud service provisioning, enabling organizations to support instantaneous demand of additional computational resources and to expand inhouse resources to maintain peak service demands by utilizing cloud resources. A key challenge for running applications in such a geographically distributed computing environment is how to efficiently schedule and perform analysis over data that is geographically distributed across multiple datacenters. In this paper, we first compare multi-datacenter Hadoop deployment with single-datacenter Hadoop deployment to identify the performance issues inherent in a geographically distributed cloud. A generalization of the problem characterization in the context of geographically distributed cloud datacenters is also provided with discussions on general optimization strategies. Then we describe the design and implementation of a suite of system-level optimizations for improving performance of Hadoop service provisioning in a geo-distributed cloud, including prediction-based job localization, configurable HDFS data placement, and data prefetching. Our experimental evaluation shows that our prediction based localization has very low error ratio, smaller than 5%, and our optimization can improve the execution time of Reduce phase by 48.6%.

international conference on service oriented computing | 2015

rSLA: Monitoring SLAs in Dynamic Service Environments

Heiko Ludwig; Katerina Stamou; Mohamed Mohamed; Nagapramod Mandagere; Bryan Langston; Gabriel Alatorre; Hiroaki Nakamura; Obinna Anya; Alexander Keller

Today’s application environments combine Cloud and on-premise infrastructure, as well as platforms and services from different providers to enable quick development and delivery of solutions to their intended users. The ability to use Cloud platforms to stand up applications in a short time frame, the wide availability of Web services, and the application of a continuous deployment model has led to very dynamic application environments. In those application environments, managing quality of service has become more important. The more external service vendors are involved the less control an application owner has and must rely on Service Level Agreements (SLAs). However, SLA management is becoming more difficult. Services from different vendors expose different instrumentation. In addition, the increasing dynamism of application environments entails that the speed of SLA monitoring set up must match the speed of changes to the application environment.

annual srii global conference | 2014

Intelligent Information Lifecycle Management in Virtualized Storage Environments

Gabriel Alatorre; Aameek Singh; Nagapramod Mandagere; Eric K. Butler; Sandeep Gopisetty; Yang Song

Data or information lifecycle management (ILM) is the process of managing data over its lifecycle in a manner that balances cost and performance. The task is made difficult by datas continuously changing business value. If done well, it can lower costs through the increased use of cost-effective storage but also runs the risk of negatively impacting performance if data is inadvertently placed on the wrong device (e.g., low performance storage or on an over-utilized storage device). To address this challenge, we designed and developed the Intelligent Storage Tiering Manager (ISTM), an analytics-driven storage tiering tool that automates the process of load balancing data across and within different storage tiers in virtualized storage environments. Using administrator-generated policies, ISTM finds data with the specified performance profiles and automatically moves them to the appropriate storage tier. Application impact is minimized by limiting overall migration load and keeping data accessible during migration. Automation results in significantly less labor and errors while reducing task completion time from several days (and in some cases weeks) to a few hours. In this paper, we provide an overview of information lifecycle management (ILM), discuss existing solutions, and finally focus on the design and deployment of our ILM solution, ISTM, within a production data center.

cluster computing and the grid | 2014

MapReduce Analysis for Cloud-Archived Data

Balaji Palanisamy; Aameek Singh; Nagapramod Mandagere; Gabriel Alatorre; Ling Liu

Public storage clouds have become a popular choice for archiving certain classes of enterprise data - for example, application and infrastructure logs. These logs contain sensitive information like IP addresses or user logins due to which regulatory and security requirements often require data to be encrypted before moved to the cloud. In order to leverage such data for any business value, analytics systems (e.g. Hadoop/MapReduce) first download data from these public clouds, decrypt it and then process it at the secure enterprise site. We propose VNCache: an efficient solution for MapReduceanalysis of such cloud-archived log data without requiring an apriori data transfer and loading into the local Hadoop cluster. VNcache dynamically integrates cloud-archived data into a virtual namespace at the enterprise Hadoop cluster. Through a seamless data streaming and prefetching model, Hadoop jobs can begin execution as soon as they are launched without requiring any apriori downloading. With VNcaches accurate pre-fetching and caching, jobs often run on a local cached copy of the data block significantly improving performance. When no longer needed, data is safely evicted from the enterprise cluster reducing the total storage footprint. Uniquely, VNcache is implemented with NO changes to the Hadoop application stack.

international conference on big data | 2015

Toward locality-aware scheduling for containerized cloud services

Dongfang Zhao; Nagapramod Mandagere; Gabriel Alatorre; Mohamed Mohamed; Heiko Ludwig

The state-of-the-art scheduler of containerized cloud services considers load-balance as the only criterion and neglects many others such as application performance. In the era of Big Data, however, applications have evolved to be highly data-intensive thus perform poorly in existing systems. This particularly holds for Platform-as-a-Service environments that encourage an application model of stateless application instances in containers reading and writing data to services storing states, e.g., key-value stores. To this end, this work strives to improve todays cloud services by incorporating sensitivity to both load-balance and application performance. We built and analyzed theoretical models that respect both dimensions, and unlike prior studies, our model abstracts the dilemma between load-balance and application performance into an optimization problem and employs a statistical method to meet the discrepant requirements. Using heuristic algorithms and approaches we try to solve the abstracted problems. We implemented the proposed approach in Diego (an open-source cloud service scheduler) and demonstrate that it can significantly boost the performance of containerized applications while preserving a relatively high load-balance.

conference on network and service management | 2015

On selective compression of primary data

Gabriel Alatorre; Nagapramod Mandagere; Yang Song; Heiko Ludwig

With the advent of social media, Internet of Things (IoT), widespread use of richer media formats such as video, and generally increased use of mobile devices, volume of online data has seen a rapid increase in recent years. To cope with this data explosion, businesses and cloud providers are scrambling to lower the cost of storing data without sacrificing the quality of their service using space reduction techniques such as compression and deduplication. Capacity savings, however, are achieved at the cost of performance and additional resource overheads. One drawback of compression techniques is the additional computation required to store and fetch data, which may significantly increase response time, i.e., I/O latency. Worse yet, inefficient compression algorithms that fail to compress data satisfactorily suffer from the latency penalty with marginal capacity savings, e.g., deciding to compress data that is encrypted or already compressed. Therefore, from a data center administrators perspective, we should pick the set of volumes that will yield the most compression space saving with the least latency for a given amount of computation capacity, without exhaustively inspecting the data content of volumes. To fill this void, this paper proposes an approach to manage compression for a very large set of volumes. It maximizes capacity savings and minimizes latency impact without scanning the actual data content (to avoid security concerns). Our pilot deployments show significant capacity savings and performance improvements compared to benchmark compression strategies.

Archive | 2012