James Cipar
Carnegie Mellon University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by James Cipar.
symposium on cloud computing | 2010
Hrishikesh Amur; James Cipar; Varun Gupta; Gregory R. Ganger; Michael Kozuch; Karsten Schwan
Power-proportional cluster-based storage is an important component of an overall cloud computing infrastructure. With it, substantial subsets of nodes in the storage cluster can be turned off to save power during periods of low utilization. Rabbit is a distributed file system that arranges its data-layout to provide ideal power-proportionality down to very low minimum number of powered-up nodes (enough to store a primary replica of available datasets). Rabbit addresses the node failure rates of large-scale clusters with data layouts that minimize the number of nodes that must be powered-up if a primary fails. Rabbit also allows different datasets to use different subsets of nodes as a building block for interference avoidance when the infrastructure is shared by multiple tenants. Experiments with a Rabbit prototype demonstrate its power-proportionality, and simulation experiments demonstrate its properties at scale.
workshop on automated control for datacenters and clouds | 2009
Michael Kozuch; Michael P. Ryan; Richard Gass; Steven W. Schlosser; David R. O'Hallaron; James Cipar; Elie Krevat; Julio Lopez; Michael Stroucken; Gregory R. Ganger
Big Data applications, those that require large data corpora either for correctness or for fidelity, are becoming increasingly prevalent. Tashi is a cluster management system designed particularly for enabling cloud computing applications to operate on repositories of Big Data. These applications are extremely scalable but also have very high resource demands. A key technique for making such applications perform well is Location-Awareness. This paper demonstrates that location-aware applications can outperform those that are not location aware by factors of 3-11 and describes two general services developed for Tashi to provide location-awareness independently of the storage system.
european conference on computer systems | 2012
James Cipar; Gregory R. Ganger; Kimberly Keeton; Charles B. Morrey; Craig A. N. Soules; Alistair Veitch
The LazyBase scalable database system is specialized for the growing class of data analysis applications that extract knowledge from large, rapidly changing data sets. It provides the scalability of popular NoSQL systems without the query-time complexity associated with their eventual consistency models, offering a clear consistency model and explicit per-query control over the trade-off between latency and result freshness. With an architecture designed around batching and pipelining of updates, LazyBase simultaneously ingests atomic batches of updates at a very high throughput and offers quick read queries to a stale-but-consistent version of the data. Although slightly stale results are sufficient for many analysis queries, fully up-to-date results can be obtained when necessary by also scanning updates still in the pipeline. Compared to the Cassandra NoSQL system, LazyBase provides 4X--5X faster update throughput and 4X faster read query throughput for range queries while remaining competitive for point queries. We demonstrate LazyBases tradeoff between query latency and result freshness as well as the benefits of its consistency model. We also demonstrate specific cases where Cassandras consistency model is weaker than LazyBases.
ACM Transactions on Storage | 2007
James Cipar; Mark D. Corner; Emery D. Berger
Contributory applications allow users to donate unused resources on their personal computers to a shared pool. Applications such as SETI@home, Folding@home, and Freenet are now in wide use and provide a variety of services, including data processing and content distribution. However, while several research projects have proposed contributory applications that support peer-to-peer storage systems, their adoption has been comparatively limited. We believe that a key barrier to the adoption of contributory storage systems is that contributing a large quantity of local storage interferes with the principal user of the machine. To overcome this barrier, we introduce the Transparent File System (TFS). TFS provides background tasks with large amounts of unreliable storage—all of the currently available space—without impacting the performance of ordinary file access operations. We show that TFS allows a peer-to-peer contributory storage system to provide 40% more storage at twice the performance when compared to a user-space storage mechanism. We analyze the impact of TFS on replication in peer-to-peer storage systems and show that TFS does not appreciably increase the resources needed for file replication.
ACM Transactions on Storage | 2014
Lianghong Xu; James Cipar; Elie Krevat; Alexey Tumanov; Nitin Gupta; Michael Kozuch; Gregory R. Ganger
Elastic storage systems can be expanded or contracted to meet current demand, allowing servers to be turned off or used for other tasks. However, the usefulness of an elastic distributed storage system is limited by its agility: how quickly it can increase or decrease its number of servers. Due to the large amount of data they must migrate during elastic resizing, state of the art designs usually have to make painful trade-offs among performance, elasticity, and agility. This article describes the state of the art in elastic storage and a new system, called SpringFS, that can quickly change its number of active servers, while retaining elasticity and performance goals. SpringFS uses a novel technique, termed bounded write offloading, that restricts the set of servers where writes to overloaded servers are redirected. This technique, combined with the read offloading and passive migration policies used in SpringFS, minimizes the work needed before deactivation or activation of servers. Analysis of real-world traces from Hadoop deployments at Facebook and various Cloudera customers and experiments with the SpringFS prototype confirm SpringFS’s agility, show that it reduces the amount of data migrated for elastic resizing by up to two orders of magnitude, and show that it cuts the percentage of active servers required by 67--82%, outdoing state-of-the-art designs by 6--120%.
ACM Transactions on Storage | 2012
Michael Abd-El-Malek; Matthew Wachs; James Cipar; Karan Sanghi; Gregory R. Ganger; Garth A. Gibson; Michael K. Reiter
File system virtual appliances (FSVAs) address the portability headaches that plague file system (FS) developers. By packaging their FS implementation in a virtual machine (VM), separate from the VM that runs user applications, they can avoid the need to port the file system to each operating system (OS) and OS version. A small FS-agnostic proxy, maintained by the core OS developers, connects the FSVA to whatever OS the user chooses. This article describes an FSVA design that maintains FS semantics for unmodified FS implementations and provides desired OS and virtualization features, such as a unified buffer cache and VM migration. Evaluation of prototype FSVA implementations in Linux and NetBSD, using Xen as the virtual machine manager (VMM), demonstrates that the FSVA architecture is efficient, FS-agnostic, and able to insulate file system implementations from OS differences that would otherwise require explicit porting.
neural information processing systems | 2013
Qirong Ho; James Cipar; Henggang Cui; Seunghak Lee; Jin Kyu Kim; Phillip B. Gibbons; Garth A. Gibson; Gregory R. Ganger; Eric P. Xing
usenix annual technical conference | 2014
Henggang Cui; James Cipar; Qirong Ho; Jin Kyu Kim; Seunghak Lee; Abhimanu Kumar; Jinliang Wei; Wei Dai; Gregory R. Ganger; Phillip B. Gibbons; Garth A. Gibson; Eric P. Xing
hot topics in operating systems | 2013
James Cipar; Qirong Ho; Jin Kyu Kim; Seunghak Lee; Gregory R. Ganger; Garth A. Gibson; Kimberly Keeton; Eric P. Xing
symposium on cloud computing | 2012
Alexey Tumanov; James Cipar; Gregory R. Ganger; Michael Kozuch