Christopher Stewart | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Christopher Stewart is active.

Explore More

Publication

Featured researches published by Christopher Stewart.

european conference on computer systems | 2007

Exploiting nonstationarity for performance prediction

Christopher Stewart; Terence Kelly; Alex Zhang

Real production applications ranging from enterprise applications to large e-commerce sites share a crucial but seldom-noted characteristic: The relative frequencies of transaction types in their workloads are nonstationary, i.e., the transaction mix changes over time. Accurately predicting application-level performance in business-critical production applications is an increasingly important problem. However, transaction mix nonstationarity casts doubt on the practical usefulness of prediction methods that ignore this phenomenon. This paper demonstrates that transaction mix nonstationarity enables a new approach to predicting application-level performance as a function of transaction mix. We exploit nonstationarity to circumvent the need for invasive instrumentation and controlled benchmarking during model calibration; our approach relies solely on lightweight passive measurements that are routinely collected in todays production environments. We evaluate predictive accuracy on two real business-critical production applications. The accuracy of our response time predictions ranges from 10% to 16% on these applications, and our models generalize well to workloads very different from those used for calibration. We apply our technique to the challenging problem of predicting the impact of application consolidation on transaction response times. We calibrate models of two testbed applications running on dedicated machines, then use the models to predict their performance when they run together on a shared machine and serve very different workloads. Our predictions are accurate to within 4% to 14%. Existing approaches to consolidation decision support predict post-consolidation resource utilizations. Our method allows application-level performance to guide consolidation decisions.

international conference on autonomic computing | 2012

Adaptive green hosting

Nan Deng; Christopher Stewart; Daniel Gmach; Martin F. Arlitt; Jaimie Kelley

The growing carbon footprint of Web hosting centers contributes to climate change and could harm the publics perception of Web hosts and Internet services. A pioneering cadre of Web hosts, called green hosts, lower their footprints by cutting into their profit margins to buy carbon offsets. This paper argues that an adaptive approach to buying carbon offsets can increase a green hosts total profit by exploiting daily, bursty patterns in Internet service workloads. We make the case in three steps. First, we present a realistic, geographically distributed service that meets strict SLAs while using green hosts to lower its carbon footprint. We show that the service routes requests between competing hosts differently depending on its request arrival rate and on how many carbon offsets each host provides. Second, we use empirical traces of request arrivals to compute how many carbon offsets a host should provide to maximize its profit. We find that diurnal fluctuations and bursty surges interrupted long contiguous periods where the best carbon offset policy held steady, leading us to propose a reactive approach. For certain hosts, our approach can triple the profit compared to a fixed approach used in practice. Third, we simulate 9 services with diverse carbon footprint goals that distribute their workloads across 11 Web hosts worldwide. We use real data on the location of Web hosts and their provided carbon offset policies to show that adaptive green hosting can increase profit by 152% for one of todays larger green hosts.

ieee international symposium on sustainable systems and technology | 2011

Concentrating renewable energy in grid-tied datacenters

Nan Deng; Christopher Stewart; Jing Li

Datacenters, the large server farms that host widely used Internet services, account for a larger fraction of worldwide carbon emissions each year. Increasingly, datacenters are reducing their emissions by using clean, renewable energy from rooftop solar panels to partially power their servers. While some customers value renewable-powered servers, many others are indifferent. We argue that renewable energy produced on site should be concentrated as much as possible on the servers used by green customers. This paper introduces a new metric, the renewable-powered instance, that measures the concentration of renewable energy in datacenters. We conducted a simulation-based study of renewable-energy datacenters, focusing on the grid tie — the device most commonly used to integrate renewable energy. We found that grid-tie placement has first-order effects on renewable-energy concentration.

measurement and modeling of computer systems | 2009

Reference-driven performance anomaly identification

Kai Shen; Christopher Stewart; Chuanpeng Li; Xin Li

Complex system software allows a variety of execution conditions on system configurations and workload properties. This paper explores a principled use of reference executions--those of similar execution conditions from the target--to help identify the symptoms and causes of performance anomalies. First, to identify anomaly symptoms, we construct change profiles that probabilistically characterize expected performance deviations between target and reference executions. By synthesizing several single-parameter change profiles, we can scalably identify anomalous reference-to-target changes in a complex system with multiple execution parameters. Second, to narrow the scope of anomaly root cause analysis, we filter anomaly-related low-level system metrics as those that manifest very differently between target and reference executions. Our anomaly identification approach requires little expert knowledge or detailed models on system internals and consequently it can be easily deployed. Using empirical case studies on the Linux I/O subsystem and a J2EE-based distributed online service, we demonstrate our approachs effectiveness in identifying performance anomalies over a wide range of execution conditions as well as multiple system software versions. In particular, we discovered five previously unknown performance anomaly causes in the Linux 2.6.23 kernel. Additionally, our preliminary results suggest that online anomaly detection and system reconfiguration may help evade performance anomalies in complex online systems.

architectural support for programming languages and operating systems | 2008

Hardware counter driven on-the-fly request signatures

Kai Shen; Ming Zhong; Sandhya Dwarkadas; Chuanpeng Li; Christopher Stewart; Xiao Zhang

Todays processors provide a rich source of statistical informationon application execution through hardware counters. In this paper, we explore the utilization of these statistics as request signaturesin server applications for identifying requests and inferring high-level request properties (e.g., CPU and I/O resource needs). Our key finding is that effective request signatures may be constructed using a small amount of hardware statistics while the request is still in an early stage of its execution. Such on-the-fly request identification and property inference allow guided operating system adaptation at request granularity (e.g., resource-aware request scheduling and on-the-fly request classification). We address the challenges of selecting hardware counter metrics for signature construction and providing necessary operating system support for per-request statistics management. Our implementation in the Linux 2.6.10 kernel suggests that our approach requires low overhead suitable for runtime deployment. Our on-the-fly request resource consumption inference (averaging 7%, 3%, 20%, and 41% prediction errors for four server workloads, TPC-C, TPC-H, J2EE-based RUBiS, and a trace-driven index search, respectively) is much more accurate than the online running-average based prediction (73-82% errors). Its use for resource-aware request scheduling results in a 15-70% response time reduction for three CPU-bound applications. Its use for on-the-fly request classification and anomaly detection exhibits high accuracy for the TPC-H workload with synthetically generated anomalous requests following a typical SQL-injection attack pattern.

IEEE Distributed Systems Online | 2004

Profile-Driven Component Placement for Cluster-Based Online Services

Christopher Stewart; Kai Shen; Sandhya Dwarkadas; Michael L. Scott; Jian Yin

The growth of the Internet and of various intranets has spawned a wealth of online services, most of which are implemented on local-area clusters using remote invocation (for example, remote procedure call/remote method invocation) among manually placed application components. Component placement can be a significant challenge for large-scale services, particularly when application resource needs are workload dependent. Automatic component placement has the potential to maximize overall system throughput. The key idea is to construct (offline) a mapping between input workload and individual-component resource consumption. Such mappings, called component profiles, then support high-performance placement. Preliminary results on an online auction benchmark based on J2EE (Java 2 Platform, Enterprise Edition) suggest that profile-driven tools can identify placements that achieve near-optimal overall throughput.

modeling, analysis, and simulation on computer and telecommunication systems | 2010

EntomoModel: Understanding and Avoiding Performance Anomaly Manifestations

Christopher Stewart; Kai Shen; Arun Iyengar; Jian Yin

Subtle implementation errors or mis-configurations in complex Internet services may lead to performance degradations without causing failures. These undiscovered performance anomalies afflict many of today’s systems, causing violations of service-level agreements (SLAs), unnecessary resource over provisioning, or both. In this paper, we re-inserted realistic anomaly causes into a multi-tier Internet service architecture and studied their manifestations. We observed that each cause had certain workload and management parameters that were more likely to trigger manifestations, hinting that such parameters could be effective classifiers. This observation held even when anomaly causes manifested differently in combination than in isolation. Our study motivates EntomoModel, a framework for depicting performance anomaly manifestations. EntomoModel uses decision tree classification and a design-driven performance model to characterize the workload and management policy settings under which manifestations are likely. EntomoModel enables online system management that avoids anomaly manifestations by dynamically adjusting system management parameters. Our trace-driven evaluations show that manifestation avoidance based on EntomoModel, or entomophobic management, can reduce 98th percentile SLA violations by 67% compared to an anomaly oblivious adaptive approach. In a cloud computing scenario with elastic resource allocation, our approach uses less than half of the resources needed in static over-provisioning.

measurement and modeling of computer systems | 2012

Electric grid balancing through lowcost workload migration

David Chiu; Christopher Stewart; Bart McManus

Energy production must continuously match demand on the electric grid. A deficiency can lead to service disruptions, and a surplus can place tremendous stress on grid components, potentially causing major blackouts. To manage this balance, grid operators must increase or lower power generation, with only a few minutes to react. The grid balancing problem has also impeded the pace of integrating bountiful renewable resources (e.g., wind), whose generation is intermittent. An emerging plan to mitigate this problem is demand response, i.e., for grid operators to alter the electricity usage behavior of the masses through real-time price signals. But due to prohibitively high infrastructure costs and societal-scale adoption, tangible demand response mechanisms have so far been elusive. We believe that altering the usage patterns of a multitude of data centers can be a tangible, albeit initial, step towards affecting demand response. Growing in both density and size, todays data center designs are shaped by the increasing awareness of energy costs and carbon footprint. We posit that shifting computational workloads (and thus, demand) across geographic regions to match electricity supply may help balance the grid. In this paper we will first present a real grid balancing problem experienced in the Pacfic Northwest. We then propose a symbiotic relationship between data centers and grid operators by showing that mutual cost benefits can be accessible. Finally, we argue for a low cost workload migration mechanism, and pose overarching challenges in designing this framework.

ieee international conference computer and communications | 2016

Blending on-demand and spot instances to lower costs for in-memory storage

Zichen Xu; Christopher Stewart; Nan Deng; Xiaorui Wang

In cloud computing, workloads that lease instances on demand get to execute exclusively for a set time. In contrast, workloads that lease spot instances execute until a competing workload outbids the current lease. Spot instances cost less than on-demand instances, but few workloads can use spot instances because of the variable leasing period. We present BOSS, a framework that uses spot instances to reduce costs for in-memory storage workloads. BOSS uses on-demand instances to create and update objects. It uses spot instances to handle read-only queries. BOSS leases instances from multiple sites and exploits varying prices between the sites. When spot instances stop abruptly at one site, BOSS places newly created objects at other sites, reducing the impact on response time. BOSS proposes a novel, online replication approach (1) avoids placing data at too many sites and (2) provides O(1.5)-competitive ratio under skewed cost distributions. Within a site, BOSS manages the tradeoff between savings and risks from replicating to spot instances. We implemented BOSS on top of Cassandra and deployed it on up to 78 instances across 8 sites in Amazon and Google clouds. With BOSS hosting TPC-W data, we spent

network operations and management symposium | 2012