Karen Appleby | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Karen Appleby is active.

Explore More

Publication

Featured researches published by Karen Appleby.

integrated network management | 2001

Oceano-SLA based management of a computing utility

Karen Appleby; Sameh A. Fakhouri; Liana Fong; Germán S. Goldszmidt; Michael H. Kalantar; Srirama Mandyam Krishnakumar; Donald P. Pazel; John Arthur Pershing; Benny Rochwerger

Oceano is a prototype of a highly available, scaleable, and manageable infrastructure for an e-business computing utility. It enables multiple customers to be hosted on a collection of sequentially shared resources. The hosting environment is divided into secure domains, each supporting one customer. These domains are dynamic: the resources assigned to them may be augmented when the load increases and reduced when load dips. This dynamic resource allocation enables flexible service level agreements (SLAs) with customers in an environment where peak loads are an order of magnitude greater than the normal steady state.

Ibm Systems Journal | 2004

Using a utility computing framework to develop utility systems

Tamar Eilam; Karen Appleby; Jochen Breh; Gerd Breiter; Harald Daur; Sameh A. Fakhouri; Guerney D. H. Hunt; Tan Lu; Sandra D. Miller; Lily B. Mummert; John Arthur Pershing; Hendrik Wagner

In this paper we describe a utility computing framework, consisting of a component model, a methodology, and a set of tools and common services for building utility computing systems. This framework facilitates the creation of new utility computing systems by providing a set of common functions, as well as a set of standard interfaces for those components that are specialized. It also provides a methodology and tools to assemble and re-use resource provisioning and management functions used to support new services with possibly different requirements. We demonstrate the benefits of the framework by describing two sample systems: a life-science utility computing service designed and implemented using the framework, and an on-line gaming utility computing service designed in compliance with the framework.

integrated network management | 2001

Yemanja-a layered event correlation engine for multi-domain server farms

Karen Appleby; Germán S. Goldszmidt; Malgorzata Steinder

Yemanja is a model-based event correlation engine for multi-layer fault diagnosis. It targets complex propagating fault scenarios, and can smoothly correlate low-level network events with high-level application performance alerts related to quality of service violations. Entity-models that represent devices or abstract components encapsulate entity behavior. Distantly associated entities are not explicitly aware of each other, and communicate through event propagation chains. Yemanjas state-based engine supports generic scenario definitions, prioritization of alternate solutions, integrated problem-state and device testing, and simultaneous analysis of overlapping problems. The system of correlation rules was developed based on device, layer, and dependency analysis, and reveals the layered structure of computer networks. The primary objectives of this research include the development of reusable, configuration independent, correlation scenarios; adaptability and the extensibility of the engine to match the constantly changing topology of a multi-domain server farm; and the development of a concise specification language that is relatively simple yet powerful.

distributed systems operations and management | 2004

Problem Determination Using Dependency Graphs and Run-Time Behavior Models

Manoj K. Agarwal; Karen Appleby; Manish Gupta; Gautam Kar; Anindya Neogi; Anca Sailer

Key challenges in managing an I/T environment for e-business lie in the area of root cause analysis, proactive problem prediction, and automated problem remediation. Our approach as reported in this paper, utilizes two important concepts: dependency graphs and dynamic runtime performance characteristics of resources that comprise an I/T environment to design algorithms for rapid root cause identification in case of problems. In the event of a reported problem, our approach uses the dependency information and the behavior models to narrow down the root cause to a small set of resources that can be individually tested, thus facilitating quick remediation and thus leading to reduced administrative costs.

Journal of Network and Systems Management | 2002

Yemanja—A Layered Fault Localization System for Multi-Domain Computing Utilities

Karen Appleby; Germán S. Goldszmidt; Malgorzata Steinder

Yemanja is a model-based event correlation engine for multi-layer fault diagnosis. It targets complex propagating fault scenarios, and can smoothly correlate low-level network events with high-level application performance alerts related to quality-of-service violations. Entity-models that represent devices or abstract components encapsulate their behavior. Distantly associated entity-models are not explicitly aware of each other, and communicate through internal event chains. Yemanjas state-based engine supports generic scenario definitions, prioritization of alternate solutions, integrated problem and device testing, and simultaneous analysis of overlapping problems. The system of correlation rules was developed based on the analysis of device and layer functions, and the dependencies among physical and abstract system components. The primary objectives of this research include the development of reusable, configuration independent, correlation scenarios, adaptability and extensibility of the engine to match the constantly changing topology of a multi-domain server farm, and development of a concise specification language that is relatively simple yet powerful.

cluster computing and the grid | 2002

Neptune: A Dynamic Resource Allocation and Planning System for a Cluster Computing Utility

Donald P. Pazel; Tamar Eilam; Liana L. Fong; Michael H. Kalantar; Karen Appleby; Germán S. Goldszmidt

We present Neptune - the resource director of Océano, a policy driven fabric management system that dynamically reconfigures resources in a computing utility cluster. Neptune implements an on-line control mechanism subject to policy-based performance and resource configuration objectives. Neptune reassigns servers and bandwidth among a set of service domains, based on pre-defined policy, in response to workload changes. It builds and executes a reconfiguration plan through a planning framework, breaking reconfiguration objectives into individual tasks delegated to set of lower level resource managers. We describe an example decision policy algorithm that we implemented and demonstrated in an 80 server multi-domain computing utility.

integrated network management | 2005

Threshold management for problem determination in transaction based e-commerce systems

Karen Appleby; J. Faik; Gautam Kar; Anca Sailer; Manoj K. Agarwal; Anindya Neogi

Managing the service level objectives (SLO) in environments that implement e-commerce systems is a challenging task. It typically involves a clear understanding of how user transactions are supported by the I/T resources that comprise the e-commerce system. This paper investigates a subset of this important management problem. Using transaction to resource dependencies, the authors show how one can experimentally calculate the extent to which supporting resources for a transaction contribute to the end-to-end SLOs for that transaction. An important aspect of this process is the classification of user transactions, based on the profile of their resource usage, enabling one to set appropriate thresholds for different classes. This approach is then used to aid the detection and remediation of application performance bottlenecks.

integrated network management | 2005

Using automatically derived load thresholds to manage compute resources on-demand

Karen Appleby; Germán S. Goldszmidt

Dynamic computing environments that support a changing application set and load are now a reality. Within these environments the decision to add or remove resources, must be made in a timely manner, while at the same time they must avoid excessive resource rebalancing. When the overhead incurred by the resource reallocation process is significant, we should be confident that additional resources are necessary before initiating an allocation process. This process is complicated by the instability of application content and frequent changes in average request processing time. Load and response time thresholds need to be dynamically and automatically adjusted if they are to remain effective. We investigate automated methods for selecting and setting thresholds, estimating load, and analyzing response time. Specifically, these methods estimate the load on a set of application servers, and set appropriate resource allocation thresholds based on the relationship between projected site response times and server load. We describe a solution to the problem of determining the maximum load over multiple servers under changing conditions, which include both changing traffic and application characteristics.

Archive | 2008