Joefon Jann
IBM
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Joefon Jann.
job scheduling strategies for parallel processing | 1997
Joefon Jann; Pratap Pattnaik; Hubertus Franke; Fang Wang; Joseph Skovira; Joseph Riordan
In this paper we have characterized the inter-arrival time and service time distributions for jobs at a large MPP supercomputing center. Our findings show that the distributions are dispersive and complex enough that they require Hyper Erlang distributions to capture the first three moments of the observed workload. We also present the parameters from the characterization so that they can be easily used for both theoretical studies and the simulations of various scheduling algorithms.
conference on high performance computing (supercomputing) | 1999
Hubertus Franke; Joefon Jann; José E. Moreira; Pratap Pattnaik; Morris A. Jette
In this paper we analyze the behavior of a gang-scheduling system that we are developing for the ASCI Blue-Pacific machines. Starting with a real workload obtained from job logs of one of the ASCI machines, we generate a statistical model of this workload using Hyper Erlang distributions. We then vary the parameters of those distributions to generate various workloads, representative of different operating points of the machine. Through simulation we obtain performance characteristics for three different scheduling strategies: (i) first-come first-serve, (ii) gang-scheduling, and (iii) backfilling. Our results show that both backfilling and gang-scheduling with moderate multiprogramming levels are much more effective than simple first-come first-serve scheduling. In addition, we show that gang-scheduling can display better performance characteristics than backfilling, particularly for large production jobs.
workshop on software and performance | 2010
Kaoutar El Maghraoui; Gokul B. Kandiraju; Joefon Jann; Pratap Pattnaik
Solid-State Disks (SSDs) made out of Flash devices have gained a lot of prominence in recent years due to their increasing performance and endurance. A number of mechanisms are being proposed to improve the performance and reliability of these devices from technological and operating system perspectives, to integrate them into personal computers and enterprise systems. Most of such proposals are being implemented and evaluated directly on top of these SSDs and require sophisticated framework and infrastructure for thorough performance evaluation. On the other hand, to our knowledge, very little has been done on modeling Flash devices and building efficient Flash simulators that can be used to simulate SSDs. Such models and simulators can give insights to make design decisions, save a lot of cumbersome work for setup and implementation, save hardware costs and allow researchers to focus on the real methods that are being proposed. This paper presents a linear model for NAND-based Flash devices based on the internal architecture of these devices. Parameters of the model are presented along with micro-benchmarks that can be used to extract these parameters. The model is validated on the STEC Zeus Flash SSD and extracted parameters are used to build a Flash simulator as a kernel extension in the AIX operating system. A key feature of the simulator is that it simulates I/O requests by maintaining minimal state information and is independent of the internal organization of a Flash SSD. The simulator is validated using commercial and raw-IO applications through experimentation on the simulator and real Flash disks.
ieee international symposium on workload characterization | 2008
Priya Nagpurkar; William P. Horn; U. Gopalakrishnan; Niteesh Dubey; Joefon Jann; Pratap Pattnaik
Web 2.0 represents the evolution of the web from a source of information to a platform. Network advances have permitted users to migrate from desktop applications to so-called Rich Internet Applications (RIAs) characterized by thin clients, which are browser-based and store their state on managed servers. Other Web 2.0 technologies have enabled users to more easily participate, collaborate, and share in web-based communities. With the emergence of wikis, blogs, and social networking, users are no longer only consumers, they become contributors to the collective knowledge accessible on the web. In another Web 2.0 development, content aggregation is moving from portal-based technologies to more sophisticated so-called mashups where aggregation capabilities are greatly expanded. While Web 2.0 has generated a great deal of interest and discussion, there has not been much work on analyzing these emerging workloads. This paper presents a detailed characterization of several applications that exploit Web 2.0 technologies, running on an IBM Power5 system, with the goal of establishing, whether the server-side workloads generated by Web 2.0 applications are significantly different from traditional web workloads, and whether they present new challenges to underlying systems. In this paper, we present a detailed characterization of three Web 2.0 workloads, and a synthetic benchmark representing commercial workloads that do not exploit Web 2.0, for comparison.
international parallel and distributed processing symposium | 2012
Justin R. Funston; Kaoutar El Maghraoui; Joefon Jann; Pratap Pattnaik; Alexandra Fedorova
Simultaneous multithreading (SMT) increases CPU utilization and application performance in many circumstances, but it can be detrimental when performance is limited by application scalability or when there is significant contention for CPU resources. This paper describes an SMT-selection metric that predicts the change in application performance when the SMT level and number of application threads are varied. This metric is obtained online through hardware performance counters with little overhead, and allows the application or operating system to dynamically choose the best SMT level. We have validated the SMT-selection metric using a variety of benchmarks that capture various application characteristics on two different processor architectures. Our results show that the SMT-selection metric is capable of predicting the best SMT level for a given workload in 90% of the cases. The paper also shows that such a metric can be used with a scheduler or application optimizer to help guide its optimization decisions.
international symposium on performance analysis of systems and software | 2003
Joefon Jann; Pratap Pattnaik; Niteesh Dubey; Ramanjaneya Sarma Burugula
In recent years, several large UNIX SMP Servers have added support for dynamic resource management through partitioning and dynamic resource reconfiguration. In this paper we study the ability of Dynamic Reconfiguration (DR) to accommodate fluctuating workloads and changes in operational priorities for a commercial web application. We use a WebSphere HTTP server, a WebSphere Application Server, and a DB2 Database for the application. This combination represents a popular platform for commercial computing deployments, and supports a number of common web-based application scenarios. In our study, we treat this application as a black box to provide a realistic measurement of the efficacy of the DR technology in UNIX Servers. We also use nonparametric estimation techniques to obtain an ab initio and unbiased study of the jitters in our experimental data. Our main conclusions are: (1) Resource allocations for the application (even for a complex and function-rich middleware system such as WebSphere) can be efficiently managed by DR, without the need for explicit accommodation of the DR features by the application, and (2) To obtain efficient resource utilization, the resource management system has to empirically monitor the throughput obtained from the application, rather than rely primarily on long time-scale estimations.
computing frontiers | 2016
Kattamuri Ekanadham; William P. Horn; Manoj Kumar; Joefon Jann; José E. Moreira; Pratap Pattnaik; Mauricio J. Serrano; Gabriel Tanase; Hao Yu
Graph processing is becoming a crucial component for analyzing big data arising in many application domains such as social and biological networks, fraud detection, and sentiment analysis. As a result, a number of computational models for graph analytics have been proposed in the literature to help users write efficient large scale graph algorithms. In this paper we present an alternative model for implementing graph algorithms using a linear algebra based specification. We first specify a set of linear algebra primitives that allows users to express graph algorithms by composition of linear algebra operations. We then describe a high performance implementation of these primitives and its integration with the Spark framework to achieve the scalability we need for large shared-memory systems. We provide an overview of our implementation and also compare and contrast the expressiveness and performance of various algorithms implemented with our approach with that of the current Spark GraphX implementation of those algorithms.
Operating Systems Review | 2008
Joefon Jann; R. Sarma Burugula; Niteesh Dubey; Pratap Pattnaik
This paper investigates the changes in AIX behavior, or the lack of them, and the resulting performance impact from a generational change in servers in a typical large scale eCommerce application environment without extensive tuning of the OS and the application stack for the changing hardware. We have investigated the performance and impediments to performance at the microprocessor level and at the OS level. This paper dissects the performance data as observed from the OS and from hardware performance counters, and suggests areas for further improvements.
Software - Practice and Experience | 2004
Joefon Jann; Niteesh Dubey; Ramanjaneya Sarma Burugula; Pratap Pattnaik
This paper studies the effects that dynamic reconfiguration (DR) has on a WebSphere workload while CPUs are dynamically added to and removed from the underlying AIX instance. DR is a new technology available in AIX 5.2. This study shows that the resource allocations for a complex and function‐rich middleware system such as WebSphere can be efficiently and dynamically managed by the DR technology, without WebSphere having to explicitly accommodate for the DR features of the operating system. Copyright
International Journal of Parallel Programming | 2018
William P. Horn; Manoj Kumar; Joefon Jann; José E. Moreira; Pratap Pattnaik; Mauricio J. Serrano; Gabriel Tanase; Hao Yu
Graph processing is becoming a crucial component for analyzing big data arising in many application domains such as social and biological networks, fraud detection, and sentiment analysis. As a result, a number of computational models for graph analytics have been proposed in the literature to help users write efficient large scale graph algorithms. In this paper we present an alternative model for implementing graph algorithms using a linear algebra based specification. We first specify a set of linear algebra primitives that allows users to express graph algorithms by composition of linear algebra operations. We then describe a high performance implementation of these primitives using C