Diana C. Roe | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Diana C. Roe is active.

Explore More

Publication

Featured researches published by Diana C. Roe.

Bioinformatics | 2005

Predicting protein--protein interactions using signature products

Shawn Martin; Diana C. Roe; Jean-Loup Faulon

MOTIVATION Proteome-wide prediction of protein-protein interaction is a difficult and important problem in biology. Although there have been recent advances in both experimental and computational methods for predicting protein-protein interactions, we are only beginning to see a confluence of these techniques. In this paper, we describe a very general, high-throughput method for predicting protein-protein interactions. Our method combines a sequence-based description of proteins with experimental information that can be gathered from any type of protein-protein interaction screen. The method uses a novel description of interacting proteins by extending the signature descriptor, which has demonstrated success in predicting peptide/protein binding interactions for individual proteins. This descriptor is extended to protein pairs by taking signature products. The signature product is implemented within a support vector machine classifier as a kernel function. RESULTS We have applied our method to publicly available yeast, Helicobacter pylori, human and mouse datasets. We used the yeast and H.pylori datasets to verify the predictive ability of our method, achieving from 70 to 80% accuracy rates using 10-fold cross-validation. We used the human and mouse datasets to demonstrate that our method is capable of cross-species prediction. Finally, we reused the yeast dataset to explore the ability of our algorithm to predict domains. CONTACT [email protected]

international parallel and distributed processing symposium | 2009

Resource monitoring and management with OVIS to enable HPC in cloud computing environments

James M. Brandt; Ann C. Gentile; Jackson R. Mayo; Philippe Pierre Pebay; Diana C. Roe; David C. Thompson; Matthew H. Wong

Using the cloud computing paradigm, a host of companies promise to make huge compute resources available to users on a pay-as-you-go basis. These resources can be configured on the fly to provide the hardware and operating system of choice to the customer on a large scale. While the current target market for these resources in the commercial space is web development/hosting, this model has the lure of savings of ownership, operation, and maintenance costs, and thus sounds like an attractive solution for people who currently invest millions to hundreds of millions of dollars annually on High Performance Computing (HPC) platforms in order to support large-scale scientific simulation codes. Given the current interconnect bandwidth and topologies utilized in these commercial offerings, however, the only current viable market in HPC would be small-memory-footprint embarrassingly parallel or loosely coupled applications, which inherently require little to no inter-processor communication. While providing the appropriate resources (bandwidth, latency, memory, etc.) for the HPC community would increase the potential to enable HPC in cloud environments, this would not address the need for scalability and reliability, crucial to HPC applications. Providing for these needs is particularly difficult in commercial cloud offerings where the number of virtual resources can far outstrip the number of physical resources, the resources are shared among many users, and the resources may be heterogeneous. Advanced resource monitoring, analysis, and configuration tools can help address these issues, since they bring the ability to dynamically provide and respond to information about the platform and application state and would enable more appropriate, efficient, and flexible use of the resources key to enabling HPC. Additionally such tools could be of benefit to non-HPC cloud providers, users, and applications by providing more efficient resource utilization in general.

international conference on cluster computing | 2009

Numerically stable, single-pass, parallel statistics algorithms

Janine C. Bennett; Ray W. Grout; Philippe Pierre Pebay; Diana C. Roe; David C. Thompson

Statistical analysis is widely used for countless scientific applications in order to analyze and infer meaning from data. A key challenge of any statistical analysis package aimed at large-scale, distributed data is to address the orthogonal issues of parallel scalability and numerical stability. In this paper we derive a series of formulas that allow for single-pass, yet numerically robust, pairwise parallel and incremental updates of both arbitrary-order centered statistical moments and co-moments. Using these formulas, we have built an open source parallel statistics framework that performs principal component analysis (PCA) in addition to computing descriptive, correlative, and multi-correlative statistics. The results of a scalability study demonstrate numerically stable, near-optimal scalability on up to 128 processes and results are presented in which the statistical framework is used to process large-scale turbulent combustion simulation data with 1500 processes.

Proceedings of the 2009 workshop on Resiliency in high performance | 2009

Methodologies for advance warning of compute cluster problems via statistical analysis: a case study

Jim M. Brandt; Ann C. Gentile; Jackson R. Mayo; Philippe Pierre Pebay; Diana C. Roe; David C. Thompson; Matthew H. Wong

The ability to predict impending failures (hardware or software) on large scale high performance compute (HPC) platforms, augmented by checkpoint mechanisms could drastically increase the scalability of applications and efficiency of platforms. In this paper we present our findings and methodologies employed to date in our search for reliable, advance indicators of failures on a 288 node, 4608 core, Opteron based cluster in production use at Sandia National Laboratories. In support of this effort we have deployed OVIS, a Sandia-developed scalable HPC monitoring, analysis, and visualization tool designed for this purpose. We demonstrate that for a particular error case, statistical analysis using OVIS would enable advanced warning of cluster problems on timescales that would enable application and system administrator response in advance of errors, subsequent system error log reporting, and job failures. This is significant as the utility of detecting such indicators depends on how far in advance of failure they can be recognized and how reliable they are.

Bioinformatics | 2005

Computational approaches for identification of conserved/unique binding pockets in the A chain of ricin

Carol L. Ecale Zhou; Adam Zemla; Diana C. Roe; Malin Young; Marisa Lam; Joseph S. Schoeniger; Rod Balhorn

MOTIVATION Specific and sensitive ligand-based protein detection assays that employ antibodies or small molecules such as peptides, aptamers or other small molecules require that the corresponding surface region of the protein be accessible and that there be minimal cross-reactivity with non-target proteins. To reduce the time and cost of laboratory screening efforts for diagnostic reagents, we developed new methods for evaluating and selecting protein surface regions for ligand targeting. RESULTS We devised combined structure- and sequence-based methods for identifying 3D epitopes and binding pockets on the surface of the A chain of ricin that are conserved with respect to a set of ricin A chains and unique with respect to other proteins. We (1) used structure alignment software to detect structural deviations and extracted from this analysis the residue-residue correspondence, (2) devised a method to compare corresponding residues across sets of ricin structures and structures of closely related proteins, (3) devised a sequence-based approach to determine residue infrequency in local sequence context and (4) modified a pocket-finding algorithm to identify surface crevices in close proximity to residues determined to be conserved/unique based on our structure- and sequence-based methods. In applying this combined informatics approach to ricin A, we identified a conserved/unique pocket in close proximity (but not overlapping) the active site that is suitable for bi-dentate ligand development. These methods are generally applicable to identification of surface epitopes and binding pockets for development of diagnostic reagents, therapeutics and vaccines.

grid computing | 2010

Using Cloud Constructs and Predictive Analysis to Enable Pre-Failure Process Migration in HPC Systems

James M. Brandt; Frank Xiaoxiao Chen; Vincent De Sapio; Ann C. Gentile; Jackson R. Mayo; Philippe Pierre Pebay; Diana C. Roe; David C. Thompson; Matthew H. Wong

Accurate failure prediction in conjunction with efficient process migration facilities including some Cloud constructs can enable failure avoidance in large-scale high performance computing (HPC) platforms. In this work we demonstrate a prototype system that incorporates our probabilistic failure prediction system with virtualization mechanisms and techniques to provide a whole system approach to failure avoidance. This work utilizes a failure scenario based on a real-world HPC case study.

ieee international symposium on parallel distributed processing workshops and phd forum | 2010

Combining Virtualization, resource characterization, and Resource management to enable efficient high performance compute platforms through intelligent dynamic resource allocation

James M. Brandt; Frank Xiaoxiao Chen; V. De Sapio; Ann C. Gentile; Jackson R. Mayo; Philippe Pierre Pebay; Diana C. Roe; David C. Thompson; Matthew H. Wong

Improved resource utilization and fault tolerance of large-scale HPC systems can be achieved through fine-grained, intelligent, and dynamic resource (re)allocation. We explore components and enabling technologies applicable to creating a system to provide this capability: specifically 1) Scalable fine-grained monitoring and analysis to inform resource allocation decisions, 2) Virtualization to enable dynamic reconfiguration, 3) Resource management for the combined physical and virtual resources and 4) Orchestration of the allocation, evaluation, and balancing of resources in a dynamic environment. We discuss both general and HPC-centric issues that impact the design of such a system. Finally, we present our prototype system, giving both design details and examples of its application in real-world scenarios.

dependable systems and networks | 2010

Quantifying effectiveness of failure prediction and response in HPC systems: Methodology and example

James M. Brandt; Frank Xiaoxiao Chen; Vincent De Sapio; Ann C. Gentile; Jackson R. Mayo; Philippe Pierre Pebay; Diana C. Roe; David C. Thompson; Matthew H. Wong

Effective failure prediction and mitigation strategies in high-performance computing systems could provide huge gains in resilience of tightly coupled large-scale scientific codes. These gains would come from prediction-directed process migration and resource servicing, intelligent resource allocation, and checkpointing driven by failure predictors rather than at regular intervals based on nominal mean time to failure. Given probabilistic associations of outlier behavior in hardware-related metrics with eventual failure in hardware, system software, and/or applications, this paper explores approaches for quantifying the effects of prediction and mitigation strategies and demonstrates these using actual production system data. We describe context-relevant methodologies for determining the accuracy and cost-benefit of predictors.

international conference on parallel processing | 2011

Framework for enabling system understanding

Jim M. Brandt; Frank Xiaoxiao Chen; Ann C. Gentile; Chokchai Leangsuksun; Jackson R. Mayo; Philippe Pierre Pebay; Diana C. Roe; Narate Taerat; David C. Thompson; Matthew H. Wong

Building the effective HPC resilience mechanisms required for viability of next generation supercomputers will require in depth understanding of system and component behaviors. Our goal is to build an integrated framework for high fidelity long term information storage, historic and run-time analysis, algorithmic and visual information exploration to enable system understanding, timely failure detection/prediction, and triggering of appropriate response to failure situations. Since it is unknown what information is relevant and since potentially relevant data may be expressed in a variety of forms (e.g., numeric, textual), this framework must provide capabilities to process different forms of data and also support the integration of new data, data sources, and analysis capabilities. Further, in order to ensure ease of use as capabilities and data sources expand, it must also provide interactivity between its elements. This paper describes our integration of the capabilities mentioned above into our OVIS tool.

Archive | 2009

Scalable Multi-Correlative Statistics and Principal Component Analysis with Titan

David C. Thompson; Janine C. Bennett; Diana C. Roe; Philippe Pierre Pebay

This report summarizes existing statistical engines in VTK/Titan and presents the recently parallelized multi-correlative and principal component analysis engines. It is a sequel to [PT08] which studied the parallel descriptive and correlative engines. The ease of use of these parallel engines is illustrated by the means of C++ code snippets. Furthermore, this report justifies the design of these engines with parallel scalability in mind; then, this theoretical property is verified with test runs that demonstrate optimal parallel speed-up with up to 200 processors.

Explore More