Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Michael R. Leuze is active.

Publication


Featured researches published by Michael R. Leuze.


Glycobiology | 2010

CAZymes Analysis Toolkit (CAT): Web service for searching and analyzing carbohydrate-active enzymes in a newly sequenced organism using CAZy database

Byung H. Park; Tatiana V. Karpinets; Mustafa H Syed; Michael R. Leuze; Edward C. Uberbacher

The Carbohydrate-Active Enzyme (CAZy) database provides a rich set of manually annotated enzymes that degrade, modify, or create glycosidic bonds. Despite rich and invaluable information stored in the database, software tools utilizing this information for annotation of newly sequenced genomes by CAZy families are limited. We have employed two annotation approaches to fill the gap between manually curated high-quality protein sequences collected in the CAZy database and the growing number of other protein sequences produced by genome or metagenome sequencing projects. The first approach is based on a similarity search against the entire nonredundant sequences of the CAZy database. The second approach performs annotation using links or correspondences between the CAZy families and protein family domains. The links were discovered using the association rule learning algorithm applied to sequences from the CAZy database. The approaches complement each other and in combination achieved high specificity and sensitivity when cross-evaluated with the manually curated genomes of Clostridium thermocellum ATCC 27405 and Saccharophagus degradans 2-40. The capability of the proposed framework to predict the function of unknown protein domains and of hypothetical proteins in the genome of Neurospora crassa is demonstrated. The framework is implemented as a Web service, the CAZymes Analysis Toolkit, and is available at http://cricket.ornl.gov/cgi-bin/cat.cgi.


international parallel and distributed processing symposium | 2004

High performance computational tools for Motif discovery

Nicole Baldwin; Rebecca L. Collins; Michael A. Langston; Christopher T. Symons; Michael R. Leuze; Brynn H. Voy

Summary form only given. We highlight a fruitful interplay between biology and computation. The sequencing of complete genomes from multiple organisms has revealed that most differences in organism complexity are due to elements of gene regulation that reside in the non protein coding portions of genes. Both within and between species, transcription factor binding sites and the proteins that recognize them govern the activity of cellular pathways that mediate adaptive responses and survival. Experimental identification of these regulatory elements is by nature a slow process. The availability of complete genomic sequences, however, opens the door for computational methods to predict binding sites and expedite our understanding of gene regulation at a genomic level. Just as with traditional experimental approaches, the computational identification of the molecular factors that control a genes expression level has been problematic. As a case in point, the identification of putative motifs is a challenging combinatorial task. For it, powerful new motif finding algorithms and high performance implementations are described. Heavy use is made of graph algorithms, some of which are exceedingly computationally intensive and involve the use of emergent mathematical methods. An approach to fully dynamic load balancing is developed in order to make effective use of highly parallel platforms.


Fems Microbiology Reviews | 2015

Ebolavirus comparative genomics

Se-Ran Jun; Michael R. Leuze; Intawat Nookaew; Edward C. Uberbacher; Miriam Land; Qian Zhang; Visanu Wanchai; Juanjuan Chai; Morten Nielsen; Thomas Trolle; Ole Lund; Gregory S. Buzard; Thomas Pedersen; Trudy M. Wassenaar; David W. Ussery

The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. This information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).


Future Generation Computer Systems | 1999

Mining multi-dimensional data for decision support

June Donato; Jack C. Schryver; Gregory C. Hinkel; Richard L. Schmoyer; Michael R. Leuze; Nancy W. Grandy

Abstract Personal bankruptcy is an increasingly common yet little understood phenomenon. Attempts to predict bankruptcy have involved the application of data mining techniques to credit card data. This is a difficult problem, since credit card data is multi-dimensional, consisting of monthly account records and daily transaction records. In this paper, we describe a two-stage approach that combines decision trees and neural networks to predict personal bankruptcy using credit card data.


Concurrency and Computation: Practice and Experience | 1989

MULTIPROGRAMMING A DISTRIBUTED-MEMORY MULTIPROCESSOR

Michael R. Leuze; Lawrence W. Dowdy; Kee Hyun Park

The development of computing systems with large numbers of processors has been motivated primarily by the need to solve large complex problems more quickly than is possible with uniprocessor systems. Traditionally, multiprocessor systems have been uniprogrammed, i.e., dedicated to the execution of a single set of related processes, since this approach provides the fastest response for an individual program once it begins execution. However, if the goal of a multiprocessor system is to minimize average response time or to maximize throughput, then multiprogramming must be considered. In this paper, a model of a simple multiprocessor system with a two-program workload is reviewed; the model is then applied to an Intel iPSC/2 hypercube multiprocessor with a workload consisting of parallel wavefront algorithms for solving triangular systems of linear equations. Throughputs predicted by the model are compared with throughputs obtained experimentally from an actual system. The results provide validation for the model and indicate that significant performance improvements for multiprocessor systems are possible through multiprogramming. 4 refs., 10 figs., 1 tab.


PLOS Computational Biology | 2016

MicroRNAs Form Triplexes with Double Stranded DNA at Sequence-Specific Binding Sites; a Eukaryotic Mechanism via which microRNAs Could Directly Alter Gene Expression.

Steven W. Paugh; David R. Coss; Ju Bao; Lucas T. Laudermilk; Christy Rani R. Grace; Antonio M. Ferreira; M. Brett Waddell; Granger Ridout; Deanna Naeve; Michael R. Leuze; Philip F. LoCascio; John C. Panetta; Mark R. Wilkinson; Ching-Hon Pui; Clayton W. Naeve; Edward C. Uberbacher; Erik Bonten; William E. Evans

MicroRNAs are important regulators of gene expression, acting primarily by binding to sequence-specific locations on already transcribed messenger RNAs (mRNA) and typically down-regulating their stability or translation. Recent studies indicate that microRNAs may also play a role in up-regulating mRNA transcription levels, although a definitive mechanism has not been established. Double-helical DNA is capable of forming triple-helical structures through Hoogsteen and reverse Hoogsteen interactions in the major groove of the duplex, and we show physical evidence (i.e., NMR, FRET, SPR) that purine or pyrimidine-rich microRNAs of appropriate length and sequence form triple-helical structures with purine-rich sequences of duplex DNA, and identify microRNA sequences that favor triplex formation. We developed an algorithm (Trident) to search genome-wide for potential triplex-forming sites and show that several mammalian and non-mammalian genomes are enriched for strong microRNA triplex binding sites. We show that those genes containing sequences favoring microRNA triplex formation are markedly enriched (3.3 fold, p<2.2 × 10−16) for genes whose expression is positively correlated with expression of microRNAs targeting triplex binding sequences. This work has thus revealed a new mechanism by which microRNAs could interact with gene promoter regions to modify gene transcription.


International Journal of High Speed Computing | 1994

ON MODELING PARTITIONED MULTIPROCESSOR SYSTEMS

Lawrence W. Dowdy; Michael R. Leuze

In recent years, multiprocessor systems have been developed to exploit parallelism within individual programs. However, a multiprocessor system may be used more efficiently if its processors are partitioned among independent parallel programs. The partitioning of a multiprocessor system is addressed in this paper. A simple yet powerful model is proposed for the analysis of various partitioning schemes. The model parameterizes both a multiprocessor system and its parallel workload. Attention is restricted to a multiprocessor system partitioned between two parallel programs. The model is studied to determine (1) when partitioning is worthwhile, (2) the extent of performance improvement under optimal partitioning, and (3) which of three partition scheduling schemes is best.


International Journal of Health Geographics | 2009

Evaluation of sliding baseline methods for spatial estimation for cluster detection in the biosurveillance system.

Jian Xing; Howard Burkom; Linda Moniz; James Edgerton; Michael R. Leuze; Jerome I. Tokars

BackgroundThe Centers for Disease Control and Preventions (CDCs) BioSense system provides near-real time situational awareness for public health monitoring through analysis of electronic health data. Determination of anomalous spatial and temporal disease clusters is a crucial part of the daily disease monitoring task. Our study focused on finding useful anomalies at manageable alert rates according to available BioSense data history.MethodsThe study dataset included more than 3 years of daily counts of military outpatient clinic visits for respiratory and rash syndrome groupings. We applied four spatial estimation methods in implementations of space-time scan statistics cross-checked in Matlab and C. We compared the utility of these methods according to the resultant background cluster rate (a false alarm surrogate) and sensitivity to injected cluster signals. The comparison runs used a spatial resolution based on the facility zip code in the patient record and a finer resolution based on the residence zip code.ResultsSimple estimation methods that account for day-of-week (DOW) data patterns yielded a clear advantage both in background cluster rate and in signal sensitivity. A 28-day baseline gave the most robust results for this estimation; the preferred baseline is long enough to remove daily fluctuations but short enough to reflect recent disease trends and data representation. Background cluster rates were lower for the rash syndrome counts than for the respiratory counts, likely because of seasonality and the large scale of the respiratory counts.ConclusionThe spatial estimation method should be chosen according to characteristics of the selected data streams. In this dataset with strong day-of-week effects, the overall best detection performance was achieved using subregion averages over a 28-day baseline stratified by weekday or weekend/holiday behavior. Changing the estimation method for particular scenarios involving different spatial resolution or other syndromes can yield further improvement.


Robotics and Computer-integrated Manufacturing | 2001

Vector space model for the generalized parts grouping problem

Nagiza Faridovna Samatova; Thomas E. Potok; Michael R. Leuze

The vector perturbation approach is introduced for addressing the generalized parts grouping problem, identifying part families for a general set of suppliers, not just a single supplier. This method is driven by the need for flexible and lean supply chain systems. A vector space model is used to represent a set of operation sequences as opposed to the traditional matrix and integer programming models in group technology. Using this approach, we find that we are able to generate part groups from 90% of the available parts, in which all the operation sequences are preserved. This contrasts the traditional methods using which only 66% of the available parts can be grouped. Furthermore, a vector representation of operation sequences provides an intuitive means for discovering the natural structure of the part data. From these results, we conclude that this technique can dramatically improve the effectiveness of the entire supply chain.


Scientific Reports | 2017

Viral Phylogenomics Using an Alignment-Free Method: A Three-Step Approach to Determine Optimal Length of k-mer

Qian Zhang; Se-Ran Jun; Michael R. Leuze; David W. Ussery; Intawat Nookaew

The development of rapid, economical genome sequencing has shed new light on the classification of viruses. As of October 2016, the National Center for Biotechnology Information (NCBI) database contained >2 million viral genome sequences and a reference set of ~4000 viral genome sequences that cover a wide range of known viral families. Whole-genome sequences can be used to improve viral classification and provide insight into the viral “tree of life”. However, due to the lack of evolutionary conservation amongst diverse viruses, it is not feasible to build a viral tree of life using traditional phylogenetic methods based on conserved proteins. In this study, we used an alignment-free method that uses k-mers as genomic features for a large-scale comparison of complete viral genomes available in RefSeq. To determine the optimal feature length, k (an essential step in constructing a meaningful dendrogram), we designed a comprehensive strategy that combines three approaches: (1) cumulative relative entropy, (2) average number of common features among genomes, and (3) the Shannon diversity index. This strategy was used to determine k for all 3,905 complete viral genomes in RefSeq. The resulting dendrogram shows consistency with the viral taxonomy of the ICTV and the Baltimore classification of viruses.

Collaboration


Dive into the Michael R. Leuze's collaboration.

Top Co-Authors

Avatar

Edward C. Uberbacher

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Intawat Nookaew

University of Arkansas for Medical Sciences

View shared research outputs
Top Co-Authors

Avatar

David W. Ussery

University of Arkansas for Medical Sciences

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Se-Ran Jun

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Tatiana V. Karpinets

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Mustafa H Syed

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Trudy M. Wassenaar

Technical University of Denmark

View shared research outputs
Top Co-Authors

Avatar

Gregory C. Hinkel

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Jack C. Schryver

Oak Ridge National Laboratory

View shared research outputs
Researchain Logo
Decentralizing Knowledge