Lutz Hamel
University of Rhode Island
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Lutz Hamel.
computational intelligence in bioinformatics and computational biology | 2006
Lutz Hamel
The visualization of support vector machines in realistic settings is a difficult problem due to the high dimensionality of the typical datasets involved. However, such visualizations usually aid the understanding of the model and the underlying processes, especially in the biosciences. Here we propose a novel visualization technique of support vector machines based on unsupervised learning, specifically self-organizing maps. Conceptually, self-organizing maps can be thought of as neural networks that investigate a high-dimensional data space for clusters of data points and then project the clusters onto a two-dimensional map preserving the topologies of the original clusters as much as possible. This allows for the visualization of high-dimensional datasets together with their support vector models. With this technique we investigate a number of support vector machine visualization scenarios based on real world biomedical datasets
Genome Biology | 2004
Olga Zhaxybayeva; Lutz Hamel; Jason Raymond; J. Peter Gogarten
The methods presented here summarize phylogenetic relationships of genomes in visually appealing and informative figures. Dekapentagonal maps depict phylogenetic information for orthologous genes present in five genomes, and provide a pre-screen for putatively horizontally transferred genes. If the majority of individual gene phylogenies are unresolved, bipartition histograms provide a means of uncovering and analyzing the plurality consensus. Analyses of genomes representing five photosynthetic bacterial phyla and of the prokaryotic contributions to the eukaryotic cell illustrate the utility of the methods.
Fertility and Sterility | 2003
James R. Trimarchi; Julie Goodside; Leah Passmore; Tali Silberstein; Lutz Hamel; Liliana Gonzalez
Design We utilized Quinlan’s C5.0 decision tree data mining algorithm to retrospectively investigate the predictive power of the 100 parameters that we track for each IVF cycle. The parameters investigated include patient demographics, stimulation regime, response properties, oocyte and embryo parameters and embryo transfer variables. To validate our findings from a statistical point of view we also constructed a statistical model based on logistic regression.
BMC Bioinformatics | 2005
Lutz Hamel; Olga Zhaxybayeva; J. Peter Gogarten
BackgroundDekapentagonal maps depict the phylogenetic relationships of five genomes in a visually appealing diagram and can be viewed as an alternative to a single evolutionary consensus tree. In particular, the generated maps focus attention on those gene families that significantly deviate from the consensus or plurality phylogeny. PentaPlot is a software tool that computes such dekapentagonal maps given an appropriate probability support matrix.ResultsThe visualization with dekapentagonal maps critically depends on the optimal layout of unrooted tree topologies representing different evolutionary relationships among five organisms along the vertices of the dekapentagon. This is a difficult optimization problem given the large number of possible layouts. At its core our tool utilizes a genetic algorithm with demes and a local search strategy to search for the optimal layout. The hybrid genetic algorithm performs satisfactorily even in those cases where the chosen genomes are so divergent that little phylogenetic information has survived in the individual gene families.ConclusionPentaPlot is being made publicly available as an open source project at http://pentaplot.sourceforge.net.
BioMed Research International | 2008
Lutz Hamel; N. Nahar; Maria Poptsova; Olga Zhaxybayeva; Johann Peter Gogarten
The tree representation as a model for organismal evolution has been in use since before Darwin. However, with the recent unprecedented access to biomolecular data, it has been discovered that, especially in the microbial world, individual genes making up the genome of an organism give rise to different and sometimes conflicting evolutionary tree topologies. This discovery calls into question the notion of a single evolutionary tree for an organism and gives rise to the notion of an evolutionary consensus tree based on the evolutionary patterns of the majority of genes in a genome embedded in a network of gene histories. Here, we discuss an approach to the analysis of genomic data of multiple genomes using bipartition spectral analysis and unsupervised learning. An interesting observation is that genes within genomes that have evolutionary tree topologies, which are in substantial conflict with the evolutionary consensus tree of an organism, point to possible horizontal gene transfer events which often delineate significant evolutionary events.
Applied Spectroscopy | 2012
Lutz Hamel; Chris W. Brown
The significance of a spectral feature is defined as the probability that the feature captures the structure of the data set at hand. In particular, the significance is equal to a value proportional to the variance of a feature within a particular data set. The larger the variance, the higher the probability that the feature will capture the underlying structure. This approach is particularly useful when significance is used to select features differentiating clusters of samples and for the construction of self-organizing maps (SOMs) of clusters. A significance spectrum is obtained by plotting significance as a function of wavenumber. After developing the approach for feature significance, the significance framework was applied to the construction of SOMs for clustering infrared spectra of bacteria. The significance framework consistently chooses features that make it possible to construct maps with reduced feature sets that are at least as good as the maps constructed on full feature sets. In addition, significance reliably picks features that are consistent with biological interpretations of the spectra.
parallel computing | 1992
Lutz Hamel; Philip J. Hatcher; Michael J. Quinn
C* is the synchronous, data parallel, C superset designed by Thinking Machines for the Connection Machine processor array. We discuss an implementation of C* targeted to the NCUBE 3200 hypercube multicomputer, an asynchronous machine. In the multicomputer environment, the major obstacle to high efficiency is the relatively slow interprocessor communication. We describe our C* optimizer that concentrates on reducing the cost of a C* programs message passing. We present the results of evaluating our system using three benchmark programs.
Database Technologies: Concepts, Methodologies, Tools, and Applications | 2009
Jounghae Bang; Nikhilesh Dholakia; Lutz Hamel; Seung Kyoon Shin
Customer relationships are increasingly central to business success (Kotler, 1997; Reichheld & Sasser, 1990). Acquiring new customers is five to seven times costlier than retaining existing customers (Kotler, 1997). Simply by reducing customer defections by 5%, a company can improve profits by 25% to 85% (Reichheld & Sasser, 1990). Relationship marketing—getting to know customers intimately by understanding their preferences—has emerged as a key business strategy for customer retention (Dyche, 2002). Internet and related technologies offer amazing possibilities for creating and sustaining ideal customer relationships (Goodhue, Wixom, & Watson, 2002; Ives, 1990; Moorman, Zaltman, & Deshpande, 1992). Internet is not only an important and convenient new channel for promotion, transactions, and business process coordination; it is also a source of customer data (Shaw, Subramaniam, Tan, & Welge, 2001). Huge customer data warehouses are being created using advanced database technologies (Fayyad, PiatetskyShapiro, & Smyth, 1996). Customer data warehouses by themselves offer no competitive advantages: insightful customer knowledge must be extracted from such data (Kim, Kim, & Lee, 2002). Valuable marketing insights about customer characteristics and their purchase patterns, however, are often hidden and untapped (Shaw et al., 2001). Data mining and knowledge discovery in databases (KDD) facilitate extraction of valuable knowledge from rapidly growing volumes of data (Mackinnon, 1999; Fayyad et al., 1996). This article provides a brief review of customer relationship issues. The article focuses on: (1) customer relationship management (CRM) technologies, (2) KDD techniques, and (3) Key CRM-KDD linkages in terms of relationship marketing. The article concludes with the observations about the state-of-the-art and future directions. Background
computational intelligence in bioinformatics and computational biology | 2005
Lutz Hamel; Gongqin Sun; Jing Zhang
Establishing structure-function relationships on the proteomic scale is a unique challenge faced by bioinformatics and molecular biosciences. Large protein families represent natural libraries of analogues of a given catalytic or protein function, thus making them ideal targets for the investigation of structure-function relationships in proteins. To this end, we have developed a new technique for analyzing large amounts of detailed molecular structure information focusing on the functional centers of homologous proteins. Our approach uses unsupervised machine learning, in particular, self-organizing maps. The information captured by a self-organizing map and stored in its reference models highlights the essential structure of the proteins under investigation and can be effectively used to study detailed structural differences and similarities among homologous proteins. Our preliminary results obtained with a prototype based on these techniques demonstrate that we can classify proteins and identify common and unique structures within a family and, more importantly, identify common and unique structural features of different conformations of the same protein. The approach developed here outperforms many of today’s structure analysis tools. These tools are usually either limited by the number of proteins they can process at the same time or they are limited by the structural resolution they can accommodate, that is, many of the structural analysis tools that can handle multiple proteins at the same time limit themselves to secondary structure analysis and therefore miss fine structural nuances within proteins.
Data Mining | 2010
Scott Ryan; Lutz Hamel
This chapter describes an algorithm that predicts events by mining Internet data. A number of specialized Internet search engine queries were designed to summarize results from relevant web pages. At the core of these queries was a set of algorithms that embody the wisdom of crowds hypothesis. This hypothesis states that under the proper conditions the aggregated opinion of a number of nonexperts is more accurate than the opinion of a set of experts. Natural language processing techniques were used to summarize the opinions expressed from all relevant web pages. The specialized queries predicted event results at a statistically significant level. It was hypothesized that predictions from the entire Internet would outperform the predictions of a smaller number of highly ranked web pages. This hypothesis was not confirmed. This data replicated results from an earlier study and indicated that the Internet can make accurate predictions of future events. Evidence that the Internet can function as a wise crowd as predicted by the wisdom of crowds hypothesis was mixed.