Shawn Martin | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shawn Martin is active.

Explore More

Publication

Featured researches published by Shawn Martin.

Bioinformatics | 2005

Predicting protein--protein interactions using signature products

Shawn Martin; Diana C. Roe; Jean-Loup Faulon

MOTIVATION Proteome-wide prediction of protein-protein interaction is a difficult and important problem in biology. Although there have been recent advances in both experimental and computational methods for predicting protein-protein interactions, we are only beginning to see a confluence of these techniques. In this paper, we describe a very general, high-throughput method for predicting protein-protein interactions. Our method combines a sequence-based description of proteins with experimental information that can be gathered from any type of protein-protein interaction screen. The method uses a novel description of interacting proteins by extending the signature descriptor, which has demonstrated success in predicting peptide/protein binding interactions for individual proteins. This descriptor is extended to protein pairs by taking signature products. The signature product is implemented within a support vector machine classifier as a kernel function. RESULTS We have applied our method to publicly available yeast, Helicobacter pylori, human and mouse datasets. We used the yeast and H.pylori datasets to verify the predictive ability of our method, achieving from 70 to 80% accuracy rates using 10-fold cross-validation. We used the human and mouse datasets to demonstrate that our method is capable of cross-species prediction. Finally, we reused the yeast dataset to explore the ability of our algorithm to predict domains. CONTACT [email protected]

Bioinformatics | 2007

Boolean dynamics of genetic regulatory networks inferred from microarray time series data

Shawn Martin; Zhaoduo Zhang; Anthony Martino; Jean-Loup Faulon

MOTIVATION Methods available for the inference of genetic regulatory networks strive to produce a single network, usually by optimizing some quantity to fit the experimental observations. In this article we investigate the possibility that multiple networks can be inferred, all resulting in similar dynamics. This idea is motivated by theoretical work which suggests that biological networks are robust and adaptable to change, and that the overall behavior of a genetic regulatory network might be captured in terms of dynamical basins of attraction. RESULTS We have developed and implemented a method for inferring genetic regulatory networks for time series microarray data. Our method first clusters and discretizes the gene expression data using k-means and support vector regression. We then enumerate Boolean activation-inhibition networks to match the discretized data. Finally, the dynamics of the Boolean networks are examined. We have tested our method on two immunology microarray datasets: an IL-2-stimulated T cell response dataset and a LPS-stimulated macrophage response dataset. In both cases, we discovered that many networks matched the data, and that most of these networks had similar dynamics. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

Bioinformatics | 2008

Genome scale enzyme–metabolite and drug–target interaction predictions using the signature molecular descriptor

Jean-Loup Faulon; Milind Misra; Shawn Martin; Ken Sale; Rajat Sapra

MOTIVATION Identifying protein enzymatic or pharmacological activities are important areas of research in biology and chemistry. Biological and chemical databases are increasingly being populated with linkages between protein sequences and chemical structures. There is now sufficient information to apply machine-learning techniques to predict interactions between chemicals and proteins at a genome scale. Current machine-learning techniques use as input either protein sequences and structures or chemical information. We propose here a method to infer protein-chemical interactions using heterogeneous input consisting of both protein sequence and chemical information. RESULTS Our method relies on expressing proteins and chemicals with a common cheminformatics representation. We demonstrate our approach by predicting whether proteins can catalyze reactions not present in training sets. We also predict whether a given drug can bind a target, in the absence of prior binding information for that drug and target. Such predictions cannot be made with current machine-learning techniques requiring binding information for individual reactions or individual targets.

visualization and data analysis | 2011

OpenOrd: An Open-Source Toolbox for Large Graph Layout

Shawn Martin; W. Michael Brown; Richard Klavans; Kevin W. Boyack

We document an open-source toolbox for drawing large-scale undirected graphs. This toolbox is based on a previously implemented closed-source algorithm known as VxOrd. Our toolbox, which we call OpenOrd, extends the capabilities of VxOrd to large graph layout by incorporating edge-cutting, a multi-level approach, average-link clustering, and a parallel implementation. At each level, vertices are grouped using force-directed layout and average-link clustering. The clustered vertices are then re-drawn and the process is repeated. When a suitable drawing of the coarsened graph is obtained, the algorithm is reversed to obtain a drawing of the original graph. This approach results in layouts of large graphs which incorporate both local and global structure. A detailed description of the algorithm is provided in this paper. Examples using datasets with over 600K nodes are given. Code is available at www.cs.sandia.gov/~smartin.

Journal of Chemical Physics | 2008

Algorithmic dimensionality reduction for molecular structure analysis.

W. Michael Brown; Shawn Martin; Sara N. Pollock; Jean-Paul Watson

Dimensionality reduction approaches have been used to exploit the redundancy in a Cartesian coordinate representation of molecular motion by producing low-dimensional representations of molecular motion. This has been used to help visualize complex energy landscapes, to extend the time scales of simulation, and to improve the efficiency of optimization. Until recently, linear approaches for dimensionality reduction have been employed. Here, we investigate the efficacy of several automated algorithms for nonlinear dimensionality reduction for representation of trans, trans-1,2,4-trifluorocyclo-octane conformation--a molecule whose structure can be described on a 2-manifold in a Cartesian coordinate phase space. We describe an efficient approach for a deterministic enumeration of ring conformations. We demonstrate a drastic improvement in dimensionality reduction with the use of nonlinear methods. We discuss the use of dimensionality reduction algorithms for estimating intrinsic dimensionality and the relationship to the Whitney embedding theorem. Additionally, we investigate the influence of the choice of high-dimensional encoding on the reduction. We show for the case studied that, in terms of reconstruction error root mean square deviation, Cartesian coordinate representations and encodings based on interatom distances provide better performance than encodings based on a dihedral angle representation.

Journal of Chemical Physics | 2010

Topology of cyclo-octane energy landscape

Shawn Martin; Aidan P. Thompson; Jean-Paul Watson

Understanding energy landscapes is a major challenge in chemistry and biology. Although a wide variety of methods have been invented and applied to this problem, very little is understood about the actual mathematical structures underlying such landscapes. Perhaps the most general assumption is the idea that energy landscapes are low-dimensional manifolds embedded in high-dimensional Euclidean space. While this is a very mild assumption, we have discovered an example of an energy landscape which is nonmanifold, demonstrating previously unknown mathematical complexity. The example occurs in the energy landscape of cyclo-octane, which was found to have the structure of a reducible algebraic variety, composed of the union of a sphere and a Klein bottle, intersecting in two rings.

Journal of Chemical Information and Modeling | 2006

Designing novel polymers with targeted properties using the signature molecular descriptor.

W. Michael Brown; Shawn Martin; Mark Daniel Rintoul; Jean-Loup Faulon

A method for solving the inverse quantitative structure-property relationship (QSPR) problem is presented which facilitates the design of novel polymers with targeted properties. Here, we demonstrate the efficacy of the approach using the targeted design of polymers exhibiting a desired glass transition temperature, heat capacity, and density. We present novel QSPRs based on the signature molecular descriptor capable of predicting glass transition temperature, heat capacity, density, molar volume, and cohesive energies of linear homopolymers with cross-validation squared correlation coefficients ranging between 0.81 and 0.95. Using these QSPRs, we show how the inverse problem can be solved to design poly(N-methyl hexamethylene sebacamide) despite the fact that the polymer was used not used in the training of this model.

Journal of Computer-aided Molecular Design | 2005

Reverse engineering chemical structures from molecular descriptors: how many solutions?

Jean-Loup Faulon; W. Michael Brown; Shawn Martin

SummaryPhysical, chemical and biological properties are the ultimate information of interest for chemical compounds. Molecular descriptors that map structural information to activities and properties are obvious candidates for information sharing. In this paper, we consider the feasibility of using molecular descriptors to safely exchange chemical information in such a way that the original chemical structures cannot be reverse engineered. To investigate the safety of sharing such descriptors, we compute the degeneracy (the number of structure matching a descriptor value) of several 2D descriptors, and use various methods to search for and reverse engineer structures. We examine degeneracy in the entire chemical space taking descriptors values from the alkane isomer series and the PubChem database. We further use a stochastic search to retrieve structures matching specific topological index values. Finally, we investigate the safety of exchanging of fragmental descriptors using deterministic enumeration.

Annals of the New York Academy of Sciences | 2007

Sensitivity analysis of a computational model of the IKK NF-kappaB IkappaBalpha A20 signal transduction network.

Jaewook Joo; Steve Plimpton; Shawn Martin; Laura Painton Swiler; Jean-Loup Faulon

Abstract: The NF‐κB signaling network plays an important role in many different compartments of the immune system during immune activation. Using a computational model of the NF‐κB signaling network involving two negative regulators, IκBα and A20, we performed sensitivity analyses with three different sampling methods and present a ranking of the kinetic rate variables by the strength of their influence on the NF‐κB signaling response. We also present a classification of temporal‐response profiles of nuclear NF‐κB concentration into six clusters, which can be regrouped to three biologically relevant clusters. Last, we constructed a reduced network of the IKK–NF‐κB–IκBα–A20 signal transduction based on the ranking.

international conference on data mining | 2005

Training support vector machines using Gilbert's algorithm

Shawn Martin

Support vector machines are classifiers designed around the computation of an optimal separating hyperplane. This hyperplane is typically obtained by solving a constrained quadratic programming problem, but may also be located by solving a nearest point problem. Gilberts algorithm can be used to solve this nearest point problem but is unreasonably slow. In this paper we present a modified version of Gilberts algorithm for the fast computation of the support vector machine hyperplane. We then compare our algorithm with the nearest point algorithm and with sequential minimal optimization.

Explore More