Suzanne J. Matthews | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Suzanne J. Matthews is active.

Explore More

Publication

Featured researches published by Suzanne J. Matthews.

technical symposium on computer science education | 2002

Nifty assignments

Nick Parlante; Julie Zelenski; Peter-Michael Osera; Marty Stepp; Mark Sherriff; Luther A. Tychonievich; Ryan M. Layer; Suzanne J. Matthews; Allison Obourn; David Raymond; Josh Hug; Stuart Reges

Creating assignments is a difficult and time consuming part of teaching Computer Science. Nifty Assignments is a forum, operating at a very practical level, to promote the sharing of assignment ideas and assignment materials.Each presenter will introduce their assignment, give a quick demo, and describe its niche in the curriculum and its strengths and weaknesses. The presentations (and the descriptions below) merely introduce each assignment. For more detail, each assignment has its own web page with more detailed information and assignment materials such as handouts and data files to aid the adoption of the assignment. Information on participating in Nifty Assignments as well as all the assignment pages are available from our central page… http://cse.stanford.edu/nifty/

BMC Bioinformatics | 2010

MrsRF: an efficient MapReduce algorithm for analyzing large collections of evolutionary trees

Suzanne J. Matthews; Tiffani L. Williams

BackgroundMapReduce is a parallel framework that has been used effectively to design large-scale parallel applications for large computing clusters. In this paper, we evaluate the viability of the MapReduce framework for designing phylogenetic applications. The problem of interest is generating the all-to-all Robinson-Foulds distance matrix, which has many applications for visualizing and clustering large collections of evolutionary trees. We introduce MrsRF (MapReduce Speeds up RF), a multi-core algorithm to generate a t × t Robinson-Foulds distance matrix between t trees using the MapReduce paradigm.ResultsWe studied the performance of our MrsRF algorithm on two large biological trees sets consisting of 20,000 trees of 150 taxa each and 33,306 trees of 567 taxa each. Our experiments show that MrsRF is a scalable approach reaching a speedup of over 18 on 32 total cores. Our results also show that achieving top speedup on a multi-core cluster requires different cluster configurations. Finally, we show how to use an RF matrix to summarize collections of phylogenetic trees visually.ConclusionOur results show that MapReduce is a promising paradigm for developing multi-core phylogenetic applications. The results also demonstrate that different multi-core configurations must be tested in order to obtain optimum performance. We conclude that RF matrices play a critical role in developing techniques to summarize large collections of trees.

international conference on conceptual structures | 2011

Paper Mâché: Creating Dynamic Reproducible Science

Grant R. Brammer; Ralph W. Crosby; Suzanne J. Matthews; Tiffani L. Williams

Abstract For centuries, the research paper have been the main vehicle for scientific progress. From the paper, readers in the scientific community are expected to extract all the relevant information necessary to reproduce and validate the results presented by the papers authors. However, the increased use of computer software in science makes reproducing scientific results increasingly difficult. The research paper in its current state is no longer sufficient to fully reproduce, validate, or review a papers experimental results and conclusions. This impedes scientific progress. To remedy these concerns, we introduce PaperMâche, a new system for creating dynamic, executable research papers. The key novelty of PaperMâche is its use of virtual machines, which lets readers and reviewers easily view and interact with a paper, and reproduce key experimental results. For authors, the Paper Mâche workbench provides an easy-touse interface to build an executable paper. By transforming the static research paper into a dynamic and interactive entity, Paper Mâche brings the presentation of scientific results into the 21st century. We believe that Paper Mâche will become indispensable to the scientific process, and increase the visibility of key findings among members and non-members of the scientific community.

BMC Bioinformatics | 2009

Using tree diversity to compare phylogenetic heuristics

Seung-Jin Sul; Suzanne J. Matthews; Tiffani L. Williams

BackgroundEvolutionary trees are family trees that represent the relationships between a group of organisms. Phylogenetic heuristics are used to search stochastically for the best-scoring trees in tree space. Given that better tree scores are believed to be better approximations of the true phylogeny, traditional evaluation techniques have used tree scores to determine the heuristics that find the best scores in the fastest time. We develop new techniques to evaluate phylogenetic heuristics based on both tree scores and topologies to compare Pauprat and Rec-I-DCM3, two popular Maximum Parsimony search algorithms.ResultsOur results show that although Pauprat and Rec-I-DCM3 find the trees with the same best scores, topologically these trees are quite different. Furthermore, the Rec-I-DCM3 trees cluster distinctly from the Pauprat trees. In addition to our heatmap visualizations of using parsimony scores and the Robinson-Foulds distance to compare best-scoring trees found by the two heuristics, we also develop entropy-based methods to show the diversity of the trees found. Overall, Pauprat identifies more diverse trees than Rec-I-DCM3.ConclusionOverall, our work shows that there is value to comparing heuristics beyond the parsimony scores that they find. Pauprat is a slower heuristic than Rec-I-DCM3. However, our work shows that there is tremendous value in using Pauprat to reconstruct trees—especially since it finds identical scoring but topologically distinct trees. Hence, instead of discounting Pauprat, effort should go in improving its implementation. Ultimately, improved performance measures lead to better phylogenetic heuristics and will result in better approximations of the true evolutionary history of the organisms of interest.

Proteins | 2012

GeoFold: Topology-based protein unfolding pathways capture the effects of engineered disulfides on kinetic stability

Vibin Ramakrishnan; Sai Praveen Srinivasan; Saeed Salem; Suzanne J. Matthews; Wilfredo Colón; Mohammed Javeed Zaki; Christopher Bystroff

Protein unfolding is modeled as an ensemble of pathways, where each step in each pathway is the addition of one topologically possible conformational degree of freedom. Starting with a known protein structure, GeoFold hierarchically partitions (cuts) the native structure into substructures using revolute joints and translations. The energy of each cut and its activation barrier are calculated using buried solvent accessible surface area, side chain entropy, hydrogen bonding, buried cavities, and backbone degrees of freedom. A directed acyclic graph is constructed from the cuts, representing a network of simultaneous equilibria. Finite difference simulations on this graph simulate native unfolding pathways. Experimentally observed changes in the unfolding rates for disulfide mutants of barnase, T4 lysozyme, dihydrofolate reductase, and factor for inversion stimulation were qualitatively reproduced in these simulations. Detailed unfolding pathways for each case explain the effects of changes in the chain topology on the folding energy landscape. GeoFold is a useful tool for the inference of the effects of disulfide engineering on the energy landscape of protein unfolding. Proteins 2011.

technical symposium on computer science education | 2016

The Micro-Cluster Showcase: 7 Inexpensive Beowulf Clusters for Teaching PDC

Joel C. Adams; Jacob Caswell; Suzanne J. Matthews; Charles Peck; Elizabeth Shoop; David Toth; James Wolfer

Just as a micro-computer is a personal, portable computer, a micro-cluster is a personal, portable, Beowulf cluster. In this special session, six cluster designers will bring and demonstrate micro-clusters they have built using inexpensive single-board computers (SBCs). The educators will describe how they have used their clusters to provide their students with hands-on experience using the shared-memory, distributed-memory, and heterogeneous computing paradigms, and thus achieve the parallel and distributed computing (PDC) objectives of CS 2013 [1].

BMC Bioinformatics | 2011

An efficient and extensible approach for compressing phylogenetic trees

Suzanne J. Matthews; Tiffani L. Williams

BackgroundBiologists require new algorithms to efficiently compress and store their large collections of phylogenetic trees. Our previous work showed that TreeZip is a promising approach for compressing phylogenetic trees. In this paper, we extend our TreeZip algorithm by handling trees with weighted branches. Furthermore, by using the compressed TreeZip file as input, we have designed an extensible decompressor that can extract subcollections of trees, compute majority and strict consensus trees, and merge tree collections using set operations such as union, intersection, and set difference.ResultsOn unweighted phylogenetic trees, TreeZip is able to compress Newick files in excess of 98%. On weighted phylogenetic trees, TreeZip is able to compress a Newick file by at least 73%. TreeZip can be combined with 7zip with little overhead, allowing space savings in excess of 99% (unweighted) and 92%(weighted). Unlike TreeZip, 7zip is not immune to branch rotations, and performs worse as the level of variability in the Newick string representation increases. Finally, since the TreeZip compressed text (TRZ) file contains all the semantic information in a collection of trees, we can easily filter and decompress a subset of trees of interest (such as the set of unique trees), or build the resulting consensus tree in a matter of seconds. We also show the ease of which set operations can be performed on TRZ files, at speeds quicker than those performed on Newick or 7zip compressed Newick files, and without loss of space savings.ConclusionsTreeZip is an efficient approach for compressing large collections of phylogenetic trees. The semantic and compact nature of the TRZ file allow it to be operated upon directly and quickly, without a need to decompress the original Newick file. We believe that TreeZip will be vital for compressing and archiving trees in the biological community.

international symposium on bioinformatics research and applications | 2010

A novel approach for compressing phylogenetic trees

Suzanne J. Matthews; Seung-Jin Sul; Tiffani L. Williams

Phylogenetic trees are tree structures that depict relationships between organisms. Popular analysis techniques often produce large collections of candidate trees, which are expensive to store. We introduce TreeZip, a novel algorithm to compress phylogenetic trees based on their shared evolutionary relationships. We evaluate TreeZips performance on fourteen tree collections ranging from 2,505 trees on 328 taxa to 150,000 trees on 525 taxa corresponding to 0.6 MB to 434 MB in storage. Our results show that TreeZip is very effective, typically compressing a tree file to less than 2% of its original size. When coupled with standard compression methods such as 7zip, TreeZip can compress a file to less than 1% of its original size. Our results strongly suggest that TreeZip is very effective at compressing phylogenetic trees, which allows for easier exchange of data with colleagues around the world.

bioinformatics and biomedicine | 2008

New Approaches to Compare Phylogenetic Search Heuristics

Seung-Jin Sul; Suzanne J. Matthews; Tiffani L. Williams

We present new and novel insights into the behavior of two maximum parsimony heuristics for building evolutionary trees of different sizes. First, our results show that the heuristics find different classes of good-scoring trees, where the different classes of trees may have significant evolutionary implications. Secondly, we develop a new entropy-based measure to quantify the diversity among the evolutionary trees found by the heuristics. Overall, topological distance measures such as the Robinson-Foulds distance identify more diversity among a collection of trees than parsimony scores, which implies more powerful heuristics could be designed that use a combination of parsimony scores and topological distances. Thus, by understanding phylogenetic heuristic behavior, better heuristics could be designed, which ultimately leads to more accurate evolutionary trees.

IEEE Transactions on Emerging Topics in Computing | 2017

Leveraging MapReduce and Synchrophasors for Real-Time Anomaly Detection in the Smart Grid

Suzanne J. Matthews; Aaron St. Leger

The rapid detection of anomalous behavior in SCADA systems such as the U.S. power grid is critical for system resiliency and operator response in cases of power fluctuations due to hazardous weather conditions or other events. Phasor measurement units are time synchronized devices that provide accurate synchrophasor measurements in power grids. The rapid deployment of PMUs enable improved real-time situational awareness to grid operators through wide area measurement systems. However, the quantity and rate of measurements obtained from PMUs is significantly higher than traditional devices, and continues to grow as more are deployed. Efficient algorithms for processing large-scale PMU data and notifying operators of anomalies is critical for real-time system monitoring. In this paper, we propose a novel, two-step anomaly detection approach that processes raw PMU data using the MapReduce paradigm. We implement our approach on a multicore system to process a dataset derived from real PMUs containing 4,500 PMUs (

Explore More