Debra S. Goldberg | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Debra S. Goldberg is active.

Explore More

Publication

Featured researches published by Debra S. Goldberg.

PLOS Computational Biology | 2009

Questioning the Ubiquity of Neofunctionalization

Todd A. Gibson; Debra S. Goldberg

Gene duplication provides much of the raw material from which functional diversity evolves. Two evolutionary mechanisms have been proposed that generate functional diversity: neofunctionalization, the de novo acquisition of function by one duplicate, and subfunctionalization, the partitioning of ancestral functions between gene duplicates. With protein interactions as a surrogate for protein functions, evidence of prodigious neofunctionalization and subfunctionalization has been identified in analyses of empirical protein interactions and evolutionary models of protein interactions. However, we have identified three phenomena that have contributed to neofunctionalization being erroneously identified as a significant factor in protein interaction network evolution. First, self-interacting proteins are underreported in interaction data due to biological artifacts and design limitations in the two most common high-throughput protein interaction assays. Second, evolutionary inferences have been drawn from paralog analysis without consideration for concurrent and subsequent duplication events. Third, the theoretical model of prodigious neofunctionalization is unable to reproduce empirical network clustering and relies on untenable parameter requirements. In light of these findings, we believe that protein interaction evolution is more persuasively characterized by subfunctionalization and self-interactions.

BMC Bioinformatics | 2008

Improving protein function prediction methods with integrated literature data

Aaron P Gabow; Sonia M. Leach; William A. Baumgartner; Lawrence Hunter; Debra S. Goldberg

BackgroundDetermining the function of uncharacterized proteins is a major challenge in the post-genomic era due to the problems complexity and scale. Identifying a proteins function contributes to an understanding of its role in the involved pathways, its suitability as a drug target, and its potential for protein modifications. Several graph-theoretic approaches predict unidentified functions of proteins by using the functional annotations of better-characterized proteins in protein-protein interaction networks. We systematically consider the use of literature co-occurrence data, introduce a new method for quantifying the reliability of co-occurrence and test how performance differs across species. We also quantify changes in performance as the prediction algorithms annotate with increased specificity.ResultsWe find that including information on the co-occurrence of proteins within an abstract greatly boosts performance in the Functional Flow graph-theoretic function prediction algorithm in yeast, fly and worm. This increase in performance is not simply due to the presence of additional edges since supplementing protein-protein interactions with co-occurrence data outperforms supplementing with a comparably-sized genetic interaction dataset. Through the combination of protein-protein interactions and co-occurrence data, the neighborhood around unknown proteins is quickly connected to well-characterized nodes which global prediction algorithms can exploit. Our method for quantifying co-occurrence reliability shows superior performance to the other methods, particularly at threshold values around 10% which yield the best trade off between coverage and accuracy. In contrast, the traditional way of asserting co-occurrence when at least one abstract mentions both proteins proves to be the worst method for generating co-occurrence data, introducing too many false positives. Annotating the functions with greater specificity is harder, but co-occurrence data still proves beneficial.ConclusionCo-occurrence data is a valuable supplemental source for graph-theoretic function prediction algorithms. A rapidly growing literature corpus ensures that co-occurrence data is a readily-available resource for nearly every studied organism, particularly those with small protein interaction databases. Though arguably biased toward known genes, co-occurrence data provides critical additional links to well-studied regions in the interaction network that graph-theoretic function prediction algorithms can exploit.

Bioinformatics | 2011

Improving evolutionary models of protein interaction networks

Todd Gibson; Debra S. Goldberg

MOTIVATION Theoretical models of biological networks are valuable tools in evolutionary inference. Theoretical models based on gene duplication and divergence provide biologically plausible evolutionary mechanics. Similarities found between empirical networks and their theoretically generated counterpart are considered evidence of the role modeled mechanics play in biological evolution. However, the method by which these models are parameterized can lead to questions about the validity of the inferences. Selecting parameter values in order to produce a particular topological value obfuscates the possibility that the model may produce a similar topology for a large range of parameter values. Alternately, a model may produce a large range of topologies, allowing (incorrect) parameter values to produce a valid topology from an otherwise flawed model. In order to lend biological credence to the modeled evolutionary mechanics, parameter values should be derived from the empirical data. Furthermore, recent work indicates that the timing and fate of gene duplications are critical to proper derivation of these parameters. RESULTS We present a methodology for deriving evolutionary rates from empirical data that is used to parameterize duplication and divergence models of protein interaction network evolution. Our method avoids shortcomings of previous methods, which failed to consider the effect of subsequent duplications. From our parameter values, we find that concurrent and existing existing duplication and divergence models are insufficient for modeling protein interaction network evolution. We introduce a model enhancement based on heritable interaction sites on the surface of a protein and find that it more closely reflects the high clustering found in the empirical network.

PLOS Computational Biology | 2011

A first attempt to bring computational biology into advanced high school biology classrooms.

Suzanne Renick Gallagher; William Coon; Kristin Donley; Abby Scott; Debra S. Goldberg

Computer science has become ubiquitous in many areas of biological research, yet most high school and even college students are unaware of this. As a result, many college biology majors graduate without adequate computational skills for contemporary fields of biology. The absence of a computational element in secondary school biology classrooms is of growing concern to the computational biology community and biology teachers who would like to acquaint their students with updated approaches in the discipline. We present a first attempt to correct this absence by introducing a computational biology element to teach genetic evolution into advanced biology classes in two local high schools. Our primary goal was to show students how computation is used in biology and why a basic understanding of computation is necessary for research in many fields of biology. This curriculum is intended to be taught by a computational biologist who has worked with a high school advanced biology teacher to adapt the unit for his/her classroom, but a motivated high school teacher comfortable with mathematics and computing may be able to teach this alone. In this paper, we present our curriculum, which takes into consideration the constraints of the required curriculum, and discuss our experiences teaching it. We describe the successes and challenges we encountered while bringing this unit to high school students, discuss how we addressed these challenges, and make suggestions for future versions of this curriculum.We believe that our curriculum can be a valuable seed for further development of computational activities aimed at high school biology students. Further, our experiences may be of value to others teaching computational biology at this level. Our curriculum can be obtained at http://ecsite.cs.colorado.edu/?page_id=149#biology or by contacting the authors.

integrating technology into computer science education | 2013

Beyond computer science: computational thinking across disciplines

Amber Settle; Debra S. Goldberg; Valerie Barr

In her influential CACM article, Jeannette Wing argues that computational thinking is an emerging basic skill that should become an integral part of every child’s education [14]. The potential impact of any approach for incorporating computational thinking into the curriculum is limited by the low enrollment in computing classes and the homogeneous population choosing these classes. While there are continuing efforts to draw students into computing courses, a complementary approach is to bring computational thinking into courses already taken by a diverse set of students. Because computing is transforming society and impacting many areas of study, providing students with meaningful exposure to computational thinking in other fields can be done without compromising existing learning goals.

workshop on algorithms in bioinformatics | 2015

The Topological Profile of a Model of Protein Network Evolution Can Direct Model Improvement

Todd A. Gibson; Debra S. Goldberg

Biological networks are an attractive construct for studying evolution. One method for inferring evolutionary mechanics is to construct models which generate networks sharing topological characteristics with their empirical counterparts. It remains a challenge to assess, modify, and improve a model based on the topological values it generates. A large range of parameter values may produce a similar topology, and topological properties may vacillate in unexpected ways, frustrating attempts to determine whether the model is flawed or model parameter values are incorrect. We introduce a new method for evaluating the fidelity of an evolutionary network model with respect to topological characteristics by driving topological characteristics towards empirical values concurrently with network generation. From this we compute a topological profile which defines the ability of the network model to produce a desired topology. The topological profile also measures the volatility of characteristics, and the interrelationships among topological characteristics. Our method shows that a top-rated protein interaction network model cannot produce the empirical number of triangles. As triangle count is driven to the empirical value, additional characteristics are propelled towards empirical values. These findings suggest that new model mechanics that increase the number of triangles produced will best enhance the existing model. By providing systematic evaluation of the ability of model mechanics to produce desired topological properties, our framework can help to focus the search for biologically plausible and relevant processes important to network evolution.

technical symposium on computer science education | 2014

E pluribus, plurima: the synergy of interdisciplinary class groups

Debra S. Goldberg; Elizabeth K. White

Computer science is increasingly becoming interdisciplinary, with applications not only in scientific disciplines, but also in the arts, humanities, and social sciences. Training computer scientists to work in diverse application disciplines is imperative for modern departments. We have had success using interdisciplinary groups for this purpose in a computational biology class, Algorithms for Molecular Biology. In this class, carefully-balanced interdisciplinary groups learn to take advantage of each others abilities, and to communicate effectively with students with a much different background. From this diversity, we get much more (e pluribus, plurima) than would be possible if we tried to train all students to have a more homogeneous blend of multiple disciplinary knowledge. Within a single semester, students go from virtually no understanding of one discipline to completing research projects on a relevant problem that they have defined themselves.

international conference on bioinformatics | 2014

Large highly connected clusters in protein-protein interaction networks

Suzanne Renick Gallagher; Debra S. Goldberg

The edge (vertex) connectivity of a graph is the minimum number of edges (vertices) that must be removed to disconnect the graph. Connectivity is an important property of a graph but has seldom been used in the study of protein-protein interaction (PPI) and other biological networks. Connectivity differs from edge density in that it is based on the number of paths between all vertices in the graph, while edge density is based solely on the number of edges. Connectivity may be a better indicator of large clusters than edge density: in real-world networks, as the number of vertices in a cluster increases, the edge density tends to decrease rapidly, but the connectivity tends to remain constant. We developed algorithms to search for subgraphs with high connectivities, proved their correctness and complexity, and applied these algorithms to Saccharomyces cerevisiae and human PPI networks. We discovered that PPI networks have large subgraphs (20-130 vertices) with high connectivity that were previously unrecognized. The function of these subgraphs remains unknown, but they are far more highly connected than subgraphs in random networks and are significantly enriched with proteins with shared biological functions, suggesting that these are biologically significant.

international conference on bioinformatics | 2013

Evaluating theoretical models of protein interaction network evolution without seed graphs

Todd A. Gibson; Debra S. Goldberg

Here we develop an alternate method to evaluate the evolutionary mechanics of theoretical network models which is free of the bias introduced by seed graph selection. We run a model in reverse directly on empirical data, and then run the model forward to generate a network topology to compare against the empirical data. We implement this method on a well-regarded gene duplication and divergence model, and find that it is unable to generate the high clustering found in the empirical data.

F1000Research | 2013

Characterization of known protein complexes using k-connectivity and other topological measures

Suzanne Renick Gallagher; Debra S. Goldberg

Many protein complexes are densely packed, so proteins within complexes often interact with several other proteins in the complex. Steric constraints prevent most proteins from simultaneously binding more than a handful of other proteins, regardless of the number of proteins in the complex. Because of this, as complex size increases, several measures of the complex decrease within protein-protein interaction networks. However, k-connectivity, the number of vertices or edges that need to be removed in order to disconnect a graph, may be consistently high for protein complexes. The property of k-connectivity has been little used previously in the investigation of protein-protein interactions. To understand the discriminative power of k-connectivity and other topological measures for identifying unknown protein complexes, we characterized these properties in known Saccharomyces cerevisiae protein complexes in networks generated both from highly accurate X-ray crystallography experiments which give an accurate model of each complex, and also as the complexes appear in high-throughput yeast 2-hybrid studies in which new complexes may be discovered. We also computed these properties for appropriate random subgraphs.We found that clustering coefficient, mutual clustering coefficient, and k-connectivity are better indicators of known protein complexes than edge density, degree, or betweenness. This suggests new directions for future protein complex-finding algorithms.

Explore More