Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Andrey Rzhetsky is active.

Publication


Featured researches published by Andrey Rzhetsky.


Cell | 1999

A Spatial Map of Olfactory Receptor Expression in the Drosophila Antenna

Leslie B. Vosshall; Hubert Amrein; Pavel Morozov; Andrey Rzhetsky; Richard Axel

Insects provide an attractive system for the study of olfactory sensory perception. We have identified a novel family of seven transmembrane domain proteins, encoded by 100 to 200 genes, that is likely to represent the family of Drosophila odorant receptors. Members of this gene family are expressed in topographically defined subpopulations of olfactory sensory neurons in either the antenna or the maxillary palp. Sensory neurons express different complements of receptor genes, such that individual neurons are functionally distinct. The isolation of candidate odorant receptor genes along with a genetic analysis of olfactory-driven behavior in insects may ultimately afford a system to understand the mechanistic link between odor recognition and behavior.


Cell | 2001

A Chemosensory Gene Family Encoding Candidate Gustatory and Olfactory Receptors in Drosophila

Kristin Scott; Roscoe Brady; Anibal Cravchik; Pavel Morozov; Andrey Rzhetsky; Charles S. Zuker; Richard Axel

A novel family of candidate gustatory receptors (GRs) was recently identified in searches of the Drosophila genome. We have performed in situ hybridization and transgene experiments that reveal expression of these genes in both gustatory and olfactory neurons in adult flies and larvae. This gene family is likely to encode both odorant and taste receptors. We have visualized the projections of chemosensory neurons in the larval brain and observe that neurons expressing different GRs project to discrete loci in the antennal lobe and subesophageal ganglion. These data provide insight into the diversity of chemosensory recognition and an initial view of the representation of gustatory information in the fly brain.


Nature Biotechnology | 2010

The BioPAX community standard for pathway data sharing

Emek Demir; Michael P. Cary; Suzanne M. Paley; Ken Fukuda; Christian Lemer; Imre Vastrik; Guanming Wu; Peter D'Eustachio; Carl F. Schaefer; Joanne S. Luciano; Frank Schacherer; Irma Martínez-Flores; Zhenjun Hu; Verónica Jiménez-Jacinto; Geeta Joshi-Tope; Kumaran Kandasamy; Alejandra López-Fuentes; Huaiyu Mi; Elgar Pichler; Igor Rodchenkov; Andrea Splendiani; Sasha Tkachev; Jeremy Zucker; Gopal Gopinath; Harsha Rajasimha; Ranjani Ramakrishnan; Imran Shah; Mustafa Syed; Nadia Anwar; Özgün Babur

Biological Pathway Exchange (BioPAX) is a standard language to represent biological pathways at the molecular and cellular level and to facilitate the exchange of pathway data. The rapid growth of the volume of pathway data has spurred the development of databases and computational tools to aid interpretation; however, use of these data is hampered by the current fragmentation of pathway information across many databases with incompatible formats. BioPAX, which was created through a community process, solves this problem by making pathway data substantially easier to collect, index, interpret and share. BioPAX can represent metabolic and signaling pathways, molecular and genetic interactions and gene regulation networks. Using BioPAX, millions of interactions, organized into thousands of pathways, from many organisms are available from a growing number of databases. This large amount of pathway data in a computable form will support visualization, analysis and biological discovery.


Journal of Biomedical Informatics | 2004

GeneWays: a system for extracting, analyzing, visualizing, and integrating molecular pathway data

Andrey Rzhetsky; Ivan Iossifov; Tomohiro Koike; Michael Krauthammer; Pauline Kra; Mitzi Morris; Hong Yu; Pablo Ariel Duboue; Wubin Weng; W. John Wilbur; Vasileios Hatzivassiloglou; Carol Friedman

The immense growth in the volume of research literature and experimental data in the field of molecular biology calls for efficient automatic methods to capture and store information. In recent years, several groups have worked on specific problems in this area, such as automated selection of articles pertinent to molecular biology, or automated extraction of information using natural-language processing, information visualization, and generation of specialized knowledge bases for molecular biology. GeneWays is an integrated system that combines several such subtasks. It analyzes interactions between molecular substances, drawing on multiple sources of information to infer a consensus view of molecular networks. GeneWays is designed as an open platform, allowing researchers to query, review, and critique stored information.


Journal of Molecular Evolution | 1992

Statistical properties of the ordinary least-squares, generalized least-squares, and minimum-evolution methods of phylogenetic inference

Andrey Rzhetsky; Masatoshi Nei

SummaryStatistical properties of the ordinary least-squares (OLS), generalized least-squares (GLS), and minimum-evolution (ME) methods of phylogenetic inference were studied by considering the case of four DNA sequences. Analytical study has shown that all three methods are statistically consistent in the sense that as the number of nucleotides examined (m) increases they tend to choose the true tree as long as the evolutionary distances used are unbiased. When evolutionary distances (dijs) are large and sequences under study are not very long, however, the OLS criterion is often biased and may choose an incorrect tree more often than expected under random choice. It is also shown that the variance-covariance matrix of dijs becomes singular as dijs approach zero and thus the GLS may not be applicable when dijs are small. The ME method suffers from neither of these problems, and the ME criterion is statistically unbiased. Computer simulation has shown that the ME method is more efficient in obtaining the true tree than the OLS and GLS methods and that the OLS is more efficient than the GLS when dijs are small, but otherwise the GLS is more efficient.


Proceedings of the National Academy of Sciences of the United States of America | 2007

Probing genetic overlap among complex human phenotypes

Andrey Rzhetsky; David Wajngurt; Naeun Park; Tian Zheng

Geneticists and epidemiologists often observe that certain hereditary disorders cooccur in individual patients significantly more (or significantly less) frequently than expected, suggesting there is a genetic variation that predisposes its bearer to multiple disorders, or that protects against some disorders while predisposing to others. We suggest that, by using a large number of phenotypic observations about multiple disorders and an appropriate statistical model, we can infer genetic overlaps between phenotypes. Our proof-of-concept analysis of 1.5 million patient records and 161 disorders indicates that disease phenotypes form a highly connected network of strong pairwise correlations. Our modeling approach, under appropriate assumptions, allows us to estimate from these correlations the size of putative genetic overlaps. For example, we suggest that autism, bipolar disorder, and schizophrenia share significant genetic overlaps. Our disease network hypothesis can be immediately exploited in the design of genetic mapping approaches that involve joint linkage or association analyses of multiple seemingly disparate phenotypes.


Proceedings of the National Academy of Sciences of the United States of America | 2008

Network properties of genes harboring inherited disease mutations

Igor Feldman; Andrey Rzhetsky; Dennis Vitkup

By analyzing, in parallel, large literature-derived and high-throughput experimental datasets we investigate genes harboring human inherited disease mutations in the context of molecular interaction networks. Our results demonstrate that network properties influence the likelihood and phenotypic consequences of disease mutations. Genes with intermediate connectivities have the highest probability of harboring germ-line disease mutations, suggesting that disease genes tend to occupy an intermediate niche in terms of their physiological and cellular importance. Our analysis of tissue expression profiles supports this view. We show that disease mutations are less likely to occur in essential genes compared with all human genes. Disease genes display significant functional clustering in the analyzed molecular network. For about one-third of known disorders with two or more associated genes we find physical clusters of genes with the same phenotype. These clusters are likely to represent disorder-specific functional modules and suggest a framework for identifying yet-undiscovered disease genes.


Gene | 2000

Using BLAST for identifying gene and protein names in journal articles

Michael Krauthammer; Andrey Rzhetsky; Pavel Morozov; Carol Friedman

We describe a system which automatically identifies gene and protein names in journal articles, an important and non-trivial first step in knowledge extraction of protein and gene actions. Our system uses a database of gene and protein names and is based on BLAST [Altschul et al., Nucleic Acids Res. 25 (1997) 3389-3402], a popular tool for DNA and protein sequence comparison. We describe a method that consists of mapping sequences of text characters into sequences of nucleotides that can be processed by BLAST. We demonstrate that this approach is feasible: the system matches gene and protein names with a recall of 78.8% and a precision of 71.7%, which includes names that are not part of the system database. An analysis of the results suggests techniques that can be used to improve performance further.


BMC Evolutionary Biology | 2002

Birth and death of protein domains: A simple model of evolution explains power law behavior

Georgy P. Karev; Yuri I. Wolf; Andrey Rzhetsky; Faina S. Berezovskaya; Eugene V. Koonin

BackgroundPower distributions appear in numerous biological, physical and other contexts, which appear to be fundamentally different. In biology, power laws have been claimed to describe the distributions of the connections of enzymes and metabolites in metabolic networks, the number of interactions partners of a given protein, the number of members in paralogous families, and other quantities. In network analysis, power laws imply evolution of the network with preferential attachment, i.e. a greater likelihood of nodes being added to pre-existing hubs. Exploration of different types of evolutionary models in an attempt to determine which of them lead to power law distributions has the potential of revealing non-trivial aspects of genome evolution.ResultsA simple model of evolution of the domain composition of proteomes was developed, with the following elementary processes: i) domain birth (duplication with divergence), ii) death (inactivation and/or deletion), and iii) innovation (emergence from non-coding or non-globular sequences or acquisition via horizontal gene transfer). This formalism can be described as a b irth, d eath and i nnovation m odel (BDIM). The formulas for equilibrium frequencies of domain families of different size and the total number of families at equilibrium are derived for a general BDIM. All asymptotics of equilibrium frequencies of domain families possible for the given type of models are found and their appearance depending on model parameters is investigated. It is proved that the power law asymptotics appears if, and only if, the model is balanced, i.e. domain duplication and deletion rates are asymptotically equal up to the second order. It is further proved that any power asymptotic with the degree not equal to -1 can appear only if the hypothesis of independence of the duplication/deletion rates on the size of a domain family is rejected. Specific cases of BDIMs, namely simple, linear, polynomial and rational models, are considered in details and the distributions of the equilibrium frequencies of domain families of different size are determined for each case. We apply the BDIM formalism to the analysis of the domain family size distributions in prokaryotic and eukaryotic proteomes and show an excellent fit between these empirical data and a particular form of the model, the second-order balanced linear BDIM. Calculation of the parameters of these models suggests surprisingly high innovation rates, comparable to the total domain birth (duplication) and elimination rates, particularly for prokaryotic genomes.ConclusionsWe show that a straightforward model of genome evolution, which does not explicitly include selection, is sufficient to explain the observed distributions of domain family sizes, in which power laws appear as asymptotic. However, for the model to be compatible with the data, there has to be a precise balance between domain birth, death and innovation rates, and this is likely to be maintained by selection. The developed approach is oriented at a mathematical description of evolution of domain composition of proteomes, but a simple reformulation could be applied to models of other evolving networks with preferential attachment.


Bioinformatics | 2003

Learning to predict protein–protein interactions from protein sequences

Shawn M. Gomez; William Stafford Noble; Andrey Rzhetsky

In order to understand the molecular machinery of the cell, we need to know about the multitude of protein-protein interactions that allow the cell to function. High-throughput technologies provide some data about these interactions, but so far that data is fairly noisy. Therefore, computational techniques for predicting protein-protein interactions could be of significant value. One approach to predicting interactions in silico is to produce from first principles a detailed model of a candidate interaction. We take an alternative approach, employing a relatively simple model that learns dynamically from a large collection of data. In this work, we describe an attraction-repulsion model, in which the interaction between a pair of proteins is represented as the sum of attractive and repulsive forces associated with small, domain- or motif-sized features along the length of each protein. The model is discriminative, learning simultaneously from known interactions and from pairs of proteins that are known (or suspected) not to interact. The model is efficient to compute and scales well to very large collections of data. In a cross-validated comparison using known yeast interactions, the attraction-repulsion method performs better than several competing techniques.

Collaboration


Dive into the Andrey Rzhetsky's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ivan Iossifov

Cold Spring Harbor Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Masatoshi Nei

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Shawn M. Gomez

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge