Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Leon Goldovsky is active.

Publication


Featured researches published by Leon Goldovsky.


Nucleic Acids Research | 2005

Expansion of the BioCyc collection of pathway/genome databases to 160 genomes

Peter D. Karp; Christos A. Ouzounis; Caroline Moore-Kochlacs; Leon Goldovsky; Pallavi Kaipa; Dag Ahrén; Sophia Tsoka; Nikos Darzentas; Victor Kunin; Nuria Lopez-Bigas

The BioCyc database collection is a set of 160 pathway/genome databases (PGDBs) for most eukaryotic and prokaryotic species whose genomes have been completely sequenced to date. Each PGDB in the BioCyc collection describes the genome and predicted metabolic network of a single organism, inferred from the MetaCyc database, which is a reference source on metabolic pathways from multiple organisms. In addition, each bacterial PGDB includes predicted operons for the corresponding species. The BioCyc collection provides a unique resource for computational systems biology, namely global and comparative analyses of genomes and metabolic networks, and a supplement to the BioCyc resource of curated PGDBs. The Omics viewer available through the BioCyc website allows scientists to visualize combinations of gene expression, proteomics and metabolomics data on the metabolic maps of these organisms. This paper discusses the computational methodology by which the BioCyc collection has been expanded, and presents an aggregate analysis of the collection that includes the range of number of pathways present in these organisms, and the most frequently observed pathways. We seek scientists to adopt and curate individual PGDBs within the BioCyc collection. Only by harnessing the expertise of many scientists we can hope to produce biological databases, which accurately reflect the depth and breadth of knowledge that the biomedical research community is producing.


PLOS Computational Biology | 2007

Construction, Visualisation, and Clustering of Transcription Networks from Microarray Expression Data

Tom C. Freeman; Leon Goldovsky; Markus Brosch; Stijn van Dongen; Pierre Mazière; Russell Grocock; Shiri Freilich; Janet M. Thornton; Anton J. Enright

Network analysis transcends conventional pairwise approaches to data analysis as the context of components in a network graph can be taken into account. Such approaches are increasingly being applied to genomics data, where functional linkages are used to connect genes or proteins. However, while microarray gene expression datasets are now abundant and of high quality, few approaches have been developed for analysis of such data in a network context. We present a novel approach for 3-D visualisation and analysis of transcriptional networks generated from microarray data. These networks consist of nodes representing transcripts connected by virtue of their expression profile similarity across multiple conditions. Analysing genome-wide gene transcription across 61 mouse tissues, we describe the unusual topography of the large and highly structured networks produced, and demonstrate how they can be used to visualise, cluster, and mine large datasets. This approach is fast, intuitive, and versatile, and allows the identification of biological relationships that may be missed by conventional analysis techniques. This work has been implemented in a freely available open-source application named BioLayout Express 3D.


BMC Evolutionary Biology | 2009

Emergence, development and diversification of the TGF- β signalling pathway within the animal kingdom

Lukasz Huminiecki; Leon Goldovsky; Shiri Freilich; Aristidis Moustakas; Christos A. Ouzounis; Carl-Henrik Heldin

BackgroundThe question of how genomic processes, such as gene duplication, give rise to co-ordinated organismal properties, such as emergence of new body plans, organs and lifestyles, is of importance in developmental and evolutionary biology. Herein, we focus on the diversification of the transforming growth factor-β (TGF-β) pathway – one of the fundamental and versatile metazoan signal transduction engines.ResultsAfter an investigation of 33 genomes, we show that the emergence of the TGF-β pathway coincided with appearance of the first known animal species. The primordial pathway repertoire consisted of four Smads and four receptors, similar to those observed in the extant genome of the early diverging tablet animal (Trichoplax adhaerens). We subsequently retrace duplications in ancestral genomes on the lineage leading to humans, as well as lineage-specific duplications, such as those which gave rise to novel Smads and receptors in teleost fishes. We conclude that the diversification of the TGF-β pathway can be parsimoniously explained according to the 2R model, with additional rounds of duplications in teleost fishes. Finally, we investigate duplications followed by accelerated evolution which gave rise to an atypical TGF-β pathway in free-living bacterial feeding nematodes of the genus Rhabditis.ConclusionOur results challenge the view of well-conserved developmental pathways. The TGF-β signal transduction engine has expanded through gene duplication, continually adopting new functions, as animals grew in anatomical complexity, colonized new environments, and developed an active immune system.


Applied Bioinformatics | 2005

BioLayout(Java): versatile network visualisation of structural and functional relationships.

Leon Goldovsky; Ildefonso Cases; Anton J. Enright; Christos A. Ouzounis

Visualisation of biological networks is becoming a common task for the analysis of high-throughput data. These networks correspond to a wide variety of biological relationships, such as sequence similarity, metabolic pathways, gene regulatory cascades and protein interactions. We present a general approach for the representation and analysis of networks of variable type, size and complexity. The application is based on the original BioLayout program (C-language implementation of the Fruchterman-Rheingold layout algorithm), entirely re-written in Java to guarantee portability across platforms. BioLayout(Java) provides broader functionality, various analysis techniques, extensions for better visualisation and a new user interface. Examples of analysis of biological networks using BioLayout(Java) are presented.


Nucleic Acids Research | 2005

Measuring genome conservation across taxa: divided strains and united kingdoms

Victor Kunin; Dag Ahrén; Leon Goldovsky; Paul Janssen; Christos A. Ouzounis

Species evolutionary relationships have traditionally been defined by sequence similarities of phylogenetic marker molecules, recently followed by whole-genome phylogenies based on gene order, average ortholog similarity or gene content. Here, we introduce genome conservation—a novel metric of evolutionary distances between species that simultaneously takes into account, both gene content and sequence similarity at the whole-genome level. Genome conservation represents a robust distance measure, as demonstrated by accurate phylogenetic reconstructions. The genome conservation matrix for all presently sequenced organisms exhibits a remarkable ability to define evolutionary relationships across all taxonomic ranges. An assessment of taxonomic ranks with genome conservation shows that certain ranks are inadequately described and raises the possibility for a more precise and quantitative taxonomy in the future. All phylogenetic reconstructions are available at the genome phylogeny server: <>.


EMBO Reports | 2005

Genome coverage, literally speaking

Paul Janssen; Leon Goldovsky; Victor Kunin; Nikos Darzentas; Christos A. Ouzounis

In late 2004, 200 complete genomes had been sequenced and made available to the research community. At the time of writing this viewpoint, that number had further risen to 221 and will have undoubtedly increased again before publication. These genomes, which represent a wide range of species from archaea to human, are a highly valuable knowledge resource for the scientific community. However, the sequencing of a full genome is just the first step in research; it must be followed by the functional characterization of genes and proteins. In this context, it is interesting to see how well represented these sequenced species are in terms of publications. We have thus obtained the number of abstracts published per species and normalized that count by the number of genes in that species to obtain a comparable measure for the number of publications per gene for all completed and published genomes. This simple measure highlights the current knowledge gap between various organisms and could further serve as a guideline for selecting genomes for sequencing projects, high‐throughput functional genomics and database annotation efforts. The 200 complete genome sequences published by December 2004 included 118 genera, 166 species and 34 additional strains for 21 species. This rate translates to a doubling time of available genome sequences of less than two years (Janssen et al , 2003a). And it remains steady: in 2003, an average of one complete genome was released per week; 47 genomes were made available in the first 44 weeks of 2004. This trend will accelerate further, as more than 1,000 genome projects are currently underway (Bernal et al , 2001). For the 221 genomes currently available, the total number of predicted proteins is 822,114, according to the COGENT database (Janssen et al , 2003b). One of the great challenges for computational and experimental genomics is …


Genome Biology | 2003

Beyond 100 genomes

Paul Janssen; Benjamin Audit; Ildefonso Cases; Nikos Darzentas; Leon Goldovsky; Victor Kunin; Nuria Lopez-Bigas; José Manuel Peregrin-Alvarez; José B. Pereira-Leal; Sophia Tsoka; Christos A. Ouzounis

By the end of 2002, we witnessed the landmark submission of the 100th complete genome sequence in the databases. An overview of these genomes reveals certain interesting trends and provides valuable insights into possible future developments.


Genome Biology | 2006

Relating tissue specialization to the differentiation of expression of singleton and duplicate mouse proteins

Shiri Freilich; Tim Massingham; Eric Blanc; Leon Goldovsky; Janet M. Thornton

BackgroundGene duplications have been hypothesized to be a major factor in enabling the evolution of tissue differentiation. Analyses of the expression profiles of duplicate genes in mammalian tissues have indicated that, with time, the expression patterns of duplicate genes diverge and become more tissue specific. We explored the relationship between duplication events, the time at which they took place, and both the expression breadth of the duplicated genes and the cumulative expression breadth of the gene family to which they belong.ResultsWe show that only duplicates that arose through post-multicellularity duplication events show a tendency to become more specifically expressed, whereas such a tendency is not observed for duplicates that arose in a unicellular ancestor. Unlike the narrow expression profile of the duplicated genes, the overall expression of gene families tends to maintain a global expression pattern.ConclusionThe work presented here supports the view suggested by the subfunctionalization model, namely that expression divergence in different tissues, following gene duplication, promotes the retention of a gene in the genome of multicellular species. The global expression profile of the gene families suggests division of expression between family members, whose expression becomes specialized. Because specialization of expression is coupled with an increased rate of sequence divergence, it can facilitate the evolution of new, tissue-specific functions.


BMC Evolutionary Biology | 2008

Metabolic innovations towards the human lineage

Shiri Freilich; Leon Goldovsky; Christos A. Ouzounis; Janet M. Thornton

BackgroundWe describe a function-driven approach to the analysis of metabolism which takes into account the phylogenetic origin of biochemical reactions to reveal subtle lineage-specific metabolic innovations, undetectable by more traditional methods based on sequence comparison. The origins of reactions and thus entire pathways are inferred using a simple taxonomic classification scheme that describes the evolutionary course of events towards the lineage of interest. We investigate the evolutionary history of the human metabolic network extracted from a metabolic database, construct a network of interconnected pathways and classify this network according to the taxonomic categories representing eukaryotes, metazoa and vertebrates.ResultsIt is demonstrated that lineage-specific innovations correspond to reactions and pathways associated with key phenotypic changes during evolution, such as the emergence of cellular organelles in eukaryotes, cell adhesion cascades in metazoa and the biosynthesis of complex cell-specific biomolecules in vertebrates.ConclusionThis phylogenetic view of metabolic networks puts gene innovations within an evolutionary context, demonstrating how the emergence of a phenotype in a lineage provides a platform for the development of specialized traits.


BMC Bioinformatics | 2007

CORRIE: enzyme sequence annotation with confidence estimates

Benjamin Audit; Emmanuel D. Levy; Walter R. Gilks; Leon Goldovsky; Christos A. Ouzounis

Using a previously developed automated method for enzyme annotation, we report the re-annotation of the ENZYME database and the analysis of local error rates per class. In control experiments, we demonstrate that the method is able to correctly re-annotate 91% of all Enzyme Classification (EC) classes with high coverage (755 out of 827). Only 44 enzyme classes are found to contain false positives, while the remaining 28 enzyme classes are not represented. We also show cases where the re-annotation procedure results in partial overlaps for those few enzyme classes where a certain inconsistency might appear between homologous proteins, mostly due to function specificity. Our results allow the interactive exploration of the EC hierarchy for known enzyme families as well as putative enzyme sequences that may need to be classified within the EC hierarchy. These aspects of our framework have been incorporated into a web-server, called CORRIE, which stands for Correspondence Indicator Estimation and allows the interactive prediction of a functional class for putative enzymes from sequence alone, supported by probabilistic measures in the context of the pre-calculated Correspondence Indicators of known enzymes with the functional classes of the EC hierarchy. The CORRIE server is available at: http://www.genomes.org/services/corrie/.

Collaboration


Dive into the Leon Goldovsky's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Anton J. Enright

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Paul Janssen

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Ildefonso Cases

Spanish National Research Council

View shared research outputs
Top Co-Authors

Avatar

Benjamin Audit

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Nikos Darzentas

Central European Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Nikos Darzentas

Central European Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Janet M. Thornton

European Bioinformatics Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge