Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Zohar Yakhini is active.

Publication


Featured researches published by Zohar Yakhini.


The New England Journal of Medicine | 2001

Gene-expression profiles in hereditary breast cancer.

Ingrid Hedenfalk; David J. Duggan; Yidong Chen; Michael Radmacher; Michael L. Bittner; Richard Simon; Paul S. Meltzer; Barry A. Gusterson; Manel Esteller; Mark Raffeld; Zohar Yakhini; Amir Ben-Dor; Edward R. Dougherty; Juha Kononen; Lukas Bubendorf; Wilfrid Fehrle; Stefania Pittaluga; Sofia Gruvberger; Niklas Loman; Oskar Johannsson; Håkan Olsson; Benjamin S. Wilfond; Guido Sauter; Olli Kallioniemi; Åke Borg; Jeffrey M. Trent

BACKGROUND Many cases of hereditary breast cancer are due to mutations in either the BRCA1 or the BRCA2 gene. The histopathological changes in these cancers are often characteristic of the mutant gene. We hypothesized that the genes expressed by these two types of tumors are also distinctive, perhaps allowing us to identify cases of hereditary breast cancer on the basis of gene-expression profiles. METHODS RNA from samples of primary tumor from seven carriers of the BRCA1 mutation, seven carriers of the BRCA2 mutation, and seven patients with sporadic cases of breast cancer was compared with a microarray of 6512 complementary DNA clones of 5361 genes. Statistical analyses were used to identify a set of genes that could distinguish the BRCA1 genotype from the BRCA2 genotype. RESULTS Permutation analysis of multivariate classification functions established that the gene-expression profiles of tumors with BRCA1 mutations, tumors with BRCA2 mutations, and sporadic tumors differed significantly from each other. An analysis of variance between the levels of gene expression and the genotype of the samples identified 176 genes that were differentially expressed in tumors with BRCA1 mutations and tumors with BRCA2 mutations. Given the known properties of some of the genes in this panel, our findings indicate that there are functional differences between breast tumors with BRCA1 mutations and those with BRCA2 mutations. CONCLUSIONS Significantly different groups of genes are expressed by breast cancers with BRCA1 mutations and breast cancers with BRCA2 mutations. Our results suggest that a heritable mutation influences the gene-expression profile of the cancer.


BMC Bioinformatics | 2009

GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists

Eran Eden; Roy Navon; Israel Steinfeld; Doron Lipson; Zohar Yakhini

BackgroundSince the inception of the GO annotation project, a variety of tools have been developed that support exploring and searching the GO database. In particular, a variety of tools that perform GO enrichment analysis are currently available. Most of these tools require as input a target set of genes and a background set and seek enrichment in the target set compared to the background set. A few tools also exist that support analyzing ranked lists. The latter typically rely on simulations or on union-bound correction for assigning statistical significance to the results.ResultsGOrilla is a web-based application that identifies enriched GO terms in ranked lists of genes, without requiring the user to provide explicit target and background sets. This is particularly useful in many typical cases where genomic data may be naturally represented as a ranked list of genes (e.g. by level of expression or of differential expression). GOrilla employs a flexible threshold statistical approach to discover GO terms that are significantly enriched at the top of a ranked gene list. Building on a complete theoretical characterization of the underlying distribution, called mHG, GOrilla computes an exact p-value for the observed enrichment, taking threshold multiple testing into account without the need for simulations. This enables rigorous statistical analysis of thousand of genes and thousands of GO terms in order of seconds. The output of the enrichment analysis is visualized as a hierarchical structure, providing a clear view of the relations between enriched GO terms.ConclusionGOrilla is an efficient GO analysis tool with unique features that make a useful addition to the existing repertoire of GO enrichment tools. GOrillas unique features and advantages over other threshold free enrichment tools include rigorous statistics, fast running time and an effective graphical representation. GOrilla is publicly available at: http://cbl-gorilla.cs.technion.ac.il


research in computational molecular biology | 1999

Clustering gene expression patterns

Amir Ben-Dor; Zohar Yakhini

Recent advances in biotechnology allow researchers to measure expression levels for thousands of genes simultaneously, across different conditions and over time. Analysis of data produced by such experiments offers potential insight into gene function and regulatory mechanisms. A key step in the analysis of gene expression data is the detection of groups of genes that manifest similar expression patterns. The corresponding algorithmic problem is to cluster multicondition gene expression patterns. In this paper we describe a novel clustering algorithm that was developed for analysis of gene expression data. We define an appropriate stochastic error model on the input, and prove that under the conditions of the model, the algorithm recovers the cluster structure with high probability. The running time of the algorithm on an n-gene dataset is O[n2[log(n)]c]. We also present a practical heuristic based on the same algorithmic ideas. The heuristic was implemented and its performance is demonstrated on simulated data and on real gene expression data, with very promising results.


Nature Genetics | 2007

Polycomb-mediated methylation on Lys27 of histone H3 pre-marks genes for de novo methylation in cancer

Yeshayahu Schlesinger; Ravid Straussman; Ilana Keshet; Shlomit Farkash; Merav Hecht; Joseph Zimmerman; Eran Eden; Zohar Yakhini; Etti Ben-Shushan; Benjamin E. Reubinoff; Yehudit Bergman; Itamar Simon; Howard Cedar

Many genes associated with CpG islands undergo de novo methylation in cancer. Studies have suggested that the pattern of this modification may be partially determined by an instructive mechanism that recognizes specifically marked regions of the genome. Using chromatin immunoprecipitation analysis, here we show that genes methylated in cancer cells are specifically packaged with nucleosomes containing histone H3 trimethylated on Lys27. This chromatin mark is established on these unmethylated CpG island genes early in development and then maintained in differentiated cell types by the presence of an EZH2-containing Polycomb complex. In cancer cells, as opposed to normal cells, the presence of this complex brings about the recruitment of DNA methyl transferases, leading to de novo methylation. These results suggest that tumor-specific targeting of de novo methylation is pre-programmed by an established epigenetic system that normally has a role in marking embryonic genes for repression.


research in computational molecular biology | 2000

Tissue classification with gene expression profiles

Amir Ben-Dor; Laurakay Bruhn; Nir Friedman; Iftach Nachman; Michèl Schummer; Zohar Yakhini

Constantly improving gene expression profiling technologies are expected to provide understanding and insight into cancer related cellular processes. Gene expression data is also expected to significantly and in the development of efficient cancer diagnosis and classification platforms. In this work we examine two sets of gene expression data measured across sets of tumor and normal clinical samples One set consists of 2,000 genes, measured in 62 epithelial colon samples [1]. The second consists of ≈ 100,000 clones, measured in 32 ovarian samples (unpublished, extension of data set described in [26]). We examine the use of scoring methods, measuring separation of tumors from normals using individual gene expression levels. These are then coupled with high dimensional classification methods to assess the classification power of complete expression profiles. We present results of performing leave-one-out cross validation (LOOCV) experiments on the two data sets. employing SVM [8], AdaBoost [13] and a novel clustering based classification technique. As tumor samples can differ from normal samples in their cell-type composition we also perform LOOCV experiments using appropriately modified sets of genes, attempting to eliminate the resulting bias. We demonstrate success rate of at least 90% in tumor vs normal classification, using sets of selected genes, with as well as without cellular contamination related members. These results are insensitive to the exact selection mechanism, over a certain range.


Journal of Computational Biology | 1999

Clustering gene expression patterns.

Amir Ben-Dor; Ron Shamir; Zohar Yakhini

Recent advances in biotechnology allow researchers to measure expression levels for thousands of genes simultaneously, across different conditions and over time. Analysis of data produced by such experiments offers potential insight into gene function and regulatory mechanisms. A key step in the analysis of gene expression data is the detection of groups of genes that manifest similar expression patterns. The corresponding algorithmic problem is to cluster multicondition gene expression patterns. In this paper we describe a novel clustering algorithm that was developed for analysis of gene expression data. We define an appropriate stochastic error model on the input, and prove that under the conditions of the model, the algorithm recovers the cluster structure with high probability. The running time of the algorithm on an n-gene dataset is O[n2[log(n)]c]. We also present a practical heuristic based on the same algorithmic ideas. The heuristic was implemented and its performance is demonstrated on simulated data and on real gene expression data, with very promising results.


Journal of Computational Biology | 2003

Discovering local structure in gene expression data: the order-preserving submatrix problem.

Amir Ben-Dor; Benny Chor; Richard M. Karp; Zohar Yakhini

This paper concerns the discovery of patterns in gene expression matrices, in which each element gives the expression level of a given gene in a given experiment. Most existing methods for pattern discovery in such matrices are based on clustering genes by comparing their expression levels in all experiments, or clustering experiments by comparing their expression levels for all genes. Our work goes beyond such global approaches by looking for local patterns that manifest themselves when we focus simultaneously on a subset G of the genes and a subset T of the experiments. Specifically, we look for order-preserving submatrices (OPSMs), in which the expression levels of all genes induce the same linear ordering of the experiments (we show that the OPSM search problem is NP-hard in the worst case). Such a pattern might arise, for example, if the experiments in T represent distinct stages in the progress of a disease or in a cellular process and the expression levels of all genes in G vary across the stages in the same way. We define a probabilistic model in which an OPSM is hidden within an otherwise random matrix. Guided by this model, we develop an efficient algorithm for finding the hidden OPSM in the random matrix. In data generated according to the model, the algorithm recovers the hidden OPSM with a very high success rate. Application of the methods to breast cancer data seem to reveal significant local patterns.


Proceedings of the National Academy of Sciences of the United States of America | 2002

Gene expression analysis reveals matrilysin as a key regulator of pulmonary fibrosis in mice and humans

Fengrong Zuo; Naftali Kaminski; Elsie M. Eugui; John Allard; Zohar Yakhini; Amir Ben-Dor; Lance Lollini; David R. Morris; Yong Kim; Barbara Delustro; Dean Sheppard; Annie Pardo; Moisés Selman; Renu A. Heller

Pulmonary fibrosis is a progressive and largely untreatable group of disorders that affects up to 100,000 people on any given day in the United States. To elucidate the molecular mechanisms that lead to end-stage human pulmonary fibrosis we analyzed samples from patients with histologically proven pulmonary fibrosis (usual interstitial pneumonia) by using oligonucleotide microarrays. Gene expression patterns clearly distinguished normal from fibrotic lungs. Many of the genes that were significantly increased in fibrotic lungs encoded proteins associated with extracellular matrix formation and degradation and proteins expressed in smooth muscle. Using a combined set of scoring systems we determined that matrilysin (matrix metalloproteinase 7), a metalloprotease not previously associated with pulmonary fibrosis, was the most informative increased gene in our data set. Immunohistochemisry demonstrated increased expression of matrilysin protein in fibrotic lungs. Furthermore, matrilysin knockout mice were dramatically protected from pulmonary fibrosis in response to intratracheal bleomycin. Our results identify matrilysin as a mediator of pulmonary fibrosis and a potential therapeutic target. They also illustrate the power of global gene expression analysis of human tissue samples to identify molecular pathways involved in clinical disease.


Journal of Computational Biology | 2000

Tissue classification with gene expression profiles.

Amir Ben-Dor; Laurakay Bruhn; Nir Friedman; Iftach Nachman; Michèl Schummer; Zohar Yakhini

Constantly improving gene expression profiling technologies are expected to provide understanding and insight into cancer-related cellular processes. Gene expression data is also expected to significantly aid in the development of efficient cancer diagnosis and classification platforms. In this work we examine three sets of gene expression data measured across sets of tumor(s) and normal clinical samples: The first set consists of 2,000 genes, measured in 62 epithelial colon samples (Alon et al., 1999). The second consists of approximately equal to 100,000 clones, measured in 32 ovarian samples (unpublished extension of data set described in Schummer et al. (1999)). The third set consists of approximately equal to 7,100 genes, measured in 72 bone marrow and peripheral blood samples (Golub et al, 1999). We examine the use of scoring methods, measuring separation of tissue type (e.g., tumors from normals) using individual gene expression levels. These are then coupled with high-dimensional classification methods to assess the classification power of complete expression profiles. We present results of performing leave-one-out cross validation (LOOCV) experiments on the three data sets, employing nearest neighbor classifier, SVM (Cortes and Vapnik, 1995), AdaBoost (Freund and Schapire, 1997) and a novel clustering-based classification technique. As tumor samples can differ from normal samples in their cell-type composition, we also perform LOOCV experiments using appropriately modified sets of genes, attempting to eliminate the resulting bias. We demonstrate success rate of at least 90% in tumor versus normal classification, using sets of selected genes, with, as well as without, cellular-contamination-related members. These results are insensitive to the exact selection mechanism, over a certain range.


PLOS Computational Biology | 2007

Discovering Motifs in Ranked Lists of DNA Sequences

Eran Eden; Doron Lipson; Sivan Yogev; Zohar Yakhini

Computational methods for discovery of sequence elements that are enriched in a target set compared with a background set are fundamental in molecular biology research. One example is the discovery of transcription factor binding motifs that are inferred from ChIP–chip (chromatin immuno-precipitation on a microarray) measurements. Several major challenges in sequence motif discovery still require consideration: (i) the need for a principled approach to partitioning the data into target and background sets; (ii) the lack of rigorous models and of an exact p-value for measuring motif enrichment; (iii) the need for an appropriate framework for accounting for motif multiplicity; (iv) the tendency, in many of the existing methods, to report presumably significant motifs even when applied to randomly generated data. In this paper we present a statistical framework for discovering enriched sequence elements in ranked lists that resolves these four issues. We demonstrate the implementation of this framework in a software application, termed DRIM (discovery of rank imbalanced motifs), which identifies sequence motifs in lists of ranked DNA sequences. We applied DRIM to ChIP–chip and CpG methylation data and obtained the following results. (i) Identification of 50 novel putative transcription factor (TF) binding sites in yeast ChIP–chip data. The biological function of some of them was further investigated to gain new insights on transcription regulation networks in yeast. For example, our discoveries enable the elucidation of the network of the TF ARO80. Another finding concerns a systematic TF binding enhancement to sequences containing CA repeats. (ii) Discovery of novel motifs in human cancer CpG methylation data. Remarkably, most of these motifs are similar to DNA sequence elements bound by the Polycomb complex that promotes histone methylation. Our findings thus support a model in which histone methylation and CpG methylation are mechanistically linked. Overall, we demonstrate that the statistical framework embodied in the DRIM software tool is highly effective for identifying regulatory sequence elements in a variety of applications ranging from expression and ChIP–chip to CpG methylation data. DRIM is publicly available at http://bioinfo.cs.technion.ac.il/drim.

Collaboration


Dive into the Zohar Yakhini's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Israel Steinfeld

Technion – Israel Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge