Is this you? Create Your Porfile

Noam Slonim

Hebrew University of Jerusalem

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Noam Slonim is active.

Explore More

Publication

Featured researches published by Noam Slonim.

international acm sigir conference on research and development in information retrieval | 2000

Document clustering using word clusters via the information bottleneck method

Noam Slonim; Naftali Tishby

We present a novel implementation of the recently introduced information bottleneck method for unsupervised document clustering. Given a joint empirical distribution of words and documents, p(x, y), we first cluster the words, Y, so that the obtained word clusters, Ytilde;, maximally preserve the information on the documents. The resulting joint distribution. p(X, Ytilde;), contains most of the original information about the documents, I(X; Ytilde;) ≈ I(X; Y), but it is much less sparse and noisy. Using the same procedure we then cluster the documents, X, so that the information about the word-clusters is preserved. Thus, we first find word-clusters that capture most of the mutual information about to set of documents, and then find document clusters, that preserve the information about the word clusters. We tested this procedure over several document collections based on subsets taken from the standard 20Newsgroups corpus. The results were assessed by calculating the correlation between the document clusters and the correct labels for these documents. Finding from our experiments show that this double clustering procedure, which uses the information bottleneck method, yields significantly superior performance compared to other common document distributional clustering algorithms. Moreover, the double clustering procedure improves all the distributional clustering methods examined here.

international acm sigir conference on research and development in information retrieval | 2002

Unsupervised document classification using sequential information maximization

Noam Slonim; Nir Friedman; Naftali Tishby

We present a novel sequential clustering algorithm which is motivated by the Information Bottleneck (IB) method. In contrast to the agglomerative IB algorithm, the new sequential (sIB) approach is guaranteed to converge to a local maximum of the information with time and space complexity typically linear in the data size. information, as required by the original IB principle. Moreover, the time and space complexity are significantly improved. We apply this algorithm to unsupervised document classification. In our evaluation, on small and medium size corpora, the sIB is found to be consistently superior to all the other clustering methods we examine, typically by a significant margin. Moreover, the sIB results are comparable to those obtained by a supervised Naive Bayes classifier. Finally, we propose a simple procedure for trading clusters recall to gain higher precision, and show how this approach can extract clusters which match the existing topics of the corpus almost perfectly.

Proceedings of the National Academy of Sciences of the United States of America | 2005

Information-based clustering

Noam Slonim; Gurinder Singh Atwal; Gašper Tkačik; William Bialek

In an age of increasingly large data sets, investigators in many different disciplines have turned to clustering as a tool for data analysis and exploration. Existing clustering methods, however, typically depend on several nontrivial assumptions about the structure of data. Here, we reformulate the clustering problem from an information theoretic perspective that avoids many of these assumptions. In particular, our formulation obviates the need for defining a cluster prototype, does not require an a priori similarity metric, is invariant to changes in the representation of the data, and naturally captures nonlinear relations. We apply this approach to different domains and find that it consistently produces clusters that are more coherent than those extracted by existing algorithms. Finally, our approach provides a way of clustering based on collective notions of similarity rather than the traditional pairwise measures.

Molecular Systems Biology | 2009

Glucose regulates transcription in yeast through a network of signaling pathways

Shadia Zaman; Soyeon I. Lippman; Lisa Schneper; Noam Slonim; James R. Broach

Addition of glucose to yeast cells increases their growth rate and results in a massive restructuring of their transcriptional output. We have used microarray analysis in conjunction with conditional mutations to obtain a systems view of the signaling network responsible for glucose‐induced transcriptional changes. We found that several well‐studied signaling pathways—such as Snf1 and Rgt—are responsible for specialized but limited responses to glucose. However, 90% of the glucose‐induced changes can be recapitulated by the activation of protein kinase A (PKA) or by the induction of PKB (Sch9). Blocking signaling through Sch9 does not interfere with the glucose response, whereas blocking signaling through PKA does. We conclude that both Sch9 and PKA regulate a massive, nutrient‐responsive transcriptional program promoting growth, but that they do so in response to different nutritional inputs. Moreover, activating PKA completely recapitulates the transcriptional growth program in the absence of any increase in growth or metabolism, demonstrating that activation of the growth program results solely from the cells perception of its nutritional status.

Neural Computation | 2006

Multivariate information bottleneck

Noam Slonim; Nir Friedman; Naftali Tishby

The information bottleneck (IB) method is an unsupervised model independent data organization technique. Given a joint distribution, p(X, Y), this method constructs a new variable, T, that extracts partitions, or clusters, over the values of X that are informative about Y. Algorithms that are motivated by the IB method have already been applied to text classification, gene expression, neural code, and spectral analysis. Here, we introduce a general principled framework for multivariate extensions of the IB method. This allows us to consider multiple systems of data partitions that are interrelated. Our approach utilizes Bayesian networks for specifying the systems of clusters and which information terms should be maintained. We show that this construction provides insights about bottleneck variations and enables us to characterize the solutions of these variations. We also present four different algorithmic approaches that allow us to construct solutions in practice and apply them to several real-world problems.

Molecular Systems Biology | 2006

Ab initio genotype-phenotype association reveals intrinsic modularity in genetic networks.

Noam Slonim; Olivier Elemento; Saeed Tavazoie

Microbial species express an astonishing diversity of phenotypic traits, behaviors, and metabolic capacities. However, our molecular understanding of these phenotypes is based almost entirely on studies in a handful of model organisms that together represent only a small fraction of this phenotypic diversity. Furthermore, many microbial species are not amenable to traditional laboratory analysis because of their exotic lifestyles and/or lack of suitable molecular genetic techniques. As an adjunct to experimental analysis, we have developed a computational information‐theoretic framework that produces high‐confidence gene–phenotype predictions using cross‐species distributions of genes and phenotypes across 202 fully sequenced archaea and eubacteria. In addition to identifying the genetic basis of complex traits, our approach reveals the organization of these genes into generic preferentially co‐inherited modules, many of which correspond directly to known enzymatic pathways, molecular complexes, signaling pathways, and molecular machines.

Monthly Notices of the Royal Astronomical Society | 2001

Objective Classification of Galaxy Spectra using the Information Bottleneck Method

Noam Slonim; Rachel S. Somerville; Naftali Tishby; Ofer Lahav

A new method for classification of galaxy spectra is presented, based on a recently introduced information theoretical principle, the information bottleneck. For any desired number of classes, galaxies are classified such that the information content about the spectra is maximally preserved. The result is classes of galaxies with similar spectra, where the similarity is determined via a measure of information. We apply our method to � 6000 galaxy spectra from the ongoing 2dF redshift survey, and a mock-2dF catalogue produced by a Cold Dark Matter-based semi-analytic model of galaxy formation. We find a good match between the mean spectra of the classes found in the data and in the models. For the mock catalogue, we find that the classes produced by our algorithm form an intuitively sensible sequence in terms of physical properties such as colour, star formation activity, morphology, and internal velocity dispersion. We also show the correlation of the classes with the projections resulting from a Principal Component Analysis.

EURASIP Journal on Advances in Signal Processing | 2003

Discriminative feature selection via multiclass variable memory Markov model

Noam Slonim; Gill Bejerano; Shai Fine; Naftali Tishby

We propose a novel feature selection method based on a variable memory Markov (VMM) model. The VMM was originally proposed as a generative model trying to preserve the original source statistics from training data. We extend this technique to simultaneously handle several sources, and further apply a new criterion to prune out nondiscriminative features out of the model. This results in a multiclass discriminative VMM (DVMM), which is highly efficient, scaling linearly with data size. Moreover, we suggest a natural scheme to sort the remaining features based on their discriminative power with respect to the sources at hand. We demonstrate the utility of our method for text and protein classification tasks.

international workshop on the web and databases | 1998

WebSuite: A Tool Suite for Harnessing Web Data

Catriel Beeri; Gershon Elber; Tova Milo; Yehoshua Sagiv; Oded Shmueli; Naftali Tishby; Yakov A. Kogan; David Konopnicki; Pini Mogilevski; Noam Slonim

We present a system for searching, collecting, and integrating Web-resident data. The system consists of five tools, where each tool provides a specific functionality aimed at solving one aspect of the complex task of using and managing Web data. Each tool can be used in a stand-alone mode, in combination with the other tools, or even in conjunction with other systems. Together, the tools offer a wide range of capabilities that overcome many of the limitations in existing systems for harnessing Web data. The paper describes each tool, possible ways of combining the tools, and the architecture of the combined system.

Psychotherapy Research | 2013

Adolescents in psychodynamic psychotherapy: Changes in internal representations of relationships with parents

Dana Atzil Slonim; Gaby Shefler; Noam Slonim; Orya Tishby

Abstract This study explored whether and how internal representations of adolescents relationship with their parents—a fundamental concept in psychodynamic theory—changed in the course of a year of treatment and whether the observed changes were related to changes in symptoms. Seventy two adolescents (ages 15–18; 30 in treatment and 42 in a non-treatment “community group”) underwent Relationship Anecdote Paradigm (RAP) interviews according to the Core Conflictual Relationship Theme method (CCRT; Luborsky & Crits-Christoph, 1998) and completed outcome measures at two time points. A novel data-driven approach to clustering CCRT categories was used to characterize internal representations. The potential contribution of this approach to the CCRT method is discussed. The results indicate that adolescents internal representations of their relationships with their parents changed significantly throughout treatment, and were related to changes in symptoms.

Explore More