Amir Globerson | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Amir Globerson is active.

Explore More

Publication

Featured researches published by Amir Globerson.

international conference on machine learning | 2006

Nightmare at test time: robust learning by feature deletion

Amir Globerson; Sam T. Roweis

When constructing a classifier from labeled data, it is important not to assign too much weight to any single input feature, in order to increase the robustness of the classifier. This is particularly important in domains with nonstationary feature distributions or with input sensor failures. A common approach to achieving such robustness is to introduce regularization which spreads the weight more evenly between the features. However, this strategy is very generic, and cannot induce robustness specifically tailored to the classification task at hand. In this work, we introduce a new algorithm for avoiding single feature over-weighting by analyzing robustness using a game theoretic formalization. We develop classifiers which are optimally resilient to deletion of features in a minimax sense, and show how to construct such classifiers using quadratic programming. We illustrate the applicability of our methods on spam filtering and handwritten digit recognition tasks, where feature deletion is indeed a realistic noise model.

european conference on machine learning | 2011

An alternating direction method for dual MAP LP relaxation

Ofer Meshi; Amir Globerson

Maximum a-posteriori (MAP) estimation is an important task in many applications of probabilistic graphical models. Although finding an exact solution is generally intractable, approximations based on linear programming (LP) relaxation often provide good approximate solutions. In this paper we present an algorithm for solving the LP relaxation optimization problem. In order to overcome the lack of strict convexity, we apply an augmented Lagrangian method to the dual LP. The algorithm, based on the alternating direction method of multipliers (ADMM), is guaranteed to converge to the global optimum of the LP relaxation objective. Our experimental results show that this algorithm is competitive with other state-of-the-art algorithms for approximate MAP estimation.

international conference on machine learning | 2007

Exponentiated gradient algorithms for log-linear structured prediction

Amir Globerson; Terry Y. Koo; Xavier Carreras; Michael Collins

Conditional log-linear models are a commonly used method for structured prediction. Efficient learning of parameters in these models is therefore an important problem. This paper describes an exponentiated gradient (EG) algorithm for training such models. EG is applied to the convex dual of the maximum likelihood objective; this results in both sequential and parallel update algorithms, where in the sequential algorithm parameters are updated in an online fashion. We provide a convergence proof for both algorithms. Our analysis also simplifies previous results on EG for max-margin models, and leads to a tighter bound on convergence rates. Experiments on a large-scale parsing task show that the proposed algorithm converges much faster than conjugate-gradient and L-BFGS approaches both in terms of optimization objective and test error.

IEEE Transactions on Signal Processing | 2013

Time Varying Autoregressive Moving Average Models for Covariance Estimation

Ami Wiesel; Ofir Bibi; Amir Globerson

We consider large scale covariance estimation using a small number of samples in applications where there is a natural ordering between the random variables. The two classical approaches to this problem rely on banded covariance and banded inverse covariance structures, corresponding to time varying moving average (MA) and autoregressive (AR) models, respectively. Motivated by this analogy to spectral estimation and the well known modeling power of autoregressive moving average (ARMA) processes, we propose a novel time varying ARMA covariance structure. Similarly to known results in the context of AR and MA, we address the completion of an ARMA covariance matrix from its main band, and its estimation based on random samples. Finally, we examine the advantages of our proposed methods using numerical experiments.

The Journal of Neuroscience | 2008

Correlations between Groups of Premotor Neurons Carry Information about Prehension

Eran Stark; Amir Globerson; Itay Asher; Moshe Abeles

How distinct parameters are bound together in brain activity is unknown. Combination coding by interneuronal interactions is one possibility, but, to coordinate parameters, interactions between neuronal pairs must carry information about them. To address this issue, we recorded neural activity from multiple sites in the premotor cortices of monkeys that memorized reach direction and grasp type followed by actual prehension. We found that correlations between individual spiking neurons are generally weak and carry little information about prehension. In contrast, correlations and synchronous interactions between small groups of neurons, quantified by multiunit activity (MUA), are an order of magnitude stronger. A substantial fraction of the information carried by pairwise interactions between MUAs is about combinations of reach and grasp. This contrasts with the information carried by individual neurons and individual MUAs, which is mainly about reach and/or grasp but much less about their combinations. The main contribution of pairwise interactions to the coding of reach–grasp combinations is when animals memorize prehension parameters, consistent with an internal composite representation. The informative interactions between neuronal groups may facilitate the coordination of reach and grasp into coherent prehension.

Proceedings of the National Academy of Sciences of the United States of America | 2009

The minimum information principle and its application to neural code analysis

Amir Globerson; Eran Stark; Eilon Vaadia; Naftali Tishby

The study of complex information processing systems requires appropriate theoretical tools to help unravel their underlying design principles. Information theory is one such tool, and has been utilized extensively in the study of the neural code. Although much progress has been made in information theoretic methodology, there is still no satisfying answer to the question: “What is the information that a given property of the neural population activity (e.g., the responses of single cells within the population) carries about a set of stimuli?” Here, we answer such questions via the minimum mutual information (MinMI) principle. We quantify the information in any statistical property of the neural response by considering all hypothetical neuronal populations that have the given property and finding the one that contains the minimum information about the stimuli. All systems with higher information values necessarily contain additional information processing mechanisms and, thus, the minimum captures the information related to the given property alone. MinMI may be used to measure information in properties of the neural response, such as that conveyed by responses of small subsets of cells (e.g., singles or pairs) in a large population and cooperative effects between subunits in networks. We show how the framework can be used to study neural coding in large populations and to reveal properties that are not discovered by other information theoretic methods.

meeting of the association for computational linguistics | 2016

Collective Entity Resolution with Multi-Focal Attention

Amir Globerson; Nevena Lazic; Soumen Chakrabarti; Amarnag Subramanya; Michael Ringaard; Fernando Pereira

Entity resolution is the task of linking each mention of an entity in text to the corresponding record in a knowledge base (KB). Coherence models for entity resolution encourage all referring expressions in a document to resolve to entities that are related in the KB. We explore attentionlike mechanisms for coherence, where the evidence for each candidate is based on a small set of strong relations, rather than relations to all other entities in the document. The rationale is that documentwide support may simply not exist for non-salient entities, or entities not densely connected in the KB. Our proposed system outperforms state-of-the-art systems on the CoNLL 2003, TAC KBP 2010, 2011 and 2012 tasks.

meeting of the association for computational linguistics | 2014

Steps to Excellence: Simple Inference with Refined Scoring of Dependency Trees

Yuan Zhang; Tao Lei; Regina Barzilay; Tommi S. Jaakkola; Amir Globerson

Much of the recent work on dependency parsing has been focused on solving inherent combinatorial problems associated with rich scoring functions. In contrast, we demonstrate that highly expressive scoring functions can be used with substantially simpler inference procedures. Specifically, we introduce a sampling-based parser that can easily handle arbitrary global features. Inspired by SampleRank, we learn to take guided stochastic steps towards a high scoring parse. We introduce two samplers for traversing the space of trees, Gibbs and Metropolis-Hastings with Random Walk. The model outperforms state-of-the-art results when evaluated on 14 languages of non-projective CoNLL datasets. Our sampling-based approach naturally extends to joint prediction scenarios, such as joint parsing and POS correction. The resulting method outperforms the best reported results on the CATiB dataset, approaching performance of parsing with gold tags. 1

international conference on computer vision | 2013

Higher Order Matching for Consistent Multiple Target Tracking

Chetan Arora; Amir Globerson

This paper addresses the data assignment problem in multi frame multi object tracking in video sequences. Traditional methods employing maximum weight bipartite matching offer limited temporal modeling. It has recently been shown [6, 8, 24] that incorporating higher order temporal constraints improves the assignment solution. Finding maximum weight matching with higher order constraints is however NP-hard and the solutions proposed until now have either been greedy [8] or rely on greedy rounding of the solution obtained from spectral techniques [15]. We propose a novel algorithm to find the approximate solution to data assignment problem with higher order temporal constraints using the method of dual decomposition and the MPLP message passing algorithm [21]. We compare the proposed algorithm with an implementation of [8] and [15] and show that proposed technique provides better solution with a bound on approximation factor for each inferred solution.

north american chapter of the association for computational linguistics | 2015

Template Kernels for Dependency Parsing

Hillel Taub-Tabib; Yoav Goldberg; Amir Globerson

A common approach to dependency parsing is scoring a parse via a linear function of a set of indicator features. These features are typically manually constructed from templates that are applied to parts of the parse tree. The templates define which properties of a part should combine to create features. Existing approaches consider only a small subset of the possible combinations, due to statistical and computational efficiency considerations. In this work we present a novel kernel which facilitates efficient parsing with feature representations corresponding to a much larger set of combinations. We integrate the kernel into a parse reranking system and demonstrate its effectiveness on four languages from the CoNLL-X shared task. 1

Explore More