Martin Jaggi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Martin Jaggi is active.

Explore More

Publication

Featured researches published by Martin Jaggi.

symposium on computational geometry | 2009

Coresets for polytope distance

Bernd Gärtner; Martin Jaggi

Following recent work of Clarkson, we translate the coreset framework to the problems of finding the point closest to the origin inside a polytope, finding the shortest distance between two polytopes, Perceptrons, and soft- as well as hard-margin Support Vector Machines (SVM). We prove asymptotically matching upper and lower bounds on the size of coresets, stating that µ-coresets of size (1+o(1)) E*/µ do always exist as µ-0, and that this is best possible. The crucial quantity E* is what we call the excentricity of a polytope, or a pair of polytopes. Additionally, we prove linear convergence speed of Gilberts algorithm, one of the earliest known approximation algorithms for polytope distance, and generalize both the algorithm and the proof to the two polytope case. Interestingly, our coreset bounds also imply that we can for the first time prove matching upper and lower bounds for the sparsity of Perceptron and SVM solutions.

north american chapter of the association for computational linguistics | 2018

UNSUPERVISED LEARNING OF SENTENCE EMBEDDINGS USING COMPOSITIONAL N-GRAM FEATURES

Matteo Pagliardini; Prakhar Gupta; Martin Jaggi

The recent tremendous success of unsupervised word embeddings in a multitude of applications raises the obvious question if similar methods could be derived to improve embeddings (i.e. semantic representations) of word sequences as well. We present a simple but efficient unsupervised objective to train distributed representations of sentences. Our method outperforms the state-of-the-art unsupervised models on most benchmark tasks, highlighting the robustness of the produced general-purpose sentence embeddings.

north american chapter of the association for computational linguistics | 2016

SwissCheese at SemEval-2016 Task 4: Sentiment Classification Using an Ensemble of Convolutional Neural Networks with Distant Supervision

Jan Deriu; Maurice Gonzenbach; Fatih Uzdilli; Aurelien Lucchi; Valeria De Luca; Martin Jaggi

In this paper, we propose a classifier for predicting message-level sentiments of English micro-blog messages from Twitter. Our method builds upon the convolutional sentence embedding approach proposed by (Severyn and Moschitti, 2015a; Severyn and Moschitti, 2015b). We leverage large amounts of data with distant supervision to train an ensemble of 2-layer convolutional neural networks whose predictions are combined using a random forest classifier. Our approach was evaluated on the datasets of the SemEval-2016 competition (Task 4) outperforming all other approaches for the Message Polarity Classification task.

Journal of Computational Geometry | 2012

AN EXPONENTIAL LOWER BOUND ON THE COMPLEXITY OF REGULARIZATION PATHS

Bernd Gärtner; Martin Jaggi; Clément Maria

For a variety of regularized optimization problems in machine learning, algorithms computing the entire solution path have been developed recently. Most of these methods are quadratic programs that are parameterized by a single parameter, as for example the Support Vector Machine (SVM). Solution path algorithms do not only compute the solution for one particular value of the regularization parameter but the entire path of solutions, making the selection of an optimal parameter much easier. It has been assumed that these piecewise linear solution paths have only linear complexity, i.e. linearly many bends. We prove that for the support vector machine this complexity can be exponential in the number of training points in the worst case. More strongly, we construct a single instance of n input points in d dimensions for an SVM such that at least \Theta(2^{n/2}) = \Theta(2^d) many distinct subsets of support vectors occur as the regularization parameter changes.

international world wide web conferences | 2017

Leveraging Large Amounts of Weakly Supervised Data for Multi-Language Sentiment Classification

Jan Milan Deriu; Aurelien Lucchi; Valeria De Luca; Aliaksei Severyn; Simone Müller; Mark Cieliebak; Thomas Hofmann; Martin Jaggi

This paper presents a novel approach for multi-lingual sentiment classification in short texts. This is a challenging task as the amount of training data in languages other than English is very limited. Previously proposed multi-lingual approaches typically require to establish a correspondence to English for which powerful classifiers are already available. In contrast, our method does not require such supervision. We leverage large amounts of weakly-supervised data in various languages to train a multi-layer convolutional network and demonstrate the importance of using pre-training of such networks. We thoroughly evaluate our approach on various multi-lingual datasets, including the recent SemEval-2016 sentiment prediction benchmark (Task 4), where we achieved state-of-the-art performance. We also compare the performance of our model trained individually for each language to a variant trained for all languages at once. We show that the latter model reaches slightly worse - but still acceptable - performance when compared to the single language model, while benefiting from better generalization properties across languages.

european symposium on algorithms | 2010

Approximating parameterized convex optimization problems

Joachim Giesen; Martin Jaggi; Sören Laue

We extend Clarksons framework by considering parameterized convex optimization problems over the unit simplex, that depend on one parameter. We provide a simple and efficient scheme for maintaining an e-approximate solution (and a corresponding e-coreset) along the entire parameter path. We prove correctness and optimality of the method. Practically relevant instances of the abstract parameterized optimization problem are for example regularization paths of support vector machines, multiple kernel learning, and minimum enclosing balls of moving points.

Optimization Methods & Software | 2017

Distributed optimization with arbitrary local solvers

Chenxin Ma; Jakub Konečný; Martin Jaggi; Virginia Smith; Michael I. Jordan; Peter Richtárik; Martin Takáč

With the growth of data and necessity for distributed optimization methods, solvers that work well on a single machine must be re-designed to leverage distributed computation. Recent work in this area has been limited by focusing heavily on developing highly specific methods for the distributed environment. These special-purpose methods are often unable to fully leverage the competitive performance of their well-tuned and customized single machine counterparts. Further, they are unable to easily integrate improvements that continue to be made to single machine methods. To this end, we present a framework for distributed optimization that both allows the flexibility of arbitrary solvers to be used on each (single) machine locally and yet maintains competitive performance against other state-of-the-art special-purpose distributed methods. We give strong primal–dual convergence rate guarantees for our framework that hold for arbitrary local solvers. We demonstrate the impact of local solver selection both theoretically and in an extensive experimental comparison. Finally, we provide thorough implementation details for our framework, highlighting areas for practical performance gains.

IEEE Transactions on Geoscience and Remote Sensing | 2017

Learning Aerial Image Segmentation From Online Maps

Pascal Kaiser; Jan Dirk Wegner; Aurelien Lucchi; Martin Jaggi; Thomas Hofmann; Konrad Schindler

This paper deals with semantic segmentation of high-resolution (aerial) images where a semantic class label is assigned to each pixel via supervised classification as a basis for automatic map generation. Recently, deep convolutional neural networks (CNNs) have shown impressive performance and have quickly become the de-facto standard for semantic segmentation, with the added benefit that task-specific feature design is no longer necessary. However, a major downside of deep learning methods is that they are extremely data hungry, thus aggravating the perennial bottleneck of supervised classification, to obtain enough annotated training data. On the other hand, it has been observed that they are rather robust against noise in the training labels. This opens up the intriguing possibility to avoid annotating huge amounts of training data, and instead train the classifier from existing legacy data or crowd-sourced maps that can exhibit high levels of noise. The question addressed in this paper is: can training with large-scale publicly available labels replace a substantial part of the manual labeling effort and still achieve sufficient performance? Such data will inevitably contain a significant portion of errors, but in return virtually unlimited quantities of it are available in larger parts of the world. We adapt a state-of-the-art CNN architecture for semantic segmentation of buildings and roads in aerial images, and compare its performance when using different training data sets, ranging from manually labeled pixel-accurate ground truth of the same city to automatic training data derived from OpenStreetMap data from distant locations. We report our results that indicate that satisfying performance can be obtained with significantly less manual annotation effort, by exploiting noisy large-scale training data.

Siam Journal on Optimization | 2018

Optimal Affine-Invariant Smooth Minimization Algorithms

Alexandre d'Aspremont; Cristóbal Guzmán; Martin Jaggi

We formulate an affine-invariant implementation of the accelerated first-order algorithm in [Y. Nesterov, Dokl. Math., 27 (1983), pp. 372--376]. Its complexity bound is proportional to an affine-invariant regularity constant defined with respect to the Minkowski gauge of the feasible set. We extend these results to more general problems, optimizing Holder smooth functions using

european symposium on algorithms | 2012