Mario Fritz | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mario Fritz is active.

Explore More

Publication

Featured researches published by Mario Fritz.

european conference on computer vision | 2010

Adapting visual category models to new domains

Kate Saenko; Brian Kulis; Mario Fritz; Trevor Darrell

Domain adaptation is an important emerging topic in computer vision. In this paper, we present one of the first studies of domain shift in the context of object recognition. We introduce a method that adapts object models acquired in a particular visual domain to new imaging conditions by learning a transformation that minimizes the effect of domain-induced changes in the feature distribution. The transformation is learned in a supervised manner and can be applied to categories for which there are no labeled examples in the new domain. While we focus our evaluation on object recognition tasks, the transform-based adaptation technique we develop is general and could be applied to nonimage data. Another contribution is a new multi-domain object database, freely available for download. We experimentally demonstrate the ability of our method to improve recognition on categories with few or no target domain labels and moderate to large changes in the imaging conditions.

ubiquitous computing | 2008

Discovery of activity patterns using topic models

Tâm Huynh; Mario Fritz; Bernt Schiele

In this work we propose a novel method to recognize daily routines as a probabilistic combination of activity patterns. The use of topic models enables the automatic discovery of such patterns in a users daily routine. We report experimental results that show the ability of the approach to model and recognize daily routines without user annotation.

international conference on machine learning | 2005

The 2005 PASCAL visual object classes challenge

Mark Everingham; Andrew Zisserman; Christopher K. I. Williams; Luc Van Gool; Moray Allan; Christopher M. Bishop; Olivier Chapelle; Navneet Dalal; Thomas Deselaers; Gyuri Dorkó; Stefan Duffner; Jan Eichhorn; Jason Farquhar; Mario Fritz; Christophe Garcia; Thomas L. Griffiths; Frédéric Jurie; Daniel Keysers; Markus Koskela; Jorma Laaksonen; Diane Larlus; Bastian Leibe; Hongying Meng; Hermann Ney; Bernt Schiele; Cordelia Schmid; Edgar Seemann; John Shawe-Taylor; Amos J. Storkey; Sandor Szedmak

The PASCAL Visual Object Classes Challenge ran from February to March 2005. The goal of the challenge was to recognize objects from a number of visual object classes in realistic scenes (i.e. not pre-segmented objects). Four object classes were selected: motorbikes, bicycles, cars and people. Twelve teams entered the challenge. In this chapter we provide details of the datasets, algorithms used by the teams, evaluation criteria, and results achieved.

international conference on computer vision | 2011

A category-level 3-D object dataset: Putting the Kinect to work

Allison Janoch; Sergey Karayev; Yangqing Jia; Jonathan T. Barron; Mario Fritz; Kate Saenko; Trevor Darrell

Recent proliferation of a cheap but quality depth sensor, the Microsoft Kinect, has brought the need for a challenging category-level 3D object detection dataset to the fore. We review current 3D datasets and find them lacking in variation of scenes, categories, instances, and viewpoints. Here we present our dataset of color and depth image pairs, gathered in real domestic and office environments. It currently includes over 50 classes, with more images added continuously by a crowd-sourced collection effort. We establish baseline performance in a PASCAL VOC-style detection task, and suggest two ways that inferred world size of the object may be used to improve detection. The dataset and annotations can be downloaded at http://www.kinectdata.com.

international conference on computer vision | 2015

Ask Your Neurons: A Neural-Based Approach to Answering Questions about Images

Mateusz Malinowski; Marcus Rohrbach; Mario Fritz

We address a question answering task on real-world images that is set up as a Visual Turing Test. By combining latest advances in image representation and natural language processing, we propose Neural-Image-QA, an end-to-end formulation to this problem for which all parts are trained jointly. In contrast to previous efforts, we are facing a multi-modal problem where the language output (answer) is conditioned on visual and natural language input (image and question). Our approach Neural-Image-QA doubles the performance of the previous best approach on this problem. We provide additional insights into the problem by analyzing how much information is contained only in the language part for which we provide a new human baseline. To study human consensus, which is related to the ambiguities inherent in this challenging task, we propose two novel metrics and collect additional answers which extends the original DAQUAR dataset to DAQUAR-Consensus.

european conference on computer vision | 2004

On the Significance of Real-World Conditions for Material Classification

Eric Hayman; Barbara Caputo; Mario Fritz; Jan Olof Eklundh

Classifying materials from their appearance is a challenging problem, especially if illumination and pose conditions are permitted to change: highlights and shadows caused by 3D structure can radically alter a sample’s visual texture. Despite these difficulties, researchers have demonstrated impressive results on the CUReT database which contains many images of 61 materials under different conditions. A first contribution of this paper is to further advance the state-of-the-art by applying Support Vector Machines to this problem. To our knowledge, we record the best results to date on the CUReT database.

international conference on computer vision | 2005

Integrating representative and discriminant models for object category detection

Mario Fritz; Bastian Leibe; Barbara Caputo; Bernt Schiele

Category detection is a lively area of research. While categorization algorithms tend to agree in using local descriptors, they differ in the choice of the classifier, with some using generative models and others discriminative approaches. This paper presents a method for object category detection which integrates a generative model with a discriminative classifier. For each object category, we generate an appearance codebook, which becomes a common vocabulary for the generative and discriminative methods. Given a query image, the generative part of the algorithm finds a set of hypotheses and estimates their support in location and scale. Then, the discriminative part verifies each hypothesis on the same codebook activations. The new algorithm exploits the strengths of both original methods, minimizing their weaknesses. Experiments on several databases show that our new approach performs better than its building blocks taken separately. Moreover, experiments on two challenging multi-scale databases show that our new algorithm outperforms previously reported results

computer vision and pattern recognition | 2015

Appearance-based gaze estimation in the wild

Xucong Zhang; Yusuke Sugano; Mario Fritz; Andreas Bulling

Appearance-based gaze estimation is believed to work well in real-world settings, but existing datasets have been collected under controlled laboratory conditions and methods have been not evaluated across multiple datasets. In this work we study appearance-based gaze estimation in the wild. We present the MPIIGaze dataset that contains 213,659 images we collected from 15 participants during natural everyday laptop use over more than three months. Our dataset is significantly more variable than existing ones with respect to appearance and illumination. We also present a method for in-the-wild appearance-based gaze estimation using multimodal convolutional neural networks that significantly outperforms state-of-the art methods in the most challenging cross-dataset evaluation. We present an extensive evaluation of several state-of-the-art image-based gaze estimation algorithms on three current datasets, including our own. This evaluation provides clear insights and allows us to identify key research challenges of gaze estimation in the wild.

international conference on computer vision | 2011

The NBNN kernel

Tinne Tuytelaars; Mario Fritz; Kate Saenko; Trevor Darrell

Naive Bayes Nearest Neighbor (NBNN) has recently been proposed as a powerful, non-parametric approach for object classification, that manages to achieve remarkably good results thanks to the avoidance of a vector quantization step and the use of image-to-class comparisons, yielding good generalization. In this paper, we introduce a kernelized version of NBNN. This way, we can learn the classifier in a discriminative setting. Moreover, it then becomes straightforward to combine it with other kernels. In particular, we show that our NBNN kernel is complementary to standard bag-of-features based kernels, focussing on local generalization as opposed to global image composition. By combining them, we achieve state-of-the-art results on Caltech101 and 15 Scenes datasets. As a side contribution, we also investigate how to speed up the NBNN computations.

The International Journal of Robotics Research | 2012

A geometric approach to robotic laundry folding

Stephen Miller; Jur van den Berg; Mario Fritz; Trevor Darrell; Ken Goldberg; Pieter Abbeel

We consider the problem of autonomous robotic laundry folding, and propose a solution to the perception and manipulation challenges inherent to the task. At the core of our approach is a quasi-static cloth model which allows us to neglect the complex dynamics of cloth under significant parts of the state space, allowing us to reason instead in terms of simple geometry. We present an algorithm which, given a 2D cloth polygon and a desired sequence of folds, outputs a motion plan for executing the corresponding manipulations, deemed g-folds, on a minimal number of robot grippers. We define parametrized fold sequences for four clothing categories: towels, pants, short-sleeved shirts, and long-sleeved shirts, each represented as polygons. We then devise a model-based optimization approach for visually inferring the class and pose of a spread-out or folded clothing article from a single image, such that the resulting polygon provides a parse suitable for these folding primitives. We test the manipulation and perception tasks individually, and combine them to implement an autonomous folding system on the Willow Garage PR2. This enables the PR2 to identify a clothing article spread out on a table, execute the computed folding sequence, and visually track its progress over successive folds.

Explore More