Arjun Jain | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Arjun Jain is active.

Explore More

Publication

Featured researches published by Arjun Jain.

computer vision and pattern recognition | 2012

Articulated people detection and pose estimation: Reshaping the future

Leonid Pishchulin; Arjun Jain; Mykhaylo Andriluka; Thorsten Thormählen; Bernt Schiele

State-of-the-art methods for human detection and pose estimation require many training samples for best performance. While large, manually collected datasets exist, the captured variations w.r.t. appearance, shape and pose are often uncontrolled thus limiting the overall performance. In order to overcome this limitation we propose a new technique to extend an existing training set that allows to explicitly control pose and shape variations. For this we build on recent advances in computer graphics to generate samples with realistic appearance and background while modifying body shape and pose. We validate the effectiveness of our approach on the task of articulated human detection and articulated pose estimation. We report close to state of the art results on the popular Image Parsing [25] human pose estimation benchmark and demonstrate superior performance for articulated human detection. In addition we define a new challenge of combined articulated human detection and pose estimation in real-world scenes.

international conference on computer graphics and interactive techniques | 2010

MovieReshape: tracking and reshaping of humans in videos

Arjun Jain; Thorsten Thormählen; Hans-Peter Seidel; Christian Theobalt

We present a system for quick and easy manipulation of the body shape and proportions of a human actor in arbitrary video footage. The approach is based on a morphable model of 3D human shape and pose that was learned from laser scans of real people. The algorithm commences by spatio-temporally fitting the pose and shape of this model to the actor in either single-view or multi-view video footage. Once the model has been fitted, semantically meaningful attributes of body shape, such as height, weight or waist girth, can be interactively modified by the user. The changed proportions of the virtual human model are then applied to the actor in all video frames by performing an image-based warping. By this means, we can now conveniently perform spatio-temporal reshaping of human actors in video footage which we show on a variety of video sequences.

eurographics | 2012

Exploring Shape Variations by 3D-Model Decomposition and Part-based Recombination

Arjun Jain; Thorsten Thormählen; Tobias Ritschel; Hans-Peter Seidel

We present a system that allows new shapes to be created by blending between shapes taken from a database. We treat the shape as a composition of parts; blending is performed by recombining parts from different shapes according to constraints deduced by shape analysis. The analysis involves shape segmentation, contact analysis, and symmetry detection. The system can be used to rapidly instantiate new models that have similar symmetry and adjacency structure to the database shapes, yet vary in appearance.

asian conference on computer vision | 2014

MoDeep: A Deep Learning Framework Using Motion Features for Human Pose Estimation

Arjun Jain; Jonathan Tompson; Yann LeCun; Christoph Bregler

In this work, we propose a novel and efficient method for articulated human pose estimation in videos using a convolutional network architecture, which incorporates both color and motion features. We propose a new human body pose dataset, FLIC-motion (This dataset can be downloaded from http://cs.nyu.edu/~ajain/accv2014/.), that extends the FLIC dataset [1] with additional motion features. We apply our architecture to this dataset and report significantly better performance than current state-of-the-art pose detection systems.

computer vision and pattern recognition | 2011

Learning people detection models from few training samples

Leonid Pishchulin; Arjun Jain; Christian Wojek; Mykhaylo Andriluka; Thorsten Thormählen; Bernt Schiele

People detection is an important task for a wide range of applications in computer vision. State-of-the-art methods learn appearance based models requiring tedious collection and annotation of large data corpora. Also, obtaining data sets representing all relevant variations with sufficient accuracy for the intended application domain at hand is often a non-trivial task. Therefore this paper investigates how 3D shape models from computer graphics can be leveraged to ease training data generation. In particular we employ a rendering-based reshaping method in order to generate thousands of synthetic training samples from only a few persons and views. We evaluate our data generation method for two different people detection models. Our experiments on a challenging multi-view dataset indicate that the data from as few as eleven persons suffices to achieve good performance. When we additionally combine our synthetic training samples with real data we even outperform existing state-of-the-art methods.

computer vision and pattern recognition | 2015

Efficient ConvNet-based marker-less motion capture in general scenes with a low number of cameras

Ahmed Elhayek; E. de Aguiar; Arjun Jain; Jonathan Tompson; Leonid Pishchulin; Mykhaylo Andriluka; Christoph Bregler; Bernt Schiele; Christian Theobalt

We present a novel method for accurate marker-less capture of articulated skeleton motion of several subjects in general scenes, indoors and outdoors, even from input filmed with as few as two cameras. Our approach unites a discriminative image-based joint detection method with a model-based generative motion tracking algorithm through a combined pose optimization energy. The discriminative part-based pose detection method, implemented using Convolutional Networks (ConvNet), estimates unary potentials for each joint of a kinematic skeleton model. These unary potentials are used to probabilistically extract pose constraints for tracking by using weighted sampling from a pose posterior guided by the model. In the final energy, these constraints are combined with an appearance-based model-to-image similarity term. Poses can be computed very efficiently using iterative local optimization, as ConvNet detection is fast, and our formulation yields a combined pose estimation energy with analytic derivatives. In combination, this enables to track full articulated joint angles at state-of-the-art accuracy and temporal stability with a very low number of cameras.

computer vision and pattern recognition | 2010

Exploiting global connectivity constraints for reconstruction of 3D line segments from images

Arjun Jain; Christian Kurz; Thorsten Thormählen; Hans-Peter Seidel

Given a set of 2D images, we propose a novel approach for the reconstruction of straight 3D line segments that represent the underlying geometry of static 3D objects in the scene. Such an algorithm is especially useful for the automatic 3D reconstruction of man-made environments. The main contribution of our approach is the generation of an improved reconstruction by imposing global topological constraints given by connections between neighbouring lines. Additionally, our approach does not employ explicit line matching between views, thus making it more robust against image noise and partial occlusion. Furthermore, we suggest a technique to merge independent reconstructions, that are generated from different base images, which also helps to remove outliers. The proposed algorithm is evaluated on synthetic and real scenes by comparison with ground truth.

international conference on computer graphics and interactive techniques | 2012

Material memex: automatic material suggestions for 3D objects

Arjun Jain; Thorsten Thormählen; Tobias Ritschel; Hans-Peter Seidel

The material found on 3D objects and their parts in our everyday surroundings is highly correlated with the geometric shape of the parts and their relation to other parts of the same object. This work proposes to model this context-dependent correlation by learning it from a database containing several hundreds of objects and their materials. Given a part-based 3D object without materials, the learned model can be used to fully automatically assign plausible material parameters, including diffuse color, specularity, gloss, and transparency. Further, we propose a user interface that provides material suggestions. This user-interface can be used, for example, to refine the automatic suggestion. Once a refinement has been made, the model incorporates this information, and the automatic assignment is incrementally improved. Results are given for objects with different numbers of parts and with different topological complexity. A user study validates that our method significantly simplifies and accelerates the material assignment task compared to other approaches.

acm multimedia | 2008

A system for automatic detection and recognition of advertising trademarks in sports videos

Lamberto Ballan; Marco Bertini; Arjun Jain

In this technical demonstration we show the current version of our trademark detection and recognition system that has been developed in collaboration with a sport marketing firm1 with the aim of evaluating the visibility of advertising trademarks in broadcast sporting events. We propose a semi-automatic system for detecting and retrieving trademark appearances in sports videos. A human annotator supervises the results of the automatic annotation through an interface that shows the time and the position of the detected trademarks; due to this fact the aim of the system is to provide a good recall figure, so that the supervisor can safely skip the parts of the video that have been marked as not containing a trademark, thus speeding up his work.

british machine vision conference | 2011

In Good Shape: Robust People Detection Based on Appearance and Shape

Leonid Pishchulin; Arjun Jain; Christian Wojek; Thorsten Thormaehlen; Bernt Schiele

Robustly detecting people in real world scenes is a fundamental and challenging task in computer vision. State-of-the-art approaches use powerful learning methods and manually annotated image data. Importantly, these learning based approaches rely on the fact that the collected training data is representative of all relevant variations necessary to detect people. Rather than to collect and annotate ever more training data, this paper explores the possibility to use a 3D human shape and pose model from computer graphics to add relevant shape information to learn more powerful people detection models. By sampling from the space of 3D shapes we are able to control data variability while covering the major shape variations of humans which are often difficult to capture when collecting real-world training images. We evaluate our data generation method for a people detection model based on pictorial structures. As we show on a challenging multi-viewpoint dataset, the additional information contained in the 3D shape model helps to outperform models trained on image data alone (see e.g. Fig. 1).

Explore More