Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Mihir Jain is active.

Publication


Featured researches published by Mihir Jain.


computer vision and pattern recognition | 2014

Action Localization with Tubelets from Motion

Mihir Jain; Jan C. van Gemert; Hervé Jégou; Patrick Bouthemy; Cees G. M. Snoek

This paper considers the problem of action localization, where the objective is to determine when and where certain actions appear. We introduce a sampling strategy to produce 2D+t sequences of bounding boxes, called tubelets. Compared to state-of-the-art alternatives, this drastically reduces the number of hypotheses that are likely to include the action of interest. Our method is inspired by a recent technique introduced in the context of image localization. Beyond considering this technique for the first time for videos, we revisit this strategy for 2D+t sequences obtained from super-voxels. Our sampling strategy advantageously exploits a criterion that reflects how action related motion deviates from background motion. We demonstrate the interest of our approach by extensive experiments on two public datasets: UCF Sports and MSR-II. Our approach significantly outperforms the state-of-the-art on both datasets, while restricting the search of actions to a fraction of possible bounding box sequences.


computer vision and pattern recognition | 2015

What do 15,000 object categories tell us about classifying and localizing actions?

Mihir Jain; Jan C. van Gemert; Cees G. M. Snoek

This paper contributes to automatic classification and localization of human actions in video. Whereas motion is the key ingredient in modern approaches, we assess the benefits of having objects in the video representation. Rather than considering a handful of carefully selected and localized objects, we conduct an empirical study on the benefit of encoding 15,000 object categories for action using 6 datasets totaling more than 200 hours of video and covering 180 action classes. Our key contributions are i) the first in-depth study of encoding objects for actions, ii) we show that objects matter for actions, and are often semantically relevant as well. iii) We establish that actions have object preferences. Rather than using all objects, selection is advantageous for action recognition. iv)We reveal that object-action relations are generic, which allows to transferring these relationships from the one domain to the other. And, v) objects, when combined with motion, improve the state-of-the-art for both action classification and localization.


british machine vision conference | 2015

APT: Action localization Proposals from dense Trajectories

Jan C. van Gemert; Mihir Jain; Ella Gati; Cees G. M. Snoek

This paper is on action localization in video with the aid of spatio-temporal proposals. To alleviate the computational expensive segmentation step of existing proposals, we propose bypassing the segmentations completely by generating proposals directly from the dense trajectories used to represent videos during classification. Our Action localization Proposals from dense Trajectories (APT) use an efficient proposal generation algorithm to handle the high number of trajectories in a video. Our spatio-temporal proposals are faster than current methods and outperform the localization and classification accuracy of current proposals on the UCF Sports, UCF 101, and MSR-II video datasets. Corrected version: we fixed a mistake in our UCF-101 ground truth. Numbers are different; conclusions are unchanged


international conference on computer vision | 2015

Objects2action: Classifying and Localizing Actions without Any Video Example

Mihir Jain; Jan C. van Gemert; Thomas Mensink; Cees G. M. Snoek

The goal of this paper is to recognize actions in video without the need for examples. Different from traditional zero-shot approaches we do not demand the design and specification of attribute classifiers and class-to-attribute mappings to allow for transfer from seen classes to unseen classes. Our key contribution is objects2action, a semantic word embedding that is spanned by a skip-gram model of thousands of object categories. Action labels are assigned to an object encoding of unseen video based on a convex combination of action and object affinities. Our semantic embedding has three main characteristics to accommodate for the specifics of actions. First, we propose a mechanism to exploit multiple-word descriptions of actions and objects. Second, we incorporate the automated selection of the most responsive objects per action. And finally, we demonstrate how to extend our zero-shot approach to the spatio-temporal localization of actions in video. Experiments on four action datasets demonstrate the potential of our approach.


Computer Vision and Image Understanding | 2018

VideoLSTM convolves, attends and flows for action recognition

Zhenyang Li; Kirill Gavrilyuk; Efstratios Gavves; Mihir Jain; Cees G. M. Snoek

We present a new architecture for end-to-end sequence learning of actions in video, we call VideoLSTM. Rather than adapting the video to the peculiarities of established recurrent or convolutional architectures, we adapt the architecture to fit the requirements of the video medium. Starting from the soft-Attention LSTM, VideoLSTM makes three novel contributions. First, video has a spatial layout. To exploit the spatial correlation we hardwire convolutions in the soft-Attention LSTM architecture. Second, motion not only informs us about the action content, but also guides better the attention towards the relevant spatio-temporal locations. We introduce motion-based attention. And finally, we demonstrate how the attention from VideoLSTM can be used for action localization by relying on just the action class label. Experiments and comparisons on challenging datasets for action classification and localization support our claims.


International Journal of Computer Vision | 2017

Tubelets: Unsupervised Action Proposals from Spatiotemporal Super-Voxels

Mihir Jain; Jan C. van Gemert; Hervé Jégou; Patrick Bouthemy; Cees G. M. Snoek

This paper considers the problem of localizing actions in videos as sequences of bounding boxes. The objective is to generate action proposals that are likely to include the action of interest, ideally achieving high recall with few proposals. Our contributions are threefold. First, inspired by selective search for object proposals, we introduce an approach to generate action proposals from spatiotemporal super-voxels in an unsupervised manner, we call them Tubelets. Second, along with the static features from individual frames our approach advantageously exploits motion. We introduce independent motion evidence as a feature to characterize how the action deviates from the background and explicitly incorporate such motion information in various stages of the proposal generation. Finally, we introduce spatiotemporal refinement of Tubelets, for more precise localization of actions, and pruning to keep the number of Tubelets limited. We demonstrate the suitability of our approach by extensive experiments for action proposal quality and action localization on three public datasets: UCF Sports, MSR-II and UCF101. For action proposal quality, our unsupervised proposals beat all other existing approaches on the three datasets. For action localization, we show top performance on both the trimmed videos of UCF Sports and UCF101 as well as the untrimmed videos of MSR-II.


ECCV THUMOS Challenge workshop | 2014

University of Amsterdam at THUMOS Challenge 2014

Mihir Jain; J.C. van Gemert; Cees G. M. Snoek


acm multimedia | 2011

Asymmetric Hamming Embedding

Mihir Jain; Hervé Jégou; Patrick Gros


Archive | 2017

ACTION LOCALIZATION IN SEQUENTIAL DATA WITH ATTENTION PROPOSALS FROM A RECURRENT NETWORK

Mihir Jain; Zhenyang Li; Efstratios Gavves; Cornelis Gerardus Maria Snoek


Archive | 2015

University of Amsterdam at THUMOS 2015

Mihir Jain; Jan C. van Gemert; Pascal Mettes; Cees G. M. Snoek

Collaboration


Dive into the Mihir Jain's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Zhenyang Li

University of Amsterdam

View shared research outputs
Top Co-Authors

Avatar

Bernard Ghanem

King Abdullah University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Victor Escorcia

King Abdullah University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge