Krishna Kumar Singh | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Krishna Kumar Singh is active.

Explore More

Publication

Featured researches published by Krishna Kumar Singh.

european conference on computer vision | 2016

End-to-End Localization and Ranking for Relative Attributes

Krishna Kumar Singh; Yong Jae Lee

We propose an end-to-end deep convolutional network to simultaneously localize and rank relative visual attributes, given only weakly-supervised pairwise image comparisons. Unlike previous methods, our network jointly learns the attribute’s features, localization, and ranker. The localization module of our network discovers the most informative image region for the attribute, which is then used by the ranking module to learn a ranking model of the attribute. Our end-to-end framework also significantly speeds up processing and is much faster than previous methods. We show state-of-the-art ranking results on various relative attribute datasets, and our qualitative localization results clearly demonstrate our network’s ability to learn meaningful image patches.

computer vision and pattern recognition | 2016

Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection

Krishna Kumar Singh; Fanyi Xiao; Yong Jae Lee

The status quo approach to training object detectors requires expensive bounding box annotations. Our framework takes a markedly different direction: we transfer tracked object boxes from weakly-labeled videos to weakly-labeled images to automatically generate pseudo ground-truth boxes, which replace manually annotated bounding boxes. We first mine discriminative regions in the weakly-labeled image collection that frequently/rarely appear in the positive/ negative images. We then match those regions to videos and retrieve the corresponding tracked object boxes. Finally, we design a hough transform algorithm to vote for the best box to serve as the pseudo GT for each image, and use them to train an object detector. Together, these lead to state-of-the-art weakly-supervised detection results on the PASCAL 2007 and 2010 datasets.

workshop on applications of computer vision | 2016

KrishnaCam: Using a longitudinal, single-person, egocentric dataset for scene understanding tasks

Krishna Kumar Singh; Kayvon Fatahalian; Alexei A. Efros

We record, and analyze, and present to the community, KrishnaCam, a large (7.6 million frames, 70 hours) egocentric video stream along with GPS position, acceleration and body orientation data spanning nine months of the life of a computer vision graduate student. We explore and exploit the inherent redundancies in this rich visual data stream to answer simple scene understanding questions such as: How much novel visual information does the student see each day? Given a single egocentric photograph of a scene, can we predict where the student might walk next? We find that given our large video database, simple, nearest-neighbor methods are surprisingly adept baselines for these tasks, even in scenes and scenarios where the camera wearer has never been before. For example, we demonstrate the ability to predict the near-future trajectory of the student in broad set of outdoor situations that includes following sidewalks, stopping to wait for a bus, taking a daily path to work, and the lack of movement while eating food.

computer vision and pattern recognition | 2017

Identifying First-Person Camera Wearers in Third-Person Videos

Chenyou Fan; Jangwon Lee; Mingze Xu; Krishna Kumar Singh; Yong Jae Lee; David J. Crandall; Michael S. Ryoo

We consider scenarios in which we wish to perform joint scene understanding, object tracking, activity recognition, and other tasks in scenarios in which multiple people are wearing body-worn cameras while a third-person static camera also captures the scene. To do this, we need to establish person-level correspondences across first-and third-person videos, which is challenging because the camera wearer is not visible from his/her own egocentric video, preventing the use of direct feature matching. In this paper, we propose a new semi-Siamese Convolutional Neural Network architecture to address this novel challenge. We formulate the problem as learning a joint embedding space for first-and third-person videos that considers both spatial-and motion-domain cues. A new triplet loss function is designed to minimize the distance between correct first-and third-person matches while maximizing the distance between incorrect ones. This end-to-end approach performs significantly better than several baselines, in part by learning the first-and third-person features optimized for matching jointly with the distance measure itself.

indian conference on computer vision, graphics and image processing | 2012

Geometry directed browser for personal photographs

Aditya Deshpande; Siddharth Choudhary; P. J. Narayanan; Krishna Kumar Singh; Kaustav Kundu; Aditya Singh; Apurva Kumar

Browsers of personal digital photographs all essentially follow the slide show paradigm, sequencing through the photos in the order they are taken. A more engaging way to browse personal photographs, especially of a large space like a popular monument, should involve the geometric context of the space. In this paper, we present a geometry directed photo browser that enables users to browse their personal pictures with the underlying geometry of the space to guide the process. The browser uses a pre-computed package of geometric information about the monument for this task. The package is used to register a set of photographs taken by the user with the common geometric space of the monument. This involves localizing the images to the monument space by computing the camera matrix corresponding to it. We use a state-of-the-art method for fast localization. Registered photographs can be browsed using a visualization module that shows them in the proper geometric context with respect to a point-based 3D model of the monument. We present the design of the geometry-directed browser and demonstrate its utility for a few sequences of personal images of well-known monuments. We believe personal photo browsers can provide an enhanced sense of ones own experience with a monument using the underlying geometric context of the monument.

web search and data mining | 2018

Who Will Share My Image?: Predicting the Content Diffusion Path in Online Social Networks

Wenjian Hu; Krishna Kumar Singh; Fanyi Xiao; Jinyoung Han; Chen-Nee Chuah; Yong Jae Lee

Content popularity prediction has been extensively studied due to its importance and interest for both users and hosts of social media sites like Facebook, Instagram, Twitter, and Pinterest. However, existing work mainly focuses on modeling popularity using a single metric such as the total number of likes or shares. In this work, we propose Diffusion-LSTM, a memory-based deep recurrent network that learns to recursively predict the entire diffusion path of an image through a social network. By combining user social features and image features, and encoding the diffusion path taken thus far with an explicit memory cell, our model predicts the diffusion path of an image more accurately compared to alternate baselines that either encode only image or social features, or lack memory. By mapping individual users to user prototypes, our model can generalize to new users not seen during training. Finally, we demonstrate our model»s capability of generating diffusion trees, and show that the generated trees closely resemble ground-truth trees.

international conference on computer vision | 2017

Hide-and-Seek: Forcing a Network to be Meticulous for Weakly-Supervised Object and Action Localization

Krishna Kumar Singh; Yong Jae Lee

arXiv: Distributed, Parallel, and Cluster Computing | 2013

CPU and/or GPU: Revisiting the GPU Vs. CPU Myth

Kishore Kothapalli; Dip Sankar Banerjee; P. J. Narayanan; Surinder Sood; Aman Kumar Bahl; Shashank Sharma; Shrenik Lad; Krishna Kumar Singh; Kiran Kumar Matam; Sivaramakrishna Bharadwaj; Rohit Nigam; Parikshit Sakurikar; Aditya Deshpande; Ishan Misra; Siddharth Choudhary; Shubham Gupta

european conference on computer vision | 2018