Kuldeep Kulkarni | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kuldeep Kulkarni is active.

Explore More

Publication

Featured researches published by Kuldeep Kulkarni.

computer vision and pattern recognition | 2016

ReconNet: Non-Iterative Reconstruction of Images from Compressively Sensed Measurements

Kuldeep Kulkarni; Suhas Lohit; Pavan K. Turaga; Ronan Kerviche; Amit Ashok

The goal of this paper is to present a non-iterative and more importantly an extremely fast algorithm to reconstruct images from compressively sensed (CS) random measurements. To this end, we propose a novel convolutional neural network (CNN) architecture which takes in CS measurements of an image as input and outputs an intermediate reconstruction. We call this network, ReconNet. The intermediate reconstruction is fed into an off-the-shelf denoiser to obtain the final reconstructed image. On a standard dataset of images we show significant improvements in reconstruction results (both in terms of PSNR and time complexity) over state-of-the-art iterative CS reconstruction algorithms at various measurement rates. Further, through qualitative experiments on real data collected using our block single pixel camera (SPC), we show that our network is highly robust to sensor noise and can recover visually better quality images than competitive algorithms at extremely low sensing rates of 0.1 and 0.04. To demonstrate that our algorithm can recover semantically informative images even at a low measurement rate of 0.01, we present a very robust proof of concept real-time visual tracking application.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2016

Reconstruction-Free Action Inference from Compressive Imagers

Kuldeep Kulkarni; Pavan K. Turaga

Persistent surveillance from camera networks, such as at parking lots, UAVs, etc., often results in large amounts of video data, resulting in significant challenges for inference in terms of storage, communication and computation. Compressive cameras have emerged as a potential solution to deal with the data deluge issues in such applications. However, inference tasks such as action recognition require high quality features which implies reconstructing the original video data. Much work in compressive sensing (CS) theory is geared towards solving the reconstruction problem, where state-of-the-art methods are computationally intensive and provide low-quality results at high compression rates. Thus, reconstruction-free methods for inference are much desired. In this paper, we propose reconstruction-free methods for action recognition from compressive cameras at high compression ratios of 100 and above. Recognizing actions directly from CS measurements requires features which are mostly nonlinear and thus not easily applicable. This leads us to search for such properties that are preserved in compressive measurements. To this end, we propose the use of spatio-temporal smashed filters, which are compressive domain versions of pixel-domain matched filters. We conduct experiments on publicly available databases and show that one can obtain recognition rates that are comparable to the oracle method in uncompressed setup, even for high compression ratios.

international conference on image processing | 2012

Recurrence textures for human activity recognition from compressive cameras

Kuldeep Kulkarni; Pavan K. Turaga

Recent advances in camera architectures and associated mathematical representations now enable compressive acquisition of images and videos at low data-rates. In such a setting, we consider the problem of human activity recognition, which is an important inference problem in many security and surveillance applications. We propose a framework for understanding human activities as a non-linear dynamical system, and propose a robust, generalizable feature that can be extracted directly from the compressed measurements without reconstructing the original video frames. The proposed feature is termed recurrence texture and is motivated from recurrence analysis of non-linear dynamical systems. We show that it is possible to obtain discriminative features directly from the compressed stream and show its utility in recognition of activities at very low data rates.

international conference on image processing | 2016

Direct inference on compressive measurements using convolutional neural networks

Suhas Lohit; Kuldeep Kulkarni; Pavan K. Turaga

Compressive imagers, e.g. the single-pixel camera (SPC), acquire measurements in the form of random projections of the scene instead of pixel intensities. Compressive Sensing (CS) theory allows accurate reconstruction of the image even from a small number of such projections. However, in practice, most reconstruction algorithms perform poorly at low measurement rates and are computationally very expensive. But perfect reconstruction is not the goal of high-level computer vision applications. Instead, we are interested in only determining certain properties of the image. Recent work has shown that effective inference is possible directly from the compressive measurements, without reconstruction, using correlational features. In this paper, we show that convolutional neural networks (CNNs) can be employed to extract discriminative non-linear features directly from CS measurements. Using these features, we demonstrate that effective high-level inference can be performed. Experimentally, using hand written digit recognition (MNIST dataset) and image recognition (ImageNet) as examples, we show that recognition is possible even at low measurement rates of about 0.1.

computer vision and pattern recognition | 2017

Compressive Light Field Reconstructions Using Deep Learning

Mayank Gupta; Arjun Jauhari; Kuldeep Kulkarni; Suren Jayasuriya; Alyosha Molnar; Pavan K. Turaga

Light field imaging is limited in its computational processing demands of high sampling for both spatial and angular dimensions. Single-shot light field cameras sacrifice spatial resolution to sample angular viewpoints, typically by multiplexing incoming rays onto a 2D sensor array. While this resolution can be recovered using compressive sensing, these iterative solutions are slow in processing a light field. We present a deep learning approach using a new, two branch network architecture, consisting jointly of an autoencoder and a 4D CNN, to recover a high resolution 4D light field from a single coded 2D image. This network decreases reconstruction time significantly while achieving average PSNR values of 26-32 dB on a variety of light fields. In particular, reconstruction time is decreased from 35 minutes to 6.7 minutes as compared to the dictionary method for equivalent visual quality. These reconstructions are performed at small sampling/compression ratios as low as 8%, allowing for cheaper coded light field cameras. We test our network reconstructions on synthetic light fields, simulated coded measurements of real light fields captured from a Lytro Illum camera, and real coded images from a custom CMOS diffractive light field camera. The combination of compressive light field capture with deep learning allows the potential for real-time light field video acquisition systems in the future.

european conference on computer vision | 2016

Weakly Supervised Learning of Heterogeneous Concepts in Videos

Sohil Shah; Kuldeep Kulkarni; Arijit Biswas; Ankit Gandhi; Om Deshmukh; Larry S. Davis

Typical textual descriptions that accompany online videos are ‘weak’: i.e., they mention the important heterogeneous concepts in the video but not their corresponding spatio-temporal locations. However, certain location constraints on these concepts can be inferred from the description. The goal of this paper is to present a generalization of the Indian Buffet Process (IBP) that can (a) systematically incorporate heterogeneous concepts in an integrated framework, and (b) enforce location constraints, for efficient classification and localization of the concepts in the videos. Finally, we develop posterior inference for the proposed formulation using mean-field variational approximation. Comparative evaluations on the Casablanca and the A2D datasets show that the proposed approach significantly outperforms other state-of-the-art techniques: 24 % relative improvement for pairwise concept classification in the Casablanca dataset and 9 % relative improvement for localization in the A2D dataset as compared to the most competitive baseline.

computer vision and pattern recognition | 2015