Is this you? Create Your Porfile

P. J. Narayanan

International Institute of Information Technology, Hyderabad

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where P. J. Narayanan is active.

Explore More

Publication

Featured researches published by P. J. Narayanan.

ieee international conference on high performance computing data and analytics | 2007

Accelerating large graph algorithms on the GPU using CUDA

Pawan Harish; P. J. Narayanan

Large graphs involving millions of vertices are common in many practical applications and are challenging to process. Practical-time implementations using high-end computers are reported but are accessible only to a few. Graphics Processing Units (GPUs) of today have high computation power and low price. They have a restrictive programming model and are tricky to use. The G80 line of Nvidia GPUs can be treated as a SIMD processor array using the CUDA programming model. We present a few fundamental algorithms - including breadth first search, single source shortest path, and all-pairs shortest path - using CUDA on large graphs. We can compute the single source shortest path on a 10 million vertex graph in 1.5 seconds using the Nvidia 8800GTX GPU costing

IEEE MultiMedia | 1997

Virtualized reality: constructing virtual worlds from real scenes

Takeo Kanade; Peter Rander; P. J. Narayanan

600. In some cases optimal sequential algorithm is not the fastest on the GPU architecture. GPUs have great potential as high-performance co-processors.

international conference on computer vision | 1998

Constructing virtual worlds using dense stereo

P. J. Narayanan; Peter Rander; Takeo Kanade

A new visual medium, Virtualized Reality, immerses viewers in a virtual reconstruction of real-world events. The Virtualized Reality world model consists of real images and depth information computed from these images. Stereoscopic reconstructions provide a sense of complete immersion, and users can select their own viewpoints at view time, independent of the actual camera positions used to capture the event.

computer vision and pattern recognition | 2008

CUDA cuts: Fast graph cuts on the GPU

Vibhav Vineet; P. J. Narayanan

We present Virtualized Reality, a technique to create virtual worlds out of dynamic events using densely distributed stereo views. The intensity image and depth map for each camera view at each time instant are combined to form a Visible Surface Model. Immersive interaction with the virtualized event is possible using a dense collection of such models. Additionally, a Complete Surface Model of each instant can be built by merging the depth maps from different cameras into a common volumetric space. The corresponding model is compatible with traditional virtual models and can be interacted with immersively using standard tools. Because both VSMs and CSMs are fully three-dimensional, virtualized models can also be combined and modified to build larger, more complex environments, an important capability for many non-trivial applications. We present results from 3D Dome, our facility to create virtualized models.

Archive | 2006

Computer Vision – ACCV 2006

P. J. Narayanan; Shree K. Nayar; Heung-Yeung Shum

Graph cuts has become a powerful and popular optimization tool for energies defined over an MRF and have found applications in image segmentation, stereo vision, image restoration, etc. The maxflow/mincut algorithm to compute graph-cuts is computationally heavy. The best-reported implementation of graph cuts takes over 100 milliseconds even on images of size 640times480 and cannot be used for real-time applications or when iterated applications are needed. The commodity Graphics Processor Unit (GPU) has emerged as an economical and fast computation co-processor recently. In this paper, we present an implementation of the push-relabel algorithm for graph cuts on the GPU. We can perform over 60 graph cuts per second on 1024times1024 images and over 150 graph cuts per second on 640times480 images on an Nvidia 8800 GTX. The time for each complete graph-cut is about 1 millisecond when only a few weights change from the previous graph, as on dynamic graphs resulting from videos. The CUDA code with a well-defined interface can be downloaded for anyonepsilas use.

international parallel and distributed processing symposium | 2009

Singular value decomposition on GPU using CUDA

Sheetal Lahabar; P. J. Narayanan

The computer vision accv 2006 7th asian conference on computer vision hyderabad india january 13 16 2006 proceedings part i lecture notes in computer science pt 1 that we provide for you will be ultimate to give preference. This reading book is your chosen book to accompany you when in your free time, in your lonely. This kind of book can help you to heal the lonely and get or add the inspirations to be more inoperative. Yeah, book as the widow of the world can be very inspiring manners. As here, this book is also created by an inspiring author that can make influences of you to do more.

high performance graphics | 2009

Fast minimum spanning tree for large graphs on the GPU

Vibhav Vineet; Pawan Harish; Suryakant Patidar; P. J. Narayanan

Linear algebra algorithms are fundamental to many computing applications. Modern GPUs are suited for many general purpose processing tasks and have emerged as inexpensive high performance co-processors due to their tremendous computing power. In this paper, we present the implementation of singular value decomposition (SVD) of a dense matrix on GPU using the CUDA programming model. SVD is implemented using the twin steps of bidiagonalization followed by diagonalization. It has not been implemented on the GPU before. Bidiagonalization is implemented using a series of Householder transformations which map well to BLAS operations. Diagonalization is performed by applying the implicitly shifted QR algorithm. Our complete SVD implementation outperforms the MATLAB and Intel ®Math Kernel Library (MKL) LAPACK implementation significantly on the CPU. We show a speedup of upto 60 over the MATLAB implementation and upto 8 over the Intel MKL implementation on a Intel Dual Core 2.66GHz PC on NVIDIA GTX 280 for large matrices. We also give results for very large matrices on NVIDIA Tesla S1070.

ieee international conference on high performance computing, data, and analytics | 2009

A performance prediction model for the CUDA GPGPU platform

Kishore Kothapalli; Rishabh Mukherjee; M. Suhail Rehman; Suryakant Patidar; P. J. Narayanan; Kannan Srinathan

Graphics Processor Units are used for many general purpose processing due to high compute power available on them. Regular, data-parallel algorithms map well to the SIMD architecture of current GPU. Irregular algorithms on discrete structures like graphs are harder to map to them. Efficient data-mapping primitives can play crucial role in mapping such algorithms onto the GPU. In this paper, we present a minimum spanning tree algorithm on Nvidia GPUs under CUDA, as a recursive formulation of Borůvkas approach for undirected graphs. We implement it using scalable primitives such as scan, segmented scan and split. The irregular steps of supervertex formation and recursive graph construction are mapped to primitives like split to categories involving vertex ids and edge weights. We obtain 30 to 50 times speedup over the CPU implementation on most graphs and 3 to 10 times speedup over our previous GPU implementation. We construct the minimum spanning tree on a 5 million node and 30 million edge graph in under 1 second on one quarter of the Tesla S1070 GPU.

IEEE Transactions on Circuits and Systems for Video Technology | 2011

Person De-Identification in Videos

Prachi Agrawal; P. J. Narayanan

The significant growth in computational power of modern Graphics Processing Units (GPUs) coupled with the advent of general purpose programming environments like NVIDIAs CUDA, has seen GPUs emerging as a very popular parallel computing platform. Till recently, there has not been a performance model for GPGPUs. The absence of such a model makes it difficult to definitively assess the suitability of the GPU for solving a particular problem and is a significant impediment to the mainstream adoption of GPUs as a massively parallel (super)computing platform. In this paper we present a performance prediction model for the CUDA GPGPU platform. This model encompasses the various facets of the GPU architecture like scheduling, memory hierarchy, and pipelining among others. We also perform experiments that demonstrate the effects of various memory access strategies. The proposed model can be used to analyze pseudo code for a CUDA kernel to obtain a performance estimate, in a way that is similar to performing asymptotic analysis. We illustrate the usage of our model and its accuracy with three case studies: matrix multiplication, list ranking, and histogram generation.

IEEE Transactions on Visualization and Computer Graphics | 2010

Real-Time Ray Tracing of Implicit Surfaces on the GPU

Jag Mohan Singh; P. J. Narayanan

Advances in cameras and web technology have made it easy to capture and share large amounts of video data over to a large number of people. A large number of cameras oversee public and semi-public spaces today. These raise concerns on the unintentional and unwarranted invasion of the privacy of individuals caught in the videos. To address these concerns, automated methods to de-identify individuals in these videos are necessary. De-identification does not aim at destroying all information involving the individuals. Its ideal goals are to obscure the identity of the actor without obscuring the action. This paper outlines the scenarios in which de-identification is required and the issues brought out by those. We also present an approach to de-identify individuals from videos. Our approach involves tracking and segmenting individuals in a conservative voxel space involving x, y , and time. A de-identification transformation is applied per frame using these voxels to obscure the identity. Face, silhouette, gait, and other characteristics need to be obscured, ideally. We show results of our scheme on a number of videos and for several variations of the transformations. We present the results of applying algorithmic identification on the transformed videos. We also present the results of a user-study to evaluate how well humans can identify individuals from the transformed videos.

Explore More