Narendra Ahuja
University of Illinois at Urbana–Champaign
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Narendra Ahuja.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2002
Ming-Hsuan Yang; David J. Kriegman; Narendra Ahuja
Images containing faces are essential to intelligent vision-based human-computer interaction, and research efforts in face processing include face recognition, face tracking, pose estimation and expression recognition. However, many reported methods assume that the faces in an image or an image sequence have been identified and localized. To build fully automated systems that analyze the information contained in face images, robust and efficient face detection algorithms are required. Given a single image, the goal of face detection is to identify all image regions which contain a face, regardless of its 3D position, orientation and lighting conditions. Such a problem is challenging because faces are non-rigid and have a high degree of variability in size, shape, color and texture. Numerous techniques have been developed to detect faces in a single image, and the purpose of this paper is to categorize and evaluate these algorithms. We also discuss relevant issues such as data collection, evaluation metrics and benchmarking. After analyzing these algorithms and identifying their limitations, we conclude with several promising directions for future research.
ACM Computing Surveys | 1992
Yong Koo Hwang; Narendra Ahuja
Motion planning is one of the most important areas of robotics research. The complexity of the motion-planning problem has hindered the development of practical algorithms. This paper surveys the work on gross-motion planning, including motion planners for point robots, rigid robots, and manipulators in stationary, time-varying, constrained, and movable-object environments. The general issues in motion planning are explained. Recent approaches and their performances are briefly described, and possible future research directions are discussed.
international conference on robotics and automation | 1992
Yong Koo Hwang; Narendra Ahuja
A path-planning algorithm for the classical movers problem in three dimensions using a potential field representation of obstacles is presented. A potential function similar to the electrostatic potential is assigned to each obstacle, and the topological structure of the free space is derived in the form of minimum potential valleys. Path planning is done at two levels. First, a global planner selects a robots path from the minimum potential valleys and its orientations along the path that minimize a heuristic estimate of the path length and the chance of collision. Then, a local planner modifies the path and orientations to derive the final collision-free path and orientations. If the local planner fails, a new path and orientations are selected by the global planner and subsequently examined by the local planner. This process is continued until a solution is found or there are no paths left to be examined. The algorithm solves a much wider class of problems than other heuristic algorithms and at the same time runs much faster than exact algorithms (typically 5 to 30 min on a Sun 3/260). >
computer vision and pattern recognition | 2012
Tianzhu Zhang; Bernard Ghanem; Si Liu; Narendra Ahuja
In this paper, we formulate object tracking in a particle filter framework as a multi-task sparse learning problem, which we denote as Multi-Task Tracking (MTT). Since we model particles as linear combinations of dictionary templates that are updated dynamically, learning the representation of each particle is considered a single task in MTT. By employing popular sparsity-inducing ℓp, q mixed norms (p ∈ {2, ∞} and q = 1), we regularize the representation problem to enforce joint sparsity and learn the particle representations together. As compared to previous methods that handle particles independently, our results demonstrate that mining the interdependencies between particles improves tracking performance and overall computational complexity. Interestingly, we show that the popular L1 tracker [15] is a special case of our MTT formulation (denoted as the L11 tracker) when p = q = 1. The learning problem can be efficiently solved using an Accelerated Proximal Gradient (APG) method that yields a sequence of closed form updates. As such, MTT is computationally attractive. We test our proposed approach on challenging sequences involving heavy occlusion, drastic illumination changes, and large pose variations. Experimental results show that MTT methods consistently outperform state-of-the-art trackers.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 1989
Juyang Weng; Thomas S. Huang; Narendra Ahuja
Deals with estimating motion parameters and the structure of the scene from point (or feature) correspondences between two perspective views. An algorithm is presented that gives a closed-form solution for motion parameters and the structure of the scene. The algorithm utilizes redundancy in the data to obtain more reliable estimates in the presence of noise. An approach is introduced to estimating the errors in the motion parameters computed by the algorithm. Specifically, standard deviation of the error is estimated in terms of the variance of the errors in the image coordinates of the corresponding points. The estimated errors indicate the reliability of the solution as well as any degeneracy or near degeneracy that causes the failure of the motion estimation algorithm. The presented approach to error estimation applies to a wide variety of problems that involve least-squares optimization or pseudoinverse. Finally the relationships between errors and the parameters of motion and imaging system are analyzed. The results of the analysis show, among other things, that the errors are very sensitive to the translation direction and the range of field view. Simulations are conducted to demonstrate the performance of the algorithms and error estimation as well as the relationships between the errors and the parameters of motion and imaging systems. The algorithms are tested on images of real-world scenes with point of correspondences computed automatically. >
IEEE Transactions on Pattern Analysis and Machine Intelligence | 1989
William Hoff; Narendra Ahuja
An approach is described that integrates the processes of feature matching, contour detection, and surface interpolation to determine the three-dimensional distance, or depth, of objects from a stereo pair of images. Integration is necessary to ensure that the detected surfaces are smooth. Surface interpolation takes into account detected occluding and ridge contours in the scene; interpolation is performed within regions enclosed by these contours. Planar and quadratic patches are used as local models of the surface. Occluded regions in the image are identified, and are not used for matching and interpolation. A coarse-to-fine algorithm is presented that generates a multiresolution hierarchy of surface maps, one at each level of resolution. Experimental results are given for a variety of stereo images. >
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2002
Ming-Hsuan Yang; Narendra Ahuja; Mark Tabb
We present an algorithm for extracting and classifying two-dimensional motion in an image sequence based on motion trajectories. First, a multiscale segmentation is performed to generate homogeneous regions in each frame. Regions between consecutive frames are then matched to obtain two-view correspondences. Affine transformations are computed from each pair of corresponding regions to define pixel matches. Pixels matches over consecutive image pairs are concatenated to obtain pixel-level motion trajectories across the image sequence. Motion patterns are learned from the extracted trajectories using a time-delay neural network. We apply the proposed method to recognize 40 hand gestures of American Sign Language. Experimental results show that motion patterns of hand gestures can be extracted and recognized accurately using motion trajectories.
IEEE Transactions on Circuits and Systems for Video Technology | 2001
Rakesh Dugad; Narendra Ahuja
Given a video frame in terms of its 8/spl times/8 block-DCT coefficients, we wish to obtain a downsized or upsized version of this frame also in terms of 8/spl times/8 block-DCT coefficients. The DCT being a linear unitary transform is distributive over matrix multiplication. This fact has been used for downsampling video frames in the DCT domain. However, this involves matrix multiplication with the DCT of the downsampling matrix. This multiplication can be costly enough to trade off any gains obtained by operating directly in the compressed domain. We propose an algorithm for downsampling and also upsampling in the compressed domain which is computationally much faster, produces visually sharper images, and gives significant improvements in PSNR (typically 4-dB better compared to bilinear interpolation). Specifically the downsampling method requires 1.25 multiplications and 1.25 additions per pixel of original image compared to 4.00 multiplications and 4.75 additions required by the method of Chang et al. (1995). Moreover, the downsampling and upsampling schemes combined together preserve all the low-frequency DCT coefficients of the original image. This implies tremendous savings for coding the difference between the original frame (unsampled image) and its prediction (the upsampled image). This is desirable for many applications based on scalable encoding of video. The method presented can also be used with transforms other than DCT, such as Hadamard or Fourier.
international conference on image processing | 1998
Ming-Hsuan Yang; Narendra Ahuja
We propose a new method to detect human faces in color images. A human skin color model is built to capture the chromatic properties based on multivariate statistical analysis. Given a color image, multiscale segmentation is used to generate homogeneous regions at multiple different scales. From the coarsest to the finest scale, regions of skin color are merged until the shape is approximately elliptic. Postprocessing is performed to determine whether a merged region contains a human face and include the facial features of non-skin color such as eyes and mouth if necessary. Experimental results show that human faces in color images can be detected regardless of size, orientation and viewpoint.
computer vision and pattern recognition | 2009
Qingxiong Yang; Kar-Han Tan; Narendra Ahuja
We propose a new bilateral filtering algorithm with computational complexity invariant to filter kernel size, so-called O(1) or constant time in the literature. By showing that a bilateral filter can be decomposed into a number of constant time spatial filters, our method yields a new class of constant time bilateral filters that can have arbitrary spatial and arbitrary range kernels. In contrast, the current available constant time algorithm requires the use of specific spatial or specific range kernels. Also, our algorithm lends itself to a parallel implementation leading to the first real-time O(1) algorithm that we know of. Meanwhile, our algorithm yields higher quality results since we are effectively quantizing the range function instead of quantizing both the range function and the input image. Empirical experiments show that our algorithm not only gives higher PSNR, but is about 10× faster than the state-of-the-art. It also has a small memory footprint, needed only 2% of the memory required by the state-of-the-art for obtaining the same quality as exact using 8-bit images. We also show that our algorithm can be easily extended for O(1) median filtering. Our bilateral filtering algorithm was tested in a number of applications, including HD video conferencing, video abstraction, highlight removal, and multi-focus imaging.