Dibyendu Nandy
University of Illinois at Chicago
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dibyendu Nandy.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 1998
Jezekiel Ben-Arie; Dibyendu Nandy
A novel method for representing 3D objects that unifies viewer and model centered object representations is presented. A unified 3D frequency-domain representation, called volumetric frequency representation (VFR), encapsulates both the spatial structure of the object and a continuum of its views in the same data structure. The frequency-domain image of an object viewed from any direction can be directly extracted employing an extension of the projection slice theorem, where each Fourier-transformed view is a planar slice of the volumetric frequency representation. The VFR is employed for pose-invariant recognition of complex objects, such as faces. The recognition and pose estimation is based on an efficient matching algorithm in a four-dimensional Fourier space. Experimental examples of pose estimation and recognition of faces in various poses are also presented.
international conference on image processing | 1998
Jezekiel Ben-Arie; Dibyendu Nandy
A framework for the reconstruction of smooth surface shapes from shading images is presented. The method is based on using a backpropagation based neural network for learning brightness patterns and associating them with range data. The network is designed to reconstruct surface range from localized intensity patches of 7/spl times/7 pixels. Two methods for training the network are investigated, one based on a novel weight diffusion process which enforces a local smoothness constraint and the other using the eigen coefficients of the input and output patterns which make the training computationally efficient. An elegant and simple method for integrating reconstructed surface patches by minimizing the sum squared error in overlapped areas is derived. Results are shown for reconstruction of simple shapes like cylinders, hyperboloids and paraboloids as well as complex shapes like facial structure from intensity images.
IEEE Transactions on Image Processing | 2001
Dibyendu Nandy; Jezekiel Ben-Arie
In this paper, we develop a novel framework for robust recovery of three-dimensional (3-D) surfaces of faces from single images. The underlying principle is shape from recognition, i.e., the idea that pre-recognizing face parts can constrain the space of possible solutions to the image irradiance equation, thus allowing robust recovery of the 3-D structure of a specific part. Parts of faces like nose, lips and eyes are recognized and localized using robust expansion matching filter templates under varying pose and illumination. Specialized backpropagation based neural networks are then employed to recover the 3-D shape of particular face parts. Representation using principal components allows to efficiently encode classes of objects such as nose, lips, etc. The specialized networks are designed and trained to map the principal component coefficients of the part images to another set of principal component coefficients that represent the corresponding 3-D surface shapes. To achieve robustness to viewing conditions, the network is trained with a wide range of illumination and viewing directions. A method for merging recovered 3-D surface regions by minimizing the sum squared error in overlapping areas is also derived. Quantitative analysis of the reconstruction of the surface parts in varying illumination and pose show relatively small errors, indicating that the method is robust and accurate. Several examples showing recovery of the complete face also illustrate the efficacy of the approach.
IEEE Transactions on Image Processing | 1999
Dibyendu Nandy; Jezekiel Ben-Arie
A novel generalized feature extraction method based on the expansion matching (EXM) method and on the Karhunen-Loeve transform (KLT) is presented. The method provides an efficient way to locate complex features of interest like corners and junctions with reduced number of filtering operations. The EXM method is used to design optimal detectors for a set of model elementary features. The KL representation of these model EXM detectors is used to filter the image and detect candidate interest points from the energy peaks of the eigen coefficients. The KL coefficients at these candidate points are then used to efficiently reconstruct the response and differentiate real junctions and corners from arbitrary features in the image. The method is robust to additive noise and is able to successfully extract, classify, and find the myriad compositions of corner and junction features formed by combinations of two or more edges or lines. This method differs from previous works in several aspects. First, it treats the features not as distinct entities, but as combinations of elementary features. Second, it employs an optimal set of elementary feature detectors based on the EM approach. Third, the method incorporates a significant reduction in computational complexity by representing a large set of EXM filters by a relatively small number of eigen filters derived by the KL transform of the basic EXM filter set. This is a novel application of the KL transform, which is usually employed to represent signals and not impulse responses as in our present work.
computer vision and pattern recognition | 1999
Dibyendu Nandy; Jezekiel Ben-Arie
In this paper a novel framework for the recovery of 3D surfaces of faces from single images is developed. The underlying principle is shape from recognition, i.e. the idea that pre-recognizing face parts can constrain the space of possible solutions to the image irradiance equation, thus allowing robust recovery of the 3D structure of a specific part. Shape recovery of the recognized part is based on specialized backpropagation based neural networks, each of which is employed in the recovery of a particular face part. Representation using principal components allows to efficiently encode classes of objects such as nose, lips, etc. The specialized networks are designed and trained to map the principal component coefficients of the shading images to another set of principal component coefficients that represent the corresponding 3D surface shapes. A method for integrating recovered 3D surface regions by minimizing the sum squared error in overlapping areas is also derived. Quantitative analysis of the reconstruction of the surface parts show relatively small errors indicating that the method is robust and accurate. The recovery of a complete face is performed by minimal squared error merging efface parts.
international conference on image processing | 1998
Dibyendu Nandy; Jezekiel Ben-Arie
A novel method for extracting parametric junction and corner features in images is presented. By treating each complex feature as a combination of elementary line and edge features, the method provides an efficient way to locate features of interest with reduced number of filtering operations. The expansion matching (EXM) method is used to design optimal detectors for a set of elementary model shape features. Next, the principal components of the Karhunen-Loeve (KL) representation of these model EXM detectors are used to filter the image and extract candidate interest points derived from the energy peaks of the eigen coefficients. The KL coefficients at these candidate points are then used to efficiently reconstruct the response and differentiate real junctions and corners from arbitrary features in the image. The method is robust to additive noise and is able to successfully extract, classify and find the myriad compositions of corner and junction features formed by combinations of two or more elementary edges or lines.
international conference on image processing | 1997
Dibyendu Nandy; Jezekiel Ben-Arie
A novel method unifying viewer and model centered approaches for representing structurally complex 3-D objects like human faces is presented. The unified 3-D frequency-domain representation (called volumetric/iconic spectral signatures-V/ISS) encapsulates both the spatial structure of the object and a continuum of the projection slice theorem is used to directly extract the frequency-domain image of an object as viewed from any direction. Each such Fourier-transformed view is a planar slice of the volumetric frequency representation. The V/ISS representation is employed for pose-invariant recognition of complex objects such as faces. The recognition and pose estimation is based on an efficient matching algorithm in a four dimensional Fourier space. Experimental examples of pose estimation and recognition of faces are presented.
computer vision and pattern recognition | 1997
Jezekiel Ben-Arie; Dibyendu Nandy
A novel method for representing 3-D objects that unifies viewer and model centered object representations is presented. A unified 3-D frequency-domain representation (called volumetric/iconic spectral signatures-V/ISS) encapsulates both the spatial structure of the object and a continuum of its views in the same data structure. The frequency-domain image of an object viewed from any direction can be directly extracted employing an extension of the projection slice theorem, where each Fourier-transformed view is a planar slice of the volumetric frequency representation. The V/ISS representation call be employed for pose-invariant recognition of complex objects such as faces. The recognition and pose estimation is based on an efficient matching algorithm in a four dimensional Fourier space. Experimental examples of pose estimation and recognition of faces are also presented.
Annals of Biomedical Engineering | 1996
Dibyendu Nandy; Jezekiel Ben-Arie
We present in this paper a connectionist model that extracts interaural intensity differences (IID) from head-related transfer functions (HRTF) in the form of spectral cues to localize broadband high-frequency auditory stimuli, in both azimuth and elevation. A novel discriminative matching measure (DMM) is defined and optimized to characterize matching this IID spectrum. The optimal DMM approach and a novel backpropagation-based fuzzy model of localization are shown to be capable of localizing sources in azimuth, using only spectral IID cues. The fuzzy neural network model is extended to include localization in elevation. The use of training data with additive noise provides robustness to input errors. Outputs are modeled as two-dimensional Gaussians that act as membership functions for the fuzzy sets of sound locations. Error back-propagation is used to train the network to correlate input patterns and the desired output patterns. The fuzzy outputs are used to estimate the location of the source by detecting Gaussians using the max-energy paradigm. The proposed model shows that HRTF-based spectral IID patterns can provide sufficient information for extracting localization cues using a connectionist paradigm. Successful recognition in the presence of additive noise in the inputs indicates that the computational framework of this model is robust to errors made in estimating the IID patterns. The localization errors for such noisy patterns at various elevations and azimuths are compared and found to be within limits of localization blurs observed in humans.
midwest symposium on circuits and systems | 1993
Dibyendu Nandy; K.R. Rao; Jezekiel Ben-Arie
In this paper we consider multiple template matching techniques for auditory localization. In our approach, auditory localization is based on extracting localization cues from the ratios of the incoming sound spectra at the two ears. Localization cues can be extracted from such ratios by matching them with stored templates of ratios of head related transfer functions. Here we compare the performance of several matching techniques in their ability to accurately extract localization cues from such ratios. We introduce a new Discriminative Matching Measure (DMM), a similarity measure to be optimized, and formulate a novel linear matching scheme which optimizes this measure. The DMM is similar to our Discriminative Signal-to-Noise Ratio measure. We compare the performance of several linear techniques, namely correlation and normalized correlation and our novel optimal matching method and also a non-linear method based on the backpropagation algorithm.<<ETX>>