Takeo Kanade | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Takeo Kanade is active.

Explore More

Publication

Featured researches published by Takeo Kanade.

International Journal of Computer Vision | 1992

Shape and motion from image streams under orthography: a factorization method

Carlo Tomasi; Takeo Kanade

Inferring scene geometry and camera motion from a stream of images is possible in principle, but is an ill-conditioned problem when the objects are distant with respect to their size. We have developed a factorization method that can overcome this difficulty by recovering shape and motion under orthography without computing depth as an intermediate step.An image stream can be represented by the 2F×P measurement matrix of the image coordinates of P points tracked through F frames. We show that under orthographic projection this matrix is of rank 3.Based on this observation, the factorization method uses the singular-value decomposition technique to factor the measurement matrix into two matrices which represent object shape and camera rotation respectively. Two of the three translation components are computed in a preprocessing stage. The method can also handle and obtain a full solution from a partially filled-in measurement matrix that may result from occlusions or tracking failures.The method gives accurate results, and does not introduce smoothing in either shape or motion. We demonstrate this with a series of experiments on laboratory and outdoor image streams, with and without occlusions.

ieee international conference on automatic face and gesture recognition | 2000

Comprehensive database for facial expression analysis

Takeo Kanade; Jeffrey F. Cohn; Yingli Tian

Within the past decade, significant effort has occurred in developing methods of facial expression analysis. Because most investigators have used relatively limited data sets, the generalizability of these various methods remains unknown. We describe the problem space for facial expression analysis, which includes level of description, transitions among expressions, eliciting conditions, reliability and validity of training and test data, individual differences in subjects, head orientation and scene complexity image characteristics, and relation to non-verbal behavior. We then present the CMU-Pittsburgh AU-Coded Face Expression Image Database, which currently includes 2105 digitized image sequences from 182 adult subjects of varying ethnicity, performing multiple tokens of most primary FACS action units. This database is the most comprehensive testbed to date for comparative studies of facial expression analysis.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2002

Limits on super-resolution and how to break them

Simon Baker; Takeo Kanade

Nearly all super-resolution algorithms are based on the fundamental constraints that the super-resolution image should generate low resolution input images when appropriately warped and down-sampled to model the image formation process. (These reconstruction constraints are normally combined with some form of smoothness prior to regularize their solution.) We derive a sequence of analytical results which show that the reconstruction constraints provide less and less useful information as the magnification factor increases. We also validate these results empirically and show that, for large enough magnification factors, any smoothness prior leads to overly smooth results with very little high-frequency content. Next, we propose a super-resolution algorithm that uses a different kind of constraint in addition to the reconstruction constraints. The algorithm attempts to recognize local features in the low-resolution images and then enhances their resolution in an appropriate manner. We call such a super-resolution algorithm a hallucination or reconstruction algorithm. We tried our hallucination algorithm on two different data sets, frontal images of faces and printed Roman text. We obtained significantly better results than existing reconstruction-based algorithms, both qualitatively and in terms of RMS pixel error.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2001

Recognizing action units for facial expression analysis

Yingli Tian; Takeo Kanade; Jeffrey F. Cohn

Most automatic expression analysis systems attempt to recognize a small set of prototypic expressions, such as happiness, anger, surprise, and fear. Such prototypic expressions, however, occur rather infrequently. Human emotions and intentions are more often communicated by changes in one or a few discrete facial features. In this paper, we develop an Automatic Face Analysis (AFA) system to analyze facial expressions based on both permanent facial features (brows, eyes, mouth) and transient facial features (deepening of facial furrows) in a nearly frontal-view face image sequence. The AFA system recognizes fine-grained changes in facial expression into action units (AUs) of the Facial Action Coding System (FACS), instead of a few prototypic expressions. Multistate face and facial component models are proposed for tracking and modeling the various facial features, including lips, eyes, brows, cheeks, and furrows. During tracking, detailed parametric descriptions of the facial features are extracted. With these parameters as the inputs, a group of action units (neutral expression, six upper face AUs and 10 lower face AUs) are recognized whether they occur alone or in combinations. The system has achieved average recognition rates of 96.4 percent (95.4 percent if neutral expressions are excluded) for upper face AUs and 96.7 percent (95.6 percent with neutral expressions excluded) for lower face AUs. The generalizability of the system has been tested by using independent image databases collected and FACS-coded for ground-truth by different research teams.

computer vision and pattern recognition | 2010

The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression

Patrick Lucey; Jeffrey F. Cohn; Takeo Kanade; Jason M. Saragih; Zara Ambadar; Iain A. Matthews

In 2000, the Cohn-Kanade (CK) database was released for the purpose of promoting research into automatically detecting individual facial expressions. Since then, the CK database has become one of the most widely used test-beds for algorithm development and evaluation. During this period, three limitations have become apparent: 1) While AU codes are well validated, emotion labels are not, as they refer to what was requested rather than what was actually performed, 2) The lack of a common performance metric against which to evaluate new algorithms, and 3) Standard protocols for common databases have not emerged. As a consequence, the CK database has been used for both AU and emotion detection (even though labels for the latter have not been validated), comparison with benchmark algorithms is missing, and use of random subsets of the original database makes meta-analyses difficult. To address these and other concerns, we present the Extended Cohn-Kanade (CK+) database. The number of sequences is increased by 22% and the number of subjects by 27%. The target expression for each sequence is fully FACS coded and emotion labels have been revised and validated. In addition to this, non-posed sequences for several types of smiles and their associated metadata have been added. We present baseline results using Active Appearance Models (AAMs) and a linear support vector machine (SVM) classifier using a leave-one-out subject cross-validation for both AU and emotion detection for the posed data. The emotion and AU labels, along with the extended image data and tracked landmarks will be made available July 2010.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 1994

A stereo matching algorithm with an adaptive window: theory and experiment

Takeo Kanade; Masatoshi Okutomi

A central problem in stereo matching by computing correlation or sum of squared differences (SSD) lies in selecting an appropriate window size. The window size must be large enough to include enough intensity variation for reliable matching, but small enough to avoid the effects of projective distortion. If the window is too small and does not cover enough intensity variation, it gives a poor disparity estimate, because the signal (intensity variation) to noise ratio is low. If, on the other hand, the window is too large and covers a region in which the depth of scene points (i.e., disparity) varies, then the position of maximum correlation or minimum SSD may not represent correct matching due to different projective distortions in the left and right images. For this reason, a window size must be selected adaptively depending on local variations of intensity and disparity. The authors present a method to select an appropriate window by evaluating the local variation of the intensity and the disparity. The authors employ a statistical model of the disparity distribution within the window. This modeling enables the authors to assess how disparity variation, as well as intensity variation, within a window affects the uncertainty of disparity estimate at the center point of the window. As a result, the authors devise a method which searches for a window that produces the estimate of disparity with the least uncertainty for each pixel of an image: the method controls not only the size but also the shape (rectangle) of the window. The authors have embedded this adaptive-window method in an iterative stereo matching algorithm: starting with an initial estimate of the disparity map, the algorithm iteratively updates the disparity estimate for each point by choosing the size and shape of a window till it converges. The stereo matching algorithm has been tested on both synthetic and real images, and the quality of the disparity maps obtained demonstrates the effectiveness of the adaptive window method. >A central problem in stereo matching by computing correlation or sum of squared differences (SSD) lies in selecting an appropriate window size. The window size must be large enough to include enoug...

IEEE Transactions on Pattern Analysis and Machine Intelligence | 1985

Stereo by Intra- and Inter-Scanline Search Using Dynamic Programming

Yuichi Ohta; Takeo Kanade

This paper presents a stereo matching algorithm using the dynamic programming technique. The stereo matching problem, that is, obtaining a correspondence between right and left images, can be cast as a search problem. When a pair of stereo images is rectified, pairs of corresponding points can be searched for within the same scanlines. We call this search intra-scanline search. This intra-scanline search can be treated as the problem of finding a matching path on a two-dimensional (2D) search plane whose axes are the right and left scanlines. Vertically connected edges in the images provide consistency constraints across the 2D search planes. Inter-scanline search in a three-dimensional (3D) search space, which is a stack of the 2D search planes, is needed to utilize this constraint. Our stereo matching algorithm uses edge-delimited intervals as elements to be matched, and employs the above mentioned two searches: one is inter-scanline search for possible correspondences of connected edges in right and left images and the other is intra-scanline search for correspondences of edge-delimited intervals on each scanline pair. Dynamic programming is used for both searches which proceed simultaneously: the former supplies the consistency constraint to the latter while the latter supplies the matching score to the former. An interval-based similarity metric is used to compute the score. The algorithm has been tested with different types of images including urban aerial images, synthesized images, and block scenes, and its computational requirement has been discussed.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 1993

A multiple-baseline stereo

Masatoshi Okutomi; Takeo Kanade

A stereo matching method that uses multiple stereo pairs with various baselines generated by a lateral displacement of a camera to obtain precise distance estimates without suffering from ambiguity is presented. Matching is performed simply by computing the sum of squared-difference (SSD) values. The SSD functions for individual stereo pairs are represented with respect to the inverse distance and are then added to produce the sum of SSDs. This resulting function is called the SSSD-in-inverse-distance. It is shown that the SSSD-in-inverse-distance function exhibits a unique and clear minimum at the correct matching position, even when the underlying intensity patterns of the scene include ambiguities or repetitive patterns. The authors first define a stereo algorithm based on the SSSD-in-inverse-distance and present a mathematical analysis to show how the algorithm can remove ambiguity and increase precision. Experimental results with real stereo images are presented to demonstrate the effectiveness of the algorithm. >

Computer Graphics and Image Processing | 1980

Color information for region segmentation

Yuichi Ohta; Takeo Kanade; Toshiyuki Sakai

Abstract In color image processing various kinds of color features can be calculated from the tristimuli R, G, and B. We attempt to derive a set of effective color features by systematic experiments of region segmentation. An Ohlander-type segmentation algorithm by recursive thresholding is employed as a tool for the experiment. At each step of segmenting a region, new color features are calculated for the pixels in that region by the Karhunen Loeve transformation of R, G, and B data. By analyzing more than 100 color features which are thus obtained during segmenting eight kinds of color pictures, we have found that a set of color features, (R + G + B) 3 , R − B , and (2G − R − B) 2 , are effective. These three features are significant in this order and in many cases a good segmentation can be achieved by using only the first two. The effectiveness of our color feature set is discussed by a comparative study with various other sets of color features which are commonly used in image analysis. The comparison is performed in terms of both the quality of segmentation results and the calculation involved in transforming data of R, G, and B to other forms.

computer vision and pattern recognition | 1996

Neural network-based face detection

Henry A. Rowley; Shumeet Baluja; Takeo Kanade

We present a neural network-based face detection system. A retinally connected neural network examines small windows of an image and decides whether each window contains a face. The system arbitrates between multiple networks to improve performance over a single network. We use a bootstrap algorithm for training the networks, which adds false detections into the training set as training progresses. This eliminates the difficult task of manually selecting non-face training examples, which must be chosen to span the entire space of non-face images. Comparisons with other state-of-the-art face detection systems are presented; our system has better performance in terms of detection and false-positive rates.

Explore More