Kiyoharu Aizawa
University of Tokyo
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kiyoharu Aizawa.
Signal Processing-image Communication | 1989
Kiyoharu Aizawa; Hiroshi Harashima; Takahiro Saito
Abstract The initial conception of a model-based analysis synthesis image coding (MBASIC) system is described and a construction method for a three-dimensional (3-D) facial model that includes synthesis methods for facial expressions is presented. The proposed MBASIC system is an image coding method that utilizes a 3-D model of the object which is to be reproduced. An input image is first analyzed and an output image using the 3-D model is then synthesized. A very low bit rate image transmission can be realized because the encoder sends only the required analysis parameters. Output images can be reconstructed without the noise corruption that reduces naturalness because the decoder synthesizes images from a similar 3-D model. In order to construct a 3-D model of a persons face, a method is developed which uses a 3-D wire frame face model. A full-face image is then projected onto this wire frame model. For the synthesis of facial expressions two different methods are proposed; a clip-and-paste method and a facial structure deformation method.
Proceedings of the IEEE | 1995
Kiyoharu Aizawa; Thomas S. Huang
The paper gives an overview of model-based approaches applied to image coding, by looking at image source models. In these model-based schemes, which are different from the various conventional waveform coding methods, the 3-D properties of the scenes are taken into consideration. They can achieve very low bit rate image transmission. The 2-D model and 3-D model based approaches are explained. Among them, a 3-D model based method using a 3-D facial model and a 2-D model based method utilizing 2-D deformable triangular patches are described. Works related to 3-D model-based coding of facial images and some of the remaining problems are also described. >
IEEE Transactions on Circuits and Systems for Video Technology | 1994
Chang Seok Choi; Kiyoharu Aizawa; Hiroshi Harashima; Tsuyoshi Takebe
This paper proposes new methods for analyzing image sequences and updating textures of the three-dimensional (3-D) facial model. It also describes a method for synthesizing various facial expressions. These three methods are the key technologies for the model-based image coding system. The input image analysis technique directly and robustly estimates the 3-D head motions and the facial expressions without any two-dimensional (2-D) entity correspondences. This technique resolves the 2-D correspondence mismatch errors and provides quality reproduction of the original images by fully incorporating the synthesis rules. To verify the analysis algorithm, the paper performs quantitative and subjective evaluations. It presents two methods for updating the texture of the facial model to improve the quality of the synthesized images. The first method focuses on the facial parts with large change of brightness according to the various facial expressions for reducing the transmission bit rates. The second method focuses on all changes of brightness caused by the 3-D head motions as well as the facial expressions. The transmission bit rates are estimated according to the update methods. For synthesizing the output images, it describes rules that simulate the facial muscular actions because the muscles cause the facial expressions. These rules more easily synthesize the high-quality facial images that represent the various facial expressions. >
acm workshop on continuous archival and retrieval of personal experiences | 2004
Kiyoharu Aizawa; Datchakorn Tancharoen; Shinya Kawasaki; Toshihiko Yamasaki
In this paper, we present continuous capture of our life log with various sensors plus additional data and propose effective retrieval methods using this context and content. Our life log system contains video, audio, acceleration sensor, gyro, GPS, annotations, documents, web pages, and emails. In our previous studies, we showed our retrieval methodology [8], [9], which mainly depends on context information from sensor data. In this paper, we extend our methodology with additional functions. They are (1) spatio-temporal sampling for extraction of key frames for summarization; and (2) conversation scene detection. With the first of these, key frames for the summarization are extracted using time and location data (GPS). Because our life log captures dense location data, we can also make use of derivatives of location data, that is, speed and acceleration in the movement of the person. The summarizing key frames are made using them. We also introduce content analysis for conversation scene detection. In our previous work, we have investigated context-based retrieval, which differs from the majority of studies in image/video retrieval focusing on content-based retrieval. In this paper, we introduce visual and audio data content analysis for conversation scene detection. The detection of conversation scenes will be very important tags for our life log data retrieval. We describe our present system and additional functions, as well as preliminary results for the additional functions.
international conference on acoustics, speech, and signal processing | 1989
Shigeo Morishima; Kiyoharu Aizawa; Hiroshi Harashima
The authors propose and compare two types of model-based facial motion coding schemes, i.e. synthesis by rules and synthesis by parameters. In synthesis by rules, facial motion images are synthesized on the basis of rules extracted by analysis of training image samples that include all of the phonemes and coarticulation. This system can be utilized as an automatic facial animation synthesizer from text input or as a man-machine interface using the facial motion image. In synthesis by parameters, facial motion images are synthesized on the basis of a code word index of speech parameters. Experimental results indicate good performance for both systems, which can create natural facial-motion images with very low transmission rate. Details of 3-D modeling, algorithm synthesis, and performance are discussed.<<ETX>>
Signal Processing-image Communication | 1993
Takashi Komatsu; Toru Igarashi; Kiyoharu Aizawa; Takahiro Saito
Abstract Towards the development of a very high definition (VHD) image acquisition system, previously we developed the signal processing based approach with multiple cameras. The approach produces an improved resolution image with sufficiently high signal-to-noise ratio by processing and integrating multiple images taken simultaneously with multiple cameras. Originally, in this approach, we used multiple cameras with the same pixel aperture, but in this case there are severe limitations both in the arrangement of multiple cameras and in the configuration of the scene, in order to guarantee the spatial uniformity of the resultant resolution. To overcome this difficulty completely, this work presents the utilization of multiple cameras with different pixel apertures, and develops a new, alternately iterative signal processing algorithm available in the different aperture case. Experimental simulations clearly show that the utilization of multiple different-aperture cameras prospects to be good and that the alternately iterative algorithm behaves satisfactorily.
computer vision and pattern recognition | 2012
Satoshi Ikehata; David P. Wipf; Yasuyuki Matsushita; Kiyoharu Aizawa
This paper presents a robust photometric stereo method that effectively compensates for various non-Lambertian corruptions such as specularities, shadows, and image noise. We construct a constrained sparse regression problem that enforces both Lambertian, rank-3 structure and sparse, additive corruptions. A solution method is derived using a hierarchical Bayesian approximation to accurately estimate the surface normals while simultaneously separating the non-Lambertian corruptions. Extensive evaluations are performed that show state-of-the-art performance using both synthetic and real-world images.
IEEE Transactions on Electron Devices | 1997
Kiyoharu Aizawa; Y. Egi; Takayuki Hamamoto; Mitsutoshi Hatori; Masahide Abe; Hirotaka Maruyama; H. Otake
We propose a novel integration of image compression and sensing in order to enhance the performance of an image sensor. By integrating a compression function onto the sensor focal plane, the image signal to be read out from the sensor is significantly reduced and the pixel rate of the sensor ran consequently be increased. The potential applications of the proposed sensor are in high pixel-rate imaging, such as high frame-rate image sensing and high-resolution image sensing. The compression scheme we employ is a conditional replenishment, which detects and encodes moving areas. In this paper, we introduce two architectures for on-sensor compression; one is the pixel parallel approach and the other is the column parallel approach. We prototyped a VLSI chip of the proposed sensor based on the pixel parallel architecture. We show the design and describe the results of the experiments obtained by the prototype chip.
IEEE Transactions on Image Processing | 2005
Akira Kubota; Kiyoharu Aizawa
We present a novel filtering method for reconstructing an all-in-focus image or an arbitrarily focused image from two images that are focused differently. The method can arbitrarily manipulate the degree of blur of the objects using linear filters without segmentation. The filters are uniquely determined from a linear imaging model in the Fourier domain. An effective and accurate blur estimation method is developed. The simulation results show that the accuracy and computational time of the proposed method are improved compared with the previous iterative method and that the effects of blur estimation error on the quality of the reconstructed image are very small. The method performs well for real images acquired without visible artifacts.
IEEE Transactions on Circuits and Systems for Video Technology | 2000
Kiyoharu Aizawa; Kazuya Kodama; Akira Kubota
We propose a novel approach for producing special visual effects by fusing multiple differently focused images. This method differs from conventional image fusion techniques because it enables us to arbitrarily generate object-based visual effects such as blurring, enhancement, and shifting. Notably, the method does not need any segmentation. Using a linear imaging model, it directly generates the desired image from multiple differently focused images.