Kazuhiro Nakadai
Kyoto University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kazuhiro Nakadai.
international conference industrial, engineering & other applications applied intelligent systems | 2002
Hiroshi G. Okuno; Kazuhiro Nakadai; Hiroaki Kitano
Social interaction is essential in improving robot human interface. Such behaviors for social interaction may include paying attention to a new sound source, moving toward it, or keeping face to face with a moving speaker. Some sound-centered behaviors may be difficult to attain, because the mixture of sounds is not well treated or auditory processing is too slow for real-time applications. Recently, Nakadai et al have developed real-time auditory and visual multiple-talker tracking technology by associating auditory and visual streams. The system is implemented on an upper-torso humanoid and the real-time talker tracking is attained with 200 msec of delay by distributed processing on four PCs connected by Gigabit Ethernet. Focus-of-attention is programmable and allows a variety of behaviors. The system demonstrates non-verbal social interaction by realizing a receptionist robot by focusing on an associated stream, while a companion robot on an auditory stream.
international conference on robotics and automation | 2011
Takeshi Mizumoto; Kazuhiro Nakadai; Takami Yoshida; Ryu Takeda; Takuma Otsuka; Toru Takahashi; Hiroshi G. Okuno
This paper presents the design and implementation of selectable sound separation functions on the telepresence system “Texai” using the robot audition software “HARK.” An operator of Texai can “walk” around a faraway office to attend a meeting or talk with people through video-conference instead of meeting in person. With a normal microphone, the operator has difficulty recognizing the auditory scene of the Texai, e.g., he/she cannot know the number and the locations of sounds. To solve this problem, we design selectable sound separation functions with 8 microphones in two modes, overview and filter modes, and implement them using HARKs sound source localization and separation. The overview mode visualizes the direction-of-arrival of surrounding sounds, while the filter mode provides sounds that originate from the range of directions he/she specifies. The functions enable the operator to be aware of a sound even if it comes from behind the Texai, and to concentrate on a particular sound. The design and implementation was completed in five days due to the portability of HARK. Experimental evaluations with actual and simulated data show that the resulting system localizes sound sources with a tolerance of 5 degrees.
intelligent robots and systems | 2005
Shunsuke Kurotaki; Noriaki Suzuki; Kazuhiro Nakadai; Hiroshi G. Okuno; Hideharu Amano
In this paper, we report the design and implementation of a sound source separation system using a dynamically reconfigurable device. A robot in real-world environments should have an ability to treat a mixture of multiple sound signals. Active direction-pass filter (ADPF) which extracts sound from a specific direction by using a pair of microphones has been developed as such a method of sound source separation. The ADPF was used as a front-end for an automatic speech recognition system, and recognition of three simultaneous speech signals has been reported. The ADPF, however, requires a lot of computational power, while the battery capacity and the physical size of the robot are limited. To reduce the power consumption and the size of the system, we adopted the dynamically reconfigurable device, DRP developed by NEC Electronics. We implemented the ADPF on DRP, and investigated the effectiveness of dynamically reconfigurable device for these applications. The preliminary experiment shows that ADPF on DRP separates a mixture of sound sources in real-time with practical accuracy.
industrial and engineering applications of artificial intelligence and expert systems | 2003
Hiroshi G. Okuno; Kazuhiro Nakadai; Hiroaki Kitano
Controlling robot behaviors becomes more important recently as active perception for robot, in particular active audition in addition to active vision, has made remarkable progress. We are studying how to create social humanoids that perform actions empowered by real-time audio-visual tracking of multiple talkers. In this paper, we present personality as a means of controlling non-verbal behaviors. It consists of two dimensions, dominance vs. submissiveness and friendliness vs. hostility, based on the Interpersonal Theory in psychology. The upper-torso humanoid SIG equipped with real-time audio-visual multiple-talker tracking system is used as a testbed for social interaction. As a companion robot, with friendly personality, it turns toward a new sound source in order to show its attention, while with hostile personality, it turns away from a new sound source. As a receptionist robot with dominant personality, it focuses its attention on the current customer, while with submissive personality, its attention to the current customer is interrupted by a new one.
Archive | 2010
Kazuhiro Nakadai; Hiroshi Nakajima; Keisuke Nakamura; 弘史 中島; 圭佑 中村; 一博 中臺
Journal of the Robotics Society of Japan | 2003
Kazuhiro Nakadai; Ken-ichi Hidai; Hiroshi Mizoguchi; Hiroshi G. Okuno; Hiroaki Kitano
Archive | 2011
Kazuhiro Nakadai; Keisuke Nakamura; 圭佑 中村; 一博 中臺
Archive | 2006
Kazuhiro Nakadai; Hiroshi Tsujino; Hirofumi Nakajima
Journal of the Robotics Society of Japan | 2013
Keita Okutani; Takami Yoshida; Keisuke Nakamura; Kazuhiro Nakadai
Archive | 2012
Kazuhiro Nakadai; 一博 中臺; Ince Goekhan; インジュ・ギョカン
Collaboration
Dive into the Kazuhiro Nakadai's collaboration.
National Institute of Advanced Industrial Science and Technology
View shared research outputsNational Institute of Advanced Industrial Science and Technology
View shared research outputs