Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Do Yeong Kim is active.

Publication


Featured researches published by Do Yeong Kim.


international conference on acoustics, speech, and signal processing | 2006

The Cu-Htk Mandarin Broadcast News Transcription System

Rohit Sinha; Mark J. F. Gales; Do Yeong Kim; Xunying Liu; Khe Chai Sim; Philip C. Woodland

This paper discusses the development of the CU-HTK Mandarin broadcast news (BN) transcription system. The Mandarin BN task includes a significant amount of English data. Hence techniques have been investigated to allow the same system to handle both Mandarin and English by augmenting the Mandarin training sets with English acoustic and language model training data. A range of acoustic models were built including models based on Gaussianised features, speaker adaptive training and feature-space MPE. A multi-branch system architecture is described in which multiple acoustic model types, alternate phone sets and segmentations can be used in a system combination framework to generate the final output. The final system shows state-of-the-art performance over a range of test sets


ieee automatic speech recognition and understanding workshop | 2003

Recent advances in broadcast news transcription

Do Yeong Kim; Gunnar Evermann; Thomas Hain; David Mrva; S. E. Tranter; Lan Wang; Philip C. Woodland

Th paper describes recent advances in the CU-HTK Broadcast News English (BN-E) transcription system and its performance in the DARPA/NIST Rich Transcription 2003 Speech-to-Text (RT-03) evaluation. Heteroscedastic linear discriminant analysis (HLDA) and discriminative training, which were previously developed in the context of the recognition of conversational telephone speech, have been successfully applied to the BN-E task for the first time. A number of new features have also been added. These include gender-dependent (GD) discriminative training and modified discriminative training using lattice regeneration and combination. On the 2003 evaluation set, the system gave an overall word error rate of 10.7% in less than 10 times real time (10/spl times/RT).


international conference on acoustics, speech, and signal processing | 2005

Development of the CU-HTK 2004 broadcast news transcription systems

Do Yeong Kim; Ho Yin Chan; Gunnar Evermann; Mark J. F. Gales; David Mrva; Khe Chai Sim; Philip C. Woodland

The paper describes our recent work on improving broadcast news transcription and presents details of the CU-HTK broadcast news English (BN-E) transcription system for the DARPA/NIST rich transcription 2004 speech-to-text (RT04) evaluation. A key focus has been building a system using an order of magnitude more acoustic training data than we have previously attempted. We have also investigated a range of techniques to improve both minimum phone error (MPE) training and the efficient creation of MPE-based narrow-band models. The paper describes two alternative system structures that run in under 10/spl times/RT and a further system that runs in less than 1/spl times/RT. This final system gives lower word error rates than our 2003 system that ran in 10/spl times/RT.


IEEE Transactions on Audio, Speech, and Language Processing | 2006

Progress in the CU-HTK broadcast news transcription system

Mark J. F. Gales; Do Yeong Kim; Philip C. Woodland; Ho Yin Chan; David Mrva; Rohit Sinha; S. E. Tranter


conference of the international speech communication association | 2003

MMI-MAP and MPE-MAP for acoustic model adaptation

Daniel Povey; Mark J. F. Gales; Do Yeong Kim; Philip C. Woodland


conference of the international speech communication association | 2004

Using VTLN for broadcast news transcription.

Do Yeong Kim; Srinivasan Umesh; Mark J. F. Gales; Thomas Hain; Philip C. Woodland


Archive | 1997

Application Of Vts To Environment Compensation With Noise Statistics

Nam Soo Kim; Do Yeong Kim; Byung Goo Kong; Sang Ryong Kim


Archive | 2004

SuperEARS: Multi-Site Broadcast News System

Philip C. Woodland; Ricky Ho Yin Chan; Gunnar Evermann; Mark J. F. Gales; Do Yeong Kim; Xiao Liu; David Mrva; Khe Chai Sim; Liqiang Wang; Kin Man Yu; John Makhoul; Richard M. Schwartz; Luong T. Nguyen; S. Masoukas; Bing Xiang; Mohamed Afify; Sherif M. Abdou; Jean-Luc Gauvain; Lori Lamel; Holger Schwenk; Gilles Adda; Fabrice Lefèvre; Dimitra Vergyri; Wei Wang; Jingfang Zheng; Anand Venkataraman; Ramana Rao Gadde; Andreas Stolcke


conference of the international speech communication association | 1997

Model-based approach for robust speech recognition in noisy environements with multiple noise sources.

Do Yeong Kim; Nam Soo Kim; Chong Kwan Un


Archive | 2004

Recent developments at Cambridge in broadcast news transcription

Do Yeong Kim; Ho Yin Chan; Gunnar Evermann; Mjf Gales; David Mrva; Khe Chai Sim; Philip C. Woodland

Collaboration


Dive into the Do Yeong Kim's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

David Mrva

University of Cambridge

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Khe Chai Sim

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Ho Yin Chan

University of Cambridge

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Thomas Hain

University of Sheffield

View shared research outputs
Top Co-Authors

Avatar

Nam Soo Kim

Seoul National University

View shared research outputs
Top Co-Authors

Avatar

Rohit Sinha

Indian Institute of Technology Guwahati

View shared research outputs
Researchain Logo
Decentralizing Knowledge