Kisoo Kwon
Seoul National University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kisoo Kwon.
IEEE Signal Processing Letters | 2015
Tae Gyoon Kang; Kisoo Kwon; Jong Won Shin; Nam Soo Kim
Non-negative matrix factorization (NMF) is one of the most well-known techniques that are applied to separate a desired source from mixture data. In the NMF framework, a collection of data is factorized into a basis matrix and an encoding matrix. The basis matrix for mixture data is usually constructed by augmenting the basis matrices for independent sources. However, target source separation with the concatenated basis matrix turns out to be problematic if there exists some overlap between the subspaces that the bases for the individual sources span. In this letter, we propose a novel approach to improve encoding vector estimation for target signal extraction. Estimating encoding vectors from the mixture data is viewed as a regression problem and a deep neural network (DNN) is used to learn the mapping between the mixture data and the corresponding encoding vectors. To demonstrate the performance of the proposed algorithm, experiments were conducted in the speech enhancement task. The experimental results show that the proposed algorithm outperforms the conventional encoding vector estimation scheme.
IEEE Signal Processing Letters | 2015
Kisoo Kwon; Jong Won Shin; Nam Soo Kim
This letter presents a speech enhancement technique combining statistical models and non-negative matrix factorization (NMF) with on-line update of speech and noise bases. The statistical model-based enhancement methods have been known to be less effective to non-stationary noises while the template-based enhancement techniques can deal with them quite well. However, the template-based enhancement techniques usually rely on a priori information. To overcome the shortcomings of both approaches, we propose a novel speech enhancement method that combines the statistical model-based enhancement scheme with the NMF-based gain function. For a better performance in time-varying noise environments, both the speech and noise bases of NMF are adapted simultaneously with the help of the estimated speech presence probability. Experimental results showed that the proposed method outperformed not only the statistical model-based and NMF approaches, but also their combination in various noise environments.
international conference on acoustics, speech, and signal processing | 2014
Kisoo Kwon; Jong Won Shin; Sukanya Sonowat; In Kyu Choi; Nam Soo Kim
Speech enhancement based on statistical models has shown good performance, but the performance degrades when environment noise is highly non-stationary due to the stationary assumption. On the contrary, the template-based enhancement methods are more robust to non-stationary noise, but these are heavily dependent on a priori information present in training data. In order to get over both of the shortcomings, we propose a novel speech enhancement method which combines the statistical model-based enhancement scheme with the template-based enhancement. To reduce a dependency on a priori information, the speech and noise bases are updated simultaneously using the estimated speech presence probability, which is obtained from statistical model-based enhancement. Experimental results showed that the proposed method outperformed not only the statistical model-based and non-negative matrix factorization (NMF) approaches, but also their combination implemented with existing bases update rule in various kinds of noise.
asia pacific signal and information processing association annual summit and conference | 2016
Kisoo Kwon; Jong Won Shin; In Kyu Choi; Hyung Yong Kim; Nam Soo Kim
Nonnegative matrix factorization (NMF) is a matrix factorization technique that might find meaningful latent nonnegative components. Since, however, the objective function is non-convex, the source separation performance can degrade when the iterative update of the basis matrix is stuck to a poor local minimum. Most of the research updates basis iteratively to minimize certain objective function with random initialization, although a few approaches have been proposed for the systematic initialization of the basis matrix such as the singular value decomposition. In this paper, we propose a novel basis estimation method inspired by the similarity of the bases training with the vector quantization, which is similar to Linde-Buzo-Gray algorithm. Experiments of the audio source separation showed that the proposed method outperformed the NMF using random initialization by about 1.64 dB and 1.43 dB in signal-to-distortion ratio when its target sources were speech and violin, respectively.
international conference on acoustics, speech, and signal processing | 2016
Kisoo Kwon; Jong Won Shid; Nam Soo Kim
Non-negative matrix factorization (NMF) is an unsupervised technique to represents a nonnegative data matrix with a product of nonnegative basis and encoding matrices. The encoding matrix for the training phase contains information on the pattern of how each basis vector is utilized. The histogram for each row of this matrix corresponding to a specific basis turned out to be sparse, while the level of sparsity varied significantly in each basis. In this paper, the distribution of each component of an encoding vector is modeled as an independent exponential or gamma distribution, and a new objective function with the log-likelihood of the current encoding vector is proposed. Experimental results on audio source separation demonstrate that the utilization of the prior knowledge on the encoding matrix based on sparse statistical models can enhance the source separation performance.
The Journal of Korean Institute of Communications and Information Sciences | 2012
Shin-Jae Kang; Chang-woo Han; Kisoo Kwon; Nam Soo Kim
This paper presents a decoding technique for speech recognition using uncertainty information from feature compensation method to improve the speech recognition performance in the low SNR condition. Traditional feature compensation algorithms have difficulty in estimating clean feature parameters in adverse environment. Those algorithms focus on the point estimation of desired features. The point estimation of feature compensation method degrades speech recognition performance when incorrectly estimated features enter into the decoder of speech recognition. In this paper, we apply the uncertainty information from well-known feature compensation method, such as IMM, to the recognition engine. Applied technique shows better performance in the Aurora-2 DB.
IEICE Transactions on Information and Systems | 2015
Kisoo Kwon; Jong Won Shin; Nam Soo Kim
conference of the international speech communication association | 2014
Tae Gyoon Kang; Kisoo Kwon; Jong Won Shin; Nam Soo Kim
The Journal of Korean Institute of Communications and Information Sciences | 2013
Kisoo Kwon; Yu Gwang Jin; Soo Hyun Bae; Nam Soo Kim
Applied Acoustics | 2018
Kisoo Kwon; Jong Won Shin; Nam Soo Kim