Nam In Park
Gwangju Institute of Science and Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Nam In Park.
Sensors | 2011
Nam In Park; Hong Kook Kim; Min A Jung; Seong Ro Lee; Seung Ho Choi
In this paper, a packet loss concealment (PLC) algorithm for CELP-type speech coders is proposed in order to improve the quality of decoded speech under burst packet loss conditions in a wireless sensor network. Conventional receiver-based PLC algorithms in the G.729 speech codec are usually based on speech correlation to reconstruct the decoded speech of lost frames by using parameter information obtained from the previous correctly received frames. However, this approach has difficulty in reconstructing voice onset signals since the parameters such as pitch, linear predictive coding coefficient, and adaptive/fixed codebooks of the previous frames are mostly related to silence frames. Thus, in order to reconstruct speech signals in the voice onset intervals, we propose a multiple codebook-based approach that includes a traditional adaptive codebook and a new random codebook composed of comfort noise. The proposed PLC algorithm is designed as a PLC algorithm for G.729 and its performance is then compared with that of the PLC algorithm currently employed in G.729 via a perceptual evaluation of speech quality, a waveform comparison, and a preference test under different random and burst packet loss conditions. It is shown from the experiments that the proposed PLC algorithm provides significantly better speech quality than the PLC algorithm employed in G.729 under all the test conditions.
IEEE Transactions on Consumer Electronics | 2013
Kwang Myung Jeon; Nam In Park; Hong Kook Kim; Myung Kyu Choi; Kwang Il Hwang
This paper proposes a new non-stationary noise suppression method to reduce the mechanical noise generated when audio signals are recorded with a digital camera. The proposed method first utilizes a non-negative matrix factorization (NMF) technique to estimate the noise spectrum of the mechanical noise from the noisy audio spectrum. After that, the mechanical noise contaminated in the audio signal is suppressed by multi-band spectral subtraction. In particular, the NMF technique estimates the noise spectrum in a frame-wise manner in order for the proposed method to operate in real-time. The performance of the proposed mechanical noise suppression method is evaluated in terms of log-spectral distortion, cepstral distortion, subjective quality, and computational complexity. In addition, it is compared with the performance of conventional methods. It is shown from the evaluation that the proposed method provides lower log-spectral and cepstral distortions with better subjective preference than conventional methods. Moreover, the complexity of the proposed method is low enough to be implemented on a commercially available digital camera.
international conference on consumer electronics | 2013
Kwang Myung Jeon; Nam In Park; Hong Kook Kim; Myung Kyu Choi; Kwang Il Hwang
This paper proposes a new noise suppression method to reduce zoom noise generated when audio signals are recorded with a digital camera. The proposed method is based on multiband spectral subtraction that can suppress spectral components of noise related to reference zoom-noise in the modified discrete cosine transform domain. In particular, in the proposed method, each frame is classified as either a noise frame or a non-noise frame, and depending on this classification, the reference zoom-noise is updated and the degree of suppression is controlled. It is shown from performance evaluation that noise due to a zooming operation of digital cameras is successfully suppressed while maintaining audio quality.
international conference on future generation communication and networking | 2011
Nam In Park; Young Han Lee; Hong Kook Kim
In this paper, an artificial bandwidth extension (ABE) algorithm from narrowband to wideband is proposed in order to improve the quality of narrowband speech. The proposed ABE algorithm is based on spectral band replication in the modified discrete cosine transform (MDCT) domain with no additional bits. In particular, the patch index search for the replication band is restricted so that the harmonic structure of the wideband speech is maintained after ABE. In the proposed ABE algorithm, we first determine whether the current analysis frame of speech signals is voiced or unvoiced. A harmonic spectral replication or a correlation-based replication approach is then applied for the voiced or unvoiced frame, respectively. The proposed ABE algorithm is finally embedded into the G.729 speech decoder as a post-processor. It is shown from the subjective evaluation using a MUSHRA test that the mean opinion score of the wideband speech signals extended by the proposed ABE method is measured as 75.5, which is higher of around 14% than that of narrowband speech signals.
international conference on future generation communication and networking | 2010
Nam In Park; Hong Kook Kim; Min A Jung; Seong Ro Lee; Seung Ho Choi
In this paper, a packet loss concealment (PLC) algorithm for CELP-type speech coders is proposed to improve the quality of decoded speech under burst packet loss. A conventional PLC algorithm is usually based on speech correlation to reconstruct decoded speech of lost frames by using the information on the parameters obtained from the previous frames that are assumed to be correctly received. However, this approach is apt to fail to reconstruct voice onset signals since the parameters such as pitch, LPC coefficient, and adaptive/fixed codebooks of the previous frames are almost related to silence frames. Thus, in order to reconstruct speech signals in the voice onset intervals, we propose a multiple codebook based approach which includes a traditional adaptive codebook and a new random codebook composed of comfort noise. The proposed PLC algorithm is designed as a PLC algorithm for G.729 and its performance is then compared with that of the PLC algorithm employed in G.729 by means of perceptual evaluation of speech quality (PESQ), a waveform comparison, and an A-B preference test under different random and burst packet loss conditions. It is shown from the experiments that the proposed PLC algorithm provides significantly better speech quality than the PLC of G.729, especially under burst packet loss and voice onset conditions.
International Journal of Distributed Sensor Networks | 2015
Jin Ah Kang; Nam In Park; Hong Kook Kim; Seong Ro Lee
In this paper, an adaptive speech streaming method is proposed to improve the perceived speech quality (PSQ) of voice over wireless multimedia sensor network (WMSNs). First of all, the proposed method estimates the PSQ of the received speech data under different network conditions that are represented by the packet loss rates (PLRs). Simultaneously, the proposed method classifies the speech signal as either an onset or a nononset frame. Based on the estimated PSQ and the speech class, it determines an appropriate bit rate for the redundant speech data (RSD) that are transmitted with the primary speech data (PSD) to help reconstruct the speech signals of any lost frames. In particular, when the estimated PLR is high, the bit rate of the RSD should be increased by decreasing that of the PSD. Thus, the bandwidth of the PSD is changed from wideband to narrowband, and an artificial bandwidth extension technique is applied to the decoded narrowband speech. It is shown from the simulation that the proposed method significantly improves the decoded speech quality under packet loss conditions in a WMSN, compared to a decoder-based packet loss concealment method and a conventional redundant speech transmission method.
International Journal of Distributed Sensor Networks | 2014
Nam In Park; Jin Ah Kang; Seong Ro Lee; Hong Kook Kim
A packet loss concealment (PLC) algorithm is proposed to improve the quality of decoded speech when packet losses occur in a wireless sensor network. The proposed algorithm is mainly based on artificial bandwidth extension (ABE) from narrowband to wideband. It consists of three main functions: packet loss concealment in the narrowband, ABE in the modified discrete cosine transform (MDCT) domain, and smoothing of wideband MDCT coefficients with those of the last good frame. The performance of the proposed PLC algorithm is implemented by replacing the PLC algorithm employed in the ITU-T Recommendation G.729.1. The experimental results show that the proposed PLC algorithm provides significantly better speech quality than the PLC in the ITU-T G.729.1.
IEEE Transactions on Consumer Electronics | 2013
Kwang Myung Jeon; Nam In Park; Hong Kook Kim; Myung Kyu Choi; Kwang Il Hwang
In this paper, an audio denoising method is proposed for improving the quality of handheld audio recording devices. The proposed method reduces noise differently depending on the block size in the modified discrete cosine transform (MDCT) analysis of an audio coder. Specifically, denoising for a long block is performed by multi-band spectral subtraction (MBSS) with perceptually weighted scalefactor bands, while that for a short block is performed by subband power scaling to maintain coherence of power with the previously-denoised long block. In order to evaluate the performance of the proposed method, it is first embedded into MPEG-2 advanced audio coding (AAC) that is popularly used for audio recording devices. Then, its performance is compared with that of a conventional audio denoising method based on block thresholding in terms of cepstral distortion, subjective quality, and computational complexity. It is shown from performance comparison that the proposed method outperforms the block thresholding method in both objective and subjective measurements. Moreover, the complexity of the proposed method is sufficiently lowered to be implemented on most resource-constrained handheld audio recording devices, unlike the conventional method.
FGIT-SIP/MulGraB | 2010
Nam In Park; Seon Man Kim; Hong Kook Kim; Ji Woon Kim; Myeong Bo Kim; Su Won Yun
In this paper, we propose a video-zoom driven audio-zoom algorithm in order to provide audio zooming effects in accordance with the degree of video-zoom. The proposed algorithm is designed based on a super-directive beamformer operating with a 4-channel microphone system, in conjunction with a soft masking process that considers the phase differences between microphones. Thus, the audio-zoom processed signal is obtained by multiplying an audio gain derived from a video-zoom level by the masked signal. After all, a real-time audio-zoom system is implemented on an ARM-CORETEX-A8 having a clock speed of 600 MHz after different levels of optimization are performed such as algorithmic level, C-code, and memory optimizations. To evaluate the complexity of the proposed real-time audio-zoom system, test data whose length is 21.3 seconds long is sampled at 48 kHz. As a result, it is shown from the experiments that the processing time for the proposed audio-zoom system occupies 14.6% or less of the ARM clock cycles. It is also shown from the experimental results performed in a semi-anechoic chamber that the signal with the front direction can be amplified by approximately 10 dB compared to the other directions.
international conference on signal processing | 2014
Seok Hee Jeong; Chan Jun Chun; Nam In Park; Hong Kook Kim
In this paper, we address the issue of the real-time implementation of an artificial stereo audio extension method that provides stereophonic audio from a mono audio source. First, the floating-point arithmetic programming of the method is performed. Then, the computationally complex parts of the floating-point operations are converted into fixed-point operations to allow them to operate in real time. The performance of a real-time implementation is evaluated by measuring the processing time and root mean square (RMS) value. Consequently, it is shown that partial fixed-point arithmetic programming reduces the processing time by 48.7%, with an RMS value of-90.50 dB, compared to the floating-point arithmetic programming.