Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Inseok Heo is active.

Publication


Featured researches published by Inseok Heo.


Journal of the Acoustical Society of America | 2011

Automated detection of alarm sounds

Robert A. Lutfi; Inseok Heo

Two approaches to the automated detection of alarm sounds are compared, one based on a change in overall sound level (RMS), the other a change in periodicity, as given by the power of the normalized autocorrelation function (PNA). Receiver operating characteristics in each case were obtained for different exemplars of four classes of alarm sounds (bells/chimes, buzzers/beepers, horns/whistles, and sirens) embedded in four noise backgrounds (cafeteria, park, traffic, and music). The results suggest that PNA combined with RMS may be used to improve current alarm-sound alerting technologies for the hard-of-hearing.


Journal of the Acoustical Society of America | 2014

Factors affecting auditory streaming of random tone sequences

An-Chieh Chang; Inseok Heo; Jungmee Lee; Christophe N. J. Stoelinga; Robert A. Lutfi

As the frequency separation of A and B tones in an ABAABA tone sequence increases the tones are heard to split into separate auditory streams (fission threshold). The phenomenon is identified with our ability to ‘hear out’ individual sound sources in natural, multisource acoustic environments. One important difference, however, between natural sounds and the tone sequences used in most streaming studies is that natural sounds often vary unpredictably from one moment to the next. In the present study, fission thresholds were measured for ABAABA tone sequences made more or less predictable by sampling the frequencies, levels or durations of the tones at random from normal distributions having different values of sigma (0–800 cents, 0–8 dB, and 0–40 ms, respectively, for frequency, level, and duration). Frequency variation on average had the greatest effect on threshold, but the function relating threshold to sigma was non-monotonic; first increasing then decreasing for the largest value of sigma. Difference...


Trends in hearing | 2016

A Detection-Theoretic Analysis of Auditory Streaming and Its Relation to Auditory Masking

An-Chieh Chang; Robert A. Lutfi; Jungmee Lee; Inseok Heo

Research on hearing has long been challenged with understanding our exceptional ability to hear out individual sounds in a mixture (the so-called cocktail party problem). Two general approaches to the problem have been taken using sequences of tones as stimuli. The first has focused on our tendency to hear sequences, sufficiently separated in frequency, split into separate cohesive streams (auditory streaming). The second has focused on our ability to detect a change in one sequence, ignoring all others (auditory masking). The two phenomena are clearly related, but that relation has never been evaluated analytically. This article offers a detection-theoretic analysis of the relation between multitone streaming and masking that underscores the expected similarities and differences between these phenomena and the predicted outcome of experiments in each case. The key to establishing this relation is the function linking performance to the information divergence of the tone sequences, DKL (a measure of the statistical separation of their parameters). A strong prediction is that streaming and masking of tones will be a common function of DKL provided that the statistical properties of sequences are symmetric. Results of experiments are reported supporting this prediction.


Advances in Experimental Medicine and Biology | 2016

Individual Differences in Behavioural Decision Weights Related to Irregularities in Cochlear Mechanics

Jungmee Lee; Inseok Heo; An-Chieh Chang; Kristen Bond; Christophe N. J. Stoelinga; Robert A. Lutfi; Glenis R. Long

An unexpected finding of previous psychophysical studies is that listeners show highly replicable, individualistic patterns of decision weights on frequencies affecting their performance in spectral discrimination tasks--what has been referred to as individual listening styles. We, like many other researchers, have attributed these listening styles to peculiarities in how listeners attend to sounds, but we now believe they partially reflect irregularities in cochlear micromechanics modifying what listeners hear. The most striking evidence for cochlear irregularities is the presence of low-level spontaneous otoacoustic emissions (SOAEs) measured in the ear canal and the systematic variation in stimulus frequency otoacoustic emissions (SFOAEs), both of which result from back-propagation of waves in the cochlea. SOAEs and SFOAEs vary greatly across individual ears and have been shown to affect behavioural thresholds, behavioural frequency selectivity and judged loudness for tones. The present paper reports pilot data providing evidence that SOAEs and SFOAEs are also predictive of the relative decision weight listeners give to a pair of tones in a level discrimination task. In one condition the frequency of one tone was selected to be near that of an SOAE and the frequency of the other was selected to be in a frequency region for which there was no detectable SOAE. In a second condition the frequency of one tone was selected to correspond to an SFOAE maximum, the frequency of the other tone, an SFOAE minimum. In both conditions a statistically significant correlation was found between the average relative decision weight on the two tones and the difference in OAE levels.


MECHANICS OF HEARING: PROTEIN TO PERCEPTION: Proceedings of the 12th International Workshop on the Mechanics of Hearing | 2015

Possible role of cochlear nonlinearity in the detection of mistuning of a harmonic component in a harmonic complex

Christophe N. J. Stoelinga; Inseok Heo; Glenis R. Long; Jungmee Lee; Robert A. Lutfi; An-Chieh Chang

The human auditory system has a remarkable ability to “hear out” a wanted sound (target) in the background of unwanted sounds. One important property of sound which helps us hear-out the target is inharmonicity. When a single harmonic component of a harmonic complex is slightly mistuned, that component is heard to separate from the rest. At high harmonic numbers, where components are unresolved, the harmonic segregation effect is thought to result from detection of modulation of the time envelope (roughness cue) resulting from the mistuning. Neurophysiological research provides evidence that such envelope modulations are represented early in the auditory system, at the level of the auditory nerve. When the mistuned harmonic is a low harmonic, where components are resolved, the harmonic segregation is attributed to more centrally-located auditory processes, leading harmonic components to form a perceptual group heard separately from the mistuned component. Here we consider an alternative explanation that a...


Journal of the Acoustical Society of America | 2015

Cochlear fine structure predicts behavioral decision weights in a multitone level discrimination task

Jungmee M. Lee; Glenis R. Long; Inseok Heo; Christophe N. J. Stoelinga; Robert A. Lutfi

Listeners show highly replicable, idiosyncratic patterns of decision weights across frequency affecting their performance in multi-tone level discrimination tasks. The different patterns are attributed to peculiarities in how listeners attend to sounds. However, evidence is presented in the current study that they reflect individual differences in cochlear micromechanics, which can be evaluated using otoacoustic emissions (OAEs). Spontaneous OAEs (SOAEs) and the fine structure of stimulus-frequency OAEs (SFOAEs) were measured in a group of normal-hearing listeners. The same group of listeners performed a two-tone, sample-level discrimination task wherein the frequency of one tone was selected to correspond to a SOAE and the other was selected well away from a SOAE. Tone levels were either 50 or 30 dB SPL. The relative decision weight of the two tones for each listener and condition was estimated from a standard COSS analysis of the trial-by-trial data [Berg (1989), J. Acoust. Soc. Am. 86, 1743–1746]. A st...


IEEE Transactions on Audio, Speech, and Language Processing | 2015

Classification based on speech rhythm via a temporal alignment of spoken sentences

Inseok Heo; William A. Sethares

How much information is contained in the rhythm of speech? Is it possible to tell, just from the rhythm of the speech, whether the speaker is male or female? Is it possible to tell if they are a native or nonnative speaker? This paper provides a new way to address such questions. Traditional investigations into speech rhythm approach the problem by manually annotating the speech, and investigating a preselected collection of features such as the durations of vowels or inter-phoneme timings. This paper presents a method that can automatically align the audio of multiple people when speaking the same sentence. The output of the alignment procedure is a mapping (from the micro-timing of one speaker to that of another) that can be used as a surrogate for speech rhythm. The method is applied to a large online corpus of speakers and shows that it is possible to classify the speakers based on these mappings alone. Several technical aspects are discussed. First, the spectrograms switch between different-length analysis windows (based on whether the speech is voiced or unvoiced) to ameliorate the time-frequency trade-off. These variable window spectrograms are fed into a dynamic time warping algorithm to produce a timing map which represents the speech rhythm. The accuracy of the alignment is evaluated by a technique of transitive validation, and the timing maps are used to form a feature vector for the classification. The method is applied to the online Speech Accent Archive corpus. In the gender discrimination experiments, the proposed method was only about 5% worse than a state-of-the-art classifier based on spectral feature vectors. In the native speaker discrimination task, the speech rhythm was about 15% better than when using spectral information.


Journal of the Acoustical Society of America | 2013

Towards a model of informational masking: The Simpson-Fitter metric of target-masker separation

Lynn Gilbertson; An-Chieh Chang; Jacob Stamas; Inseok Heo; Robert A. Lutfi

Informational masking (IM) is the term used to describe masking that appears to have its origin at some central level of the auditory nervous system beyond the cochlea. Supporting a central origin are the two major factors associated with IM: trial-by-trial uncertainty regarding the masker and perceived similarity of target and masker. Here preliminary evidence is provided suggesting these factors exert their influence through a single critical determinant of IM, the stochastic separation of target and masker given by Simpson-Fitters da [Lutfi et al. (2012). J. Acoust. Soc. Am. 132, EL109-113.]. Target and maskers were alternating sequences of tones or words with frequencies, F0s for words, selected at random on each presentation. The listeners task was to discriminate a frequency-difference in the target tones, identify the target words. Performance in both tasks was found to be constant across conditions in which the mean difference (similarity), variance (uncertainty) or covariance (similarity) of ta...


Journal of the Acoustical Society of America | 2013

Differences in masked localization of speech and noise

Inseok Heo; Lynn Gilbertson; An-Chieh Chang; Jacob Stamas; Robert A. Lutfi

Dichotic masking studies using noise are commonly referenced in regard to their implications for “cocktail party listening” wherein target and maskers are speech. In the present study masker decision weights (MDWs) are reported suggesting that speech and noise are processed differently in dichotic masking. The stimuli were words or Gaussian-noise bursts played in sequence as masker-target-masker triads. The apparent location of words (noise bursts) from left to right was varied independently and at random on each presentation using KEMAR HTRFs. In the two-interval, forced-choice procedure listeners were instructed to identify whether the second-interval target was to the left or right of the first. For wide spatial separations between target and masker noise-MDWs were typically negative, indicating that target location was judged relative to the masker. For small spatial separations between target and masker noise-MDWs were typically positive, suggesting that target location was more often confused with t...


Journal of the Acoustical Society of America | 2008

Psychoacoustic measures of blind audio source separation performance

Mingu Lee; Inseok Heo; Nakjin Choi; Koeng-Mo Sung

In this paper, an improved method for evaluating the performance of blind audio source separation (BASS) is discussed. In previous studies, such as described in E. Vincent et al., IEEE Transactions on Speech and Audio Processing, 2006, several computation methods for measuring quality of BASS algorithms e.g., defined by source‐to‐distortion ratio (SDR), source‐to‐interferences ratio (SIR), sources‐to‐noise ratio (SNR) and sources‐to‐artifacts ratio (SAR) are introduced. However, those methods do not take human auditory system into consideration. An improved method is developed by applying preprocessing and using weighted‐inner product in frequency domain instead of simple inner‐product in time domain. The proposed method incorporates well‐known psychoacoustic characteristics e.g., masking effect and equal loudness contours. In comparison with the conventional quality measures, the proposed method shows better correlation with the results of carefully designed listening tests.

Collaboration


Dive into the Inseok Heo's collaboration.

Top Co-Authors

Avatar

Robert A. Lutfi

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

An-Chieh Chang

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jungmee Lee

Northwestern University

View shared research outputs
Top Co-Authors

Avatar

Lynn Gilbertson

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Glenis R. Long

City University of New York

View shared research outputs
Top Co-Authors

Avatar

Koeng-Mo Sung

Seoul National University

View shared research outputs
Top Co-Authors

Avatar

Mingu Lee

Seoul National University

View shared research outputs
Top Co-Authors

Avatar

Nakjin Choi

Seoul National University

View shared research outputs
Top Co-Authors

Avatar

Kristen Bond

University of Wisconsin-Madison

View shared research outputs
Researchain Logo
Decentralizing Knowledge