Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tamás Bohm is active.

Publication


Featured researches published by Tamás Bohm.


Philosophical Transactions of the Royal Society B | 2012

Multistability in auditory stream segregation: a predictive coding view

István Winkler; Susan L. Denham; Robert Mill; Tamás Bohm; Alexandra Bendixen

Auditory stream segregation involves linking temporally separate acoustic events into one or more coherent sequences. For any non-trivial sequence of sounds, many alternative descriptions can be formed, only one or very few of which emerge in awareness at any time. Evidence from studies showing bi-/multistability in auditory streaming suggest that some, perhaps many of the alternative descriptions are represented in the brain in parallel and that they continuously vie for conscious perception. Here, based on a predictive coding view, we consider the nature of these sound representations and how they compete with each other. Predictive processing helps to maintain perceptual stability by signalling the continuation of previously established patterns as well as the emergence of new sound sources. It also provides a measure of how well each of the competing representations describes the current acoustic scene. This account of auditory stream segregation has been tested on perceptual data obtained in the auditory streaming paradigm.


Frontiers in Neuroscience | 2014

Stable individual characteristics in the perception of multiple embedded patterns in multistable auditory stimuli.

Susan L. Denham; Tamás Bohm; Alexandra Bendixen; Orsolya Szalárdy; Zsuzsanna Kocsis; Robert Mill; István Winkler

The ability of the auditory system to parse complex scenes into component objects in order to extract information from the environment is very robust, yet the processing principles underlying this ability are still not well understood. This study was designed to investigate the proposal that the auditory system constructs multiple interpretations of the acoustic scene in parallel, based on the finding that when listening to a long repetitive sequence listeners report switching between different perceptual organizations. Using the “ABA-” auditory streaming paradigm we trained listeners until they could reliably recognize all possible embedded patterns of length four which could in principle be extracted from the sequence, and in a series of test sessions investigated their spontaneous reports of those patterns. With the training allowing them to identify and mark a wider variety of possible patterns, participants spontaneously reported many more patterns than the ones traditionally assumed (Integrated vs. Segregated). Despite receiving consistent training and despite the apparent randomness of perceptual switching, we found individual switching patterns were idiosyncratic; i.e., the perceptual switching patterns of each participant were more similar to their own switching patterns in different sessions than to those of other participants. These individual differences were found to be preserved even between test sessions held a year after the initial experiment. Our results support the idea that the auditory system attempts to extract an exhaustive set of embedded patterns which can be used to generate expectations of future events and which by competing for dominance give rise to (changing) perceptual awareness, with the characteristics of pattern discovery and perceptual competition having a strong idiosyncratic component. Perceptual multistability thus provides a means for characterizing both general mechanisms and individual differences in human perception.


Journal of the Acoustical Society of America | 2014

The effects of rhythm and melody on auditory stream segregation

Orsolya Szalárdy; Alexandra Bendixen; Tamás Bohm; Lucy A. Davies; Susan L. Denham; István Winkler

While many studies have assessed the efficacy of similarity-based cues for auditory stream segregation, much less is known about whether and how the larger-scale structure of sound sequences support stream formation and the choice of sound organization. Two experiments investigated the effects of musical melody and rhythm on the segregation of two interleaved tone sequences. The two sets of tones fully overlapped in pitch range but differed from each other in interaural time and intensity. Unbeknownst to the listener, separately, each of the interleaved sequences was created from the notes of a different song. In different experimental conditions, the notes and/or their timing could either follow those of the songs or they could be scrambled or, in case of timing, set to be isochronous. Listeners were asked to continuously report whether they heard a single coherent sequence (integrated) or two concurrent streams (segregated). Although temporal overlap between tones from the two streams proved to be the strongest cue for stream segregation, significant effects of tonality and familiarity with the songs were also observed. These results suggest that the regular temporal patterns are utilized as cues in auditory stream segregation and that long-term memory is involved in this process.


conference on information sciences and systems | 2011

CHAINS: Competition and cooperation between fragmentary event predictors in a Model of Auditory Scene Analysis

Robert Mill; Tamás Bohm; Alexandra Bendixen; István Winkler; Susan L. Denham

This paper presents an algorithm called Chains for separating temporal patterns of events that are mixed together. The algorithm is motivated by the task the auditory system faces when it attempts to analyse an acoustic mixture to determine the sources that contribute to it, and in particular, sources that emit regular sequences. The task is complicated by the fact that a mixture can be interpreted in several ways. For example, a complex pattern may issue from a complex source; or, alternatively, it may arise from the interaction of many simple sources. The idea pursued here is that the brain attempts to account for an incoming sequence in terms of short, fragmentary sequences, called chains. Chains are built as the input arrives and, once built, are used to predict inputs. A group of chains can coalesce to form an organisation, in which the member chains alternately generate predictions. A chain fails upon making an incorrect prediction, and any organisation it belongs to collapses. Several incompatible organisations can exist in parallel. The Chains algorithm thus remains open to multiple interpretations of a sequence. Perceptual multistability, in which the perceptual experience of an ambiguous stimulus switches spontaneously from one interpretation to another, seems to require a similar flexibility of representation.


conference on information sciences and systems | 2011

A multimodal-corpus data collection system for cognitive acoustic scene analysis

Julius Georgiou; Philippe O. Pouliquen; Andrew S. Cassidy; Guillaume Garreau; Charalambos M. Andreou; Guillermo Stuarts; Cyrille d'Urbal; Andreas G. Andreou; Susan L. Denham; Thomas Wennekers; Robert Mill; István Winkler; Tamás Bohm; Orsolya Szalárdy; Georg M. Klump; Simon J. Jones; Alexandra Bendixen

We report on the design and the collection of a multi-modal data corpus for cognitive acoustic scene analysis. Sounds are generated by stationary and moving sources (people), that is by omni-directional speakers mounted on peoples heads. One or two subjects walk along predetermined systematic and random paths, in synchrony and out of sync. Sound is captured in multiple microphone systems, including a four MEMS microphone directional array, two electret microphones situated in the ears of a stuffed gerbil head, and a Head Acoustics, head-shoulder unit with ICP microphones. Three micro-Doppler units operating at different frequencies were employed to capture gait and the articulatory signatures as well as location of the people in the scene. Three ground vibration sensors were recording the footsteps of the walking people. A 3D MESA camera as well as a web-cam provided 2D and 3D visual data for system calibration and ground truth. Data were collected in three environments ranging from a well controlled environment (anechoic chamber), an indoor environment (large classroom) and the natural environment of an outside courtyard. A software tool has been developed for the browsing and visualization of the data.


Phonetica | 2009

Do Listeners Store in Memory a Speaker's Habitual Utterance-Final Phonation Type?

Tamás Bohm; Stefanie Shattuck-Hufnagel

Earlier studies report systematic differences across speakers in the occurrence of utterance-final irregular phonation; the work reported here investigated whether human listeners remember this speaker-specific information and can access it when necessary (a prerequisite for using this cue in speaker recognition). Listeners personally familiar with the voices of the speakers were presented with pairs of speech samples: one with the original and the other with transformed final phonation type. Asked to select the member of the pair that was closer to the talker’s voice, most listeners tended to choose the unmanipulated token (even though they judged them to sound essentially equally natural). This suggests that utterance-final pitch period irregularity is part of the mental representation of individual speaker voices, although this may depend on the individual speaker and listener to some extent.


Journal of the Acoustical Society of America | 2006

Is nonmodal phonation at the end of utterances a cue for speaker recognition by humans

Tamás Bohm

Many studies of nonmodal phonation report a high rate of interspeaker variation. These experiments also showed that the rate and type of glottalization are characteristic to the speaker. Still, it remains an open question whether human listeners use this information as a cue for recognizing familiar voices. A listening experiment was conducted to investigate this issue, concentrating on utterance‐final irregularities. Short utterances produced by four speakers were copy synthesized with the Klatt synthesizer in two conditions: with modal and with nonmodal voice quality in their final regions. Because two of the speakers were reliable ‘‘glottalizers’’ and two seldom ‘‘glottalized,’’ one of the two synthesis conditions reflected the speaker’s usual utterance‐final voice quality and the other did not. These pairs of stimuli were used to test the effect of the presence or absence of utterance‐final nonmodal phonation on the identification of voices familiar to the subjects. When asked to choose which one of t...


Archive | 2005

Design issues of a corpus-based speech synthesizer

Andras Nagy; Péter Pesti; Géza Németh; Tamás Bohm


conference of the international speech communication association | 2009

Relation of formants and subglottal resonances in Hungarian vowels

Tamás Gábor Csapó; Zsuzsanna Bárkányi; Tekla Etelka Gráczi; Tamás Bohm; Steven M. Lulich


conference of the international speech communication association | 2007

Utterance-final glottalization as a cue for familiar speaker recognition.

Tamás Bohm; Stefanie Shattuck-Hufnagel

Collaboration


Dive into the Tamás Bohm's collaboration.

Top Co-Authors

Avatar

István Winkler

Hungarian Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alexandra Bendixen

Chemnitz University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Orsolya Szalárdy

Hungarian Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Géza Németh

Budapest University of Technology and Economics

View shared research outputs
Top Co-Authors

Avatar

Robert W. Mill

University of Nottingham

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Stefanie Shattuck-Hufnagel

Massachusetts Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge