Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Dj Dik Hermes is active.

Publication


Featured researches published by Dj Dik Hermes.


Journal of the Acoustical Society of America | 1991

The frequency scale of speech intonation

Dj Dik Hermes; Joost C. van Gestel

In intonation research, prominence-lending pitch movements have either been described on a linear or on a logarithmic frequency scale. An experiment has been carried out to check whether pitch movements in speech intonation are perceived on one of these two scales or on a psychoacoustic scale representing the frequency selectivity of the auditory system. This last scale is intermediary between the other two scales. Subjects matched the excursion size of prominence-lending pitch movements in utterances resynthesized in different pitch registers. Their task was to adjust the excursion size in a comparison stimulus in such a way that it lent equal prominence to the corresponding syllable in a fixed test stimulus. The comparison stimulus and the test stimulus had pitches running parallel on either the logarithmic frequency scale, the psychoacoustic scale, or the linear frequency scale. In one-half of the experimental sessions, the test stimulus was presented in the low register, while the comparison stimulus was presented in the high register, and, conversely, for the other half of the sessions. The result is that, in all cases, stimuli are matched in such a way that the average excursion sizes in different registers are equal on the psychoacoustic scale.


Hearing Research | 1981

Spectro-temporal characteristics of single units in the auditory midbrain of the lightly anaesthetised grass frog (Rana temporaria L.) Investigated with noise stimuli

Dj Dik Hermes; Ad Aertsen; P. I. M. Johannesma; Jos J. Eggermont

About 30% of the auditory units in the midbrain of the lightly anaesthetised grass frog respond in a sustained way to stationary pseudorandom noise. This response is described by the spectro-temporal receptive field (STRF), the regions in the spectro-temporal domain where the average second-order functional of those parts of the stimulus ensemble that precede the action potentials differ from the average second-order functional of the stimulus ensemble. By means of the STRF frequency selectivity, postactivation suppression and lateral suppression can quantitatively be studied under one and the same experimental condition. Auditory units that respond to stationary noise are localised in those parts of the torus where fibres enter from the olivary nucleus. They are characterised by relatively short latencies to tones and probably represent the first information-processing stage in the torus semicircularis.


Journal of the Acoustical Society of America | 1990

Vowel-onset detection

Dj Dik Hermes

An algorithm is presented that correctly detects the large majority of vowel onsets in fluent speech. The algorithm is based on the simple assumption that vowel onsets are characterized by the appearance of rapidly increasing resonance peaks in the amplitude spectrum. Application to carefully articulated, isolated words results in a high number of false alarms, predominantly before consonants that can function as vowels in a different context such as another language or as a syllabic consonant. After applying some modifications in the setting of some parameters, this number of false alarms for isolated words can be reduced significantly, without the risk of a large number of missed detections. The temporal accuracy of the algorithm is better than 20 ms. This accuracy is determined with respect to the perceptual moment of occurrence of a vowel onset as determined by a phonetician.


Speech Communication | 1991

Synthesis of breathy vowels: some research methods

Dj Dik Hermes

When vowels are synthesised by means of a source-filter model, a delta-pulse train is often used as a source signal. Although breathiness can to some extent be simulated by using a sophisticated glottal-source model, a more natural simulation of breathiness requires the addition of aspiration noise. When stationary noise is used, however, the noise is to a large extent perceived as coming from a separate sound source which hardly contributes to the breathy timbre of the vowel. This problem can be solved by using noise with a temporal envelope of the same periodicity as the pulse train. In a simple source-filter model, a combination of lowpass-filtered pulses and synchronous highpass-filtered noise bursts of equal energy was used as a source signal. In this way, the noise was no longer perceived as a separate sound, but integrated perceptually with the strictly periodic part of the signal. It will be shown that this integration consists of both a reduction of the loudness of the separate noise stream and a timbre change in the breathy vowel.


Prosody: Theory and experiment. Studies presented to Gösta Bruce | 2000

The Perception of Prosodic Prominence

Jmb Jacques Terken; Dj Dik Hermes

We say that a linguistic entity is prosodically prominent when it stands out from its environment by virtue of its prosodic characteristics. That is, we define prominence as a property of a linguistic entity relative to an entity or a set of entities in its environment. Although the definition is cast in relative terms, it includes monosyllabic utterances, because they stand out from silence. In the acoustic domain, the primary prosodic properties bringing about these relative differences are amplitude, duration and ‘F0’ (we use F0 as a shorthand form for the inverse of the quasi-periodicity of the speech signal). The corresponding perceptual properties are loudness, duration or length, and pitch. Also, particular aspects of timbre come into play, such as those relating to vowel reduction, spectral slope or tilt, etc., and properties relating to voice quality, e.g. creak.


Hearing Research | 1981

Spectro-temporal characterization of auditory neurons: Redundant or necessary?

Jos J. Eggermont; Ad Aertsen; Dj Dik Hermes; P. I. M. Johannesma

For neurons in the auditory midbrain of the grass frog the use of a combined spectro-temporal characterization has been evaluated against the separate characterizations of frequency-sensitivity and temporal response properties. By factoring the joint density function of stimulus intensity, I (f, t), preceding a spike, into two marginal density functions I1(f) and I2(t) one may under the assumption of statistical independence reconstruct the joint density by multiplication: I1(f).I2(t). The reconstructed I(f, t) is compared to the original I(f, t) for 83 neurons: in 23% thereof the I(f, t) appeared to be vastly different from I(f, t). These units appeared to be located dominantly in the ventral parts of the auditory midbrain and had a latency exceeding 30 ms. On the basis of the action-potential wave forms the absence of non-separable I(f, t) in the incoming nerve fiber population is concluded. A spectro-temporal characterization of auditory neurons seems mandatory for investigations in and central from the auditory midbrain.


Speech Communication | 2004

Perception of the size and speed of rolling balls by sound

Mmj Mark Houben; Ag Armin Kohlrausch; Dj Dik Hermes

In everyday life, we listen to the properties of sources that generate sound, not to properties of the sound itself. But what properties of the sound source can we identify and what is it in the sound that informs us about these properties? This paper reports three experiments investigating the auditory perception of the size and the speed of wooden balls rolling over a wooden plate on the basis of recorded sounds. Experiment I showed that listeners are able to choose the larger ball from paired sounds. Experiment II showed that listeners are able to discriminate between the sounds of balls rolling with different speeds. However, some listeners reversed the labeling of the speed. In experiment III, the interaction between size and speed was tested. Results indicated that if the size and the speed of a rolling ball are varied simultaneously, listeners generally are still able to identify the larger ball, but the judgment of speed is influenced by the variation in size. An analysis of the spectral and temporal properties of the recorded sounds listeners may use in their decisions suggested a conflict in available cues when varying both size and speed, which is in line with the observed interaction effect.


Journal of the Acoustical Society of America | 1994

Perception of prominence in speech intonation induced by rising and falling pitch movements

Dj Dik Hermes; Hh Hans Rump

The object of this study was to investigate whether subjects are able to compare the prominence caused by different types of accent‐lending pitch movements and, if so, whether some pitch movements lend more prominence to a syllable than others. These experiments were carried out with the utterance /ma’mama/, with the second syllable accented by either a rise, a fall, or a rise–fall. Subjects adjusted the variable excursion size of a comparison stimulus to the fixed excursion size of a test stimulus in such a way that the accented syllable in test and comparison stimuli had equal prominence. The rise–fall was only presented in one ‘‘standard’’ position, while the fall and the rise were tested for five different temporal positions in the syllable. Subjects were found to be quite capable of equating the prominence of syllables accented by the following types of pitch movement: the rise–fall in standard position, the rise starting before the vowel onset, and the fall whatever its temporal position in the syll...


Journal of the Acoustical Society of America | 1997

Timing of pitch movements and accentuation of syllables in Dutch

Dj Dik Hermes

In this study, the relation between the timing of a rising or falling pitch movement and the syllable it accentuates is investigated. The five-syllable utterance /mamamamama/ was provided with a relatively fast rising or falling pitch movement. The timing of the movement was systematically varied and Dutch subjects were asked to indicate which syllable they perceived as accented. In order to find out where in the pitch movement the cue which induces the percept of accentuation is located, the duration of the pitch movement was varied. In order to find out which segments of the utterance this characteristic is linked to, the duration of the /m/ was varied. The results showed that the percept of accentuation is induced by a change in pitch at the start of the movement. The moment at which the course of pitch starts to change significantly determines which syllable is perceived as accented. If this moment lies some tens of milliseconds before the P-center, i.e., the perceptual moment of occurrence of the syllable, the preceding syllable is perceived as accented. For a rise, a high accent is perceived; for a fall, a low accent. If the pitch change occurs after this moment, the syllable with this P-center is perceived as accented. For the rise, a low accent is then perceived; for the fall, a high accent. This will be discussed in the light of earlier research on accentuation and of theoretical knowledge about pitch accents.


ieee international workshop on haptic audio visual environments and games | 2008

Sound and tangible interface for shape evaluation and modification

Monica Bordegoni; Francesco Ferrise; Simon Shelley; Miguel Bruns Alonso; Dj Dik Hermes

One of the recent research topics in the area of design and virtual prototyping is offering designers tools for creating and modifying shapes in a natural and interactive way. Multimodal interaction is part of this research. It allows conveying to the users information through different sensory channels. The use of more modalities than touch and vision augments the sense of presence in the virtual environment and can be useful to present the same information in various ways. In addition, multimodal interaction can sometimes be used to augment the perception of the user by transferring information that is not generally perceived in the real world, but which can be emulated by the virtual environment. The paper presents a prototype of a system that allows designers to evaluate the quality of a shape with the aid of touch, vision and sound. Sound is used to communicate geometrical data, relating to the virtual object, which are practically undetectable through touch and vision. In addition, the paper presents the preliminary work carried out on this prototype and the results of the first tests made in order to demonstrate the feasibility. The problems related to the development of this kind of application and the realization of the prototype itself are highlighted. This paper also focuses on the potentialities and the problems relating to the use of multimodal interaction, in particular the auditory channel.

Collaboration


Dive into the Dj Dik Hermes's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

David House

Royal Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Simon Shelley

Eindhoven University of Technology

View shared research outputs
Top Co-Authors

Avatar

Ad Aertsen

University of Freiburg

View shared research outputs
Top Co-Authors

Avatar

Jos J. Eggermont

Radboud University Nijmegen

View shared research outputs
Top Co-Authors

Avatar

Cnj Christophe Stoelinga

Eindhoven University of Technology

View shared research outputs
Top Co-Authors

Avatar

Miguel Bruns Alonso

Eindhoven University of Technology

View shared research outputs
Top Co-Authors

Avatar

Sjl Mozziconacci

Eindhoven University of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge