Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Baris Bozkurt is active.

Publication


Featured researches published by Baris Bozkurt.


IEEE Transactions on Audio, Speech, and Language Processing | 2010

Three Dimensions of Pitched Instrument Onset Detection

Andre Holzapfel; Yannis Stylianou; Ali Cenk Gedik; Baris Bozkurt

In this paper, we suggest a novel group delay based method for the onset detection of pitched instruments. It is proposed to approach the problem of onset detection by examining three dimensions separately: phase (i.e., group delay), magnitude and pitch. The evaluation of the suggested onset detectors for phase, pitch and magnitude is performed using a new publicly available and fully onset annotated database of monophonic recordings which is balanced in terms of included instruments and onset samples per instrument, while it contains different performance styles. Results show that the accuracy of onset detection depends on the type of instruments as well as on the style of performance. Combining the information contained in the three dimensions by means of a fusion at decision level leads to an improvement of onset detection by about 8% in terms of F-measure, compared to the best single dimension.


Speech Communication | 2007

Chirp group delay analysis of speech signals

Baris Bozkurt; Laurent Couvreur; Thierry Dutoit

This study proposes new group delay estimation techniques that can be used for analyzing resonance patterns of short-term discrete-time signals and more specifically speech signals. Phase processing or equivalently group delay processing of speech signals are known to be difficult due to large spikes in the phase/group delay functions that mask the formant structure. In this study, we first analyze in detail the z-transform zero patterns of short-term speech signals in the z-plane and discuss the sources of spikes on group delay functions, namely the zeros closely located to the unit circle. We show that windowing largely influences these patterns, therefore short-term phase processing. Through a systematic study, we then show that reliable phase/group delay estimation for speech signals can be achieved by appropriate windowing and group delay functions can reveal formant information as well as some of the characteristics of the glottal flow component in speech signals. However, such phase estimation is highly sensitive to noise and robust extraction of group delay based parameters remains difficult in real acoustic conditions even with appropriate windowing. As an alternative, we propose processing of chirp group delay functions, i.e. group delay functions computed on a circle other than the unit circle in z-plane, which can be guaranteed to be spike-free. We finally present one application in feature extraction for automatic speech recognition (ASR). We show that chirp group delay representations are potentially useful for improving ASR performance.


Computer Speech & Language | 2012

A comparative study of glottal source estimation techniques

Thomas Drugman; Baris Bozkurt; Thierry Dutoit

Abstract: Source-tract decomposition (or glottal flow estimation) is one of the basic problems of speech processing. For this, several techniques have been proposed in the literature. However, studies comparing different approaches are almost nonexistent. Besides, experiments have been systematically performed either on synthetic speech or on sustained vowels. In this study we compare three of the main representative state-of-the-art methods of glottal flow estimation: closed-phase inverse filtering, iterative and adaptive inverse filtering, and mixed-phase decomposition. These techniques are first submitted to an objective assessment test on synthetic speech signals. Their sensitivity to various factors affecting the estimation quality, as well as their robustness to noise are studied. In a second experiment, their ability to label voice quality (tensed, modal, soft) is studied on a large corpus of real connected speech. It is shown that changes of voice quality are reflected by significant modifications in glottal feature distributions. Techniques based on the mixed-phase decomposition and on a closed-phase inverse filtering process turn out to give the best results on both clean synthetic and real speech signals. On the other hand, iterative and adaptive inverse filtering is recommended in noisy environments for its high robustness.


Journal of New Music Research | 2008

An automatic pitch analysis method for Turkish maqam music

Baris Bozkurt

Abstract Automatic pitch analysis of large audio databases is essential for studies on music information retrieval and developing a pitch scale theory for Turkish maqam music. However no such study is available. In this article, we first determine the main obstacle as the alignment of frequency analysis results from multiple files. We then propose a new method to automatically detect the tonic of a recording, align the data, and estimate overall frequency histograms from large databases. We show that such histograms can be successfully used for pitch scale (tuning) studies on the recordings of Tanburi Cemil Bey, an undisputed master of the genre.


Signal Processing | 2010

Pitch-frequency histogram-based music information retrieval for Turkish music

Ali Cenk Gedik; Baris Bozkurt

This study reviews the use of pitch histograms in music information retrieval studies for western and non-western music. The problems in applying the pitch-class histogram-based methods developed for western music to non-western music and specifically to Turkish music are discussed in detail. The main problems are the assumptions used to reduce the dimension of the pitch histogram space, such as, mapping to a low and fixed dimensional pitch-class space, the hard-coded use of western music theory, the use of the standard diapason (A4=440Hz), analysis based on tonality and tempered tuning. We argue that it is more appropriate to use higher dimensional pitch-frequency histograms without such assumptions for Turkish music. We show in two applications, automatic tonic detection and makam recognition, that high dimensional pitch-frequency histogram representations can be successfully used in Music Information Retrieval (MIR) applications without such pre-assumptions, using the data-driven models.


Speech Communication | 2011

Causal-anticausal decomposition of speech using complex cepstrum for glottal source estimation

Thomas Drugman; Baris Bozkurt; Thierry Dutoit

Complex cepstrum is known in the literature for linearly separating causal and anticausal components. Relying on advances achieved by the Zeros of the Z-Transform (ZZT) technique, we here investigate the possibility of using complex cepstrum for glottal flow estimation on a large-scale database. Via a systematic study of the windowing effects on the deconvolution quality, we show that the complex cepstrum causal-anticausal decomposition can be effectively used for glottal flow estimation when specific windowing criteria are met. It is also shown that this complex cepstral decomposition gives similar glottal estimates as obtained with the ZZT method. However, as complex cepstrum uses FFT operations instead of requiring the factoring of high-degree polynomials, the method benefits from a much higher speed. Finally in our tests on a large corpus of real expressive speech, we show that the proposed method has the potential to be used for voice quality analysis.


IEEE Signal Processing Letters | 2005

Zeros of Z-transform representation with application to source-filter separation in speech

Baris Bozkurt; Boris Doval; Christophe d'Alessandro; Thierry Dutoit

We propose a new spectral representation called the zeros of z-transform (ZZT), which is an all-zero representation of the z-transform of the signal. We show that separate patterns exist in ZZT representations of speech signals for the glottal flow and the vocal tract contributions. A decomposition method for source-tract separation is presented based on ZZT. The ZZT-decomposition consists in grouping the zeros into two sets, according to their location in the z-plane. This type of decomposition leads to separating glottal flow contribution (without a return phase) from vocal tract contribution in the z domain.


Journal of New Music Research | 2009

Weighing Diverse Theoretical Models on Turkish Maqam Music Against Pitch Measurements: A Comparison of Peaks Automatically Derived from Frequency Histograms with Proposed Scale Tones

Baris Bozkurt; Ozan Yarman; M. Kemal Karaosmanoğlu; Can Akkoç

Abstract Since the early 20th century, various theories have been advanced in order to mathematically explain and notate modes of Traditional Turkish music known as maqams. In this article, maqam scales according to various theoretical models based on different tunings are compared with pitch measurements obtained from select recordings of master Turkish performers in order to study their level of match with analysed data. Chosen recordings are subjected to a fully computerized sequence of signal processing algorithms for the automatic determination of the set of relative pitches for each maqam scale: f0 estimation, histogram computation, tonic detection + histogram alignment, and peak picking. For nine well-recognized maqams, automatically derived relative pitches are compared with scale tones defined by theoretical models using quantitative distance measures. We analyse and interpret histogram peaks based on these measures to find the theoretical models most conforming with all the recordings, and hence, with the quotidian performance trends influenced by them.


Journal of New Music Research | 2014

Computational analysis of Turkish makam music: review of state-of-the-art and challenges

Baris Bozkurt; Ruhi Ayangil; Andre Holzapfel

Abstract This text targets a review of the computational analysis literature for Turkish makam music, discussing in detail the challenges involved and presenting a perspective for further studies. For that purpose, the basic concepts of Turkish makam music and the description of melodic, rhythmic and timbral aspects are considered in detail. Studies on tuning analysis, automatic transcription, automatic melodic analysis, automatic makam and usul detection are reviewed. Technological and data resource needs for further advancement are discussed and available sources are presented.


Proceedings of the 1st International Workshop on Digital Libraries for Musicology | 2014

A Corpus for Computational Research of Turkish Makam Music

Burak Uyar; Hasan Sercan Atli; Sertan Şentürk; Baris Bozkurt; Xavier Serra

Each music tradition has its own characteristics in terms of melodic, rhythmic and timbral properties as well as semantic understandings. To analyse, discover and explore these culture-specific characteristics, we need music collections which are representative of the studied aspects of the music tradition. For Turkish makam music, there are various resources available such as audio recordings, music scores, lyrics and editorial metadata. However, most of these resources are not typically suited for computational analysis, are hard to access, do not have sufficient quality or do not include adequate descriptive information. In this paper we present a corpus of Turkish makam music created within the scope of the CompMusic project. The corpus is intended for computational research and the primary considerations during the creation of the corpus reflect some criteria, namely, purpose, coverage, completeness, quality and re-usability. So far, we have gathered approximately 6000 audio recordings, 2200 music scores with lyrics and 27000 instances of editorial metadata related to Turkish makam music. The metadata include information about makams, recordings, scores, compositions, artists etc. as well as the interrelations between them. In this paper, we also present several test datasets of Turkish makam music. Test datasets contain manual annotations by experts and they provide ground truth for specific computational tasks to test, calibrate and improve the research tools. We hope that this research corpus and the test datasets will facilitate academic studies in several fields such as music information retrieval and computational musicology.

Collaboration


Dive into the Baris Bozkurt's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Christophe d'Alessandro

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Xavier Serra

Pompeu Fabra University

View shared research outputs
Top Co-Authors

Avatar

Laurent Couvreur

Faculté polytechnique de Mons

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ali Cenk Gedik

İzmir Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Erdem Ünal

Scientific and Technological Research Council of Turkey

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge