Is this you? Create Your Porfile

Horia Cucu

Politehnica University of Bucharest

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Horia Cucu is active.

Explore More

Publication

Featured researches published by Horia Cucu.

ieee automatic speech recognition and understanding workshop | 2011

Investigating the role of machine translated text in ASR domain adaptation: Unsupervised and semi-supervised methods

Horia Cucu; Laurent Besacier; Corneliu Burileanu; Andi Buzo

This study investigates the use of machine translated text for ASR domain adaptation. The proposed methodology is applicable when domain-specific data is available in language X only, whereas the goal is to develop a domain-specific system in language Y. Two semi-supervised methods are introduced and compared with a fully unsupervised approach, which represents the baseline. While both unsupervised and semi-supervised approaches allow to quickly develop an accurate domain-specific ASR system, the semi-supervised approaches overpass the unsupervised one by 10% to 29% relative, depending on the amount of human post-processed data available. An in-depth analysis, to explain how the machine translated text improves the performance of the domain-specific ASR, is also given at the end of this paper.

international conference on database theory | 2010

Romanian Spoken Language Resources and Annotation for Speaker Independent Spontaneous Speech Recognition

Corneliu Burileanu; Andi Buzo; Cristina Sorina Petre; Diana Ghelmez-Hanes; Horia Cucu

This paper presents studies and early results with the scope to build a robust spontaneous speech recognition system in Romanian language. We have tried to give solutions to several issues that have arisen like building a large and accurate database within a reasonable time. A short description of the database is given and some statistics are collected in order to show its evolution in several stages of the project. Embedded training technique has been used for training triphones. As a consequence, the alignment problem has been studied and a solution is proposed for it. The final purpose of these attempts is to obtain substantial results in speech recognition for Romanian language that can be used as baseline for further results.

Speech Communication | 2014

SMT-based ASR domain adaptation methods for under-resourced languages: Application to Romanian

Horia Cucu; Andi Buzo; Laurent Besacier; Corneliu Burileanu

This study investigates the possibility of using statistical machine translation to create domain-specific language resources. We propose a methodology that aims to create a domain-specific automatic speech recognition (ASR) system for a low-resourced language when in-domain text corpora are available only in a high-resourced language. Several translation scenarios (both unsupervised and semi-supervised) are used to obtain domain-specific textual data. Moreover this paper shows that a small amount of manually post-edited text is enough to develop other natural language processing systems that, in turn, can be used to automatically improve the machine translated text, leading to a significant boost in ASR performance. An in-depth analysis, to explain why and how the machine translated text improves the performance of the domain-specific ASR, is also made at the end of this paper. As bi-products of this core domain-adaptation methodology, this paper also presents the first large vocabulary continuous speech recognition system for Romanian, and introduces a diacritics restoration module to process the Romanian text corpora, as well as an automatic phonetization module needed to extend the Romanian pronunciation dictionary.

2013 7th Conference on Speech Technology and Human - Computer Dialogue (SpeD) | 2013

Text spotting in large speech databases for under-resourced languages

Andi Buzo; Horia Cucu; Corneliu Burileanu

Lightly supervised acoustic modeling in under-resourced languages raises new issues due to the poor accuracy of Automatic Speech Recognition (ASR) systems for such languages and the quality of the speech transcriptions that may be found. In these conditions, the common alignment techniques are not always capable of aligning the ASR output and the approximate transcription. We propose two aligning methods that overcome these issues. In the first approach we apply an image processing algorithm on the matching matrix of the two texts to be aligned, while the second alignment approach is based on segmental DTW. The approaches outperform the current Dynamic Time Warping technique (DTW) by extracting in average 29% and 27% respectively more speech data than the currently used DTW.

2013 7th Conference on Speech Technology and Human - Computer Dialogue (SpeD) | 2013

Multilingual query by example spoken term detection for under-resourced languages

Andi Buzo; Horia Cucu; Mihai Safta; Corneliu Burileanu

We propose a query-by-example approach to multilingual Spoken Term Detection for under-resourced languages based on Automatic Speech Recognition. The approach overcomes the main difficulties met under these conditions, i.e., providing a new method for building multilingual acoustic models with few annotated data and searching in approximate Automatic Speech Recognition transcriptions providing high scalability. The acoustic models are obtained by adapting well-trained phonemes to the ones from the envisaged languages. The mapping is made according to International Phonetic Alphabet phoneme classification and a confusion matrix. The weighting of query length and alignment spread are incorporated in the Dynamic Time Warping technique to improve the searching method. Experimental validation was conducted on a standard data set consisting of 3 hours of mixed African languages. The recorded speech has telephonic quality and it is a mix of read and spontaneous speech.

2015 International Conference on Speech Technology and Human-Computer Dialogue (SpeD) | 2015

Speech database acquisition for assisted living environment applications

Mihai Dogariu; Horia Cucu; Andi Buzo; Dragos Burileanu; Octavian Fratu

Home automation has become a subject of increasing interest for both industry and research as there is an increase in the awareness of such systems and their benefits can be easily seen. The new trend is to develop smart homes where commands can be given by speech. This way of communication, besides being the most natural, has the advantage of offering flexibility to the users especially when they have limited motion capabilities. As for widely used languages the state of the art has achieved an important level of performance, little efforts are made with the Romanian language. The main reason for this is the lack of an annotated speech database from real life conditions. This paper focuses on the methodology of acquiring four different speech corpora with various end-user scenarios in mind. The commands corpus is meant to be used in home automation development, the cough corpus is meant to help research in detecting distress situations, the spontaneous speech corpus will aid in distant speech recognition applications and the multi-room, multi-person, multi-language corpus can be used for research in speaker detection and identification. All these were recorded in the context of a completely automated and functional smart home. The small number of such environments available to the public makes these corpora valuable from experimental point of view.

Advances in Electrical and Computer Engineering | 2015

Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training Technique

Horia Cucu; Andi Buzo; Laurent Besacier; Corneliu Burileanu

Statistical speech and language processing techniques, requiring large amounts of training data, are currently state-of-the-art in automatic speech recognition. For high-resourced, int ...

2015 International Conference on Speech Technology and Human-Computer Dialogue (SpeD) | 2015

Exploring multi-language resources for unsupervised spoken term discovery

Bogdan Ludusan; Alexandru Caranica; Horia Cucu; Andi Buzo; Corneliu Burileanu; Emmanuel Dupoux

With information processing and retrieval of spoken documents becoming an important topic, there is a need of systems performing automatic segmentation of audio streams. Among such algorithms, spoken term discovery allows the extraction of word-like units (terms) directly from the continuous speech signal, in an unsupervised manner and without any knowledge of the language at hand. Since the performance of any downstream application depends on the goodness of the terms found, it is relevant to try to obtain higher quality automatic terms. In this paper we investigate whether the use input features derived from of multi-language resources helps the process of term discovery. For this, we employ an open-source phone recognizer to extract posterior probabilities and phone segment decisions, for several languages. We examine the features obtained from a single language and from combinations of languages based on the spoken term discovery results attained on two different datasets of English and Xitsonga. Furthermore, a comparison to the results obtained with standard spectral features is performed and the implications of the work discussed.

2015 International Conference on Speech Technology and Human-Computer Dialogue (SpeD) | 2015

Estimating competing speaker count for blind speech source separation

Valentin Andrei; Horia Cucu; Andi Buzo; Corneliu Burileanu

We present a method for estimating the number of simultaneous speakers for direct integration with blind speech source separation algorithms. The method was developed to use single microphone recordings but is fully compatible with microphone-array approaches. Speech source separation algorithms based on independent component analysis, multiband analysis or spectral learning need the number of concurrent speakers as an input parameter. This is estimated based on pattern matching techniques between the spectrogram of the speech mixture and the ones associated to a set of single speaker references. The method demonstrated to scale up until at least 10 concurrent speakers. Additionally we highlight the separation performance of various speech separation algorithms using mixtures with 3 competing speeches.

world conference on information systems and technologies | 2018

Multilingual Low-Resourced Prototype System for Voice-Controlled Intelligent Building Applications

Alexandru Caranica; Lucian Georgescu; Alexandru Vulpe; Horia Cucu

With speech recognition databases spanning most of the widely used languages around the globe, there is a lot of incentive to build linguistically diverse, voice-driven applications, in different languages and in diverse acoustic conditions. Although state of the art speech processing has achieved great performance for most widely used languages, little efforts were made for under-resourced languages, such as Romanian. Moreover, most of these systems are not focused in supporting specific voice recognition scenarios, such as assistive applications for elder or disabled people, or consider a triggered close talking voice interaction. This paper focuses in building a prototype system for Romanian language, to be used in distant speech recognition scenarios, for voice driven speech applications in intelligent homes or buildings. Previously acquired speech databases in Romanian language are used, recorded in real life conditions, by our research group. For a baseline comparison, an English recognition engine is also implemented.

Explore More