Alexandru Caranica | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alexandru Caranica is active.

Explore More

Publication

Featured researches published by Alexandru Caranica.

2015 International Conference on Speech Technology and Human-Computer Dialogue (SpeD) | 2015

Exploring multi-language resources for unsupervised spoken term discovery

Bogdan Ludusan; Alexandru Caranica; Horia Cucu; Andi Buzo; Corneliu Burileanu; Emmanuel Dupoux

With information processing and retrieval of spoken documents becoming an important topic, there is a need of systems performing automatic segmentation of audio streams. Among such algorithms, spoken term discovery allows the extraction of word-like units (terms) directly from the continuous speech signal, in an unsupervised manner and without any knowledge of the language at hand. Since the performance of any downstream application depends on the goodness of the terms found, it is relevant to try to obtain higher quality automatic terms. In this paper we investigate whether the use input features derived from of multi-language resources helps the process of term discovery. For this, we employ an open-source phone recognizer to extract posterior probabilities and phone segment decisions, for several languages. We examine the features obtained from a single language and from combinations of languages based on the spoken term discovery results attained on two different datasets of English and Xitsonga. Furthermore, a comparison to the results obtained with standard spectral features is performed and the implications of the work discussed.

2009 Advanced Technologies for Enhanced Quality of Life | 2009

Communication between the Sensor Levels for Monitoring Subjects with Disabilities

Mihaela Hnatiuc; Alexandru Caranica

The intelligent sensors used to detect the subject’s behavior are classified on priority levels, superior and inferior. The levels might work in parallel or not. Which level is used is decided by the system in function of that event. The communication modes between levels are presented in this paper. The parallel/distributed structure is introduced because the sensor levels work in parallel. Testing of an embedded system, used for a communication node between sensors and Microsystems is detailed in the final part of this paper.

world conference on information systems and technologies | 2018

Multilingual Low-Resourced Prototype System for Voice-Controlled Intelligent Building Applications

Alexandru Caranica; Lucian Georgescu; Alexandru Vulpe; Horia Cucu

With speech recognition databases spanning most of the widely used languages around the globe, there is a lot of incentive to build linguistically diverse, voice-driven applications, in different languages and in diverse acoustic conditions. Although state of the art speech processing has achieved great performance for most widely used languages, little efforts were made for under-resourced languages, such as Romanian. Moreover, most of these systems are not focused in supporting specific voice recognition scenarios, such as assistive applications for elder or disabled people, or consider a triggered close talking voice interaction. This paper focuses in building a prototype system for Romanian language, to be used in distant speech recognition scenarios, for voice driven speech applications in intelligent homes or buildings. Previously acquired speech databases in Romanian language are used, recorded in real life conditions, by our research group. For a baseline comparison, an English recognition engine is also implemented.

world conference on information systems and technologies | 2018

Tenable Smart Building Security Flow Architecture Using Open Source Tools

Alexandru Caranica; Alexandru Vulpe; Octavian Fratu

Nowadays it’s rare to find newly opened buildings or under construction sites that don’t aspire to be “smart” or “intelligent”. There are wired systems through the building that offer the basic infrastructure for VoIP systems, TCP/IP communication, IP or CCTV cameras, systems that require low latency and great amounts of bandwidth. Complementary to these systems, we also find a variety of wireless devices, from data collection sensors to access points, environmental panels or light control systems. Buildings today are certainly “smarter” and more power efficient than they were five or ten years ago, and all these new devices, that comprise the Internet of Things (IoT) category, pose new issues to all building managers: securing and enforcing digital policies to all these IoT nodes and devices. This paper focuses on building a “vendor neutral” security architecture, based on open source tools, suitable for a wide range of scenarios and building types: school campuses, small or home offices, small building shops, etc. The proposed system architecture is described, together with a preliminary evaluation of the prototype system.

2017 International Conference on Speech Technology and Human-Computer Dialogue (SpeD) | 2017

Speech recognition results for voice-controlled assistive applications

Alexandru Caranica; Horia Cucu; Corneliu Burileanu; François Portet; Michel Vacher

Until recently, controlling a “smart home” consisted in setting up a series of applications and automation tools: scheduling when the air conditioning system could cool the room, turn on the lighting system at sunset, or just use ones phone to control several TV appliances or the garage door. Recent advances in speech recognition technology have made voice-controlled smart homes attainable, and many companies and communities are providing interfaces or home boxes to make this voice control available. However, they lack customization ability, and interoperability with appliances or applications is not guaranteed. Moreover, most of these systems are not focused in supporting specific voice recognition scenarios, such as assistive applications for elder or disabled people or consider a triggered close talking voice interaction. Although state of the art speech processing has achieved great performance for most widely used languages, little to no efforts were made for under-resourced languages, such as Romanian. This paper focuses on a set of experiments in building a series of acoustic and grammar models for Romanian language, to be used in distant speech recognition scenarios, for voice driven speech applications in intelligent homes or buildings, using previously acquired speech databases in Romanian language, in real life conditions, by our research group.

content based multimedia indexing | 2016

Exploring an unsupervised, language independent, spoken document retrieval system

Alexandru Caranica; Horia Cucu; Andi Buzo

With the increasing availability of spoken documents in different languages, there is a need of systems performing automatic and unsupervised search on audio streams, containing speech, in a document retrieval scenario. We are interested in retrieving information from multilingual speech data, from spoken documents such as broadcast news, video archives or even telephone conversations. The ultimate goal of a Spoken Document Retrieval System is to enable vocabulary-independent search over large collections of speech content, to find written or spoken “queries” or reoccurring speech data. If the language is known, the task is relatively simple. One could use a large vocabulary continuous speech recognition (LVCSR) tool to produce highly accurate word transcripts, which are then indexed and query terms are retrieved from the index. However, if the language is unknown, hence queries are not part of the recognizers vocabulary, the relevant audio documents cannot be retrieved. Thus, search metrics are affected, and documents retrieved are no longer relevant to the user. In this paper we investigate whether the use of input features derived from multi-language resources helps the process of unsupervised spoken term detection, independent of the language. Moreover, we explore the use of multi objective search, by combining both language detection and LVCSR based search, with unsupervised Spoken Term Detection (STD). In order to achieve this, we make use of multiple open-source tools and in-house acoustic and language models, to propose a language independent spoken document retrieval system.

international conference on telecommunications | 2015

On transcribing informally-pronounced numbers in Romanian speech

Horia Cucu; Alexandru Caranica; Andi Buzo; Corneliu Burileanu

The pronunciation model, a mapping between the lexicon words and their phonetic representation, has a key role in automatic speech recognition. Although many times neglected, the accuracy of this model influences significantly the accuracy of the whole system. This study discusses within-word and cross-word pronunciation variations for Romanian numbers and proposes the solutions to model them in the phonetic dictionary and the language model of an existing speech recognition system for Romanian. The evaluation is performed of a read speech corpus comprising rational numbers with up to three decimal digits. The experiments show a relative WER improvement of 14% over the baseline when within-word pronunciation variations are taken into account and an additional relative WER improvement of 63% when cross-word pronunciation variations are also modelled.

Advanced Topics in Optoelectronics, Microelectronics, and Nanotechnologies 2014 | 2015

An automatic speech recognition system with speaker-independent identification support

Alexandru Caranica; Corneliu Burileanu

The novelty of this work relies on the application of an open source research software toolkit (CMU Sphinx) to train, build and evaluate a speech recognition system, with speaker-independent support, for voice-controlled hardware applications. Moreover, we propose to use the trained acoustic model to successfully decode offline voice commands on embedded hardware, such as an ARMv6 low-cost SoC, Raspberry PI. This type of single-board computer, mainly used for educational and research activities, can serve as a proof-of-concept software and hardware stack for low cost voice automation systems.

2015 International Conference on Speech Technology and Human-Computer Dialogue (SpeD) | 2015

Sound event recognition in smart environments

Gheorghe Pop; Alexandru Caranica; Horia Cucu; Dragos Burileanu

A rich research was reported lately on sound event recognition (SER), a particular case of audio signal classification (ASC), which in turn is part of the more general research field of auditory scene analysis (ASA). The classification of sound events in a given environment is generally more precise with fewer classes and with better knowledge of sound events expected to occur in each class. Various techniques were described in the literature which allow good performance when sound events are strictly repeating. In an effort to develop an application that in the end recognize all sound events in a given context, this work describes an application of the SER in smart environments that aims at recognizing cough sounds. Such techniques cannot rely on the strict repeatability of sound events. They must move towards recognition of sound events that are rather similar to any one of a set of established models. The main working modes we examined were to model cough as non-speech utterances and to search for a match against a database of established models.

MediaEval | 2015