Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where António Joaquim Serralheiro is active.

Publication


Featured researches published by António Joaquim Serralheiro.


international conference on acoustics, speech, and signal processing | 2009

Non-speech audio event detection

José Portelo; Miguel Bugalho; Isabel Trancoso; João Paulo Neto; Alberto Abad; António Joaquim Serralheiro

Audio event detection is one of the tasks of the European project VIDIVIDEO. This paper focuses on the detection of non-speech events, and as such only searches for events in audio segments that have been previously classified as non-speech. Preliminary experiments with a small corpus of sound effects have shown the potential of this type of corpus for training purposes. This paper describes our experiments with SVM and HMM-based classifiers, using a 290-hour corpus of sound effects. Although we have only built detectors for 15 semantic concepts so far, the method seems easily portable to other concepts. The paper reports experiments with multiple features, different kernels and several analysis windows. Preliminary experiments on documentaries and films yielded promising results, despite the difficulties posed by the mixtures of audio events that characterize real sounds.


international conference on acoustics, speech, and signal processing | 2005

Application of Kalman and RLS adaptive algorithms to non-linear loudspeaker controller parameter estimation: a case study

Ricardo Adriano Ribeiro; António Joaquim Serralheiro; Moisés Piedade

The loudspeaker is a nonlinear transducer that produces harmonic distortion and nonlinear controllers, requiring parameters well tuned to the loudspeaker, are used to reduce it. Unfortunately loudspeaker parameters are not well known and vary during normal operation. Based on a simplified nonlinear model of the loudspeaker, and a modification to the adaptive filters, nonlinear systems for the estimation of parameters for the mechanical part and for the electrical part of the loudspeaker were developed. The parameters estimated are directly usable by the controllers, not requiring additional conversion calculations. The Kalman and RLS adaptive algorithms where applied to this systems and simulation results show that they converge, although the electrical part estimation system was about 15 times slower than the mechanical.


european conference on research and advanced technology for digital libraries | 2002

Word Alignment in Digital Talking Books Using WFSTs

António Joaquim Serralheiro; Diamantino Caseiro; Hugo Meinedo; Isabel Trancoso

This paper describes the motivation and the method that we used for aligning digital spoken books, and the results obtained both at a word level and at a phone level. This alignment will allow specific access interfaces for persons with special needs, and also tools for easily detecting and indexing units (words, sentences, topics) in the spoken books. The tool was implemented in a Weighted Finite State Transducer framework, which provides an efficient way to combine different types of knowledge sources, such as alternative pronunciation rules. With this tool, a 2-hour long spoken book was aligned in a single step in much less than real time.


2009 EAEEIE Annual Conference | 2009

Development and test of an antenna simulator for transmitter-receiver radio P/PRC525

Angelo Silva; António Joaquim Serralheiro; Maria João Martins; Moisés Piedade

The work reported in this paper, resulted from a joint cooperation between the Faculty of Engineering (I.S.T) of the Technical University of Lisbon, Portugal, the Portuguese Army and an private company, EID, which is active in the development of electronic devices for special applications. The rationale behind this Project is the need to implement an electronic device for the simulation of antennas used in the military tactical radios P/PRC525 [2] in different operation scenarios.


processing of the portuguese language | 2008

An Approach to Natural Language Equation Reading in Digital Talking Books

Carlos Juzarte Rolo; António Joaquim Serralheiro

Mathematic equations are, of necessity, a must in any mathematic textbooks but also in physics, communications and, in general, in any technology related texts. Furthermore, their usage in Digital Talking Books (DTB)[1] can be eased if its corresponding counterpart in both text and/or spoken forms can be automatically generated. Therefore, an automatic system to translate or convert them into text and latter to speech is needed to broaden the scope of the DTBs. DTBs are based on different types of data, structured according to some standard. They also require a player or browser that allows users to navigate, to index and to retrieve information (text, sound, images, etc.). The player was developed using a model based framework for adaptive multi-modal environments [2]. Besides supporting the features described in the DTB standard, the player introduces features complementing the synchronized presentation of text and audio, such as: addition of content related images; variable synchronization units, ranging from word to paragraph; annotation controlled navigation; definition of new reading paths; adaptation of the visual elements; behavioral adaptation reflecting user interaction, amongst others.


processing of the portuguese language | 2008

A Spoken Dialog System Speech Interface Based on a Microphone Array

Gustavo Esteves Coelho; António Joaquim Serralheiro; João Paulo Neto

In this paper we present a Spoken Dialog System (SDS) with a Microphone Array (MA). Our goal is to create a hands-free home automation system with a speech interface to control home devices. The MA interface enables to create ubiquitous speech acquisition for the SDS. The implemented system allows any user --- in any position in a room --- to establish a dialog with a virtual butler that is able to control a wide range of home appliances (room lights, air-conditioner, windows shades and hi-fi features). This virtual butler has a 3D animated face that is, while the dialog is engaged, able to steer to the users position and respond to his/hers commands with synthesized speech. The presented results show that the MA, as distant talk interface, performs quite well and is a step towards a more realistic human-machine interaction.


international conference on computational cybernetics | 2004

New insights into pseudo-fractional arma modelling

Manuel Duarte Ortigueira; António Joaquim Serralheiro

In this paper the modelling of fractional linear systems through ARMA models is addressed. This study is performed by using a recursive algorithm for impulse response ARMA modelling leading us to propose suitable models for this problem


Archive | 1995

On the Performance of SCHMM for Isolated Word Recognition and Rejection

Carlos Teixeira; Isabel Trancoso; António Joaquim Serralheiro

A common problem with isolated word recognition systems arises when an untrained user speaks an unwanted word, outside the active vocabulary. This word will be recognised as one of the keywords, thus steering the dialogue into a wrong direction. The use of garbage or sink models (SM) is a known technique to avoid those extraneous words being recognised as vocabulary words. Each lexical word from the active vocabulary is represented in the recognition process by at least one word model (WM). A single SM intends to be a general description for a wide number of lexical items - all those which do not belong to the limited active vocabulary. Our previous work [3] has indicated that multiple SM’s can improve the rejection score when compared with a single SM in the context of a Continuous Hidden Markov Model (CHMM) with a single observation component. This improvement is related to the vocabulary size. For very small vocabularies, there are no advantages in using more than one SM, whereas for larger vocabularies, better results can be achieved with multiple models. When searching for the optimal number of multiple SM’s, an upper bound seems to be imposed by the available amount of speech training material. In fact, this amount should be particularly relevant for training sink models as they intend to represent the whole word universe (minus the small keyword vocabulary set). The parametric description provided by a single gaussian distribution is known to be a poor model for the observation probability density function (pdf). However, due to the restricted amount of speech training material, the use of multiple gaussian mixtures to describe the observation pdf’s did not improve our results. In the present work, we compare the performance of continuous and semi-continuous HMM (SCHMM) recognisers for dealing with the problem of word rejection. The latter type of recogniser has several advantages over the first one in cases of reduced training material which is indeed one of the critical factors in this study and in terms of computational complexity. This approach combines a common set of pdf’s in a codebook with the word or sub-word models themselves. The codebook and the models can be easily initialised and reestimated separately using different sets of training material or mutually optimised using the unified modelling approach described in [1]. Separate software tools for processing each stage of training and testing were developed providing a complete SCHMM recognition platform. In the present work some effort was also spent in finding how to combine the initialisation steps. The tests reported here enable us to compare CHMM and SCHMM while using multiple SM’s. Another issue to be addressed is the type and amount of speech material to be used for SM’s training. The discussion of HMM clustering techniques for selecting the speech material used to train each sink model in the context of multiple sink modelling is described in [2].


international conference on acoustics speech and signal processing | 1988

Quantization issues in harmonic coders (speech coding)

Isabel Trancoso; Joaquim S. Rodrigues; Luís B. Almeida; Jorge S. Marques; António Joaquim Serralheiro; D. Santos; José Tribolet

The introduction of vector quantization in the harmonic coder framework can affect several sets of parameters, namely the amplitudes and the phases of the harmonic coefficients. A systematic experimental study is reported in which this type of quantization is compared with traditional scalar techniques for quantizing the harmonic parameters. The experiments showed that it is possible to quantize voice speech below 8 kb/s and still obtain high-quality synthetic signals.<<ETX>>


conference of the international speech communication association | 1997

Recognition of non-native accents.

Carlos Teixeira; Isabel Trancoso; António Joaquim Serralheiro

Collaboration


Dive into the António Joaquim Serralheiro's collaboration.

Top Co-Authors

Avatar

Isabel Trancoso

Instituto Superior Técnico

View shared research outputs
Top Co-Authors

Avatar

Diamantino Caseiro

Technical University of Lisbon

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

João Paulo Neto

Technical University of Lisbon

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Moisés Piedade

Instituto Superior Técnico

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge