Luis A. S. V. de Sa | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Luis A. S. V. de Sa is active.

Explore More

Publication

Featured researches published by Luis A. S. V. de Sa.

processing of the portuguese language | 2008

Development of a Speech Recognizer with the Tecnovoz Database

José Lopes; Cláudio Neves; Arlindo Veiga; Alexandre M. A. Maciel; Carla Lopes; Fernando Perdigão; Luis A. S. V. de Sa

This paper describes the development of a robust speech recognition using a database collected in the scope of the Tecnovoz project. The speech recognition system is speaker independent, robust to noise and operates in a small footprint embedded hardware platform. Some issues about the database, the training of the acoustic models, the noise suppression front-end and the recognizers confidence measure are addressed in the paper. Although the database was especially designed for specific small-vocabulary tasks, the best system performance was obtained using triphone models rather than whole-word models.

processing of the portuguese language | 2014

Acoustic Similarity Scores for Keyword Spotting

Arlindo Veiga; Carla Lopes; Luis A. S. V. de Sa; Fernando Perdigão

This paper presents a study on keyword spotting systems based on acoustic similarity between a filler model and keyword model. The ratio between the keyword model likelihood and the generic (filler) model likelihood is used by the classifier to detect relevant peaks values that indicate keyword occurrences. We have changed the standard scheme of keyword spotting system to allow keyword detection in a single forward step. We propose a new log-likelihood ratio normalization to minimize the effect of word length on the classifier performance. Tests show the effectiveness of our normalization method against two other methods. Experiments were performed on continuous speech utterances of the Portuguese TECNOVOZ database (read sentences) with keywords of several lengths.

Microprocessing and Microprogramming | 1990

A parallel architecture for real-time video coding

Luis A. S. V. de Sa; Vitor Silva; Fernando Perdigão; Sérgio M. M. de Faria; Pedro A. Amado Assunção

Abstract A computing architecture capable of coding video signals in real time is described. The codec uses several digital signal processors (DSPs) which can be easily programmed to implement the recent H.261 algorithm approved by the CCITT. The DSPs are organized as a single instruction multiple data (SIMD) computing architecture. Every image in a sequence is divided in regions of horizontal strips and each region is operated by its own processor. The principle is used in both the encoder and decoder. These local processors code (decode) one horizontal strip of data which, using the terminology of the H.261 norm, corresponds to two group of blocks (GOBs). They also communicate to a central processor which multiplexes (demultiplexes) the coded data from (for) the processors in the encoder (decoder). In the case of the encoder this central processor also controls a data buffer for bit-rate adaptation. Lateral communication between adjacent processors is also permitted. This allows comparisons between blocks situated in neighbouring regions, as required by most motion estimation algorithms.

conference on computer as a tool | 2011

Talking avatar for web-based interfaces

José Nunes; Luis A. S. V. de Sa; Fernando Perdigão

In this paper we present an approach for creating interactive and speaking avatar models, based on standard face images. We have started from a 3D human face model that can be adjusted to a particular face. In order to adjust the 3D model from a 2D image, a new method with 2 steps is presented. First, a process based on Procrustes analysis is applied in order to find the best match for input key points, obtaining the rotation, translation and scale needed to best fit the model to the photo. Then, using the resulting model we refine the face mesh by applying a linear transform on each vertex. In terms of visual speech animation, we have considered a total of 15 different positions to accurately model the articulation of Portuguese language — the visemes. For normalization purposes, each viseme is defined from the generic neutral face. The animation process is visually represented with linear time interpolation, given a sequence of visemes and its instants of occurrence.

international conference on signal processing | 2008

Efficient noise-robust speech recognition front-end based on the ETSI standard

Cláudio Neves; Arlindo Veiga; Luis A. S. V. de Sa; Fernando Perdigão

A powerful feature extraction system for noise robust speech recognition was standardized by ETSI. The system was developed for distributed speech recognition (DSR) and includes an advanced front-end (AFE) to be implemented in client terminals, which send the extracted parameters to a remote server that runs a speech recognition engine. In view of the integration of a noise-robust front-end in an embedded speech recognition system, which performs simultaneously the feature extraction and the speech recognition tasks, we propose a modified implementation of the front-end with less computational requirements. Using the Aurora 2 speech database, we evaluate the impact on performance of the blind equalization (BE) block, the gain factorization (GF) block and the SNR-dependent waveform processing (SWP) block that are used in the AFE. We conclude that our modified front-end using cepstral mean normalization (CMN) and dropping BE, GF and SWP, outperforms the AFE in a practical task.

conference of the industrial electronics society | 2009

Design of one cycle control for low distortion bipolar switching inverters

Carlos Ferreira; Beatriz Borges; Luis A. S. V. de Sa

Constant frequency One Cycle Control (OCC), as proposed by Smedley, is a powerful method for controlling switching power converters. The dimensioning of this control method for two level AC output converters is not of common knowledge. This paper describes OCC for two level switched power converters. Its instability problems and solutions are analysed and dimensioning equations are deducted. This technique allows the construction of stable high bandwidth power converters with low output Total Harmonic Distortion (THD) and high Power supply Rejection Ratio (PSRR). The obtained analytical expressions are compared with experimental results, showing good correlation.

SPIE/IS&T 1992 Symposium on Electronic Imaging: Science and Technology | 1992

DSP-based hardware for real-time video coding

Luis A. S. V. de Sa; Vitor Silva; Luis J. de la Cruz; Sérgio M. M. de Faria; Pedro J. Amado; Antonio Navarro; Fernando Lopes; Joao Carlos Silvestre

An important application of digital image processing is the compression of video sequences by one or two orders of magnitude with minor picture quality degradation. In order to achieve this data compression elaborated algorithms are used. They eliminate both spatial and temporal redundancy by using transform, differential, and variable length coding techniques. Two of these algorithms are the CCITT H.261 algorithm for videotelephony and the ISO MPEG algorithm for CD-ROM motion video. The hardware implementation of these algorithms is a formidable task in view of the number of operations (more than 1GFLOPS) that may be necessary. This paper discusses the compression and decompression of real-time video using a multiprocessor system based on digital signal processors. The system is based on the partition of each picture in horizontal strips which are operated by a local processor unit made by the combination of the TMS320C30 signal processor and an A121 discrete cosine transform processor. In the encoder, each strip processor inputs raw data from a video acquisition module through a common parallel video bus and outputs compressed data to a supervisor module through a common serial supervisor bus. In the decoder, the data flows through an inverse path, i.e., the processors receive data from a supervisor module and transmit data to a display module. All operations within the horizontal strips are independent from each other except when motion estimation is used. In this case, the processing elements have to access regions of the picture that are allocated to neighboring processors. The number of processors is related to the frame rate and the resolution of the image.

visual communications and image processing | 1990

Parallel architecture for real-time video communications

Luis A. S. V. de Sa; Vitor Silva; Fernando Perdigão; Sérgio M. M. de Faria; Pedro A. Amado Assunção

Avideo codecbased on several paralleldigitalsignalprocessors is described. The digitalsignalprocessors (DSPs) can be easily programmed to implement the H. 261 algorithm and are organized as a single instruction multiple data (SIMD) computing architecture. Both the encoder and the decoder divide a picture in regions of horizontal strips and use one local processor per region. These local processors code (decode) one horizontal strip of data which using the terminology of the H. 261 standard corresponds to two group. of blocks (GOBs). They also communicate to a central processor which multiplexes (demultiplexes) the coded data from (for) the processors in the encoder (decoder). In the case of the encoder the central processor also controls a data buffer for bit-rate adaptation. Lateral communication between adjacent processors is implemented to allow comparisons between blocks situated in neighbouring regions as required by most motion estimation algorithms.

Applications of Digital Image Processing XII | 1990

Implementing A 64kbit/s Video Codec On DSP Hardware

Luis A. S. V. de Sa; Victor Silva

A modular hardware architecture for video coding at p x 64kbit/s data rates is described. The codec uses several digital signal processors (DSPs) and can be viewed as a single instruction multiple data (SIMD) computing architecture. Every image in a sequence is divided in regions of horizontal strips and each region is operated by its an processor. These local processors communicate with a central processor which codes (decodes) the cosine transformed frame differences. Lateral communication between adjacent processors is also permitted. This is done by memory sharing and allows comparisons between blocks situated in neighbouring regions, as required by most motion estimation algorithms. The codec is built using the modern TMS320C30 digital signal processor. The number of processors used in both the coder and the decoder depends on the application. This is a consequence of the modular design and allays the machine to be configured to suit a particular algorithm complexity or a desired quality of the coded image.

visual communications and image processing | 1997

Subband coding of image sequences using multiple vector quantizers

Emanuel Martins; Vitor Silva; Luis A. S. V. de Sa

One efficient way to compress digital images is subband coding. Subband coding using vector quantization could be a competitor to DCT-like image compression schemes. In this paper we will describe an image sequence compression algorithm based on difference image coding techniques, with block motion compensation, difference image segmentation in rectangles using quadtrees, decomposition of rectangles in subbands and vector quantization of the subbands. The vector quantization scheme uses multiple vector quantizers, which yields a better bitrate allocation. The quantization of each subband is performed by 3 different tree structured vector quantizers (TSVQ) at variable tree depths. The rate- distortion curves of all the rectangles are scanned to get the best global R-D combination. The best combination parameters are coded and use to quantize the subbands of all the rectangles. The results show a slightly better performance of the this scheme in relation to the optimal scalar quantization of subbands. The coding speed of this VQ scheme is only 3 times slower than 1 single vector quantization per vector.

Explore More