Jesús Andrés-Ferrer
Polytechnic University of Valencia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jesús Andrés-Ferrer.
empirical methods in natural language processing | 2008
Saša Hasan; Juri Ganitkevitch; Hermann Ney; Jesús Andrés-Ferrer
This paper describes a lexical trigger model for statistical machine translation. We present various methods using triplets incorporating long-distance dependencies that can go beyond the local context of phrases or n-gram based language models. We evaluate the presented methods on two translation tasks in a reranking framework and compare it to the related IBM model 1. We show slightly improved translation quality in terms of BLEU and TER and address various constraints to speed up the training based on Expectation-Maximization and to lower the overall number of triplets without loss in translation performance.
Pattern Recognition Letters | 2014
Adrià Giménez; Ihab Khoury; Jesús Andrés-Ferrer; Alfons Juan
Hidden Markov Models (HMMs) are now widely used for off-line handwriting recognition in many languages. As in speech recognition, they are usually built from shared, embedded HMMs at symbol level, where state-conditional probability density functions in each HMM are modeled with Gaussian mixtures. In contrast to speech recognition, however, it is unclear which kind of features should be used and, indeed, very different features sets are in use today. Among them, we have recently proposed to directly use columns of raw, binary image pixels, which are directly fed into embedded Bernoulli (mixture) HMMs, that is, embedded HMMs in which the emission probabilities are modeled with Bernoulli mixtures. The idea is to by-pass feature extraction and to ensure that no discriminative information is filtered out during feature extraction, which in some sense is integrated into the recognition model. In this work, column bit vectors are extended by means of a sliding window of adequate width to better capture image context at each horizontal position of the word image. Using these windowed Bernoulli mixture HMMs, good results are reported on the well-known IAM and RIMES databases of Latin script, and in particular, state-of-the-art results are provided on the IfN/ENIT database of Arabic handwritten words.
international conference on acoustics, speech, and signal processing | 2013
Adria A. Martinez-Villaronga; Miguel A. del Agua; Jesús Andrés-Ferrer; Alfons Juan
Videolectures are currently being digitised all over the world for its enormous value as reference resource. Many of these lectures are accompanied with slides. The slides offer a great opportunity for improving ASR systems performance. We propose a simple yet powerful extension to the linear interpolation of language models for adapting language models with slide information. Two types of slides are considered, correct slides, and slides automatic extracted from the videos with OCR. Furthermore, we compare both time aligned and unaligned slides. Results report an improvement of up to 3.8 % absolute WER points when using correct slides. Surprisingly, when using automatic slides obtained with poor OCR quality, the ASR system still improves up to 2.2 absolute WER points.
international conference on frontiers in handwriting recognition | 2012
Patrick Doetsch; Mahdi Hamdani; Hermann Ney; Adrià Giménez; Jesús Andrés-Ferrer; Alfons Juan
In this paper a vertical repositioning method based on the center of gravity is investigated for handwriting recognition systems and evaluated on databases containing Arabic and French handwriting. Experiments show that vertical distortion in images has a large impact on the performance of HMM based handwriting recognition systems. Recently good results were obtained with Bernoulli HMMs (BHMMs) using a preprocessing with vertical repositioning of binarized images. In order to isolate the effect of the preprocessing from the BHMM model, experiments were conducted with Gaussian HMMs and the LSTM-RNN tandem HMM approach with relative improvements of 33% WER on the Arabic and up to 62% on the French database.
international conference on document analysis and recognition | 2011
Adrià Giménez; Jesús Andrés-Ferrer; Alfons Juan; Nicolás Serrano
Bernoulli-based models such as Bernoulli mixtures or Bernoulli HMMs (BHMMs), have been successfully applied to several handwritten text recognition (HTR) tasks which range from character recognition to continuous and isolated handwritten words. All these models belong to the generative model family and, hence, are usually trained by (joint) maximum likelihood estimation (MLE). Despite the good properties of the MLE criterion, there are better training criteria such as maximum mutual information (MMI). The MMI is a widespread criterion that is mainly employed to train discriminative models such as log-linear (or maximum entropy) models. Inspired by the Bernoulli mixture classifier, in this work a log-linear model for binary data is proposed, the so-called mixture of multi-class logistic regression. The proposed model is proved to be equivalent to the Bernoulli mixture classifier. In this way, we give a discriminative training framework for Bernoulli mixture models. The proposed discriminative training framework is applied to a well-known Indian digit recognition task.
Pattern Analysis and Applications | 2010
Jesús Andrés-Ferrer; Alfons Juan
The naive Bayes assumption in text classification has the advantage of greatly simplifying maximum likelihood estimation of unknown class-conditional word occurrence probabilities. However, these estimates are usually modified by application of a heuristic parameter smoothing technique to avoid (over-fitted) null estimates. In this work, we advocate the reduction of the parameter domain instead of parameter smoothing. This leads to a constrained domain maximum likelihood estimation problem for which we provide an iterative algorithm that solves it optimally.
IberSPEECH 2014 Proceedings of the Second International Conference on Advances in Speech and Language Technologies for Iberian Languages - Volume 8854 | 2014
M. A. del-Agua; Adrià Giménez; Nicolás Serrano; Jesús Andrés-Ferrer; Jorge Civera; Alberto Sanchis; Alfons Juan
Over the past few years, online multimedia educational repositories have increased in number and popularity. The main aim of the transLectures project is to develop cost-effective solutions for producing accurate transcriptions and translations for large video lecture repositories, such as VideoLectures.NET or the Universitat Politecnica de Valencias repository, poliMedia. In this paper, we present the transLectures-UPV toolkit TLK, which has been specifically designed to meet the requirements of the transLectures project, but can also be used as a conventional ASR toolkit. The main features of the current release include HMM training and decoding with speaker adaptation techniques fCMLLR. TLK has been tested on the VideoLectures.NET and poliMedia repositories, yielding very competitive results. TLK has been released under the permissive open source Apache License v2.0 and can be directly downloaded from the transLectures website.
international conference on image analysis and processing | 2013
Ihab Khoury; Adrià Giménez; Alfons Juan; Jesús Andrés-Ferrer
Hidden Markov Models (HMMs) are now widely used for off-line text recognition in many languages and, in particular, Arabic. In previous work, we proposed to directly use columns of raw, binary image pixels, which are directly fed into embedded Bernoulli (mixture) HMMs, that is, embedded HMMs in which the emission probabilities are modeled with Bernoulli mixtures. The idea was to by-pass feature extraction and to ensure that no discriminative information is filtered out during feature extraction, which in some sense is integrated into the recognition model. More recently, we extended the column bit vectors by means of a sliding window of adequate width to better capture image context at each horizontal position of the word image. However, these models might have limited capability to properly model vertical image distortions. In this paper, we have considered three methods of window repositioning after window extraction to overcome this limitation. Each sliding window is translated (repositioned) to align its center to the center of mass. Using this approach, state-of-art results are reported on the Arabic Printed Text Recognition (APTI) database.
SSPR&SPR'10 Proceedings of the 2010 joint IAPR international conference on Structural, syntactic, and statistical pattern recognition | 2010
Jesús Andrés-Ferrer; Germán Sanchis-Trilles; Francisco Casacuberta
In this paper, we present a novel clustering approach based on the use of kernels as similarity functions and the C-means algorithm. Several word-sequence kernels are defined and extended to verify the properties of similarity functions. Afterwards, these monolingual wordsequence kernels are extended to bilingual word-sequence kernels, and applied to the task of monolingual and bilingual sentence clustering. The motivation of this proposal is to group similar sentences into clusters so that specialised models can be trained for each cluster, with the purpose of reducing in this way both the size and complexity of the initial task.We provide empirical evidence for proving that the use of bilingual kernels can lead to better clusters, in terms of intra-cluster perplexities.
international conference on acoustics, speech, and signal processing | 2009
Jesús Andrés-Ferrer; Hermann Ney
The problem of estimating the parameters of an n-gram language model is a typical problem of estimating small probabilities. So far, two methods have been proposed and used to handle this problem: 1. the empirical Bayes method resulting in the Turing-Good estimates. Theses estimates do not have any constraints and tend to be very noisy. 2. discounting models like absolute (or linear) discounting. The discounting models are heavily constrained and typically have only a single free parameter. Both methods can be formulated in a leaving-one-out framework. In this paper, we study methods that lie between these two extremes. We design models with various types of constraints and derive efficient algorithms for estimating the parameters of these models. We propose two novel types of constraints or models: interval constraints and the exact extended Kneser-Ney model. The proposed methods are implemented and applied to language modelling in order to compare the methods in terms of perplexities. The results show that the new constrained methods outperform other unconstrained methods.