Nelly Barbot | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nelly Barbot is active.

Explore More

Publication

Featured researches published by Nelly Barbot.

Computational Approaches to Analogical Reasoning | 2014

Analogical Proportions in a Lattice of Sets of Alignments Built on the Common Subwords in a Finite Language

Laurent Miclet; Nelly Barbot; Baptiste Jeudy

We define the locally maximal subwords and locally minimal superwords common to a finite set of words. We also define the corresponding sets of alignments. We give a partial order relation between such sets of alignments, as well as two operations between them. We show that the constructed family of sets of alignments has the lattice structure. The study of analogical proportion in lattices gives hints to use this structure as a machine learning basis, aiming at inducing a generalization of the set of words.

text speech and dialogue | 2006

Comparing b-spline and spline models for f0 modelling

Damien Lolive; Nelly Barbot; Olivier Boëffard

This article describes a new approach to estimate F0 curves using B-spline and Spline models characterized by a knot sequence and associated control points The free parameters of the model are the number of knots and their location The free-knot placement, which is a NP-hard problem, is done using a global MLE (Maximum Likelihood Estimation) within a simulated-annealing strategy Experiments are conducted in a speech processing context on a 7000 syllables french corpus We estimate the two challenging models for increasing values of the number of free parameters We show that a B-spline model provides a slightly better improvement than the Spline model in terms of RMS error.

IEEE Journal of Selected Topics in Signal Processing | 2010

B-Spline Model Order Selection With Optimal MDL Criterion Applied to Speech Fundamental Frequency Stylization

Damien Lolive; Nelly Barbot; Olivier Boëffard

In the speech processing field, stylization of fundamental frequency F 0 has been subjected to numerous works. Models proposed in the literature rely on knowledge stemming from phonology and linguistics. We propose an approach that deals with the issue of F 0 curve stylization requiring as few linguistic assumptions as possible and in the framework of B-spline models. A B-spline model, characterized by a sequence of knots with which control points are associated, enables the formalization of discontinuities in the derivatives of the observed values sequence. Beyond the implementation of a B-spline model to stylize an open curve sampled using a constant step, we address the problem of the optimal model order choice. We propose to use a parsimony criterion based on a minimum description length (MDL) approach, in order to optimize the number of knots. We derive several criteria relying on bounds estimated from parameter values. We demonstrate the optimality of these choices in the theoretical MDL framework. We introduce a notion of variable precision of parameters which enables a good compromise between the modeling precision and degrees of freedom of the estimated models. Experiments are performed on a French speech corpus and compare three MDL criteria. The use of both B-spline model and MDL methodology enables an efficient modeling of F 0 curves and provides an RMS error around 1 Hz while allowing a relatively high compression rate about 40%.

Computational Linguistics | 2015

Large linguistic corpus reduction with scp algorithms

Nelly Barbot; Olivier Boëffard; Jonathan Chevelu; Arnaud Delhay

Linguistic corpus design is a critical concern for building rich annotated corpora useful in different domains of applications. For example, speech technologies such as ASR (Automatic Speech Recognition) or TTS (Text-to-Speech) need a huge amount of speech data to train data-driven models or to produce synthetic speech. Collecting data is always related to costs (recording speech, verifying annotations, etc.), and as a rule of thumb, the more data you gather, the more costly your application will be. Within this context, we present in this article solutions to reduce the amount of linguistic text content while maintaining a sufficient level of linguistic richness required by a model or an application. This problem can be formalized as a Set Covering Problem (SCP) and we evaluate two algorithmic heuristics applied to design large text corpora in English and French for covering phonological information or POS labels. The first considered algorithm is a standard greedy solution with an agglomerative/spitting strategy and we propose a second algorithm based on Lagrangian relaxation. The latter approach provides a lower bound to the cost of each covering solution. This lower bound can be used as a metric to evaluate the quality of a reduced corpus whatever the algorithm applied. Experiments show that a suboptimal algorithm like a greedy algorithm achieves good results; the cost of its solutions is not so far from the lower bound (about 4.35% for 3-phoneme coverings). Usually, constraints in SCP are binary; we proposed here a generalization where the constraints on each covering feature can be multi-valued.

text, speech and dialogue | 2006

Evaluating language models within a predictive framework: an analysis of ranking distributions

Pierre Alain; Olivier Boëffard; Nelly Barbot

Perplexity is a widely used criterion in order to compare language models without any task assumptions However, the main drawback is that perplexity supposes probability distributions and hence cannot compare heterogeneous models As an evaluation framework, we propose in this article to abandon perplexity and to extend the Shannons entropy idea which is based on model prediction performance using rank based statistics Our methodology is able to predict joint word sequences being independent of the task or model assumptions Experiments are carried out on the English language with different kind of language models We show that long-term prediction language models are not more effective than the standard n-gram models Ranking distributions follow exponential laws as already observed in predicting letter sequences These distributions show a second mode not observed with letters and we propose to give some interpretation to this mode in this article.

Archive | 2008