Xavier Bost
University of Avignon
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Xavier Bost.
spoken language technology workshop | 2014
Xavier Bost; Georges Linarès
Speaker diarization, usually denoted as the “who spoke when” task, turns out to be particularly challenging when applied to fictional films, where many characters talk in various acoustic conditions (background music, sound effects...). Despite this acoustic variability, such movies exhibit specific visual patterns in the dialogue scenes. In this paper, we introduce a two-step method to achieve speaker diarization in TV series: a speaker diarization is first performed locally in the scenes detected as dialogues; then, the hypothesized local speakers are merged in a second agglomerative clustering process, with the constraint that speakers locally hypothesized to be distinct must not be assigned to the same cluster. The performances of our approach are compared to those obtained by standard speaker diarization tools applied to the same data.
international conference on acoustics, speech, and signal processing | 2015
Xavier Bost; Georges Linarès; Serigne Gueye
Speaker diarization may be difficult to achieve when applied to narrative films, where speakers usually talk in adverse acoustic conditions: background music, sound effects, wide variations in intonation may hide the inter-speaker variability and make audio-based speaker diarization approaches error prone. On the other hand, such fictional movies exhibit strong regularities at the image level, particularly within dialogue scenes. In this paper, we propose to perform speaker diarization within dialogue scenes of TV series by combining the audio and video modalities: speaker diarization is first performed by using each modality; the two resulting partitions of the instance set are then optimally matched, before the remaining instances, corresponding to cases of disagreement between both modalities, are finally processed. The results obtained by applying such a multi-modal approach to fictional films turn out to outperform those obtained by relying on a single modality.
advances in social networks analysis and mining | 2016
Xavier Bost; Vincent Labatut; Serigne Gueye; Georges Linarès
Modern popular TV series often develop complex storylines spanning several seasons, but are usually watched in quite a discontinuous way. As a result, the viewer generally needs a comprehensive summary of the previous season plot before the new one starts. The generation of such summaries requires first to identify and characterize the dynamics of the series subplots. One way of doing so is to study the underlying social network of interactions between the characters involved in the narrative. The standard tools used in the Social Networks Analysis field to extract such a network rely on an integration of time, either over the whole considered period, or as a sequence of several time-slices. However, they turn out to be inappropriate in the case of TV series, due to the fact the scenes showed onscreen alternatively focus on parallel storylines, and do not necessarily respect a traditional chronology. In this article, we introduce narrative smoothing, a novel, still exploratory, network extraction method. It smooths the relationship dynamics based on the plot properties, aiming at solving some of the limitations present in the standard approaches. In order to assess our method, we apply it to a new corpus of 3 popular TV series, and compare it to both standard approaches. Our results are promising, showing narrative smoothing leads to more relevant observations when it comes to the characterization of the protagonists and their relationships. It could be used as a basis for further modeling the intertwined storylines constituting TV series plots.
Computer Speech & Language | 2015
Xavier Bost; G. Senay; Marc El-Bèze; R. De Mori
HighlightsA multiple classification methods for multiple theme hypothesization is proposed.Four methods, one of which is new, are initially used and separately evaluated.A new sequential decision strategy for multiple theme hypothesization is introduced.A new hypothesis refinancing component is presented, based on ASR word lattice.Results show that the strategy makes it possible to obtain reliable service surveys. The paper deals with the automatic analysis of real-life telephone conversations between an agent and a customer of a customer care service (ccs). The application domain is the public transportation system in Paris and the purpose is to collect statistics about customer problems in order to monitor the service and decide priorities on the intervention for improving user satisfaction.Of primary importance for the analysis is the detection of themes that are the object of customer problems. Themes are defined in the application requirements and are part of the application ontology that is implicit in the ccs documentation.Due to variety of customer population, the structure of conversations with an agent is unpredictable. A conversation may be about one or more themes. Theme mentions can be interleaved with mentions of facts that are irrelevant for the application purpose. Furthermore, in certain conversations theme mentions are localized in specific conversation segments while in other conversations mentions cannot be localized. As a consequence, approaches to feature extraction with and without mention localization are considered.Application domain relevant themes identified by an automatic procedure are expressed by specific sentences whose words are hypothesized by an automatic speech recognition (asr) system. The asr system is error prone. The word error rates can be very high for many reasons. Among them it is worth mentioning unpredictable background noise, speaker accent, and various types of speech disfluencies.As the application task requires the composition of proportions of theme mentions, a sequential decision strategy is introduced in this paper for performing a survey of the large amount of conversations made available in a given time period. The strategy has to sample the conversations to form a survey containing enough data analyzed with high accuracy so that proportions can be estimated with sufficient accuracy.Due to the unpredictable type of theme mentions, it is appropriate to consider methods for theme hypothesization based on global as well as local feature extraction. Two systems based on each type of feature extraction will be considered by the strategy. One of the four methods is novel. It is based on a new definition of density of theme mentions and on the localization of high density zones whose boundaries do not need to be precisely detected.The sequential decision strategy starts by grouping theme hypotheses into sets of different expected accuracy and coverage levels. For those sets for which accuracy can be improved with a consequent increase of coverage a new system with new features is introduced. Its execution is triggered only when specific preconditions are met on the hypotheses generated by the basic four systems.Experimental results are provided on a corpus collected in the call center of the Paris transportation system known as ratp. The results show that surveys with high accuracy and coverage can be composed with the proposed strategy and systems. This makes it possible to apply a previously published proportion estimation approach that takes into account hypothesization errors.
arXiv: Multimedia | 2018
Xavier Bost; Vincent Labatut; Serigne Gueye; Georges Linarès
Identifying and characterizing the dynamics of modern tv series subplots is an open problem. One way is to study the underlying social network of interactions between the characters. Standard dynamic network extraction methods rely on temporal integration, either over the whole considered period, or as a sequence of several time-slices. However, they turn out to be inappropriate in the case of tv series, because the scenes shown onscreen alternatively focus on parallel storylines, and do not necessarily respect a traditional chronology. In this article, we introduce Narrative Smoothing, a novel network extraction method taking advantage of the plot properties to solve some of their limitations. We apply our method to a corpus of 3 popular series, and compare it to both standard approaches. Narrative smoothing leads to more relevant observations when it comes to the characterization of the protagonists and their relationships, confirming its appropriateness to model the intertwined storylines constituting the plots.
CLEF (Working Notes) | 2013
Jean-Valère Cossu; Benjamin Bigot; Ludovic Bonnefoy; Mohamed Morchid; Xavier Bost; Grégory Senay; Richard Dufour; Vincent Bouvier; Juan-Manuel Torres-Moreno; Marc El-Bèze
arXiv: Computation and Language | 2013
Xavier Bost; Ilaria Brunetti; Luis Adrián Cabrera-Diego; Jean-Valère Cossu; Andréa Carneiro Linhares; Mohamed Morchid; Juan-Manuel Torres-Moreno; Marc El-Bèze; Richard Dufour
conference of the international speech communication association | 2013
Xavier Bost; Marc El-Bèze; Renato De Mori
7ème Conférence sur les modèles et l'analyse de réseaux : approches mathématiques et informatiques (MARAMI) | 2016
Xavier Bost; Vincent Labatut; Serigne Gueye; Georges Linarès
arXiv: Computation and Language | 2015
Jean-Valère Cossu; Ludovic Bonnefoy; Xavier Bost; Marc El-Bèze