Richard Dufour
University of Avignon
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Richard Dufour.
Pattern Recognition Letters | 2014
Mohamed Morchid; Richard Dufour; Pierre-Michel Bousquet; Georges Linarès; Juan-Manuel Torres-Moreno
Social networks become a major actor in massive information propagation. In the context of the Twitter platform, its popularity is due in part to the capability of relaying messages (i.e. tweets) posted by users. This particular mechanism, called retweet, allows users to massively share tweets they consider as potentially interesting for others. In this paper, we propose to study the behavior of tweets that have been massively retweeted in a short period of time. We first analyze specific tweet features through a Principal Component Analysis (PCA) to better understand the behavior of highly forwarded tweets as opposed to those retweeted only a few times. Finally, we propose to automatically detect the massively retweeted messages. The qualitative study is used to select the features allowing the best classification performance. We show that the selection of only the most correlated features, leads to the best classification accuracy (F-measure of 65.7%), with a gain of about 2.4 points in comparison to the use of the complete set of features.
international conference on acoustics, speech, and signal processing | 2014
Mohamed Morchid; Richard Dufour; Pierre-Michel Bousquet; Mohamed Bouallegue; Georges Linarès; Renato De Mori
In this paper, we study the impact of dialogue representations and classification methods in the task of theme identification of telephone conversation services having highly imperfect automatic transcriptions. Two dialogue representations are firstly compared: the classical Term Frequency-Inverse Document Frequency with Gini purity criteria (TF-IDF-Gini) method and the Latent Dirichlet Allocation (LDA) approach. We then propose to study an original classification method that takes advantage of the LDA topic space representation, highlighted as the best dialogue representation. To do so, two assumptions about topic representation led us to choose a Gaussian process (GP) based method. This approach is compared with a Support Vector Machine (SVM) classification method. Results show that the GP approach is a better solution to deal with the multiple theme complexity of a dialogue, no matter the conditions studied (manual or automatic transcriptions). We finally discuss the impact of the topic space reduction on the classification accuracy.
international conference on acoustics, speech, and signal processing | 2013
Benjamin Bigot; Grégory Senay; Georges Linarès; Corinne Fredouille; Richard Dufour
The detection and characterization, in audiovisual documents, of speech utterances where person names are pronounced, is an important cue for spoken content analysis. This paper tackles the problematic of retrieving spoken person names in the 1-Best ASR outputs of broadcast TV shows. Our assumption is that a person name is a latent variable produced by the lexical context it appears in. Thereby, a spoken name could be derived from ASR outputs even if it has not been proposed by the speech recognition system. A new context modelling is proposed in order to capture lexical and structural information surrounding a spoken name. The fundamental hypothesis of this study has been validated on broadcast TV documents available in the context of the REPERE challenge.
ieee automatic speech recognition and understanding workshop | 2009
Richard Dufour; Yannick Estève; Paul Deléglise; Frédéric Béchet
Processing spontaneous speech is one of the many challenges that automatic speech recognition (ASR) systems have to deal with. The main evidences characterizing spontaneous speech are disfluencies (filled pause, repetition, repair and false start) and many studies have focused on the detection and the correction of these disfluencies. In this study we define spontaneous speech as unprepared speech, in opposition to prepared speech where utterances contain well-formed sentences close to those that can be found in written documents. Disfluencies are of course very good indicators of unprepared speech, however they are not the only ones: ungrammaticality and language register are also important as well as prosodic patterns. This paper proposes a set of acoustic and linguistic features that can be used for characterizing and detecting spontaneous speech segments from large audio databases. More, we introduce a strategy that takes advantage of a global classification procfalseess using a probabilistic model which significantly improves the spontaneous speech detection.
empirical methods in natural language processing | 2014
Mohamed Morchid; Mohamed Bouallegue; Richard Dufour; Georges Linarès; Driss Matrouf; Renato De Mori
Various studies highlighted that topicbased approaches give a powerful spoken content representation of documents. Nonetheless, these documents may contain more than one main theme, and their automatic transcription inevitably contains errors. In this study, we propose an original and promising framework based on a compact representation of a textual document, to solve issues related to topic space granularity. Firstly, various topic spaces are estimated with different numbers of classes from a Latent Dirichlet Allocation. Then, this multiple topic space representation is compacted into an elementary segment, calledc-vector, originally developed in the context of speaker recognition. Experiments are conducted on the DECODA corpus of conversations. Results show the effectiveness of the proposed multi-view compact representation paradigm. Our identification system reaches an accuracy of 85%, with a significant gain of 9 points compared to the baseline (best single topic space configuration).
spoken language technology workshop | 2014
Mohamed Morchid; Richard Dufour; Mohamed Bouallegue; Georges Linarès
Performance of Automatic Speech Recognition (ASR) systems drops dramatically when transcribing conversations recorded in noisy conditions. Speech analytics suffer from this poor automatic transcription quality. To tackle this difficulty, a solution consists in mapping transcriptions into a space of hidden topics. This abstract representation allows to substantiate the drawbacks of the ASR process. The well-known and commonly used one is the topic-based representation from a Latent Dirichlet Allocation (LDA). Several studies demonstrate the effectiveness and reliability of this high-level representation. During the LDA learning process, distribution of words into each topic is estimated automatically. Nonetheless, in the context of a classification task, no consideration is made for the targeted classes. Thus, if the targeted application is to find out the main theme related to a dialogue, this information should be taken into consideration. In this paper, we propose to compare a classical topic-based representation of a dialogue, with a new one based not only on the dialogue content itself (words), but also on the theme related to the dialogue. This original representation is based on the author-topic (AT) model. The effectiveness of the proposed representation is evaluated on a classification task from automatic dialogue transcriptions between an agent and a customer of the Paris Transportation Company. Experiments confirmed that this author-topic model approach outperforms by far the classical topic representation, with a substantial gain of more than 7% in terms of correctly labeled conversations.
Speech Communication | 2014
Richard Dufour; Yannick Estève; Paul Deléglise
Processing spontaneous speech is one of the many challenges that automatic speech recognition systems have to deal with. The main characteristics of this kind of speech are disfluencies (filled pause, repetition, false start, etc.) and many studies have focused on their detection and correction. Spontaneous speech is defined in opposition to prepared speech, where utterances contain well-formed sentences close to those found in written documents. Acoustic and linguistic features made available by the use of an automatic speech recognition system are proposed to characterize and detect spontaneous speech segments from large audio databases. To better define this notion of spontaneous speech, segments of an 11-hour corpus (French Broadcast News) had been manually labeled according to three classes of spontaneity. Firstly, we present a study of these features. We then propose a two-level strategy to automatically assign a class of spontaneity to each speech segment. The proposed system reaches a 73.0% precision and a 73.5% recall on high spontaneous speech segments, and a 66.8% precision and a 69.6% recall on prepared speech segments. A quantitative study shows that the classes of spontaneity are useful information to characterize the speaker roles. This is confirmed by extending the speech spontaneity characterization approach to build an efficient automatic speaker role recognition system.
international conference on acoustics, speech, and signal processing | 2012
Richard Dufour; Géraldine Damnati; Delphine Charlet
This paper addresses the issue of error region detection and characterization in LVCSR transcriptions. It is a well-known phenomenon that errors are not independent and tend to co-occur in automatic transcriptions. We are interested in automatically detecting these so-called error regions. Additionally, in the context of information extraction in TVBN shows, being able to automatically characterize detected error regions is a crucial step towards the definition of suitable recovery strategies. In this paper we propose to classify error regions in four classes with a particular focus on errors on person names. We propose several sequential detection + classification approaches and an integrated sequence labeling approach. We show that our best classification system can reach 70% classification accuracy on automatically detected error regions. Additionally, the overall system is able to detect and correctly characterize 29.6% of error region corresponding to a person name with a precision of 61.9%.
IEEE Transactions on Audio, Speech, and Language Processing | 2015
Mohamed Morchid; Mohamed Bouallegue; Richard Dufour; Georges Linarès; Driss Matrouf; Renato De Mori
Mapping text documents in an LDA-based topic-space is a classical way to extract high-level representation of text documents. Unfortunately, LDA is highly sensitive to hyper-parameters related to the number of classes, or word and topic distribution, and there is no systematic way to pre-estimate optimal configurations. Moreover, various hyper-parameter configurations offer complementary views on the document. In this paper, we propose a method based on a two-step process that, first, expands the representation space by using a set of topic spaces and, second, compacts the representation space by removing poorly relevant dimensions. These two steps are based respectively on multi-view LDA-based representation spaces and factor-analysis models. This model provides a view-independent representation of documents while extracting complementary information from a massive multi-view representation. Experiments are conducted on the DECODA conversation corpus and the Reuters-21578 textual dataset. Results show the efficiency of the proposed multiview compact representation paradigm. The proposed categorization system reaches an accuracy of 86.5% with automatic transcriptions of conversations from DECODA corpus and a Macro-F1 of 80% during a classification task of the well-known Reuters-21578 corpus, with a significant gain compared to the baseline (best single topic space configuration), as well as methods and document representations previously studied.
6th International Workshop on Spoken Dialog Systems (IWSDS 2015) | 2015
Yannick Estève; Mohamed Bouallegue; Carole Lailler; Mohamed Morchid; Richard Dufour; Georges Linarès; Driss Matrouf; Renato De Mori
The paper describes a research about the possibility of integrating different types of word and semantic features for automatically identifying themes of real-life telephone conversations in a customer care service (CCS). Features are all the words of the application vocabulary, the probabilities obtained with latent Dirichlet allocation (LDA) of selected discriminative words and semantic features obtained with a limited human supervision of words and patterns expressing entities and relations of the application ontology. A deep neural network (DNN) is proposed for integrating these features. Experimental results on manual and automatic conversation transcriptions are presented showing the effective contribution of the integration. The results show how to automatically select a large subset of the test corpus with high precision and recall, making it possible to automatically obtain theme mention proportions in different time periods.