Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Johann Poignant is active.

Publication


Featured researches published by Johann Poignant.


international conference on multimedia and expo | 2012

From Text Detection in Videos to Person Identification

Johann Poignant; Laurent Besacier; Georges Quénot; Franck Thollard

We present in this article a video OCR system that detects and recognizes overlaid texts in video as well as its application to person identification in video documents. We proceed in several steps. First, text detection and temporal tracking are performed. After adaptation of images to a standard OCR system, a final post-processing combines multiple transcriptions of the same text box. The semi-supervised adaptation of this system to a particular video type (video broadcast from a French TV) is proposed and evaluated. The system is efficient as it runs 3 times faster than real time (including the OCR step) on a desktop Linux box. Both text detection and recognition are evaluated individually and through a person recognition task where it is shown that the combination of OCR and audio (speaker) information can greatly improve the performances of a state of the art audio based person identification system.


international conference on computer vision | 2012

Fusion of speech, faces and text for person identification in TV broadcast

Hervé Bredin; Johann Poignant; Makarand Tapaswi; Guillaume Fortier; Viet Bac Le; Thibault Napoléon; Hua Gao; Claude Barras; Sophie Rosset; Laurent Besacier; Jakob J. Verbeek; Georges Quénot; Frédéric Jurie; Hazim Kemal Ekenel

The Repere challenge is a project aiming at the evaluation of systems for supervised and unsupervised multimodal recognition of people in TV broadcast. In this paper, we describe, evaluate and discuss QCompere consortium submissions to the 2012 Repere evaluation campaign dry-run. Speaker identification (and face recognition) can be greatly improved when combined with name detection through video optical character recognition. Moreover, we show that unsupervised multimodal person recognition systems can achieve performance nearly as good as supervised monomodal ones (with several hundreds of identity models).


Multimedia Tools and Applications | 2016

Naming multi-modal clusters to identify persons in TV broadcast

Johann Poignant; Guillaume Fortier; Laurent Besacier; Georges Quénot

Persons’ identification in TV broadcast is one of the main tools to index this type of videos. The classical way is to use biometric face and speaker models, but, to cover a decent number of persons, costly annotations are needed. Over the recent years, several works have proposed to use other sources of names for identifying people, such as pronounced names and written names. The main idea is to form face/speaker clusters based on their similarities and to propagate these names onto clusters. In this paper, we propose a method to take advantage of written names during the diarization process, in order to both name clusters and prevent the fusion of two clusters named differently. First, we extract written names with the LOOV tool (Poignant et al. 2012); these names are associated to their co-occurring speaker turns / face tracks. Simultaneously, we build a multi-modal matrix of distances between speaker turns and face tracks. Then agglomerative clustering is performed on this matrix with the constraint to avoid merging clusters associated to different names. We also integrate the prediction of few biometric models (anchors, some journalists) to directly identify speaker turns / face tracks before the clustering process. Our approach was evaluated on the REPERE corpus and reached an F-measure of 68.2 % for speaker identification and 60.2 % for face identification. Adding few biometric models improves results and leads to 82.4 % and 65.6 % for speaker and face identity respectively. By comparison, a mono-modal, supervised person identification system with 706 speaker models trained on matching development data and additional TV and radio data provides 67.8 % F-measure, while 908 face models provide only 30.5 % F-measure.


content based multimedia indexing | 2011

Text detection and recognition for person identification in videos

Johann Poignant; Franck Thollard; Georges Quénot; Laurent Besacier

This article presents a demo of person search in audiovisual broadcast using only the text available in a video and in resources external to the video. We also present the different steps used to recognize characters in video for multi-modal person recognition systems. Text detection is realized using the text features (texture, color, contrast, geometry, temporal information). The text recognition itself is performed by the Google Tesseract free software. The method was successfully evaluated on a broadcast news corpus that contains 59 videos from the France 2 French TV channel.


international conference on multimodal interfaces | 2015

A Visual Analytics Approach to Finding Factors Improving Automatic Speaker Identifications

Pierrick Bruneau; Mickaël Stefas; Hervé Bredin; Johann Poignant; Thomas Tamisier; Claude Barras

Classification quality criteria such as precision, recall, and F-measure are generally the basis for evaluating contributions in automatic speaker recognition. Specifically, comparisons are carried out mostly via mean values estimated on a set of media. Whilst this approach is relevant to assess improvement w.r.t. the state-of-the-art, or ranking participants in the context of an automatic annotation challenge, it gives little insight to system designers in terms of cues for improving algorithms, hypothesis formulation, and evidence display. This paper presents a design study of a visual and interactive approach to analyze errors made by automatic annotation algorithms. A timeline-based tool emerged from prior steps of this study. A critical review, driven by user interviews, exposes caveats and refines user objectives. The next step of the study is then initiated by sketching designs combining elements of the current prototype to principles newly identified as relevant.


content-based multimedia indexing | 2014

Automatic propagation of manual annotations for multimodal person identification in TV shows

Mateusz Budnik; Johann Poignant; Laurent Besacier; Georges Quénot

In this paper an approach to human annotation propagation for person identification in the multimodal context is proposed. A system is used, which combines speaker diarization and face clustering to produce multimodal clusters. The whole multimodal clusters are later annotated rather than just single tracks, which is done by propagation. Optical character recognition systems provides initial annotation. Four different strategies, which select candidates for annotation, are tested. The initial results of annotation propagation are promising. With the use of a proper active learning selection strategy the human annotator involvement could be reduced even further.


international symposium on multimedia | 2016

Post-Hoc Interactive Analytics of Errors in the Context of a Person Discovery Task

Pierrick Bruneau; Mickaël Stefas; Johann Poignant; Hervé Bredin; Claude Barras

Part of the research effort in automatic person discovery in multimedia content consists in analyzing the errors made by algorithms. However exploring the space of models relating algorithmic errors in person discovery to intrinsic properties of associated shots (e.g. person facing the camera) - coined as post-hoc analysis in this paper - requires data curation and statistical model tuning, which can be cumbersome. In this paper we present a visual and interactive tool that facilitates this exploration. A case study is conducted with multimedia researchers to validate the tool. Real data obtained from the MediaEval person discovery task was used for this experiment. Our approach yielded novel insight that was completely unsuspected previously.


cooperative design visualization and engineering | 2014

Collaborative Annotation of Multimedia Resources

Pierrick Bruneau; Mickaël Stefas; Mateusz Budnik; Johann Poignant; Hervé Bredin; Thomas Tamisier; Benoît Otjacques

Reference multimedia corpora for use in automated indexing algorithms require lots of manual work. The Camomile project advocates the joint progress of automated annotation methods and tools for improving the benchmark resources. This paper shows some work in progress in interactive visualization of annotations, and perspectives in harnessing the collaboration between manual annotators, algorithm designers, and benchmark administrators.


conference of the international speech communication association | 2012

Unsupervised Speaker Identification using Overlaid Texts in TV Broadcast

Johann Poignant; Hervé Bredin; Viet Bac Le; Laurent Besacier; Claude Barras; Georges Quénot


MediaEval | 2015

Multimodal Person Discovery in Broadcast TV at MediaEval 2015

Johann Poignant; Hervé Bredin; Claude Barras

Collaboration


Dive into the Johann Poignant's collaboration.

Top Co-Authors

Avatar

Laurent Besacier

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Georges Quénot

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Hervé Bredin

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Claude Barras

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Mateusz Budnik

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Sophie Rosset

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Hazim Kemal Ekenel

Istanbul Technical University

View shared research outputs
Top Co-Authors

Avatar

Franck Thollard

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Viet Bac Le

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Philippe Mulhem

Joseph Fourier University

View shared research outputs
Researchain Logo
Decentralizing Knowledge