Is this you? Create Your Porfile

Georges Quénot

Centre national de la recherche scientifique

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Georges Quénot is active.

Explore More

Publication

Featured researches published by Georges Quénot.

international conference on acoustics, speech, and signal processing | 1992

The 'orthogonal algorithm' for optical flow detection using dynamic programming

Georges Quénot

An algorithm for optical flow detection is introduced. It is based on an iterative search for a displacement field that minimizes the L/sub 1/ or L/sub 2/ distance between two images. Both changes are sliced into parallel and overlapping strips. Corresponding strips are aligned using dynamic programming. Two passes are performed using orthogonal slicing directions. This process is iterated in a pyramidal fashion by reducing the spacing and width of the strips. This algorithm provides very-high-quality matching for calibrated patterns as well as for human visual sensation. The results appear to be at least as good as those obtained with classical optical flow detection methods.<<ETX>>

international conference on multimedia and expo | 2012

From Text Detection in Videos to Person Identification

Johann Poignant; Laurent Besacier; Georges Quénot; Franck Thollard

We present in this article a video OCR system that detects and recognizes overlaid texts in video as well as its application to person identification in video documents. We proceed in several steps. First, text detection and temporal tracking are performed. After adaptation of images to a standard OCR system, a final post-processing combines multiple transcriptions of the same text box. The semi-supervised adaptation of this system to a particular video type (video broadcast from a French TV) is proposed and evaluated. The system is efficient as it runs 3 times faster than real time (including the OCR step) on a desktop Linux box. Both text detection and recognition are evaluated individually and through a person recognition task where it is shown that the combination of OCR and audio (speaker) information can greatly improve the performances of a state of the art audio based person identification system.

content based multimedia indexing | 2013

Descriptor optimization for multimedia indexing and retrieval

Bahjat Safadi; Georges Quénot

In this paper, we propose and evaluate a method for optimizing descriptors used for content-based multimedia indexing and retrieval. A large variety of descriptors are commonly used for this purpose. However, the most efficient ones often have characteristics preventing them to be easily used in large scale systems. They may have very high dimensionality (up to tens of thousands dimensions) and/or be suited for a distance which is costly to compute (e.g. χ2). The proposed method combines a PCA-based dimensionality reduction with pre- and post-PCA non-linear transformations. The resulting transformation is globally optimized. The produced descriptors have a much lower dimensionality while performing at least as well, and often significantly better, with the Euclidean distance than the original high dimensionality descriptors with their optimal distance. Our approach also includes a hyper-parameter optimization procedure based on the use of a fast kNN classifier and on a polynomial fit to overcome the MAP metric instability. The method has been validated and evaluated on a variety of descriptors using the TRECVid 2010 semantic indexing task data. It has been applied at large scale for the TRECVid 2012 semantic indexing task on tens of descriptors of various types and with initial dimensionalities ranging from 15 up to 32,768. The same transformation can be used also for multimedia retrieval in the context of query by example and/or of relevance feedback.

Eurasip Journal on Image and Video Processing | 2007

Image and video indexing using networks of operators

Stéphane Ayache; Georges Quénot; Jérôome Gensel

This article presents a framework for the design of concept detection systems for image and video indexing. This framework integrates in a homogeneous way all the data and processing types. The semantic gap is crossed in a number of steps, each producing a small increase in the abstraction level of the handled data. All the data inside the semantic gap and on both sides included are seen as a homogeneous type called numcept and all the processing modules between the various numcepts are seen as a homogeneous type called operator. Concepts are extracted from the raw signal using networks of operators operating on numcepts. These networks can be represented as data-flow graphs and the introduced homogenizations allow fusing elements regardless of their nature. Low-level descriptors can be fused with intermediate of final concepts. This framework has been used to build a variety of indexing networks for images and videos and to evaluate many aspects of them. Using annotated corpora and protocols of the 2003 to 2006 TRECVID evaluation campaigns, the benefit brought by the use of individual features, the use of several modalities, the use of various fusion strategies, and the use of topologic and conceptual contexts was measured. The framework proved its efficiency for the design and evaluation of a series of network architectures while factorizing the training effort for common sub-networks.

conference on information and knowledge management | 2011

Re-ranking by local re-scoring for video indexing and retrieval

Bahjat Safadi; Georges Quénot

Video retrieval can be done by ranking the samples according to their probability scores that were predicted by classifiers. It is often possible to improve the retrieval performance by re-ranking the samples. In this paper, we proposed a re-ranking method that improves the performance of semantic video indexing and retrieval, by re-evaluating the scores of the shots by the homogeneity and the nature of the video they belong to. Compared to previous works, the proposed method provides a framework for the re-ranking via the homogeneous distribution of video shots content in a temporal sequence. The experimental results showed that the proposed re-ranking method was able to improve the system performance by about 18% in average on the TRECVID 2010 semantic indexing task, videos collection with homogeneous contents. For TRECVID 2008, in the case of collections of videos with non-homogeneous contents, the system performance was improved by about 11-13%.

International Journal of Digital Multimedia Broadcasting | 2012

Automatic Story Segmentation for TV News Video Using Multiple Modalities

Emilie Dumont; Georges Quénot

While video content is often stored in rather large files or broadcasted in continuous streams, users are often interested in retrieving only a particular passage on a topic of interest to them. It is, therefore, necessary to split video documents or streams into shorter segments corresponding to appropriate retrieval units. We propose here a method for the automatic segmentation of TV news videos into stories. A-multiple-descriptor based segmentation approach is proposed. The selected multimodal features are complementary and give good insights about story boundaries. Once extracted, these features are expanded with a local temporal context and combined by an early fusion process. The story boundaries are then predicted using machine learning techniques. We investigate the system by experiments conducted using TRECVID 2003 data and protocol of the story boundary detection task, and we show that the proposed approach outperforms the state-of-the-art methods while requiring a very small amount of manual annotation.

custom integrated circuits conference | 1991

A data-flow processor for real-time low-level image processing

Georges Quénot; Bertrand Zavidovique

A chip featuring two coupled data-flow processors (DFPs) has been designed. It is to be mesh-connected into large processor arrays dedicated primarily to image processing. Each processor operates on 25 Mbyte/s data flows and performs up to 50 million 8- or 16-b arithmetic operations per second. The chip has been processed in a 1- mu m CMOS technology. It includes 160000 transistors in a 84 mm/sup 2/ die size area; its clock is at 25 MHz; and it is packaged in a 144-pin PGA package. The approach is to perform computations on the fly on a data flow that comes from a digital video camera. The set of available operators on the DFP has been defined to cover as widely as possible the range of low-level image processing functions.<<ETX>>

international conference on computer vision | 2012

Hierarchical late fusion for concept detection in videos

Sabin Tiberius Strat; Alexandre Benoit; Hervé Bredin; Georges Quénot; Patrick Lambert

We deal with the issue of combining dozens of classifiers into a better one, for concept detection in videos. We compare three fusion approaches that share a common structure: they all start with a classifier clustering stage, continue with an intra-cluster fusion and end with an inter-cluster fusion. The main difference between them comes from the first stage. The first approach relies on a priori knowledge about the internals of each classifier (low-level descriptors and classification algorithm) to group the set of available classifiers by similarity. The second and third approaches obtain classifier similarity measures directly from their output and group them using agglomerative clustering for the second approach and community detection for the third one.

international conference on computer vision | 2012

Fusion of speech, faces and text for person identification in TV broadcast

Hervé Bredin; Johann Poignant; Makarand Tapaswi; Guillaume Fortier; Viet Bac Le; Thibault Napoléon; Hua Gao; Claude Barras; Sophie Rosset; Laurent Besacier; Jakob J. Verbeek; Georges Quénot; Frédéric Jurie; Hazim Kemal Ekenel

The Repere challenge is a project aiming at the evaluation of systems for supervised and unsupervised multimodal recognition of people in TV broadcast. In this paper, we describe, evaluate and discuss QCompere consortium submissions to the 2012 Repere evaluation campaign dry-run. Speaker identification (and face recognition) can be greatly improved when combined with name detection through video optical character recognition. Moreover, we show that unsupervised multimodal person recognition systems can achieve performance nearly as good as supervised monomodal ones (with several hundreds of identity models).

acm multimedia | 2008

Rushes summarization by IRIM consortium: redundancy removal and multi-feature fusion

Georges Quénot; Jenny Benois-Pineau; Boris Mansencal; Eliana Rossi; Matthieu Cord; Frédéric Precioso; David Gorisse; Patrick Lambert; Bertrand Augereau; Lionel Granjon; Denis Pellerin; Michèle Rombaut; Stéphane Ayache

In this paper, we present the first participation of a consortium of French laboratories, IRIM, to the TRECVID 2008 BBC Rushes Summarization task. Our approach resorts to video skimming. We propose two methods to reduce redundancy, as rushes include several takes of scenes. We also take into account low and mid-level semantic features in an ad-hoc fusion method in order to retain only significant content

Explore More