Fabrice Souvannavong
Institut Eurécom
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Fabrice Souvannavong.
multimedia information retrieval | 2004
Fabrice Souvannavong; Bernard Merialdo; Benoit Huet
We present a complete and efficient framework for video shot indexing and retrieval. Video shots are described by their key-frame, themselves described by their regions. Region-based approaches suffer from the complexity of segmentation and comparison tasks. A compact region-based shot representation is usually obtained thanks to vector-quantization method. We thus introduce LSA to reduce the noise inherent to the segmentation and the quantization processes. Then to better capture the content of video shots, we propose two original methods. The first takes advantage of a multi-scale segmentation of frames while the second uses multiple frames to represent a shot. Both approaches require more computation time during the pre-processing but not for indexing and comparison tasks. Indeed the extra information is included in the original signatures of shots. Finally we introduce a relevance feedback loop to optimize the search and propose a new method to optimize the effect of LSA. In the experimental section, we make an evaluation of latent semantic analysis and proposed approaches on two problems, namely object retrieval and semantic content estimation
international conference on multimedia and expo | 2004
Fabrice Souvannavong; Bernard Merialdo; Benoit Huet
Low-level features are now becoming insufficient to build efficient content-based retrieval systems. The interest of users is not any more to retrieve visually similar content, but they expect retrieval systems to find documents with similar semantic content. Bridging the gap between low-level features and semantic content is a challenging task necessary for future retrieval systems. Latent semantic indexing (LSI) was successfully introduced to efficiently index text documents. We propose to adapt this technique to efficiently represent the visual content of video shots for semantic content detection. Although we restrict our approach to visual features, it can be extended with minor changes to audio and motion features to build a multi-modal system. The semantic content is then detected thanks to two classifiers: k-nearest neighbors and neural network classifiers. Finally, in the experimental section we show the performances of each classifier and the performance gain obtained with LSI features compared to traditional features.
international conference on image processing | 2004
Lukas Hohl; Fabrice Souvannavong; Bernard Merialdo; Benoit Huet
The work presented in this paper aims at reducing the scmantic gap between low level video features and semantic video objects. The proposed method for finding associations between segmented frame region characteristics relies on the strength of Latent Semantic Analysis. Our previous expcrimcnts [I I have shown the potential of this approach but also uncovered some of its limitation. Here. we will present a method using the structural information within an LSA framework. Moreover, we will demonstrate the pcrformance gain of combining visual (low level) and structural information.
conference on image and video retrieval | 2004
Fabrice Souvannavong; Bernard Merialdo; Benoit Huet
Low-level features are now becoming insufficient to build efficient content-based retrieval systems. Users are not interested any longer in retrieving visually similar content, but they expect retrieval systems to also find documents with similar semantic content. Bridging the gap between low-level features and semantic content is a challenging task necessary for future retrieval systems. Latent Semantic Analysis (LSA) was successfully introduced to efficiently index text documents by detecting synonyms and the polysemy of words. We have successfully proposed an adaptation of LSA to model video content for object retrieval and semantic content estimation. Following this idea we now present a new model composed of multiple LSA’s (M-LSA) to better represent the video content. In the experimental section, we make a comparison of LSA and M-LSA on two problems, namely object retrieval and semantic content estimation.
conference on image and video retrieval | 2004
Lukas Hohl; Fabrice Souvannavong; Bernard Merialdo; Benoit Huet
The work presented in this paper aims at reducing the semantic gap between low level video features and semantic video objects. The proposed method for finding associations between segmented frame region characteristics relies on the strength of Latent Semantic Analysis (LSA). Our previous experiments [1], using color histograms and Gabor features, have rapidly shown the potential of this approach but also uncovered some of its limitation. The use of structural information is necessary, yet rarely employed for such a task. In this paper we address two important issues. The first is to verify that using structural information does indeed improve performance, while the second concerns the manner in which this additional information is integrated within the framework. Here, we propose two methods using the structural information. The first adds structural constraints indirectly to the LSA during the preprocessing of the video, while the other includes the structure directly within the LSA. Moreover, we will demonstrate that when the structure is added directly to the LSA the performance gain of combining visual (low level) and structural information is convincing.
international conference on information fusion | 2006
Fabrice Souvannavong; Benoit Huet
In this paper we introduce a new method for fusing classifier outputs. It is inspired from the behavior knowledge space model with the extra ability to work on continuous input values. This property allows to deal with heterogeneous classifiers and in particular it does not require to make any decision at the classifier level. We propose to build a set of units, defining a knowledge space, with respect to classifier output spaces. A new sample is then classified with respect to the unit it belongs to and some statistics computed on each unit. Several methods to create cells and make the final decision are proposed and compared to k-nearest neighbor and decision tree schemas. The evaluation is conducted on the task of video content retrieval which will reveal the efficiency of our approach
Archive | 2003
Fabrice Souvannavong
IEE Proceedings - Vision, Image, and Signal Processing | 2005
Fabrice Souvannavong; Lukas Hohl; Bernard Merialdo; Benoit Huet
IEE Proceedings - Vision, Image, and Signal Processing | 2005
Fabrice Souvannavong; Bernard Merialdo; Benoit Huet
text retrieval conference | 2002
Fabrice Souvannavong