Michel Crucianu
Conservatoire national des arts et métiers
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Michel Crucianu.
Pattern Recognition | 2008
Nizar Grira; Michel Crucianu; Nozha Boujemaa
Clustering algorithms are increasingly employed for the categorization of image databases, in order to provide users with database overviews and make their access more effective. By including information provided by the user, the categorization process can produce results that come closer to users expectations. To make such a semi-supervised categorization approach acceptable for the user, this information must be of a very simple nature and the amount of information the user is required to provide must be minimized. We propose here an effective semi-supervised clustering algorithm, active fuzzy constrained clustering (AFCC), that minimizes a competitive agglomeration cost function with fuzzy terms corresponding to pairwise constraints provided by the user. In order to minimize the amount of constraints required, we define an active mechanism for the selection of candidate constraints. The comparisons performed on a simple benchmark and on a ground truth image database show that with AFCC the results of clustering can be significantly improved with few constraints, making this semi-supervised approach an attractive alternative in the categorization of image databases.
international conference on document analysis and recognition | 2003
Jean-Yves Ramel; Michel Crucianu; Nicole Vincent; Claudie Faure
We are concerned with the extraction of tables from exchange format representations of very diverse composite documents. We put forward a flexible representation scheme for complex tables, based on a clear distinction between the physical layout of a table and its logical structure. Relying on this scheme, we develop a new method for the detection and the extraction of tables by an analysis of the graphic lines. To deal with tables that lack all or most of the graphic marks, one must focus on the regularities of the text elements alone. We propose such a method, based on a multi-level analysis of the layout of text components on a page. A general graph representation of the relative positions of blocks of text is exploited.
Multimedia Systems | 2008
Marin Ferecatu; Nozha Boujemaa; Michel Crucianu
We address the challenge of semantic gap reduction for image retrieval through an improved support vector machines (SVM)-based active relevance feedback framework, together with a hybrid visual and conceptual content representation and retrieval. We introduce a new feature vector based on projecting the keywords associated to an image on a set of “key concepts” with the help of an external lexical database. We then put forward two improvements of SVM-based relevance feedback method. First, to optimize the transfer of information between the user and the system, we introduce a new active learning selection criterion that minimizes redundancy between the candidate images shown to the user. Second, as most image classes span a wide range of scales in the description space, we argue that the insensitivity of the SVM to the scale of the data is desirable in this context and we show how to obtain it by using specific kernel functions. Experimental evaluations show that the joint use of the new concept-based feature vector and the visual features with our relevance feedback scheme can significantly improve the quality of the results.
conference on image and video retrieval | 2007
Sébastien Poullot; Olivier Buisson; Michel Crucianu
Scalability is the key issue in making content-based copy de-tection (CBCD) methods practical for very large image and video databases. Since copies are transformed versions of original documents, CBCD involves some form of retrieval by similarity using as queries the descriptions of potential copies. To enhance the scalability of an existing competitive CBCD method, we introduce here three improvements of this retrieval process: a Z-grid for building the index, uniformity-based sorting and adapted partitioning of the components. Retrieval speed is significantly increased, enabling us to monitor with a single computer one TV channel against a database of 120,000 hours of video.
ieee international conference on fuzzy systems | 2005
Nizar Grira; Michel Crucianu; Nozha Boujemaa
Traditional clustering algorithms usually rely on a pre-defined similarity measure between unlabelled data to attempt to identify natural classes of items. When compared to what a human expert would provide on the same data, the results obtained may be disappointing if the similarity measure employed by the system is too different from the one a human would use. To obtain clusters fitting user expectations better, we can exploit, in addition to the unlabelled data, some limited form of supervision, such as constraints specifying whether two data items belong to a same cluster or not. The resulting approach is called semi-supervised clustering. In this paper, we put forward a new semi-supervised clustering algorithm, pairwise-constrained competitive agglomeration: clustering is performed by minimizing a competitive agglomeration cost function with a fuzzy term corresponding to the violation of constraints. We present comparisons performed on a simple benchmark and on an image database
multimedia information retrieval | 2005
Nizar Grira; Michel Crucianu; Nozha Boujemaa
We consider data clustering problems where a limited amount of high-level semantic information, in the form of pairwise must-link and cannot-link constraints, can be acquired from the user. This form of supervision will guide the categorization of image databases in order to provide overviews that fit better user expectations. We propose here an effective semi-supervised clustering algorithm, Active Fuzzy Constrained Clustering (AFCC), that minimizes a competitive agglomeration-based cost function with fuzzy terms corresponding to pairwise constraints provided by the user. In order to minimize the amount of constraints required, we define an active mechanism for the selection of candidates for constraints. The comparisons performed on a simple benchmark and on a ground truth image database show that with AFCC the results of clustering can be significantly improved with few constraints, making this semi-supervised approach an attractive alternative in the categorization of image databases.
multimedia information retrieval | 2004
Marin Ferecatu; Michel Crucianu; Nozha Boujemaa
User-defined classes in large generalist image databases are often composed of several groups of images and span very different scales in the space of low-level visual descriptors. The interactive retrieval of such image classes is then very difficult. To addess his challenge, we propose and evaluate here two general mprovements of SVM-based relevance feedback methods. First, to optimize the transfer of information between the user and the system, we focus on the criterion employed by the system for selecting the images presented to the user at every feedback round. We put forward a new active learning selection criterion that minimizes redundancy between the candidate images shown to the user. Second, for image classes having very different scales, we find that a high sensitivity of the SVM to the scale of the data brings about a low retrieval performance. We then argue that insensitivity to scale is desirable in this context and we show how to obtain it by the use of specific kernel functions. The experimental evaluation of both ranking and classification performance on several image databases confirms the effectiveness of our selection criterion and of the use of kernels that reduce the sensitivity of SVMs to the scale of the data
acm multimedia | 2010
Yi Yu; Michel Crucianu; Vincent Oria; Ernesto Damiani
In order to improve the reliability and the scalability of content-based retrieval of variant audio tracks from large music databases, we suggest a new multi-stage LSH scheme that consists in (i) extracting compact but accurate representations from audio tracks by exploiting the LSH idea to summarize audio tracks, and (ii) adequately organizing the resulting representations in LSH tables, retaining almost the same accuracy as an exact kNN retrieval. In the first stage, we use major bins of successive chroma features to calculate a multi-probe histogram (MPH) that is concise but retains the information about local temporal correlations. In the second stage, based on the order statistics (OS) of the MPH, we propose a new LSH scheme, OS-LSH, to organize and probe the histograms. The representation and organization of the audio tracks are storage efficient and support robust and scalable retrieval. Extensive experiments over a large dataset with 30,000 real audio tracks confirm the effectiveness and efficiency of the proposed scheme.
acm multimedia | 2009
Sébastien Poullot; Michel Crucianu; Shin'ichi Satoh
Content-based video copy detection is relevant for structuring large video databases. The use of local features leads to good robustness to most types of photometric or geometric transformations. However, to achieve both good precision and good recall when the transformations are strong, feature configurations should be taken into account. This usually leads to complex matching operations that are incompatible with scalable copy detection. We suggest a computationally inexpensive solution for including a minimal amount of configuration information that significantly improves the balance between overall detection quality and scalability.
international conference on image processing | 2005
Nizar Grira; Michel Crucianu; Nozha Boujemaa
As image collections become ever larger, effective access to their content requires a meaningful categorization of the images. Such a categorization can rely on clustering methods working on image features, but should greatly benefit from any form of supervision the user can provide, related to the visual content. Semi-supervised clustering - learning from both labelled and unlabelled data - has consequently become a topic of significant interest. In this paper we present a new semi-supervised clustering algorithm, pairwise-constrained competitive agglomeration, which is based on a fuzzy cost function that takes pairwise constraints into account.