Is this you? Create Your Porfile

Laurent Amsaleg

Centre national de la recherche scientifique

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Laurent Amsaleg is active.

Explore More

Publication

Featured researches published by Laurent Amsaleg.

conference on image and video retrieval | 2009

Evaluation of GIST descriptors for web-scale image search

Matthijs Douze; Hervé Jégou; Harsimrat Sandhawalia; Laurent Amsaleg; Cordelia Schmid

The GIST descriptor has recently received increasing attention in the context of scene recognition. In this paper we evaluate the search accuracy and complexity of the global GIST descriptor for two applications, for which a local description is usually preferred: same location/object recognition and copy detection. We identify the cases in which a global description can reasonably be used. The comparison is performed against a state-of-the-art bag-of-features representation. To evaluate the impact of GISTs spatial grid, we compare GIST with a bag-of-features restricted to the same spatial grid as in GIST. Finally, we propose an indexing strategy for global descriptors that optimizes the trade-off between memory usage and precision. Our scheme provides a reasonable accuracy in some widespread application cases together with very high efficiency: In our experiments, querying an image database of 110 million images takes 0.18 second per image on a single machine. For common copyright attacks, this efficiency is obtained without noticeably sacrificing the search accuracy compared with state-of-the-art approaches.

Pattern Recognition Letters | 2010

Locality sensitive hashing: A comparison of hash function types and querying mechanisms

Loïc Paulevé; Hervé Jégou; Laurent Amsaleg

It is well known that high-dimensional nearest neighbor retrieval is very expensive. Dramatic performance gains are obtained using approximate search schemes, such as the popular Locality-Sensitive Hashing (LSH). Several extensions have been proposed to address the limitations of this algorithm, in particular, by choosing more appropriate hash functions to better partition the vector space. All the proposed extensions, however, rely on a structured quantizer for hashing, poorly fitting real data sets, limiting its performance in practice. In this paper, we compare several families of space hashing functions in a real setup, namely when searching for high-dimension SIFT descriptors. The comparison of random projections, lattice quantizers, k-means and hierarchical k-means reveal that unstructured quantizer significantly improves the accuracy of LSH, as it closely fits the data in the feature space. We then compare two querying mechanisms introduced in the literature with the one originally proposed in LSH, and discuss their respective merits and limitations.

international conference on acoustics, speech, and signal processing | 2011

Searching in one billion vectors: Re-rank with source coding

Hervé Jégou; Romain Tavenard; Matthijs Douze; Laurent Amsaleg

Recent indexing techniques inspired by source coding have been shown successful to index billions of high-dimensional vectors in memory. In this paper, we propose an approach that re-ranks the neighbor hypotheses obtained by these compressed-domain indexing methods. In contrast to the usual post-verification scheme, which performs exact distance calculation on the short-list of hypotheses, the estimated distances are refined based on short quantization codes, to avoid reading the full vectors from disk. We have released a new public dataset of one billion 128-dimensional vectors and proposed an experimental setup to evaluate high dimensional indexing algorithms on a realistic scale. Experiments show that our method accurately and efficiently re-ranks the neighbor hypotheses using little memory compared to the full vectors representation.

acm international workshop on multimedia databases | 2003

Robust content-based image searches for copyright protection

Sid-Ahmed Berrani; Laurent Amsaleg; Patrick Gros

This paper proposes a novel content-based image retrieval scheme for image copy identification. Its goal is to detect matches between a set of doubtful images and the ones stored in the database of the legal holders of the photographies. If an image was stolen and used to create a pirated copy, it tries to identify from which original image that copy was created. The image recognition scheme is based on local differential descriptors. Therefore, the matching process takes into account a large set of variations that might have been applied to stolen images in order to create pirated copies. The high cost and the complexity of this image recognition scheme requires a very efficient retrieval process since many individual queries must be executed before being able to construct the final result. This paper therefore proposes to use a novel search method that trades the precision of each individual search for reduced query execution time. This imprecision has only little impact on the overall recognition performance since the final result is a consolidation of many partial results. However, it dramatically accelerates queries. This result has then been corroborated by a theoretically study. Experiments show the efficiency and the robustness of the proposed scheme.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2009

NV-Tree: An Efficient Disk-Based Index for Approximate Search in Very Large High-Dimensional Collections

Herwig Lejsek; Friðrik Heiðar Ásmundsson; Björn Þór Jónsson; Laurent Amsaleg

Over the last two decades, much research effort has been spent on nearest neighbor search in high-dimensional data sets. Most of the approaches published thus far have, however, only been tested on rather small collections. When large collections have been considered, high-performance environments have been used, in particular systems with a large main memory. Accessing data on disk has largely been avoided because disk operations are considered to be too slow. It has been shown, however, that using large amounts of memory is generally not an economic choice. Therefore, we propose the NV-tree, which is a very efficient disk-based data structure that can give good approximate answers to nearest neighbor queries with a single disk operation, even for very large collections of high-dimensional data. Using a single NV-tree, the returned results have high recall but contain a number of false positives. By combining two or three NV-trees, most of those false positives can be avoided while retaining the high recall. Finally, we compare the NV-tree to locality sensitive hashing, a popular method for ¿-distance search. We show that they return results of similar quality, but the NV-tree uses many fewer disk reads.

acm multimedia | 2006

Scalability of local image descriptors: a comparative study

Herwig Lejsek; Friðrik Heiðar Ásmundsson; Björn Þór Jónsson; Laurent Amsaleg

Computer vision researchers have recently proposed several local descriptor schemes. Due to lack of database support, however, these descriptors have only been evaluated using small image collections. Recently, we have developed the PvS-framework, which allows efficient querying of large local descriptor collections. In this paper, we use the PvSframework to study the scalability of local image descriptors. We propose a new local descriptor scheme and compare it to three other well known schemes. Using a collection of almost thirty thousand images, we show that the new scheme gives the best results in almost all cases. We then give two stop rules to reduce query processing time and show that in many cases only a few query descriptors must be processed to find matching images. Finally, we test our descriptors on a collection of over three hundred thousand images, resulting in over 200 million local descriptors, and show that even at such a large scale the results are still of high quality, with no change in query processing time.

Pattern Analysis and Applications | 2001

Content-based retrieval using local descriptors: Problems and issues from a database perspective

Laurent Amsaleg; Patrick Gros

Abstract:Most existing content-based image retrieval systems built above a very large database typically compute a single descriptor per image, based for example on colour histograms. Therefore, these systems can only return images that are globally similar to the query image, but cannot return images that contain some of the objects that are in the query. Recent image processing techniques, however, focused on fine-grain image recognition to address the need of detecting similar objects in images. Fine-grain image recognition typically relies on computing many local descriptors per image. These techniques obviously increase the recognition power of retrieval systems, but also raise new problems in the design of fundamental lower-level functions such as indexes and secondary storage management. This paper addresses these problems: it shows that the three most efficient multi-dimensional indexing techniques known today do not efficiently cope with the deep changes in the retrieval process caused by the use of local descriptors. This paper also identifies several research directions to investigate before being able to build efficient image database systems supporting fine-grain recognition.

Distributed and Parallel Databases | 1998

Dynamic Query Operator Scheduling for Wide-Area Remote Access

Laurent Amsaleg; Michael J. Franklin; Anthony Tomasic

Distributed databases operating over wide-area networks such as the Internet, must deal with the unpredictable nature of the performance of communication. The response times of accessing remote sources can vary widely due to network congestion, link failure, and other problems. In such an unpredictable environment, the traditional iterator-based query execution model performs poorly. We have developed a class of methods, called query scrambling, for dealing explicitly with the problem of unpredictable response times. Query scrambling dynamically modifies query execution plans on-the-fly in reaction to unexpected delays in data access. In this paper we focus on the dynamic scheduling of query operators in the context of query scrambling. We explore various choices for dynamic scheduling and examine, through a detailed simulation, the effects of these choices. Our experimental environment considers pipelined and non-pipelined join processing in a client with multiple remote data sources and delayed or possibly bursty arrivals of data. Our performance results show that scrambling rescheduling is effective in hiding the impact of delays on query response time for a number of different delay scenarios.

international conference on multimedia retrieval | 2013

Indexing and searching 100M images with map-reduce

Diana Moise; Denis Shestakov; Gylfi Thór Gudmundsson; Laurent Amsaleg

Most researchers working on high-dimensional indexing agree on the following three trends: (i) the size of the multimedia collections to index are now reaching millions if not billions of items, (ii) the computers we use every day now come with multiple cores and (iii) hardware becomes more available, thanks to easier access to Grids and/or Clouds. This paper shows how the Map-Reduce paradigm can be applied to indexing algorithms and demonstrates that great scalability can be achieved using Hadoop, a popular Map-Reduce-based framework. Dramatic performance improvements are not however guaranteed a priori: such frameworks are rigid, they severely constrain the possible access patterns to data and scares resource RAM has to be shared. Furthermore, algorithms require major redesign, and may have to settle for sub-optimal behavior. The benefits, however, are many: simplicity for programmers, automatic distribution, fault tolerance, failure detection and automatic re-runs and, last but not least, scalability. We share our experience of adapting a clustering-based high-dimensional indexing algorithm to the Map-Reduce model, and of testing it at large scale with Hadoop as we index 30 billion SIFT descriptors. We foresee that lessons drawn from our work could minimize time, effort and energy invested by other researchers and practitioners working in similar directions.

conference on information and knowledge management | 2003

Approximate searches: k-neighbors + precision

Sid-Ahmed Berrani; Laurent Amsaleg; Patrick Gros

It is known that all multi-dimensional index structures fail to accelerate content-based similarity searches when the feature vectors describing images are high-dimensional. It is possible to circumvent this problem by relying on approximate search-schemes trading-off result quality for reduced query execution time. Most approximate schemes, however, provide none or only complex control on the precision of the searches, especially when retrieving the k nearest neighbors (NNs) of query points.In contrast, this paper describes an approximate search scheme for high-dimensional databases where the precision of the search can be probabilistically controlled when retrieving the k NNs of query points. It allows a fine and intuitive control over this precision by setting at run time the maximum probability for a vector that would be in the exact answer set to be missed in the approximate set of answers eventually returned. This paper also presents a performance study of the implementation using real datasets showing its reliability and efficiency. It shows, for example, that our method is 6.72 times faster than the sequential scan when it handles more than 5 106 24-dimensional vectors, even when the probability of missing one of the true nearest neighbors is below 0.01.

Explore More