Lionel Lacassagne
University of Paris
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Lionel Lacassagne.
international workshop on computer architecture for machine perception | 2007
Arnaud Verdant; Antoine Dupret; Herve Mathias; Patrick Villard; Lionel Lacassagne
Video surveillance aims at detecting unexpected individuals or objects intrusion. When no motion is observed, common motion detection systems induce huge power consumption, regardless of the scene activity. This paper presents algorithms for low power motion detection, and their possible implementation. The main interest is that they are able to adapt the sensors acuity according to the scene activity. Relevant motion information can be extracted from images with lowered spatial and temporal resolution, with specific algorithms. By reducing the amount of data to analyze and spatial and temporal redundancy, a drastic reduction of power consumption can be achieved.
embedded systems for real-time multimedia | 2005
Daniel Etiemble; Samir Bouaziz; Lionel Lacassagne
We have implemented customized SIMD 16-bit floating point instructions on a NIOS II processor. On several image processing and media benchmarks for which the accuracy and dynamic range of this format is sufficient, a speed-up ranging from 1.5 to more than 2 is obtained versus the integer implementation. The hardware overhead remains limited and is compatible with the capacities of todays FPGAs.
international conference on signal and image processing applications | 2009
Kanur Aneja; Florence Laguzet; Lionel Lacassagne; Alain Mérigot
This paper proposes a fast method for image segmentation. After an optimal split of the image into rectangular regions, this paper focuses on the fast merging of these regions. Since the computation time is very small, hence it is suitable for real time applications, while producing a good segmentation for tracking purposes.
international conference on pattern recognition | 2010
Michèle Gouiffès; Florence Laguzet; Lionel Lacassagne
This paper proposes an extension to the mean shift tracking. We introduce the color connectedness degrees (CCD) which, more than providing statistical information about the target to track, embeds information about the amount of connectedness of the color intervals which compose the target. With a low increase of complexity, this approach provides a better robustness and quality of the tracking compared to the use of the RGB space. This is asserted by the experiments performed on several sequences showing vehicles and pedestrians in various contexts.
international conference on image processing | 2007
Arnaud Verdant; Antoine Dupret; Herve Mathias; Patrick Villard; Lionel Lacassagne
To be implemented on an analog CMOS image sensor, a robust algorithm based on recursive operations is presented. It allows sensors acuity adaptation to the scene activity. The main interest of the presented motion detection with adaptive thresholding is that, in a context of embedded steady camera, such a system allows focusing on targets with high resolution while keeping background in low resolution. Drastic power consumption reduction is achieved by tremendously reducing the amount of processed data.
Future Generation Computer Systems | 2018
Olfa Haggui; Claude Tadonki; Lionel Lacassagne; Fatma Ezahra Sayadi; Bouraoui Ouni
Corner detection is a key kernel for many image processing procedures including pattern recognition and motion detection. The latter, for instance, mainly relies on the corner points for which spatial analyses are performed, typically on (probably live) videos or temporal flows of images. Thus, highly efficient corner detection is essential to meet the real-time requirement of associated applications. In this paper, we consider the corner detection algorithm proposed by Harris, whose the main work-flow is a composition of basic operators represented by their approximations using 3 3 matrices. The corresponding data access patterns follow a stencil model, which is known to require careful memory organization and management. Cache misses and other additional hindering factors with NUMA architectures need to be skillfully addressed in order to reach an efficient scalable implementation. In addition, with an increasingly wide vector registers, an efficient SIMD version should be designed and explicitly implemented. In this paper, we study a direct and explicit implementation of common and novel optimization strategies, and provide a NUMA-aware parallelization. Experimental results on a dual-socket INTEL Broadwell-E/EP show a noticeably good scalability performance.
machine vision applications | 2016
Ahmed Chamseddine Ben Abdallah; Michèle Gouiffès; Lionel Lacassagne
This paper presents a modular system for both abnormal event detection and categorization in videos. Complementary normalcy models are built both globally at the image level and locally within pixels blocks. Three features are analyzed: (1) spatio-temporal evolution of binary motion where foreground pixels are detected using an enhanced background subtraction method that keeps track of temporarily static pixels; (2) optical flow, using a robust pyramidal KLT technique; and (3) motion temporal derivatives. At the local level, a normalcy MOG model is built for each block and for each flow feature and is made more compact using PCA. Then, the activity is analyzed qualitatively using a set of compact hybrid histograms embedding both optical flow orientation (or temporal gradient orientation) and foreground statistics. A compact binary signature of maximal size 13 bits is extracted from these different features for event characterization. The performance of the system is illustrated on different datasets of videos recorded on static cameras. The experiments show that the anomalies are well detected even if the method is not dedicated to one of the addressed scenarios.
Journal of Systems Architecture | 2017
F. Lemaitre; B. Couturier; Lionel Lacassagne
Abstract Many linear algebra libraries, such as the Intel MKL, Magma or Eigen, provide fast Cholesky factorization. These libraries are suited for big matrices but perform slowly on small ones. Even though State-of-the-Art studies begin to take an interest in small matrices, they usually feature a few hundreds rows. Fields like Computer Vision or High Energy Physics use tiny matrices. In this paper we show that it is possible to speed up the Cholesky factorization for tiny matrices by grouping them in batches and using highly specialized code. We provide High Level Transformations that accelerate the factorization for current multi-core and many-core SIMD architectures ( SSE, AVX 2, KNC, AVX512 , Neon, Altivec). We focus on the fact that, on some architectures, compilers are unable to vectorize and on other architectures, vectorizing compilers are not efficient. Thus hand-made SIMD ization is mandatory. We achieve with these transformations combined with SIMD a speedup from × 14 to × 28 for the whole resolution in single precision compared to the naive code on a AVX 2 machine and a speedup from × 6 to × 14 on double precision, both with a strong scalability.
acm sigplan symposium on principles and practice of parallel programming | 2018
Florian Lemaitre; B. Couturier; Lionel Lacassagne
System tracking is an old problem and has been heavily optimized throughout the past. However, in High Energy Physics, many small systems are tracked in real-time using Kalman filtering and no implementation satisfying those constraints currently exists. In this paper, we present a code generator used to speed up Cholesky Factorization and Kalman Filter for small matrices. The generator is easy to use and produces portable and heavily optimized code. We focus on current SIMD architectures (SSE, AVX, AVX512, Neon, SVE, Altivec and VSX). Our Cholesky factorization outperforms any existing libraries: from x3 to x10 faster than MKL. The Kalman Filter is also faster than existing implementations, and achieves 4 · 109 iter/s on a 2x24C Intel Xeon.
acm sigplan symposium on principles and practice of parallel programming | 2018
Darius Mercadier; Pierre-Évariste Dagand; Lionel Lacassagne; Gilles Muller
Bitslicing is a programming technique commonly used in cryptography that consists in implementing a combinational circuit in software. It results in a massively parallel program immune to cache-timing attacks by design. However, writing a program in bitsliced form requires extreme minutia. This paper introduces Usuba, a synchronous dataflow language producing bitsliced C code. Usuba is both a domain-specific language -- providing syntactic support for the implementation of cryptographic algorithms -- as well as a domain-specific compiler -- taking advantage of well-defined semantics invariants to perform various optimizations before handing the generated code to an (optimizing) C compiler. On the Data Encryption Standard (DES) algorithm, we show that Usuba outperforms a reference, hand-tuned implementation by 15% (using Intels 64 bits general-purpose registers and depending on the underlying C compiler) whilst our implementation also transparently supports modern SIMD extensions (SSE, AVX, AVX-512), other architectures (ARM Neon, IBM Altivec) as well as multicore processors through an OpenMP backend.