Corinne Ancourt | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Corinne Ancourt is active.

Explore More

Publication

Featured researches published by Corinne Ancourt.

acm sigplan symposium on principles and practice of parallel programming | 1991

Scanning polyhedra with DO loops

Corinne Ancourt; François Irigoin

Supercompilers perform complex program transformations which often result in new loop bounds. This paper shows that, under the usual assumptions in automatic parallelization, most transformations on loop nests can be expressed as affine transformations on integer sets defined by polyhedra and that the new loop bounds can be computed with algorithms using Fourier’s pairwise elimination method although it is not exact for integer sets. Sufficient conditions to use pairivise elimination on integer sets and to extend it to pseudo-linear constraints are also given. A tradeoff has to be made between dynamic overhead due to some bound slackness and compilation complexity but the resulting code is always correct. These algorithms can be used to interchange or block loops regardless of the loop bounds or the blocking strategy and to safely exchange array parts between two levela of a memory hierarchy or between neighboring processors in a distributed memory machine.

Scientific Programming | 1997

A linear algebra framework for static High Performance Fortran code distribution

Corinne Ancourt; Fabien Coelho; François Irigoin; Ronan Keryell

High Performance Fortran (HPF) was developed to support data parallel programming for single-instruction multiple-data (SIMD) and multiple-instruction multiple-data (MIMD) machines with distributed memory. The programmer is provided a familiar uniform logical address space and specifies the data distribution by directives. The compiler then exploits these directives to allocate arrays in the local memories, to assign computations to elementary processors, and to migrate data between processors when required. We show here that linear algebra is a powerful framework to encode HPF directives and to synthesize distributed code with space-efficient array allocation, tight loop bounds, and vectorized communications for INDEPENDENT loops. The generated code includes traditional optimizations such as guard elimination, message vectorization and aggregation, and overlap analysis. The systematic use of an affine framework makes it possible to prove the compilation scheme correct.

application-specific systems, architectures, and processors | 1997

Automatic data mapping of signal processing applications

Corinne Ancourt; Denis Barthou; Christophe Guettier; François Irigoin; Bertrand Jeannet; Jean Jourdan; Juliette Mattioli

This paper presents a technique to map automatically a complete digital signal processing (DSP) application onto a parallel machine with distributed memory. Unlike other applications where coarse or medium grain scheduling techniques can be used, DSP applications integrate several thousand of tasks and hence necessitate fine grain considerations. Moreover finding an effective mapping imperatively require to take into account both architectural resources constraints and real time constraints. The main contribution of this paper is to show how it is possible to handle and to solve data partitioning, and fine-grain scheduling under the above operational constraints using concurrent constraints logic programming languages (CCLP). Our concurrent resolution technique undertaking linear and nonlinear constraints takes advantage of the special features of signal processing applications and provides a solution equivalent to a manual solution for the representative panoramic analysis (PA) application.

International Journal of Parallel Programming | 1995

Minimal data dependence abstractions for loop transformations: extended version

Yi-Qing Yang; Corinne Ancourt; François Irigoin

Many abstractions of program dependences have already been proposed, such as the Dependence Distance, the Dependence Direction Vector, the Dependence Level or the Dependence Cone. These different abstractions have different precisions. Theminimal abstraction associated to a transformation is the abstraction that contains the minimal amount of information necessary to decide when such a transformation is legal. Minimal abstractions for loop reordering and unimodular transformations are presented. As an example, the dependence cone, which approximates dependences by a convex cone of the dependence distance vectors, is the minimal abstraction for unimodular transformations. It also contains enough information for legally applying all loop reordering transformations and finding the same set of valid mono- and multi-dimensional linear schedules as the dependence distance set.

Technique Et Science Informatiques | 2012

Polyèdres et compilation

François Irigoin; Mehdi Amini; Corinne Ancourt; Fabien Coelho; Béatrice Creusillet; Ronan Keryell

La premiere utilisation de polyedres pour resoudre un probleme de compilation, la parallelisation automatique de boucles en presence dappels de procedure, a ete decrite et implemente il y a pres de trente ans. Le modele polyedrique est maintenant reconnu internationalement et est en phase dintegration dans le compilateur GCC, bien que la complexite exponentielle des algorithmes associes ait ete pendant tres longtemps un motif justifiant leur refus pur et simple. Lobjectif de cet article est de donner de nombreux exemples dutilisation des polyedres dans un compilateur optimiseur et de montrer quils permettent de poser des conditions simples pour garantir la legalite de transformations.

high performance computing and communications | 2013

Automatic Generation of Communications for Redundant Multi-dimensional Data Parallel Redistributions

Corinne Ancourt; Teodora Petrisor; François Irigoin; Eric Lenormand

In this paper we concentrate on embedded parallel architectures with heterogeneous memory management systems combining shared and local memories, and more precisely we focus on efficient data communications between the various architecture parts. We formulate explicit data transfers in a polyhedral context and give several strategies for managing efficient communications for redundantly stored/read data. This allows automatic DMA-style code generation for a variety of data mappings onto parallel processing elements. Our approach is validated on a wide series of data redistribution examples linked with a domain-specific parallelisation framework developed in Thales, SpearDE. We give the solution for efficient data transfers mathematically as well as under the form of generated C code.

computational science and engineering | 2016

Automatic Code Generation of Distributed Parallel Tasks

Nelson Lossing; Corinne Ancourt; François Irigoin

With the advent of clustered systems, more and more parallel computing is required. However a lot of programming skills is needed to write a parallel codes, especially when you want to benefit from the various parallel architectural resources, with heterogeneous units and complex memory organizations. We present in this paper a method that generates automatically, step by step, a task-parallel distributed code from a sequential program. It has been implemented in an existing source-to-source compiler PIPS. Our approach provides two main advantages 1) all the program transformations are simple and applied on source code, thus are visible by the user, 2) a proof of correctness of the parallelization process can be made. This ensures that we end up with a correct distributed-task program for distributed-memory machines. To our knowledge, it is the first tool that automatically generates a distributed code for task parallelization.

computational science and engineering | 2016

Threewise: a local variance algorithm for GPU

Florian Gouin; Corinne Ancourt; Christophe Guettier

Abstract-Variance computation is commonly used in many fields like in image processing to improve local contrasts. This article is not only about developing and placing an algorithm of variance computation for graphical processors, it will also introduce its optimisation in terms of precision and computing time in relation to architectural constraints of graphical processors. Our algorithm enables to improve complexity in O(N logN) and brings a speedup of 112 compared to the classical formulation and of 4 regarding the optimized Pairwise algorithm

parallel computing | 2013