Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Arie Tal is active.

Publication


Featured researches published by Arie Tal.


compiler construction | 2005

Generalized index-set splitting

Christopher Barton; Arie Tal; Bob Blainey; José Nelson Amaral

This paper introduces Index-Set Splitting (ISS), a technique that splits a loop containing several conditional statements into several loops with less complex control flow. Contrary to the classic loop unswitching technique, ISS splits loops when the conditional is loop variant. ISS uses an Index Sub-range Tree (IST) to identify the structure of the conditionals in the loop and to select which conditionals should be eliminated. This decision is based on an estimation of the code growth for each splitting: a greedy algorithm spends a pre-determined code growth budget. ISTs separate the decision about which splits to perform from the actual code generation for the split loops. The use of ISS to improve a loop fusion framework is then discussed. ISS opportunity identification in the SPEC2000 benchmark suite and three other suites demonstrate that ISS is a general technique that may benefit other compilers.


international workshop on openmp | 2003

Busy-wait barrier synchronization using distributed counters with local sensor

Guansong Zhang; Francisco Martínez; Arie Tal; Bob Blainey

Barrier synchronization is an important and performance critical primitive in many parallel programming models, including the popular OpenMP model. In this paper, we compare the performance of several software implementations of barrier synchronization and introduce a new implementation, distributed counters with local sensor, which considerably reduces overhead on POWER3 and POWER4 SMP systems. Through experiments with the EPCC OpenMP benchmark, we demonstrate a 79% reduction in overhead on a 32-way POWER4 system and an 87% reduction in overhead on a 16-way POWER3 system when comparing with a fetch-and-add implementation. Since these improvements are primarily attributed to reduced L2 and L3 cache misses, we expect the relative performance of our implementation to increase with the number of processors in an SMP and as memory latencies lengthen relative to cache latencies.


Archive | 2005

Efficient application deployment on dynamic clusters

Alain Azagury; Yair Koren; Benny Rochwerger; Arie Tal


Archive | 2004

Method and apparatus for a generic language interface to apply loop optimization transformations

Robert James Blainey; Arie Tal


Archive | 2002

Unrolling transformation of nested loops

Arie Tal; Robert James Blainey


Archive | 2004

Method, system and computer program product for hierarchical loop optimization of machine executable code

Christopher Barton; Arie Tal


Archive | 2009

Method and apparatus for automatic second-order predictive commoning

Arie Tal; Dina Tal


Archive | 2007

Method and system for managing heuristic properties

Arie Tal


Archive | 2005

Method and apparatus for a programming framework for pattern matching and transformation of intermediate language expression trees

Christopher Barton; Arie Tal


Archive | 2005

Pattern matching and transformation of intermediate language expression trees

Erik Pierre Charlebois; Arie Tal

Researchain Logo
Decentralizing Knowledge