Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Alexandre E. Eichenberger is active.

Publication


Featured researches published by Alexandre E. Eichenberger.


international symposium on microarchitecture | 1998

Effective cluster assignment for modulo scheduling

Erik Nystrom; Alexandre E. Eichenberger

Clustering is one solution to the demand for wide issue machines and fast clock cycles because it allows for smaller, less ported register files and simpler bypass logic while remaining scaleable. Much of the previous work on scheduling for clustered architectures has focused on acyclic code. While minimizing schedule length of acyclic code is paramount, the primary objective when scheduling cyclic code is to maximize the throughput or steady state performance. This paper investigates a pre-modulo scheduling pass that performs cluster assignment in a way that minimizes performance degradation due to explicit communication required as the loops are split over clusters. The proposed cluster assignment algorithm annotates and adjusts the graph for use by the scheduler so that any traditional modulo scheduling algorithm, having no knowledge of clustering, can produce a valid and efficient schedule for a clustered machine.


programming language design and implementation | 1997

Efficient formulation for optimal modulo schedulers

Alexandre E. Eichenberger; Edward S. Davidson

Modulo scheduling algorithms based on optimal solvers have been proposed to investigate and tune the performance of modulo scheduling heuristics. While recent advances have broadened the scope for which the optimal approach is applicable, this approach increasingly suffers from large execution times. In this paper, we propose a more efficient formulation of the modulo scheduling space that significantly decreases the execution time of solvers based on integer linear programs. For example, the total execution time is reduced by a factor of 8.6 when 782 loops from the Perfect Club, SPEC, and Livermore Fortran Kernels are scheduled for minimum register requirements using the more efficient formulation instead of the traditional formulation. Experimental evidence further indicates that significantly larger loops can be scheduled under realistic machine constraints.


international symposium on microarchitecture | 1999

Balance scheduling: weighting branch tradeoffs in superblocks

Alexandre E. Eichenberger; Waleed Meleis

Since there is generally insufficient instruction level parallelism within a single basic block, higher performance is achieved by speculatively scheduling operations in superblocks. This is difficult in general because each branch competes for the processors limited resources. Previous work manages the performance tradeoffs that exist between branches only indirectly. We show here that dependence and resource constraints can be used to gather explicit knowledge about scheduling tradeoffs between branches. The first contribution of this paper is a set of new, tighter lower bounds on the execution times of superblocks that specifically accounts for the dependence and resource conflicts between pairs of branches. The second contribution of this paper is a novel superblock scheduling heuristic that finds high performance schedules by determining the operations that each branch needs to be scheduled early and selecting branches with compatible needs that favor beneficial branch tradeoffs. Performance evaluations for superblocks from SPECint95 indicate that our bounds are very tight and that our scheduling heuristic outperforms well known superblock scheduling algorithms.


Algorithmica | 2002

An Experimental Study of Algorithms for Weighted Completion Time Scheduling

Ivan D. Baev; Waleed Meleis; Alexandre E. Eichenberger

Abstract We consider the total weighted completion time scheduling problem for parallel identical machines and precedence constraints, P| prec|sum wiCi . This important and broad class of problems is known to be NP-hard, even for restricted special cases, and the best known approximation algorithms have worst-case performance that is far from optimal. However, little is known about the experimental behavior of algorithms for the general problem. This paper represents the first attempt to describe and evaluate comprehensively a range of weighted completion time scheduling algorithms.We first describe a family of combinatorial scheduling algorithms that optimally solve the single-machine problem, and show that they can be used to achieve good performance for the multiple-machine problem. These algorithms are efficient and find schedules that are on average within 1.5percent of optimal over a large synthetic benchmark consisting of trees, chains, and instances with no precedence constraints. We then present several ways to create feasible schedules from nonintegral solutions to a new linear programming relaxation for the multiple-machine problem. The best of these linear programming-based approaches finds schedules that are within 0.2percent of optimal over our benchmark.Finally, we describe how the scheduling phase in profile-based program compilation can be expressed as a weighted completion time scheduling problem and apply our algorithms to a set of instances extracted from the SPECint95 compiler benchmark. For these instances with arbitrary precedence constraints, the best linear programming-based approach finds optimal solutions in 78percent of cases. Our results demonstrate that careful experimentation can help lead the way to high quality algorithms, even for difficult optimization problems.


international conference on parallel architectures and compilation techniques | 1998

Efficient edge profiling for ILP-processors

Alexandre E. Eichenberger; Sheldon M. Lobo

Compilers for VLIW and superscalar machines increasingly use dynamic application behavior or profiling information in optimizations such as instruction scheduling, speculative code motion, and code layout. Hence it is extremely useful to develop inexpensive techniques that gather accurate profiling information. This paper presents novel edge profiling techniques that greatly reduce run-time overhead by efficiently exploiting instruction level parallelism between application and instrumentation. Best results are achieved when speculatively executing a software pipelined version of the instrumentation code. For an 8-wide issue machine, measurements for the SPECint95 benchmarks indicate a 10-fold reduction in overhead (from 32.8% to 3.3%), when compared with previous techniques.


IEEE Transactions on Computers | 2001

Scheduling superblocks with bound-based branch trade-offs

Waleed Meleis; Alexandre E. Eichenberger; Ivan D. Baev

Since instruction level parallelism in basic blocks is often limited, compilers increase performance by creating superblocks that allow operations to be issued speculatively. This is difficult in general because each branch competes for the processors limited resources. Previous work manages the performance trade-offs that exist between branches only indirectly. We show here that dependence and resource constraints can be used to gather explicit knowledge about scheduling trade-offs between branches. This papers first contribution is a set of new, tighter lower bounds on the execution times of superblocks that specifically account for the dependence and resource conflicts between pairs of branches. This papers second contribution is a novel superblock scheduling heuristic that finds high performance schedules by determining the operations that each branch needs to be scheduled early and selecting branches with compatible needs that favor beneficial branch trade-offs. Performance evaluations for superblocks from SPECint95 indicate that our bounds are very tight and that our scheduling heuristic outperforms well-known superblock scheduling algorithms.


international symposium on microarchitecture | 2000

An integrated approach to accelerate data and predicate computations in hyperblocks

Alexandre E. Eichenberger; Waleed Meleis; Suman Maradani

To exploit increased instruction-level parallelism available in modern processors, we describe the formation and optimization of tracenets, an integrated approach to reducing the length of the critical path in data and predicated computation. By tightly integrating selective path expansion and path optimization within hyperblocks, our algorithm is able to produce highly optimized code without exploring the exponentially large number of paths included in a hyperblock. Our approach extracts more of the implicit predicate correlations in hyperblocks and uses a precise model of predicate correlations to aggressively accelerate data and predicate computations. Experimental results indicate that tracenets can significantly reduce the number of dynamic execution cycles.


international conference on parallel processing | 2000

Lower bounds on precedence-constrained scheduling for parallel processors

Ivan D. Baev; Waleed Meleis; Alexandre E. Eichenberger

We consider two general precedence-constrained scheduling problems that have wide applicability in the areas of parallel processing, high performance compiling, and digital system synthesis. These problems are intractable so it is important to be able to compute tight bounds on their solutions. A tight lower bound on makespan scheduling can be obtained by replacing precedence constraints with release and due dates, giving a problem that can be efficiently solved. We demonstrate that recursively applying this approach yields a bound that is provably tighter than other known bounds, and experimentally shown to achieve the optimal value at least 86.5% of the time over a synthetic benchmark. We compute the best known lower bound on weighted completion time scheduling by applying the recent discovery of a new algorithm for solving a related scheduling problem. Experiments show that this bound significantly outperforms the linear programming-based bound. We have therefore demonstrated that combinatorial algorithms can be a valuable alternative to linear programming for computing tight bounds on large scheduling problems.


symposium on discrete algorithms | 1999

Algorithms for total weighted completion time scheduling

Ivan D. Baev; Waleed Meleis; Alexandre E. Eichenberger


Archive | 1996

Modulo scheduling, machine representations, and register-sensitive algorithms

Alexandre E. Eichenberger; Santosh G. Abraham; Edward S. Davidson

Collaboration


Dive into the Alexandre E. Eichenberger's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Erik Nystrom

North Carolina State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sheldon M. Lobo

North Carolina State University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge