Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kirk Schloegel is active.

Publication


Featured researches published by Kirk Schloegel.


Journal of Parallel and Distributed Computing | 1997

Multilevel Diffusion Schemes for Repartitioning of Adaptive Meshes

Kirk Schloegel; George Karypis; Vipin Kumar

For a large class of irregular mesh applications, the structure of the mesh changes from one phase of the computation to the next. Eventually, as the mesh evolves, the adapted mesh has to be repartitioned to ensure good load balance. If this new graph is partitioned from scratch, it may lead to an excessive migration of data among processors. In this paper, we present schemes for computing repartitionings of adaptively refined meshes that perform diffusion of vertices in a multilevel framework. These schemes try to minimize vertex movement without significantly compromising the edge-cut. We present heuristics to control the tradeoff between edge-cut and vertex migration costs. We also show that multilevel diffusion produces results with improved edge-cuts over single-level diffusion, and is better able to make use of heuristics to control the tradeoff between edge-cut and vertex migration costs than single-level diffusion.


Concurrency and Computation: Practice and Experience | 2002

Parallel static and dynamic multi‐constraint graph partitioning

Kirk Schloegel; George Karypis; Vipin Kumar

Sequential multi‐constraint graph partitioners have been developed to address the static load balancing requirements of multi‐phase simulations. These work well when (i) the graph that models the computation fits into the memory of a single processor, and (ii) the simulation does not require dynamic load balancing. The efficient execution of very large or dynamically adapting multi‐phase simulations on high‐performance parallel computers requires that the multi‐constraint partitionings are computed in parallel. This paper presents a parallel formulation of a multi‐constraint graph‐partitioning algorithm, as well as a new partitioning algorithm for dynamic multi‐phase simulations. We describe these algorithms and give experimental results conducted on a 128‐processor Cray T3E. These results show that our parallel algorithms are able to efficiently compute partitionings of similar edge‐cuts as serial multi‐constraint algorithms, and can scale to very large graphs. Our dynamic multi‐constraint algorithm is also able to minimize the data redistribution required to balance the load better than a naive scratch‐remap approach. We have shown that both of our parallel multi‐constraint graph partitioners are as scalable as the widely‐used parallel graph partitioner implemented in PARMETIS. Both of our parallel multi‐constraint graph partitioners are very fast, as they are able to compute three‐constraint 128‐way partitionings of a 7.5 million vertex graph in under 7 s on 128 processors of a Cray T3E. Copyright


conference on high performance computing (supercomputing) | 2000

A Unified Algorithm for Load-balancing Adaptive Scientific Simulations

Kirk Schloegel; George Karypis; Vipin Kumar

Adaptive scientific simulations require that periodic repartitioning occur dynamically throughout the course of the computation. The repartitionings should be computed so as to minimize both the inter-processor communications incurred during the iterative mesh-based computation and the data redistribution costs required to balance the load. Recently developed schemes for computing repartitionings provide the user with only a limited control of the tradeoffs among these objectives. This paper describes a new Unified Repartitioning Algorithm that can tradeoff one objective for the other dependent upon a user-defined parameter describing the relative costs of these objectives. We show that the Unified Repartitioning Algorithm is able to reduce the precise overheads associated with repartitioning as well as or better than other repartitioning schemes for a variety of problems, regardless of the relative costs of performing inter-processor communication and data redistribution. Our experimental results show that this scheme is extremely fast and scalable to large problems.


european conference on parallel processing | 2000

Parallel Multilevel Algorithms for Multi-constraint Graph Partitioning

Kirk Schloegel; George Karypis; Vipin Kumar

Sequential multi-constraint graph partitioners have been developed to address the load balancing requirements of multi-phase simulations. The efficient execution of large multi-phase simulations on high performance parallel computers requires that the multi-constraint partitionings are computed in parallel. This paper presents a parallel formulation of a recently developed multi-constraint graph partitioning algorithm. We describe this algorithm and give experimental results conducted on a 128-processor Cray T3E. We show that our parallel algorithm is able to efficiently compute partitionings of similar edge-cuts as serial multi-constraint algorithms, and can scale to very large graphs. Our parallel multi-constraint graph partitioner is able to compute a three-constraint 128-way partitioning of a 7.5 million node graph in about 7 seconds on 128 processors of a Cray T3E.


european conference on parallel processing | 1999

A New Algorithm for Multi-objective Graph Partitioning

Kirk Schloegel; George Karypis; Vipin Kumar

Recently, a number of graph partitioning applications have emerged with additional requirements that the traditional graph partitioning model alone cannot effectively handle. One such class of problems is those in which multiple objectives, each of which can be modeled as a sum of weights of the edges of a graph, must be simultaneously optimized. This class of problems can be solved utilizing a multi-objective graph partitioning algorithm. We present a new formulation of the multi-objective graph partitioning problem and describe an algorithm that computes partitionings with respect to this formulation. We explain how this algorithm provides the user with a fine-tuned control of the tradeoffs among the objectives, results in predictable partitionings, and is able to handle both similar and dissimilar objectives. We show that this algorithm is better able to find a good tradeoff among the objectives than partitioning with respect to a single objective only. Finally, we show that by modifying the input preference vector, the multi-objective graph partitioning algorithm is able to gracefully tradeoff decreases in one objective for increases in the others.


IEEE Transactions on Parallel and Distributed Systems | 2001

Wavefront diffusion and LMSR: algorithms for dynamic repartitioning of adaptive meshes

Kirk Schloegel; George Karypis; Vipin Kumar

Current multilevel repartitioning schemes tend to perform well on certain types of problems while obtaining worse results for other types of problems. We present two new multilevel algorithms for repartitioning adaptive meshes that improve the performance of multilevel schemes for the types of problems that current schemes perform poorly while maintaining similar or better results for those problems that current schemes perform well. Specifically, we present a new scratch-remap scheme called Locally-matched Multilevel Scratch-remap (or simply LMSR) for repartitioning of adaptive meshes. LMSR tries to compute a high-quality partitioning that has a large amount of overlap with the original partitioning. We show that LMSR generally decreases the data redistribution costs required to balance the load compared to current scratch-remap schemes. We present a new diffusion-based scheme that we refer to as Wavefront Diffusion. In Wavefront Diffusion, the flow of vertices moves in a wavefront from overweight to underweight subdomains. We show that Wavefront Diffusion obtains significantly lower data redistribution costs while maintaining similar or better edge-cut results compared to existing diffusion algorithms. We also compare Wavefront Diffusion with LMSR and show that these provide a trade-off between edge-cut and data redistribution costs for a wide range of problems. Our experimental results on a Gray T3E, an IBM SP2, and a cluster of Pentium Pro workstations show that both schemes are fast and scalable. For example, both are capable of repartitioning a seven million vertex graph in under three seconds on 128 processors of a Gray T3E. Our schemes obtained relative speedups of between nine and 12 when the number of processors was increased by a factor of 16 on a Gray T3E.


conference on high performance computing (supercomputing) | 1998

Dynamic Repartitioning of Adaptively Refined Meshes

Kirk Schloegel; George Karypis; Vipin Kumar

One ingredient which is viewed as vital to the successful conduct of many large-scale numerical simulations is the ability to dynamically repartition the underlying adaptive finite element mesh among the processors so that the computations are balanced and interprocessor communication is minimized. This requires that a sequence of partitions of the computational mesh be computed during the course of the computation in which the amount of data migration necessary to realize subsequent partitions is minimized, while all of the domains of a given partition contain a roughly equal amount of computational weight. Recently, parallel multilevel graph repartitioning techniques have been developed that can quickly compute high-quality repartitions for adaptive and dynamic meshes while minimizing the amount of data which needs to be migrated between processors. These algorithms can be categorized as either schemes which compute a new partition from scratch and then intelligently remap this partition to the original partition (hereafter referred to as scratch-remap schemes), or multilevel diffusion schemes. Scratch-remap schemes work quite well for graphs which are highly imbalanced in localized areas. On slightly to moderately imbalanced graphs and those in which imbalance occurs globally throughout the graph, however, they result in excessive vertex migration compared to multilevel diffusion algorithms. On the other hand, diffusion- based schemes work well for slightly imbalanced graphs and for those in which imbalance occurs globally throughout the graph. However, these schemes perform poorly on graphs that are highly imbalanced in localized areas, as the propagation of diffusion over long distances results in excessive edge-cut and vertex migration results. In this paper, we present two new schemes for adaptive repartitioning: Locally-Matched Multilevel Scratch-Remap (or LMSR) and Wavefront Diffusion. The LMSR scheme performs purely local coarsening and partition remapping in a multilevel context. In Wavefront Diffusion, the flow of vertices move in a wavefront from overbalanced to underbalanced domains. We present experimental evaluations of our LMSR and Wavefront Diffusion algorithms on synthetically generated adaptive meshes as well as on some application meshes. We show that our LMSR algorithm decreases the amount of vertex migration required to balance the graph and produces repartitionings of similar quality compared to state-of-the-art scratch-remap schemes. Furthermore, we show that our LMSR algorithm is more scalable in terms of execution time compared to state-of-the-art scratch-remap schemes. We show that our Wavefront Diffusion algorithm obtains significantly lower vertex migration requirements, while maintaining similar edge-cut results compared to state-of-the-art multilevel diffusion algorithms, especially for highly imbalanced graphs. Furthermore, we compare Wavefront Diffusion with LMSR and show that the former will result in lower vertex migration requirements and the later will result in higher quality edge-cut results. These results hold true regardless of the distance which diffusion is required to propagate in order to balance the graph. Finally, we discuss the run times of our schemes which are both capable of repartitioning an eight million node graph in under three seconds on a 128-processor Cray T3E.


foundations of computer science | 2001

Graph partitioning for dynamic, adaptive and multi-phase scientific simulations

Kirk Schloegel; George Karypis; Vipin Kumar

The efficient execution of scientific simulations on HPC systems requires a partitioning of the underlying mesh among the processors such that the load is balanced and the inter-processor communication is minimized. Graph partitioning algorithms have been applied with much success for this purpose. However, the parallelization of multi-phase and multi-physics computations poses new challenges that require fundamental advances in graph partitioning technology. In addition, most existing graph partitioning algorithms are not suited for the newer heterogeneous high-performance computing platforms. This talk will describe research efforts in our group that are focused on developing novel multi-constraint and multi-objective graph partitioning algorithms that can support the advancing state-of-the-art in numerical simulation technologies. In addition, we will present our preliminary work on new partitioning algorithms that are well suited for heterogeneous architectures.


european conference on parallel processing | 1997

Repartitioning of Adaptive Meshes: Experiments with Multilevel Diffusion

Kirk Schloegel; George Karypis; Vipin Kumar

For a large class of irregular grid applications, the structure of the mesh changes from one phase of the computation to the next. Eventually, as the graph evolves, the adapted mesh has to be repartitioned to ensure good load balance. If this new graph is partitioned from scratch, it will lead to an excessive migration of data among processors. In this paper, we present a new scheme for computing repartitionings of adaptively refined meshes. This scheme performs diffusion of vertices in a multilevel framework and minimizes vertex movement without significantly compromising the edge-cut.


international parallel and distributed processing symposium | 2000

Graph Partitioning for Dynamic, Adaptive, and Multi-phase Computations

Vipin Kumar; Kirk Schloegel; George Karypis

Algorithms that find good partitionings of highly unstructured graphs are critical in developing efficient algorithms for problems in a variety of domains such as scientific simulations that require solution to large sparse linear systems, VLSI design, and data mining. Even though this problem is NP-hard, efficient multi-lev el algorithms have been developed that can find good partitionings of static irregular meshes. The problem of graph partitioning becomes a lot more challenging when the graph is dynamically evolving (e.g., in adaptive computations), or if computation in multiple phases needs to be balanced simultaneously. This talk will discuss these challenges, and then describe some of our recent research in addressing them.

Collaboration


Dive into the Kirk Schloegel's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Vipin Kumar

University of Minnesota

View shared research outputs
Researchain Logo
Decentralizing Knowledge