Andrey N. Chernikov | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Andrey N. Chernikov is active.

Explore More

Publication

Featured researches published by Andrey N. Chernikov.

IEEE Transactions on Parallel and Distributed Systems | 2004

A load balancing framework for adaptive and asynchronous applications

Kevin J. Barker; Andrey N. Chernikov; Nikos Chrisochoides; Keshav Pingali

We describe the design of a flexible load balancing framework and runtime software system for supporting the development of adaptive applications on distributed-memory parallel computers. The runtime system supports a global namespace, transparent object migration, automatic message forwarding and routing, and automatic load balancing. These features can be used at the discretion of the application developer in order to simplify program development and to eliminate complex bookkeeping associated with mobile data objects. An evaluation of this system in the context of a three-dimensional tetrahedral advancing front parallel mesh generator shows that overall runtime improvements of 15 percent compared to common stop-and-repartition load balancing methods, 30 percent compared to explicit intrusive load balancing methods, and 42 percent compared to no load balancing are possible on large processor configurations. At the same time, the overheads attributable to the runtime system are a fraction of 1 percent of the total runtime. The parallel advancing front method is a coarse-grained and highly adaptive application and therefore exercises all of the features of the runtime system.

Mathematics and Computers in Simulation | 2007

Parallel unstructured mesh generation by an advancing front method

Yasushi Ito; Alan M. Shih; Anil K. Erukala; Bharat K. Soni; Andrey N. Chernikov; Nikos Chrisochoides; Kazuhiro Nakahashi

Mesh generation is a critical step in high fidelity computational simulations. High-quality and high-density meshes are required to accurately capture the complex physical phenomena. A robust approach for a parallel framework has been developed to generate large-scale meshes in a short period of time. A coarse tetrahedral mesh is generated first to provide the basis of block interfaces and then is partitioned into a number of sub-domains using METIS partitioning algorithms. A volume mesh is generated on each sub-domain in parallel using an advancing front method. Dynamic load balancing is achieved by evenly distributing work among the processors. All the sub-domains are combined to create a single volume mesh. The combined volume mesh can be smoothed to remove the artifacts in the interfaces between sub-domains. A void region is defined inside each sub-domain to reduce the data points during the smoothing operation. The scalability of the parallel mesh generation is evaluated to quantify the improvement on shared- and distributed-memory computer systems.

international conference on supercomputing | 2004

Practical and efficient point insertion scheduling method for parallel guaranteed quality delaunay refinement

Andrey N. Chernikov; Nikos Chrisochoides

We describe a parallel scheduler, for guaranteed quality parallel mesh generation and refinement methods. We prove a sufficient condition for the new points to be independent, which permits the concurrent insertion of more than two points without destroying the conformity and Delaunay properties of the mesh. The scheduling technique we present is much more efficient than existing coloring methods and thus it is suitable for practical use. The condition for concurrent point insertion is based on the comparison of the distance between the candidate points against the upper bound on triangle circumradius in the mesh. Our experimental data show that the scheduler introduces a small overhead (in the order of 1--2% of the total execution time) it requires local and structured communication compared to irregular, variable and unpredictable communication of the other existing practical parallel guaranteed quality mesh generation and refinement method. Finally, on a cluster of more than 100 workstations using a simple (block) decomposition our data show that we can generate about 900 million elements in less than 300 seconds.

international conference on supercomputing | 2008

Three-dimensional delaunay refinement for multi-core processors

Andrey N. Chernikov; Nikos Chrisochoides

We develop the first ever fully functional three-dimensional guaranteed quality parallel graded Delaunay mesh generator. First, we prove a criterion and a sufficient condition of Delaunay-independence of Steiner points in three dimensions. Based on these results, we decompose the iteration space of the sequential Delaunay refinement algorithm by selecting independent subsets from the set of the candidate Steiner points without resorting to rollbacks. We use an octree which overlaps the mesh for a coarse-grained decomposition of the set of candidate Steiner points based on their location. We partition the worklist containing poor quality tetrahedra into independent lists associated with specific separated leaves of the octree. Finally, we describe an example parallel implementation using a publicly available state-of-the art sequential Delaunay library (Tetgen). This work provides a case study for the design of abstractions and parallel frameworks for the use of complex labor intensive sequential codes on multicore architectures.

ACM Transactions on Mathematical Software | 2008

Algorithm 872: Parallel 2D constrained Delaunay mesh generation

Andrey N. Chernikov; Nikos Chrisochoides

Delaunay refinement is a widely used method for the construction of guaranteed quality triangular and tetrahedral meshes. We present an algorithm and a software for the parallel constrained Delaunay mesh generation in two dimensions. Our approach is based on the decomposition of the original mesh generation problem into N smaller subproblems which are meshed in parallel. The parallel algorithm is asynchronous with small messages which can be aggregated and exhibits low communication costs. On a heterogeneous cluster of more than 100 processors our implementation can generate over one billion triangles in less than 3 minutes, while the single-node performance is comparable to that of the fastest to our knowledge sequential guaranteed quality Delaunay meshing library (the Triangle).

SIAM Journal on Scientific Computing | 2006

Parallel Guaranteed Quality Delaunay Uniform Mesh Refinement

Andrey N. Chernikov; Nikos Chrisochoides

We present a theoretical framework for developing parallel guaranteed quality Delaunay mesh generation software that allows us to use commercial off-the-shelf sequential Delaunay meshers for two-dimensional geometries. In this paper, we describe our approach for constructing uniform meshes, that is, the meshes in which all elements have approximately the same size. Our uniform distributed- and shared-memory implementations are based on a simple (block) coarse-grained mesh decomposition. Our method requires only local communication, which is bulk and structured as opposed to fine and unpredictable communication of the other existing practical parallel guaranteed quality mesh generation and refinement techniques. Our experimental data show that on a cluster of more than 100 workstations we can generate about 0.9 billion elements in less than 5 minutes in the absence of work-load imbalances. Preliminary results for this paper were presented in [A. N. Chernikov and N. P. Chrisochoides, “Practical and efficient point insertion scheduling method for parallel guaranteed quality Delaunay refinement,” in Proceedings of the 18th Annual International Conference on Supercomputing, ACM Press, New York, 2004, pp. 48-57]. Our work in progress includes extending the presented approach, which can efficiently generate only uniform meshes, to nonuniform graded meshes.

international conference on supercomputing | 2005

Multigrain parallel Delaunay Mesh generation: challenges and opportunities for multithreaded architectures

Christos D. Antonopoulos; Xiaoning Ding; Andrey N. Chernikov; Filip Blagojevic; Dimitrios S. Nikolopoulos; Nikos Chrisochoides

Given the importance of parallel mesh generation in large-scale scientific applications and the proliferation of multilevel SMT-based architectures, it is imperative to obtain insight on the interaction between meshing algorithms and these systems. We focus on Parallel Constrained Delaunay Mesh (PCDM) generation. We exploit coarse-grain parallelism at the subdomain level and fine-grain at the element level. This multigrain data parallel approach targets clusters built from low-end, commercially available SMTs. Our experimental evaluation shows that current SMTs are not capable of executing fine-grain parallelism in PCDM. However, experiments on a simulated SMT indicate that with modest hardware support it is possible to exploit fine-grain parallelism opportunities. The exploitation of fine-grain parallelism results to higher performance than a pure MPI implementation and closes the gap between the performance of PCDM and the state-of-the-art sequential mesher on a single physical processor. Our findings extend to other adaptive and irregular multigrain, parallel algorithms.

Journal of Parallel and Distributed Computing | 2009

A multigrain Delaunay mesh generation method for multicore SMT-based architectures

Christos D. Antonopoulos; Filip Blagojevic; Andrey N. Chernikov; Nikos Chrisochoides; Dimitrios S. Nikolopoulos

Given the proliferation of layered, multicore- and SMT-based architectures, it is imperative to deploy and evaluate important, multi-level, scientific computing codes, such as meshing algorithms, on these systems. We focus on Parallel Constrained Delaunay Mesh (PCDM) generation. We exploit coarse-grain parallelism at the subdomain level, medium-grain at the cavity level and fine-grain at the element level. This multi-grain data parallel approach targets clusters built from commercially available SMTs and multicore processors. The exploitation of the coarser degree of granularity facilitates scalability both in terms of execution time and problem size on loosely-coupled clusters. The exploitation of medium-grain parallelism allows performance improvement at the single node level. Our experimental evaluation shows that the first generation of SMT cores is not capable of taking advantage of fine-grain parallelism in PCDM. Many of our experimental findings with PCDM extend to other adaptive and irregular multigrain parallel algorithms as well.

IMR | 2009

Towards Exascale Parallel Delaunay Mesh Generation

Nikos Chrisochoides; Andrey N. Chernikov; Andriy Fedorov; Andriy Kot; Leonidas Linardakis; Panagiotis A. Foteinos

Mesh generation is a critical component for many (bio-)engineering applications. However, parallel mesh generation codes, which are essential for these applications to take the fullest advantage of the high-end computing platforms, belong to the broader class of adaptive and irregular problems, and are among the most complex, challenging, and labor intensive to develop and maintain. As a result, parallel mesh generation is one of the last applications to be installed on new parallel architectures. In this paper we present a way to remedy this problem for new highly-scalable architectures. We present a multi-layered tetrahedral/triangular mesh generation approach capable of delivering and sustaining close to 1018 of concurrent work units. We achieve this by leveraging concurrency at different granularity levels using a hybrid algorithm, and by carefully matching these levels to the hierarchy of the hardware architecture. This paper makes two contributions: (1) a new evolutionary path for developing multi-layered parallel mesh generation codes capable of increasing the concurrency of the state-of-the-art parallel mesh generation methods by at least 10 orders of magnitude and (2) a new abstraction for multi-layered runtime systems that target parallel mesh generation codes, to efficiently orchestrate intra- and inter-layer data movement and load balancing for current and emerging multi-layered architectures with deep memory and network hierarchies.

SIAM Journal on Scientific Computing | 2012

Generalized Insertion Region Guides for Delaunay Mesh Refinement

Andrey N. Chernikov; Nikos Chrisochoides

Mesh generation by Delaunay refinement is a widely used technique for constructing guaranteed quality triangular and tetrahedral meshes. The quality guarantees are usually provided in terms of the bounds on circumradius-to-shortest-edge ratio and on the grading of the resulting mesh. Traditionally circumcenters of skinny elements and middle points of boundary faces and edges are used for the positions of inserted points. However, recently variations of the traditional algorithms are being proposed that are designed to achieve certain optimization objectives by inserting new points in neighborhoods of the center points. In this paper we propose a general approach to the selection of point positions by defining one-, two-, and three-dimensional selection regions such that any point insertion strategy based on these regions is automatically endowed with the theoretical guarantees proven here. In particular, for the input models defined by planar linear complexes under the assumption that no input angle is le...

Explore More