Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jian Tao is active.

Publication


Featured researches published by Jian Tao.


Scientific Programming | 2013

From physics model to results: An optimizing framework for cross-architecture code generation

Marek Blazewicz; Ian Hinder; David M. Koppelman; Steven R. Brandt; Milosz Ciznicki; Michal Kierzynka; Frank Löffler; Jian Tao

Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization is based on higher-order finite differences on multi-block domains. Chemoras capabilities are demonstrated by simulations of black hole collisions. This problem provides an acid test of the framework, as the Einstein equations contain hundreds of variables and thousands of terms.


Scientific Programming | 2011

CaKernel --A parallel application programming framework for heterogenous computing architectures

Marek Blazewicz; Steven R. Brandt; Michal Kierzynka; Krzysztof Kurowski; Bogdan Ludwiczak; Jian Tao; Jan Węglarz

With the recent advent of new heterogeneous computing architectures there is still a lack of parallel problem solving environments that can help scientists to use easily and efficiently hybrid supercomputers. Many scientific simulations that use structured grids to solve partial differential equations in fact rely on stencil computations. Stencil computations have become crucial in solving many challenging problems in various domains, e.g., engineering or physics. Although many parallel stencil computing approaches have been proposed, in most cases they solve only particular problems. As a result, scientists are struggling when it comes to the subject of implementing a new stencil-based simulation, especially on high performance hybrid supercomputers. In response to the presented need we extend our previous work on a parallel programming framework for CUDA --CaCUDA that now supports OpenCL. We present CaKernel --a tool that simplifies the development of parallel scientific applications on hybrid systems. CaKernel is built on the highly scalable and portable Cactus framework. In the CaKernel framework, Cactus manages the inter-process communication via MPI while CaKernel manages the code running on Graphics Processing Units GPUs and interactions between them. As a non-trivial test case we have developed a 3D CFD code to demonstrate the performance and scalability of the automatically generated code.


acm sigplan symposium on principles and practice of parallel programming | 2012

Using GPU's to accelerate stencil-based computation kernels for the development of large scale scientific applications on heterogeneous systems

Jian Tao; Marek Blazewicz; Steven R. Brandt

We present CaCUDA - a GPGPU kernel abstraction and a parallel programming framework for developing highly efficient large scale scientific applications using stencil computations on hybrid CPU/GPU architectures. CaCUDA is built upon the Cactus computational toolkit, an open source problem solving environment designed for scientists and engineers. Due to the flexibility and extensibility of the Cactus toolkit, the addition of a GPGPU programming framework required no changes to the Cactus infrastructure, guaranteeing that existing features and modules will continue to work without modification. CaCUDA was tested and benchmarked using a 3D CFD code based on a finite difference discretization of Navier-Stokes equations.


Physical Review D | 2005

Computational relativistic astrophysics with adaptive mesh refinement : Testbeds

Edwin Evans; Sai Iyer; Wai-Mo Suen; Jian Tao; Randy Wolfmeyer; Hui-Min Zhang

We have carried out numerical simulations of strongly gravitating systems based on the Einstein equations coupled to the relativistic hydrodynamic equations using adaptive mesh refinement (AMR) techniques. We show AMR simulations of NS binary inspiral and coalescence carried out on a workstation having an accuracy equivalent to that of a 1025{sup 3} regular unigrid simulation, which is, to the best of our knowledge, larger than all previous simulations of similar NS systems on supercomputers. We believe the capability opens new possibilities in general relativistic simulations.


Proceedings of the 15th ACM Mardi Gras conference on From lightweight mash-ups to lambda grids: Understanding the spectrum of distributed computing requirements, applications, tools, infrastructures, interoperability, and the incremental adoption of key capabilities | 2008

A case study for petascale applications in astrophysics: simulating gamma-ray bursts

Christian D. Ott; Gabrielle Allen; Edward Seidel; Jian Tao; Burkhard Zink

Petascale computing will allow astrophysicists to investigate astrophysical objects, systems, and events that cannot be studied by current observational means and that were previously excluded from computational study by sheer lack of CPU power and appropriate codes. Here we present a pragmatic case study, focussing on the simulation of gamma-ray bursts as a science driver for petascale computing. We estimate the computational requirements for such simulations and delineate in what way petascale and peta-grid computing can be utilized in this context.


Engineering Mechanics Conference | 2013

An HPC framework for large scale simulations and visualizations of oil spill trajectories

Jian Tao; Werner Benger; Kelin Hu; Edwin Mathews; Marcel Ritter; Peter Diener; Carola Kaiser; Haihong Zhao; Gabrielle Allen; Qin Chen

The objective of this work is to build a high performance computing framework for simulating, analyzing and visualizing oil spill trajectories driven by winds and ocean currents. We adopt a particle model for oil and track the trajectories of oil particles using 2D surface currents and winds, which can either be measured directly or estimated with sophisticated coastal storm and ocean circulation models. Our work is built upon the Cactus computational framework. The numerical implementation of the particle model as well as the model coupling modules will become crucial parts of our upcoming full 3D oil spill modeling toolkit. Employing high performance computing and networking, the simulation time can be greatly reduced. Given timely injection of the measurement data, our work can be helpful to predict oil trajectories and facilitate oil clean up, especially after a tropical cyclone.


teragrid conference | 2011

Runtime analysis tools for parallel scientific applications

Oleg Korobkin; Gabrielle Allen; Steven R. Brandt; Eloisa Bentivegna; Peter Diener; Jinghua Ge; Frank Löffler; Jian Tao

This paper describes the Alpaca runtime tools. These tools leverage the component infrastructure of the Cactus Framework in a novel way to enable runtime steering, monitoring, and interactive control of a simulation. Simulation data can be observed graphically, or by inspecting values of variables. When GPUs are available, images can be generated using volume ray casting on the live data. In response to observed error conditions or automatic triggers, users can pause the simulation to modify or repair data, or change runtime parameters. In this paper we describe the design of our implementation of these features and illustrate their value with three use cases.


Scientific Programming | 2016

A New Parallel Method for Binary Black Hole Simulations

Quan Yang; Zhihui Du; Zhoujian Cao; Jian Tao; David A. Bader

Simulating binary black hole BBH systems are a computationally intensive problem and it can lead to great scientific discovery. How to explore more parallelism to take advantage of the large number of computing resources of modern supercomputers is the key to achieve high performance for BBH simulations. In this paper, we propose a scalable MPM Mesh based Parallel Method which can explore both the inter- and intramesh level parallelism to improve the performance of BBH simulation. At the same time, we also leverage GPU to accelerate the performance. Different kinds of performance tests are conducted on Blue Waters. Compared with the existing method, our MPM can improve the performance from 5x speedup compared with the normalized speed of 32 MPI processes to 8x speedup. For the GPU accelerated version, our MPM can improve the performance from 12x speedup to 28x speedup. Experimental results also show that when only enough CPU computing resource or limited GPU computing resource is available, our MPM can employ two special scheduling mechanisms to achieve better performance. Furthermore, our scalable GPU acceleration MPM can achieve almost ideal weak scaling up to 2048 GPU computing nodes which enables our software to handle even larger BBH simulations efficiently.


Archive | 2014

Simulation Management Systems Developed by the Northern Gulf Coastal Hazards Collaboratory (NG-CHC): An Overview of Cyberinfrastructure to Support the Coastal Modeling Community in the Gulf of Mexico

Robert R. Twilley; Steve R. Brandt; Darlene Breaux; John Cartwright; James B. Chen; Greg Easson; Patrick J. Fitzpatrick; Kenneth J. Fridley; Sara J. Graves; Sandra L. Harper; Carola Kaiser; Alexander Maestre; Manil Maskey; William H. McAnally; John A. McCorquodale; Ehab A. Meselhe; Tina Miller-Way; Kyeong Park; João Pereira; Thomas Richardson; Jian Tao; Amelia K. Ward; Jerry D. Wiggert; Derek G. Williamson

Given the significance of natural and built assets of the Gulf of Mexico region, the three states of Alabama, Louisiana, and Mississippi, leveraged their unique partnerships, proximity, and significant prior investments in cyberinfrastructure (CI) to develop the Northern Gulf Coastal Hazards Collaboratory (NG-CHC). This collaboratory was established to catalyze collaborative research via enhanced CI to reduce the regions vulnerability to natural and human disasters by facilitating high performance modeling to test hypotheses focused on engineering design, coastal system response, and risk management of coastal hazards. The objective of the NG-CHC is to advance research and inspire collaboration through highly available innovation-enabling CI, with a particular focus on geosciences and engineering from the watershed to the coast. An integrated CI capable of simulating all relevant interacting processes is needed to implement a system that captures the dynamic nature of coastal surface processes. The NG-CHC has implemented CI to locate appropriate data and computational resources, create necessary workflows associated with different simulation demands, and provide visualization tools for analysis of results. Three simulation management systems, SIMULOCEAN, SULIS, and ASGS, were implemented, each with a defined suite of hypotheses and institutional participants to run collaborator experiments. The NG-CHC focused on developing suites of CI tools centered on handling the functional needs of each simulation management system in a collaborative environment. The NG-CHC also developed curriculum units, computer games and simulations to extend the knowledge of coastal hazards to students from middle school to college. Education and outreach activities were developed to increase public understanding and support for sustainable coastal practices. The elements of the CI tool box within NG-CHC describe generic tools needed to promote a ‘collaborative modeling environment’ in other coastal systems.


ieee international conference on high performance computing data and analytics | 2009

Benchmarking parallel i/o performance for a large scale scientific application on the teragrid

Frank Löffler; Jian Tao; Gabrielle Allen

This paper is a report on experiences in benchmarking I/O performance on leading computational facilities on the NSF TeraGrid network with a large scale scientific application. Instead of focusing only on the raw file I/O bandwidth provided by different machine architectures, the I/O performance and scalability of the computational tools and libraries that are used in current production simulations are tested as a whole, however with focus mostly on bulk transfers. It is seen that the I/O performance of our production code scales very well, but is limited by the I/O system itself at some point. This limitation occurs at a low percentage of the computational size of the machines, which shows that at least for the application used for this paper the I/O system can be an important limiting factor in scaling up to the full size of the machine.

Collaboration


Dive into the Jian Tao's collaboration.

Top Co-Authors

Avatar

Frank Löffler

Louisiana State University

View shared research outputs
Top Co-Authors

Avatar

Steven R. Brandt

Louisiana State University

View shared research outputs
Top Co-Authors

Avatar

Hui-Min Zhang

University of Washington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Wai-Mo Suen

Washington University in St. Louis

View shared research outputs
Top Co-Authors

Avatar

Marek Blazewicz

Poznań University of Technology

View shared research outputs
Top Co-Authors

Avatar

Peter Diener

Louisiana State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Oleg Korobkin

Louisiana State University

View shared research outputs
Top Co-Authors

Avatar

Michal Kierzynka

Poznań University of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge