Xuan Shi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Xuan Shi is active.

Explore More

Publication

Featured researches published by Xuan Shi.

Giscience & Remote Sensing | 2013

Kriging interpolation over heterogeneous computer architectures and systems

Xuan Shi; Fei Ye

Heterogeneous computer architectures are the emerging and future trend in computer engineering but lead to major challenges in algorithm redesign and software re-engineering. This article reviews the evolutional trend in hardware advancement and discusses how to utilize multiple modern graphics processing units (GPUs) and central processing units (CPUs) to accelerate geospatial computation. In particular, we deployed the supercomputing power of Keeneland to demonstrate the significant performance improvement in spatial interpolation using Kriging which is computing intensive over large data. It is concluded that heterogeneous computer system can exemplify an apparent acceleration when a larger scale of spatial data is processed.

International Journal of Geographical Information Science | 2016

A hybrid parallel cellular automata model for urban growth simulation over GPU/CPU heterogeneous architectures

Qingfeng Guan; Xuan Shi; Miaoqing Huang; Chenggang Lai

As an important spatiotemporal simulation approach and an effective tool for developing and examining spatial optimization strategies (e.g., land allocation and planning), geospatial cellular automata (CA) models often require multiple data layers and consist of complicated algorithms in order to deal with the complex dynamic processes of interest and the intricate relationships and interactions between the processes and their driving factors. Also, massive amount of data may be used in CA simulations as high-resolution geospatial and non-spatial data are widely available. Thus, geospatial CA models can be both computationally intensive and data intensive, demanding extensive length of computing time and vast memory space. Based on a hybrid parallelism that combines processes with discrete memory and threads with global memory, we developed a parallel geospatial CA model for urban growth simulation over the heterogeneous computer architecture composed of multiple central processing units (CPUs) and graphics processing units (GPUs). Experiments with the datasets of California showed that the overall computing time for a 50-year simulation dropped from 13,647 seconds on a single CPU to 32 seconds using 64 GPU/CPU nodes. We conclude that the hybrid parallelism of geospatial CA over the emerging heterogeneous computer architectures provides scalable solutions to enabling complex simulations and optimizations with massive amount of data that were previously infeasible, sometimes impossible, using individual computing approaches.

Giscience & Remote Sensing | 2014

Unsupervised image classification over supercomputers Kraken, Keeneland and Beacon

Xuan Shi; Miaoqing Huang; Haihang You; Chenggang Lai; Zhong Chen

The iterative self-organizing data analysis technique algorithm (ISODATA) was implemented over supercomputers Kraken, Keeneland and Beacon to explore scalable and high-performance solutions for image processing and analytics using emerging advanced computer architectures. When 10 classes are extracted from one 18-GB image tile, the calculation can be reduced from several hours to no more than 90 seconds when 100 CPU, GPU or MIC processors are utilized. High-performance scalability tests were further implemented over Kraken using 10,800 processors to extract various number of classes from 12 image tiles totalling 216 gigabytes. As the first geospatial computations over GPU clusters (Keeneland) and MIC clusters (Beacon), the success of this research illustrates a solid foundation for exploring the potential of scalable and high-performance geospatial computation for the next generation cyber-enabled image analytics.

high performance computing and communications | 2013

Accelerating Geospatial Applications on Hybrid Architectures

Chenggang Lai; Miaoqing Huang; Xuan Shi; Haihang You

Accelerators have become critical in the process to develop supercomputers with exascale computing capability. In this work, we examine the potential of two latest acceleration technologies, Nvidia K20 Kepler GPU and Intel Many Integrated Core (MIC) Architecture, for accelerating geospatial applications. We first apply a set of benchmarks under 3 different configurations, i.e, MPI+CPU, MPI+GPU, and MPI+MIC. This set of benchmarks include embarrassingly parallel application, loosely communicating application, and intensely communicating application. It is found that the straightforward MPI implementation on MIC cores can achieve the same amount of performance speedup as hybrid MPI+GPU implementation when the same number of processors are used. Further, we demonstrate the potentials of hardware accelerators for advancing the scientific research using an urban sprawl simulation application. The parallel implementation of the urban sprawl simulation using 16 Tesla M2090 GPUs can realize a 155× speedup compared with the single-node implementation, while achieving a good strong scalability.

International Journal of High Performance Computing Applications | 2017

Study of parallel programming models on computer clusters with Intel MIC coprocessors

Miaoqing Huang; Chenggang Lai; Xuan Shi; Zhijun Hao; Haihang You

Coprocessors based on the Intel Many Integrated Core (MIC) Architecture have been adopted in many high-performance computer clusters. Typical parallel programming models, such as MPI and OpenMP, are supported on MIC processors to achieve the parallelism. In this work, we conduct a detailed study on the performance and scalability of the MIC processors under different programming models using the Beacon computer cluster. Our findings are as follows. (1) The native MPI programming model on the MIC processors is typically better than the offload programming model, which offloads the workload to MIC cores using OpenMP. (2) On top of the native MPI programming model, multithreading inside each MPI process can further improve the performance for parallel applications on computer clusters with MIC coprocessors. (3) Given a fixed number of MPI processes, it is a good strategy to schedule these MPI processes to as few MIC processors as possible to reduce the cross-processor communication overhead. (4) The hybrid MPI programming model, in which data processing is distributed to both MIC cores and CPU cores, can outperform the native MPI programming model.

international parallel and distributed processing symposium | 2014

Comparison of Parallel Programming Models on Intel MIC Computer Cluster

Chenggang Lai; Zhijun Hao; Miaoqing Huang; Xuan Shi; Haihang You

Coprocessors based on Intel Many Integrated Core (MIC) Architecture have been adopted in many high-performance computer clusters. Typical parallel programming models, such as MPI and OpenMP, are supported on MIC processors to achieve the parallelism. In this work, we conduct a detailed study on the performance and scalability of the MIC processors under different programming models using the Beacon computer cluster. Followings are our findings. (1) The native MPI programming model on the MIC processors is typically better than the offload programming model, which offloads the workload to MIC cores using OpenMP, on Beacon computer cluster. (2) On top of the native MPI programming model, multithreading inside each MPI process can further improve the performance for parallel applications on computer clusters with MIC coprocessors. (3) Given a fixed number of MPI processes, it is a good strategy to schedule these MPI processes to as few MIC processors as possible to reduce the cross-processor communication overhead. (4) The hybrid MPI programming model, in which data processing is distributed to both MIC cores and CPU cores, can outperform the native MPI programming model.

Big Earth Data | 2018

Efficient utilization of multi-core processors and many-core co-processors on supercomputer beacon for scalable geocomputation and geo-simulation over big earth data

Chenggang Lai; Xuan Shi; Miaoqing Huang

Abstract Digital earth science data originated from sensors aboard satellites and platforms such as airplane, UAV, and mobile systems are increasingly available with high spectral, spatial, vertical, and temporal resolution data. When such big earth science data are processed and analyzed via geocomputation solutions, or utilized in geospatial simulation or modeling, considerable computing power and resources are necessary to complete the tasks. While classic computer clusters equipped by central processing units (CPUs) and the new computing resources of graphics processing units (GPUs) have been deployed in handling big earth data, coprocessors based on the Intel’s Many Integrated Core (MIC) Architecture are emerging and adopted in many high-performance computer clusters. This paper introduces how to efficiently utilize Intel’s Xeon Phi multicore processors and MIC coprocessors for scalable geocomputation and geo-simulation by implementing two algorithms, Maximum Likelihood Classification (MLC) and Cellular Automata (CA), on supercomputer Beacon, a cluster of MICs. Four different programming models are examined, including (1) the native model, (2) the offload model, (3) the symmetric model, and (4) the hybrid-offload model. It can be concluded that while different kinds of parallel programming models can enable big data handling efficiently, the hybrid-offload model can achieve the best performance and scalability. These different programming models can be applied and extended to other types of geocomputation to handle big earth data.

Sigspatial Special | 2017

Accelerating the calculation of minimum set of viewpoints for maximum coverage over digital elevation model data by hybrid computer architecture and systems

Chenggang Lai; Miaoqing Huang; Xuan Shi

This paper introduces how to accelerate the calculation of the minimum set of viewpoints for the maximum coverage over digital elevation model data using Intels Xeon Phi and a computer cluster equipped with Intels Many-Integrated-Core (MIC) coprocessors. This data and computation intensive process consists of a series of geocomputation tasks, including 1) the automatic generation of control viewpoints through map algebra calculation and hydrological modeling approaches; 2) the creation of the joint viewshed derived from the viewshed of all viewpoints to establish the maximum viewshed coverage of the given digital elevation model (DEM) data; and 3) the identification of a minimum set of viewpoints that cover the maximum terrain area of the joint viewshed. The parallel implementation on the hybrid computer cluster was able to achieve more than 100× performance speedup in comparison to the sequential implementation. The outcome of the computation has broad societal impacts since the research questions and solutions can be applied to real-world applications and decision-making practice.

Transactions in Gis | 2014