Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Huming Zhu is active.

Publication


Featured researches published by Huming Zhu.


international parallel and distributed processing symposium | 2012

Parallel Multi-Temporal Remote Sensing Image Change Detection on GPU

Huming Zhu; Yu Cao; Zhiqiang Zhou; Maoguo Gong

Change detection is an important technique in damage assessment area. As the amount of remote sensing images and the complexity of algorithms rise, the demand for processing power is increasing. In this paper, we propose PLog-FLCM, a parallel algorithm for change detection. It is implemented on AMD Accelerated Parallel Processing (APP) SDK v2 based on Open Computing Language. The parallel characteristics and implementation details of the proposed PLog-FLICM algorithm are presented. Experiments on several Synthetic Aperture Radar(SAR) images demonstrate that the proposed algorithm outperform other algorithms, and the designed parallel algorithm can greatly reduce the computational time of change detection algorithm. It has achieved speedups of between 63 and 145 times on AMD Radeon HD 6870 Graphics Processing Unit (GPU).


ieee international conference on high performance computing data and analytics | 2013

Parallel unsupervised Synthetic Aperture Radar image change detection on a graphics processing unit

Huming Zhu; Yu Cao; Zhiqiang Zhou; Maoguo Gong; Licheng Jiao

Change detection is now routinely applied in many application domains, such as damage assessment, environmental monitoring and agricultural surveys. As the number of remote sensing images and the complexity of algorithms rise, the demand for processing power is increasing. In this paper, we propose PLog-FLICM , a parallel algorithm for change detection, which includes two steps: (1) generate the difference image based on the log-ratio operator; (2) detect changes in the difference image by using a modified fuzzy c-means clustering algorithm. PLog-FLICM is implemented on AMD Accelerated Parallel Processing SDK based on Open Computing Language. The parallel characteristics and implementation details of the proposed PLog-FLICM algorithm are presented. Experiments on several Synthetic Aperture Radar images demonstrate that the proposed algorithm outperforms other algorithms, and the designed parallel algorithm can greatly reduce the computational time of the change detection algorithm. Furthermore, we investigate the performance portability of PLog-FLICM in the different central processing unit and graphics processing unit platforms. Experimental results show that they have also achieved good parallel performance.


Concurrency and Computation: Practice and Experience | 2017

An OpenCL‐accelerated parallel immunodominance clone selection algorithm for feature selection

Huming Zhu; Yanfei Wu; Pei Li; Peng Zhang; Zhe Ji; Maoguo Gong

Immunodominance clone selection algorithm (ICSA) is a robust and effective metaheuristic method for feature selection problem. However, ICSA is usually slow in finding the optimal solution. In this paper, we propose a parallel immunodominance clone selection algorithm (PICSA) on Graphics Processing Unit (GPU) to improve the speedup of ICSA for feature selection problem. The parallel program can considerably accelerate the feature selection operator. The immunodominance operator, which efficiently connects the local and global information, makes the algorithm able to jump out of the local optimum easily and obtain the global optimum. When comparing with other parallel languages, Open Computing Language (OpenCL) has advantages both in efficiency and portability. Therefore, we use OpenCL to implement this algorithm on Intel many integrated core and different GPU platforms. Experiment results obtained using high‐dimensional UCI machine learning and image texture datasets demonstrate that the PICSA algorithm allows one to achieve good acceleration ratio while maintaining similar classification accuracy to serial ICSA program. Besides, the OpenCL‐based implementation of PICSA shows good portability on many integrated core and different GPU platforms as well. Copyright


Journal of Computational Science | 2016

A parallel Non-Local means denoising algorithm implementation with OpenMP and OpenCL on Intel Xeon Phi Coprocessor

Huming Zhu; Yanfei Wu; Pei Li; Duo Wang; Wei Shi; Peng Zhang; Licheng Jiao

Abstract The Non-Local means (NLM) denoising algorithm calculates similarity weight between denoising pixels and searching area pixels by establishing similar functions. In texture denoising and edge region denoising domain, the Non-Local Means denoising algorithm performs better than many other existing denoising algorithms because it uses the redundant information of images. However, NLM algorithm has defect in speed for the huge computational amount. Recently, Intel Xeon Phi Coprocessor (based on Intel Many Integrated Core architecture, MIC) exhibits huge superiority in speedup computation. Therefore we design parallel algorithm strategies of OpenMP and OpenCL based on the serial NLM algorithm for MIC architecture, and conduct the experiment on CPU, GPU, and MIC with images of different sizes. The experiment suggests that the OpenMP-based NLM algorithm has better performance on Xeon Phi 7120 than on Xeon E5 2692 when the image size is greater than or equal to 1024*1024, the OpenCL-based NLM algorithm has better performance on Xeon Phi 7120 than on NVIDIA Kepler K20M GPU, and OpenCL-based NLM algorithm performs a little better than OpenMP-based NLM algorithm when they both implemented on Intel Xeon Phi 7120.


international geoscience and remote sensing symposium | 2015

Distributed SAR image change detection based on Spark

Huming Zhu; Yuqi Guo; Mingwei Niu; Guodong Yang; Licheng Jiao

SAR image change detection is a fundamental process in many applications such as damage assessment, natural disasters monitoring and urban planning. Now as the scale of images and the complexity of algorithms rise, sequential methods have been more and more inefficient and powerless. In this paper, we propose a distributed parallel image change detection method based on Spark, an in-memory cluster computing framework, which provides an original support for iterative jobs. The proposed method can make full use of the power of a cluster or a set of commercial computers to process large scale SAR images. Different from the traditional image change detection, a distributed parallel kernel fuzzy c-means clustering algorithm, which is integrated with Spark, is used to part the change map into changed area and unchanged area. Our experimental results on some large scale SAR images show good effectiveness and accelerating performance. Compared to Hadoop based KFCM, the speedup can achieve 18.9 in maximum.


Multimedia Tools and Applications | 2018

Parallel implementations of frame rate up-conversion algorithm using OpenCL on heterogeneous computing devices

Huming Zhu; Duo Wang; Peng Zhang; Zheng Luo; Licheng Jiao; Hong Han

As a video post-processing technology, frame rate up-conversion (FRUC) converts a low frame rate video into a higher one by inserting intermediate frames between adjacent original frames. Because computing consumption grows rapidly with the increase of video resolution and frame rate, accelerating FRUC by parallel computing may serve as an appropriate method. In this paper, an effective parallel FRUC algorithm is proposed, which consists mainly of two parts: parallel motion estimation algorithm (Three-dimensional Recursive Search algorithm, 3DRS algorithm) and parallel motion compensation algorithm. We design macro-block-level parallelism and candidate motion vector level parallelism strategies based on different granularity in the motion estimation module, and pixel-level parallelism in the motion compensation module. The proposed parallel FRUC algorithm has been tested on different hardware platforms. The results show that the method achieves significant speedups of up to 96× for 1920 × 1080 video and 254× for 3840 × 2160 video when compared with sequential implementation on CPU. Moreover, the OpenCL program of the parallel FRUC algorithm shows good portability on various GPU platforms.


architectural support for programming languages and operating systems | 2017

Distributed SAR Image Change Detection with OpenCL-Enabled Spark

Huming Zhu; Jianing Kou; Linyan Qiu; Yuqi Guo; Mingwei Niu; Maoguo Gong; Licheng Jiao

Distributed processing framework has been widely used in remote-sensing field. Spark, as a popular distributed computing framework, has been utilized to deal with big remote sensing data. However, it is inefficient due to that the application is not only data intensive but also computation intensive. For example, in Synthetic Aperture Radar (SAR) image change detection, clustering analysis can consume a lot of computing time and memory resources dealing with big remote sensing data. Coprocessors (GPU, MIC, etc.) have a high-compute power, which is able to handle computation intensive tasks. In this paper, we proposed an OpenCL-enabled Spark framework to accelerate Kernel Fuzzy C-Mean (KFCM) algorithm for SAR image change detection. And the computation intensive operations of KFCM are transferred to coprocessors of the cluster through the proposed OpenCL-enabled Spark framework. The experimental results on real SAR image indicate that the implementation on OpenCL-enabled Spark is efficient and scalable.


international geoscience and remote sensing symposium | 2016

SAR image change detection based on Spark-FLICM algorithm

Huming Zhu; Yuqi Guo; Mingwei Niu; Linyan Qiu; Licheng Jiao; Maoguo Gong

In this paper, we propose a Spark-based fuzzy local information C-Means (FLICM) algorithm that provides synthetic aperture radar (SAR) image change detection. With the volume and resolution of SAR images increasing, current serial clustering algorithms are not suitable to handle big data, scalable solutions are indispensable. The proposed algorithm based on Spark framework implements FLICM algorithm according to MapReduce programming model. The membership degree is obtained in parallel during Map phase where the local information is obtained by broadcasting arrays and clustering centers are obtained during Reduce phase. With a cluster composed of five nodes, experiments on real SAR images show good performance.


international geoscience and remote sensing symposium | 2016

Parallel implementation of the FLICM algorithm for SAR image change detection on intel MIC

Huming Zhu; Le Lu; Yucong Fan; Pei Li; Qingyu Zhang; Licheng Jiao

Image change detection has a wide range of applications in various fields, such as damage assessment, environmental monitoring and agricultural surveys. As the number of remote sensing images and the complexity of algorithm rise, the demand for processing power is increasing. In this paper, we present a parallel FLICM algorithm for SAR image change detection on Intel MIC (Many Integrated Core) which is the new type of many-core coprocessor. The proposed algorithm is implemented based on MIC-Offload mode using OpenMP (Open Multi-Processing). The parallel characteristics and implementation details of the proposed parallel FLICM algorithm are presented. Experiment results demonstrate that the optimized parallel algorithm can greatly reduce the computational time of the change detection algorithm. The speedup is up to 20× compared with the runtime of serial algorithm on CPU.


international conference on parallel and distributed systems | 2016

Accelerating Learning to Rank via SVM with OpenCL and OpenMP on Heterogeneous Platforms

Huming Zhu; Zheng Luo; Yanfei Wu; Pei Li; Peng Zhang; Shuiping Gou; Licheng Jiao

Support vector machine (SVM) is a popular algorithm for learning to rank, but the training speed of SVM is the bottleneck when dealing with large size data problems. Recently, heterogeneous computing platforms, such as graphics processing unit (GPU) and Many Integrated Core (MIC), have exhibited huge superiority in High Performance Computing domain. Open Computing Language (OpenCL) and Open Multi-Processing (OpenMP) are two popular parallel programming interface for different Heterogeneous Platforms. To resolve the speed problem of RSVM, comparison of the performance of different parallel programming models on different heterogeneous platforms is important. We designed OpenMPbased parallel learning to Rank SVM (PLRSVM) for multi-core CPU and MIC, and OpenCL-based PLRSVM for multi-core CPU, GPU and MIC. The experimental result shows the different performance between OpenMP based program and OpenCL based program. The OpenCL based program significantly speeds up training process of SVM and shows good portability on heterogeneous devices. The experiment also suggests that selection of suitable programming models according to the hardware platform and the structure of serial algorithm is an important step to acquire high performance of parallel algorithm.

Collaboration


Dive into the Huming Zhu's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge