Manzhu Yu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Manzhu Yu is active.

Explore More

Publication

Featured researches published by Manzhu Yu.

PLOS ONE | 2015

Enabling big geoscience data analytics with a cloud-based, MapReduce-enabled and service-oriented workflow framework.

Zhenlong Li; Chaowei Yang; Baoxuan Jin; Manzhu Yu; Kai Liu; Min Sun; Matthew Zhan

Geoscience observations and model simulations are generating vast amounts of multi-dimensional data. Effectively analyzing these data are essential for geoscience studies. However, the tasks are challenging for geoscientists because processing the massive amount of data is both computing and data intensive in that data analytics requires complex procedures and multiple tools. To tackle these challenges, a scientific workflow framework is proposed for big geoscience data analytics. In this framework techniques are proposed by leveraging cloud computing, MapReduce, and Service Oriented Architecture (SOA). Specifically, HBase is adopted for storing and managing big geoscience data across distributed computers. MapReduce-based algorithm framework is developed to support parallel processing of geoscience data. And service-oriented workflow architecture is built for supporting on-demand complex data analytics in the cloud environment. A proof-of-concept prototype tests the performance of the framework. Results show that this innovative framework significantly improves the efficiency of big geoscience data analytics by reducing the data processing time as well as simplifying data analytical procedures for geoscientists.

PLOS ONE | 2014

A Service Brokering and Recommendation Mechanism for Better Selecting Cloud Services

Zhipeng Gui; Chaowei Yang; Jizhe Xia; Qunying Huang; Kai Liu; Zhenlong Li; Manzhu Yu; Min Sun; Nanyin Zhou; Baoxuan Jin

Cloud computing is becoming the new generation computing infrastructure, and many cloud vendors provide different types of cloud services. How to choose the best cloud services for specific applications is very challenging. Addressing this challenge requires balancing multiple factors, such as business demands, technologies, policies and preferences in addition to the computing requirements. This paper recommends a mechanism for selecting the best public cloud service at the levels of Infrastructure as a Service (IaaS) and Platform as a Service (PaaS). A systematic framework and associated workflow include cloud service filtration, solution generation, evaluation, and selection of public cloud services. Specifically, we propose the following: a hierarchical information model for integrating heterogeneous cloud information from different providers and a corresponding cloud information collecting mechanism; a cloud service classification model for categorizing and filtering cloud services and an application requirement schema for providing rules for creating application-specific configuration solutions; and a preference-aware solution evaluation mode for evaluating and recommending solutions according to the preferences of application providers. To test the proposed framework and methodologies, a cloud service advisory tool prototype was developed after which relevant experiments were conducted. The results show that the proposed system collects/updates/records the cloud information from multiple mainstream public cloud services in real-time, generates feasible cloud configuration solutions according to user specifications and acceptable cost predication, assesses solutions from multiple aspects (e.g., computing capability, potential cost and Service Level Agreement, SLA) and offers rational recommendations based on user preferences and practical cloud provisioning; and visually presents and compares solutions through an interactive web Graphical User Interface (GUI).

Archive | 2015

Contemporary Computing Technologies for Processing Big Spatiotemporal Data

Chaowei Yang; Min Sun; Kai Liu; Qunying Huang; Zhenlong Li; Zhipeng Gui; Yunfeng Jiang; Jizhe Xia; Manzhu Yu; Chen Xu; Peter Lostritto; Nanying Zhou

Geographic phenomena evolve in a four-dimensional spatiotemporal world. To capture the geographical phenomena at different scales, large amount of data (big data) are produced with specific spatiotemporal patterns. Phenomena evolution and the principles driving the evolution provide pathways for developing methodological solutions to process the big spatiotemporal data. Based on experiences gained from several projects, such as climate studies and cloud computing, we introduce in this chapter modern computing technologies required for processing big data, including (1) sensor web, Earth observations, and model simulations for collecting and generating big data, (2) flexible and standard-based systems for managing big data for easy discovery and access, (3) multidimensional visual analytics for exploring and analyzing big spatiotemporal data, and (4) grid, cloud, and GPU computing for addressing the computing intensive challenges. We discuss through exemplar projects how these cutting-edge computing technologies are utilized to handle big spatiotemporal data. We expect this chapter to set a computing research context for future big data handling at different spatiotemporal granules.

International Journal of Geographical Information Science | 2015

Forming a global monitoring mechanism and a spatiotemporal performance model for geospatial services

Jizhe Xia; Chaowei Phil Yang; Kai Liu; Zhenglong Li; Min Sun; Manzhu Yu

Geographic information service (GIService) has become popular in the last decade to develop applications for addressing global challenges. Performance is one of the most important criteria to help users select distributed online GIService for developing geospatial applications including natural hazards and emergency responses. However, performance accuracy is limited by the single-location-based evaluation mechanism while service performance is dynamic in space and time between end-users and services. We propose a spatiotemporal performance evaluation mechanism to improve the accuracy. Specially, a cloud and volunteer computing mechanism is proposed to collect performance information of globally distributed GIServices. A global spatiotemporal performance model is designed to integrate spatiotemporal dynamics for better performance evaluation for users from different regions at different times. This model is tested to support GIService selection in global spatial data infrastructures (SDIs). The experiment confirms that the proposed model provides more accurate evaluations for global users and better supports geospatial resource utilizations in SDIs than previous mechanisms. The methodology can be adopted to improve the services of other regional and global distributed operational systems.

Archive | 2013

Accelerating Geocomputation with Cloud Computing

Qunying Huang; Zhenlong Li; Jizhe Xia; Yunfeng Jiang; Chen Xu; Kai Liu; Manzhu Yu; Chaowei Yang

The scientific and engineering advancements in the twenty-first century pose computing intensive challenges in managing Big Data, using complex algorithms to extract information and knowledge from Big Data, and simulating physical and social phenomena. Cloud computing is considered as the next generation computing platform with the potential to address these computing challenges and redefine the possibilities of geoscience and digital Earth. This chapter introduces through examples how cloud computing can help accelerate geocomputation with: (1) easy and fast access to computing resources that can be available in seconds to minutes, (2) elastic computing resources to handle spike computing loads, (3) high-end computing capacity to address large-scale computing demands, and (4) distributed services and computing to handle the distributed geoscience problems, data and users.

International Journal of Geographical Information Science | 2017

A 3D multi-threshold, region-growing algorithm for identifying dust storm features from model simulations

Manzhu Yu; Chaowei Yang

ABSTRACT Dust storms cause significant damage to health, property, and the environment worldwide every year. To help mitigate the damage, dust forecast models simulate and predict upcoming dust events, providing valuable information to scientists, decision makers, and the public. These simulation outputs are in four-dimensions (i.e., latitude, longitude, elevation, and time) and represent spatially heterogeneous dust storm features and their evolution over space and time. This research investigates and proposes an automatic multi-threshold, region-growing-based algorithm to identify critical dust storm features from 3D dust storm simulations. A multi-threshold scheme is defined for the identification of dust storm features with different dust concentrations. Based on the multi-thresholds, dust storm features are iteratively identified by developing a region-growing algorithm that splits a clustered dust storm feature into multiple sub-features. The proposed approach is compared with three commonly used methods in image processing and thunderstorm identification. The proposed approach outperforms the other three methods in sensitivity and quantitative/qualitative accuracy. This research approach may also be slightly adjusted to identify critical 3D features from simulation outputs for other severe weather and geographical phenomena.

PLOS ONE | 2016

Developing Subdomain Allocation Algorithms Based on Spatial and Communicational Constraints to Accelerate Dust Storm Simulation

Zhipeng Gui; Manzhu Yu; Chaowei Yang; Yunfeng Jiang; Songqing Chen; Jizhe Xia; Qunying Huang; Kai Liu; Zhenlong Li; Mohammed Anowarul Hassan; Baoxuan Jin

Dust storm has serious disastrous impacts on environment, human health, and assets. The developments and applications of dust storm models have contributed significantly to better understand and predict the distribution, intensity and structure of dust storms. However, dust storm simulation is a data and computing intensive process. To improve the computing performance, high performance computing has been widely adopted by dividing the entire study area into multiple subdomains and allocating each subdomain on different computing nodes in a parallel fashion. Inappropriate allocation may introduce imbalanced task loads and unnecessary communications among computing nodes. Therefore, allocation is a key factor that may impact the efficiency of parallel process. An allocation algorithm is expected to consider the computing cost and communication cost for each computing node to minimize total execution time and reduce overall communication cost for the entire simulation. This research introduces three algorithms to optimize the allocation by considering the spatial and communicational constraints: 1) an Integer Linear Programming (ILP) based algorithm from combinational optimization perspective; 2) a K-Means and Kernighan-Lin combined heuristic algorithm (K&K) integrating geometric and coordinate-free methods by merging local and global partitioning; 3) an automatic seeded region growing based geometric and local partitioning algorithm (ASRG). The performance and effectiveness of the three algorithms are compared based on different factors. Further, we adopt the K&K algorithm as the demonstrated algorithm for the experiment of dust model simulation with the non-hydrostatic mesoscale model (NMM-dust) and compared the performance with the MPI default sequential allocation. The results demonstrate that K&K method significantly improves the simulation performance with better subdomain allocation. This method can also be adopted for other relevant atmospheric and numerical modeling.

Computers & Geosciences | 2018

A framework for natural phenomena movement tracking – Using 4D dust simulation as an example

Manzhu Yu; Chaowei Yang; Baoxuan Jin

Abstract Natural phenomena evolve in space and time are often highly dynamic. Numerical simulations and earth observations have provided the capability to capture and study the complex evolvement of natural phenomena in a discrete fashion. It is demanding but challenging to extract events from these datasets automatically. Based on the previous research on feature identification, this research presents a movement tracking framework to analyze evolvements and dynamic movements of detected events. The framework consists of three components: feature identification, movement tracking, and track simplification. Based on the proposed framework, dust storm events are systematically detected and analyzed concerning their dynamic movements from a 4D (x, y, z, and t) simulation dataset over North Africa, the Mediterranean, and the Middle East from December 2013 to November 2014. The systematic research includes single event, multi-event, and seasonal analyses. Evaluation of the detected dust events shows that the tracked dust events align well with observations, with ∼80% identification accuracy and consistency in the movement pattern. To briefly demonstrate its capability, we adopted the proposed framework to detect precipitation events from 3D (x, y, and t) precipitation observation data.

International Journal of Geographical Information Science | 2017

Predicting the visualization intensity for interactive spatio-temporal visual analytics: a data-driven view-dependent approach

Jing Li; Tong Zhang; Qing Liu; Manzhu Yu

ABSTRACT The continually increasing size of geospatial data sets poses a computational challenge when conducting interactive visual analytics using conventional desktop-based visualization tools. In recent decades, improvements in parallel visualization using state-of-the-art computing techniques have significantly enhanced our capacity to analyse massive geospatial data sets. However, only a few strategies have been developed to maximize the utilization of parallel computing resources to support interactive visualization. In particular, an efficient visualization intensity prediction component is lacking from most existing parallel visualization frameworks. In this study, we propose a data-driven view-dependent visualization intensity prediction method, which can dynamically predict the visualization intensity based on the distribution patterns of spatio-temporal data. The predicted results are used to schedule the allocation of visualization tasks. We integrated this strategy with a parallel visualization system deployed in a compute unified device architecture (CUDA)-enabled graphical processing units (GPUs) cloud. To evaluate the flexibility of this strategy, we performed experiments using dust storm data sets produced from a regional climate model. The results of the experiments showed that the proposed method yields stable and accurate prediction results with acceptable computational overheads under different types of interactive visualization operations. The results also showed that our strategy improves the overall visualization efficiency by incorporating intensity-based scheduling.

international conference on agro geoinformatics | 2013