Guozhou Zheng
Zhejiang University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Guozhou Zheng.
Concurrency and Computation: Practice and Experience | 2006
Huajun Chen; Zhaohui Wu; Yuxin Mao; Guozhou Zheng
In the presence of a Database Grid where a huge number of highly diverse, widely distributed, autonomously managed databases can be involved in a sharing cycle, database tools and middleware should be well suited for schema mediation and query processing in a semantically meaningful way. In this paper, an implemented system called DartGrid is presented. DartGrid is intended to provide a semantic infrastructure for building database grid applications. We explore the essential and fundamental roles played by Resource Description Framework (RDF) semantics for database grids and implement a set of semantically enabled tools and grid services such as semantic browser, semantic mapping tools, ontology service, semantic query service and semantic registration service. We propose an RDF‐View‐based approach for relational schema mediation and describe the view‐based semantic query rewriting algorithm implemented in DartGrid. DartGrid has been used to build a real database grid application for Traditional Chinese Medicine in China. Copyright
trust security and privacy in computing and communications | 2014
Ningyu Zhang; Guozhou Zheng; Huajun Chen; Jiaoyan Chen; Xi Chen
Recent years, the scale of spatial data is developing more and more huge and its storage has encountered a lot of problems. Traditional DBMS can efficiently handle some big spatial data. However, popular open source relational database systems are overwhelmed by the high insertion rates, querying requirements and terabytes of data that these systems can handle. On the other hand, key-value storage can effectively support large scale operations. To resolve the problems of big vector spatial datas storage and query, we bring forward HBase Spatial, a scalable spatial dada storage based on HBase. At first, we analyze the distributed storage model of HBase. Then, we design a distributed storage and index model. Finally, the advantages of our storage model and index algorithm are proven by experiments with both big sample sets and typical benchmarks on cluster compared with MongoDB and Mysql, which shows that our model can effectively enhance the query speed of big spatial data and provide a good solution for storage.
international conference on computational science | 2004
Zhaohui Wu; Huajun Chen; Changhuang Changhuang; Guozhou Zheng; Jiefeng Xu
In presence of web, one critical challenge is how to globally publish, seamlessly integrate and transparently locate geographically distributed database resources with such “open” settings. This paper proposes a semantic-based approach supporting the global sharing of database resources using grid as the platform. We have built a semantic query system, called DartGrid, with the following features: a) database providers are organized as an ontology-based virtual organization; by uniformly defined domain semantics, database could be semantically registered and seamlessly integrated together to provide database service, and b)we raise the level of interaction with the data base system to a domain-cognizant model in which query request are specified in the terminology and knowledge of the domain(s), which enable the users to publish, discovery ,query databases only at a semantic or knowledge level. We explore the essential and fundamental roles played by data semantics, and implement some innovative semantic functionalities such as semantic browse, semantic query and semantic registration. We also reports on application results from Traditional Chinese Medicine (TCM) that requires data-intensive collaboration.
Neural Computing and Applications | 2016
Jiaoyan Chen; Huajun Chen; Xiangyi Wan; Guozhou Zheng
Abstract In the big data era, extreme learning machine (ELM) can be a good solution for the learning of large sample data as it has high generalization performance and fast training speed. However, the emerging big and distributed data blocks may still challenge the method as they may cause large-scale training which is hard to be finished by a common commodity machine in a limited time. In this paper, we propose a MapReduce-based distributed framework named MR-ELM to enable large-scale ELM training. Under the framework, ELM submodels are trained parallelly with the distributed data blocks on the cluster and then combined as a complete single-hidden layer feedforward neural network. Both classification and regression capabilities of MR-ELM have been theoretically proven, and its generalization performance is shown to be as high as that of the original ELM and some common ELM ensemble methods through many typical benchmarks. Compared with the original ELM and the other parallel ELM algorithms, MR-ELM is a general and scalable ELM training framework for both classification and regression and is suitable for big data learning under the cloud environment where the data are usually distributed instead of being located in one machine.
international conference on control and automation | 2013
Jiaoyan Chen; Guozhou Zheng; Huajun Chen
Land cover classification of remote sensing (RS) data plays a key role in various spatio-temporal applications. Moreover, scalability and efficiency have become the most important challenges because of increasing RS data. In this paper, we propose a novel MapReduce accelerated extreme learning machine (ELM) ensemble classifier called ELM-MapReduce for large scale land cover classification. First, ELM-MapReduce adopts ELM ensemble learning algorithm with higher accuracy and stability. Second, ELM-MapReduce is accelerated by MapReduce for higher scalability and efficiency. Third, the experiments on large scale real world RS data have proven the advantages of ELM-MapReduce.
grid computing | 2004
Huajun Chen; Zhaohui Wu; Guozhou Zheng; Yuxing Mao
In presence of grid where a huge amount of databases can be involved in sharing cycle, database tools and middlewares should be well suited for schema mediation and query processing in a semantically meaningful way. Dart is an implemented prototype system whose goal is to provide a semantic solution for database resource sharing capable of deployment in grid settings. This paper particularly concerns the problems of schema mediation in DartGrid. Our approach mainly involves the following notions: a) we use RDF/OWL to define the mediated ontologies for integration, b) we devise a set of rules for automatically converting the relational schema to RDF/OWL description called source data semantic, c) we define the source data semantic (source schema) as the view of shared ontologies (mediated schema), d) query is formulated and posed on the shared ontologies. A set of grid services is developed for the implementation of above functionalities.
international workshop on analytics for big geospatial data | 2013
Jiaoyan Chen; Huajun Chen; Jeff Z. Pan; Ming Wu; Ningyu Zhang; Guozhou Zheng
Recently, the appearing disaster of severe smog has been attacking many cities in China such as the capital Beijing. The chief culprit of China smog, namely PM2.5, is affected by various factors including air pollutants, weather, climate, geographical location, urbanization, etc. To analyze the factors, we collect about 35,000,000 air quality records and about 30,000,000 weather records from the sensors in 77 Chinas cities in 2013. Moreover, two big data sets named Geoname and DBPedia are also combined for the data of climate, geographical location and urbanization. To deal with big spatio-temporal data for big smog analysis, we propose a MapReduce-based framework named BigSmog. It mainly conducts parallel correlation analysis of the factors and scalable training of artificial neural networks for spatio-temporal approximation of the concentration of PM2.5. In the experiments, BigSmog displays high scalability for big smog analysis with big spatio-temporal data. The analysis result shows that the air pollutants influence the short-term concentration of PM2.5 more than the weather and the factors of geographical location and climate rather than urbanization play a major role in determining a citys long-term pollution level of PM2.5. Moreover, the trained ANNs can accurately approximate the concentration of PM2.5.
Neurocomputing | 2014
Jiaoyan Chen; Guozhou Zheng; Cong Fang; Ningyu Zhang; Huajun Chen; Zhaohui Wu
Nowadays, land-cover change detection plays a more and more important role in environment protection and many other fields. However, the current land-cover change detection methods encounter the problems of low accuracy and low efficiency, especially in dealing with large scale remote sensing (RS) data. This paper presents a novel extreme learning machine (ELM) based land-cover change detection method with high testing accuracy and fast processing speed. The evaluation results show that ELM outperforms the traditional methods, e.g., SVM and BP network, in terms of training speed and generalization performance, when applied in land-cover classification. In our experiments, we apply our method to the analysis of rapid land use change in Taihu Lake region over the past decade.
web science | 2014
Ningyu Zhang; Guozhou Zheng; Huajun Chen; Xi Chen; Jiaoyan Chen
Nowadays, urban waterlogging has been one of the most serious global urban hazards in some big cities in the world especially in Chinese cities. While, existing methods fail to cover all locations and forecast the waterlogging trend. Meanwhile, the past one decade has witnessed an astounding outburst in the number of online social media services. For example, when a rainstorm occurs, people make a large number of tweets related to the rainstorm, which enables detection of urban waterlogging promptly, simply by analyzing the tweets. In this paper, we present a semantic method that can monitor urban waterlogging using social sensors. Currently, we use ontology and fuzzy reasoning to analyze waterlogging locations and its severity and build Apps to monitor and forecast waterlogging in more than ten cities in China. With this method, people can easily monitor all the possible urban waterlogging locations with severity and trend, which may reduce the possibility of traffic congestion in a rainstorm.
asia pacific web conference | 2005
Zhaohui Wu; Huajun Chen; Yuxing Mao; Guozhou Zheng
This paper demonstrated the Dart Database Grid system developed by Grid Research Center of Zhejiang University. Dart Database Grid is built upon several Semantic Web standards and the Globus grid toolkits. It is mainly intended to provide a Dynamic, Adaptive , RDF-mediated and Transparent (DART) approach to database integration for semantic web. This work has been applied to integrate data resources from the application domain of Traditional Chinese Medicine.