Ningyu Zhang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ningyu Zhang is active.

Explore More

Publication

Featured researches published by Ningyu Zhang.

trust security and privacy in computing and communications | 2014

HBaseSpatial: A Scalable Spatial Data Storage Based on HBase

Ningyu Zhang; Guozhou Zheng; Huajun Chen; Jiaoyan Chen; Xi Chen

Recent years, the scale of spatial data is developing more and more huge and its storage has encountered a lot of problems. Traditional DBMS can efficiently handle some big spatial data. However, popular open source relational database systems are overwhelmed by the high insertion rates, querying requirements and terabytes of data that these systems can handle. On the other hand, key-value storage can effectively support large scale operations. To resolve the problems of big vector spatial datas storage and query, we bring forward HBase Spatial, a scalable spatial dada storage based on HBase. At first, we analyze the distributed storage model of HBase. Then, we design a distributed storage and index model. Finally, the advantages of our storage model and index algorithm are proven by experiments with both big sample sets and typical benchmarks on cluster compared with MongoDB and Mysql, which shows that our model can effectively enhance the query speed of big spatial data and provide a good solution for storage.

Sensors | 2016

Semantic Framework of Internet of Things for Smart Cities: Case Studies

Ningyu Zhang; Huajun Chen; Xi Chen; Jiaoyan Chen

In recent years, the advancement of sensor technology has led to the generation of heterogeneous Internet-of-Things (IoT) data by smart cities. Thus, the development and deployment of various aspects of IoT-based applications are necessary to mine the potential value of data to the benefit of people and their lives. However, the variety, volume, heterogeneity, and real-time nature of data obtained from smart cities pose considerable challenges. In this paper, we propose a semantic framework that integrates the IoT with machine learning for smart cities. The proposed framework retrieves and models urban data for certain kinds of IoT applications based on semantic and machine-learning technologies. Moreover, we propose two case studies: pollution detection from vehicles and traffic pattern detection. The experimental results show that our system is scalable and capable of accommodating a large number of urban regions with different types of IoT applications.

international workshop on analytics for big geospatial data | 2013

When big data meets big smog: a big spatio-temporal data framework for China severe smog analysis

Jiaoyan Chen; Huajun Chen; Jeff Z. Pan; Ming Wu; Ningyu Zhang; Guozhou Zheng

Recently, the appearing disaster of severe smog has been attacking many cities in China such as the capital Beijing. The chief culprit of China smog, namely PM2.5, is affected by various factors including air pollutants, weather, climate, geographical location, urbanization, etc. To analyze the factors, we collect about 35,000,000 air quality records and about 30,000,000 weather records from the sensors in 77 Chinas cities in 2013. Moreover, two big data sets named Geoname and DBPedia are also combined for the data of climate, geographical location and urbanization. To deal with big spatio-temporal data for big smog analysis, we propose a MapReduce-based framework named BigSmog. It mainly conducts parallel correlation analysis of the factors and scalable training of artificial neural networks for spatio-temporal approximation of the concentration of PM2.5. In the experiments, BigSmog displays high scalability for big smog analysis with big spatio-temporal data. The analysis result shows that the air pollutants influence the short-term concentration of PM2.5 more than the weather and the factors of geographical location and climate rather than urbanization play a major role in determining a citys long-term pollution level of PM2.5. Moreover, the trained ANNs can accurately approximate the concentration of PM2.5.

Neurocomputing | 2014

Time-series processing of large scale remote sensing data with extreme learning machine

Jiaoyan Chen; Guozhou Zheng; Cong Fang; Ningyu Zhang; Huajun Chen; Zhaohui Wu

Nowadays, land-cover change detection plays a more and more important role in environment protection and many other fields. However, the current land-cover change detection methods encounter the problems of low accuracy and low efficiency, especially in dealing with large scale remote sensing (RS) data. This paper presents a novel extreme learning machine (ELM) based land-cover change detection method with high testing accuracy and fast processing speed. The evaluation results show that ELM outperforms the traditional methods, e.g., SVM and BP network, in terms of training speed and generalization performance, when applied in land-cover classification. In our experiments, we apply our method to the analysis of rapid land use change in Taihu Lake region over the past decade.

ISPRS international journal of geo-information | 2016

Forecasting Public Transit Use by Crowdsensing and Semantic Trajectory Mining: Case Studies

Ningyu Zhang; Huajun Chen; Xi Chen; Jiaoyan Chen

With the growing development of smart cities, public transit forecasting has begun to attract significant attention. In this paper, we propose an approach for forecasting passenger boarding choices and public transit passenger flow. Our prediction model is based on mining common user behaviors for semantic trajectories and enriching features using knowledge from geographic and weather data. All the experimental data comes from the Ridge Nantong Limited bus company and Alibaba platform which is also open to the public. We evaluate our approach using various data sources, including point of interest (POI), weather condition, and public bus information in Guangzhou to demonstrate its effectiveness. Experimental results show that our proposal performs better than baselines in the prediction of passenger boarding choices and public transit passenger flow.

Computational Intelligence and Neuroscience | 2016

Social Media Meets Big Urban Data

Ningyu Zhang; Huajun Chen; Jiaoyan Chen; Xi Chen

With the design and development of smart cities, opportunities as well as challenges arise at the moment. For this purpose, lots of data need to be obtained. Nevertheless, circumstances vary in different cities due to the variant infrastructures and populations, which leads to the data sparsity. In this paper, we propose a transfer learning method for urban waterlogging disaster analysis, which provides the basis for traffic management agencies to generate proactive traffic operation strategies in order to alleviate congestion. Existing work on urban waterlogging mostly relies on past and current conditions, as well as sensors and cameras, while there may not be a sufficient number of sensors to cover the relevant areas of a city. To this end, it would be helpful if we could transfer waterlogging. We examine whether it is possible to use the copious amounts of information from social media and satellite data to improve urban waterlogging analysis. Moreover, we analyze the correlation between severity, road networks, terrain, and precipitation. Moreover, we use a multiview discriminant transfer learning method to transfer knowledge to small cities. Experimental results involving cities in China and India show that our proposed framework is effective.

international conference on big data | 2013

OWL reasoning over big biomedical data

Xi Chen; Huajun Chen; Ningyu Zhang; Jiaoyan Chen; Zhaohui Wu

Recently, the emerging accumulation of biomedical data on the Web (e.g. vast amounts of protein sequences, genes, gene products, drugs, diseases and chemical compounds, etc.) has shaped a big network of isolated professional knowledge. Embedded with domain knowledge from different disciplines all regarding to human biological systems, the decentralized data repositories are implicitly connected by human expert knowledge. Lots of biomedical data sources are published separately in the form of semantic ontologies represented by Web Ontology Language (OWL) syntax, which is naturally based on linked graphs. When we are faced with such massive, disparate and interlinked data, biomedical data analysis becomes a challenge. In this paper, we present a general OWL reasoning framework for the analysis of big biomedical data and implement a MapReduce-based property chain reasoning prototype system. OWL reasoning method is ideally suitable for problems involved complex semantic associations because it is able to infer logical consequences based on a set of asserted rules or axioms. MapReduce framework is used to solve the problem of scalability. In our experiment, we focus on the discovery of associations between Traditional Chinese Medicine (TCM) and Western Medicine (WM). The results show the system achieves high performance, accuracy and scalability.

web intelligence | 2015

SparkRDF: Elastic Discreted RDF Graph Processing Engine With Distributed Memory

Xi Chen; Huajun Chen; Ningyu Zhang; Songyang Zhang

With the explosive growth of semantic data on the Web over the past years, many large-scale RDF knowledge bases with billions of facts are generating. This poses significant challenges for the storage and query of big RDF graphs. Current systems still have many limitations in processing big RDF graphs including scalability and real-time. In this paper, we introduce the SparkRDF, an elastic discreted RDF graph processing engine with distributed memory. To reduce the high I/O and communication cost in distributed processing platforms, SparkRDF implements SPARQL query based on Spark, a novel in-memory distributed computing framework for big data processing. All the intermediate results are modeled as Resilient Discreted SubGraph, which are cached in the distributed memory to support fast iterative join operations. To cut down the search space and avoid the overhead of memory, we split the RDF graph into the small Multi-layer Elastic SubGraph based on the relations and classes. For SPARQL query optimization, SparkRDF deploys a serials of optimization strategies, leading to effective reduction on the size of intermediate results, the number of joins and the cost of communication. Our extensive evaluation demonstrates that SparkRDF can efficiently implement non-selective joins faster than both current state-of-the-art distributed and centralized stores, while being able to process other queries in real time, scaling linearly to the amount of data.

International Journal of Distributed Sensor Networks | 2015

Large-scale real-time semantic processing framework for Internet of Things

Xi Chen; Huajun Chen; Ningyu Zhang; Jue Huang; Wen Zhang

Nowadays, the advanced sensor technology with cloud computing and big data is generating large-scale heterogeneous and real-time IOT (Internet of Things) data. To make full use of the data, development and deploy of ubiquitous IOT-based applications in various aspects of our daily life are quite urgent. However, the characteristics of IOT sensor data, including heterogeneity, variety, volume, and real time, bring many challenges to effectively process the sensor data. The Semantic Web technologies are viewed as a key for the development of IOT. While most of the existing efforts are mainly focused on the modeling, annotation, and representation of IOT data, there has been little work focusing on the background processing of large-scale streaming IOT data. In the paper, we present a large-scale real-time semantic processing framework and implement an elastic distributed streaming engine for IOT applications. The proposed engine efficiently captures and models different scenarios for all kinds of IOT applications based on popular distributed computing platform SPARK. Based on the engine, a typical use case on home environment monitoring is given to illustrate the efficiency of our engine. The results show that our system can scale for large number of sensor streams with different types of IOT applications.

web science | 2014

Monitoring Urban Waterlogging Disaster Using Social Sensors

Ningyu Zhang; Guozhou Zheng; Huajun Chen; Xi Chen; Jiaoyan Chen

Nowadays, urban waterlogging has been one of the most serious global urban hazards in some big cities in the world especially in Chinese cities. While, existing methods fail to cover all locations and forecast the waterlogging trend. Meanwhile, the past one decade has witnessed an astounding outburst in the number of online social media services. For example, when a rainstorm occurs, people make a large number of tweets related to the rainstorm, which enables detection of urban waterlogging promptly, simply by analyzing the tweets. In this paper, we present a semantic method that can monitor urban waterlogging using social sensors. Currently, we use ontology and fuzzy reasoning to analyze waterlogging locations and its severity and build Apps to monitor and forecast waterlogging in more than ten cities in China. With this method, people can easily monitor all the possible urban waterlogging locations with severity and trend, which may reduce the possibility of traffic congestion in a rainstorm.

Explore More