Xiangguo Zhao | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Xiangguo Zhao is active.

Explore More

Publication

Featured researches published by Xiangguo Zhao.

Neurocomputing | 2011

XML document classification based on ELM

Xiangguo Zhao; Guoren Wang; Xin Bi; Peizhen Gong; Yuhai Zhao

Abstract In this paper, we describe an XML document classification framework based on extreme learning machine (ELM). On the basis of Structured Link Vector Model (SLVM), an optimized Reduced Structured Vector Space Model (RS-VSM) is proposed to incorporate structural information into feature vectors more efficiently and optimize the computation of document similarity. We apply ELM in the XML document classification to achieve good performance at extremely high speed compared with conventional learning machines (e.g., support vector machine). A voting-ELM algorithm is then proposed to improve the accuracy of ELM classifier. Revoting of Equal Votes (REV) method and Revoting of Confusing Classes (RCC) method are also proposed to postprocess the voting result of v-ELM and further improve the performance. The experiments conducted on real world classification problems demonstrate that the voting-ELM classifiers presented in this paper can achieve better performance than ELM algorithms with respect to precision, recall and F-measure.

Neurocomputing | 2015

Distributed Extreme Learning Machine with kernels based on MapReduce

Xin Bi; Xiangguo Zhao; Guoren Wang; Pan Zhang; Chao Wang

Extreme Learning Machine (ELM) has shown its good generalization performance and extremely fast learning speed in many learning applications. Recently, it has been proved that ELM outperforms Support Vector Machine (SVM) with less constraints from the optimization point of view. ELM provides unified learning schemes with a widespread type of feature mappings. Among these unified algorithms, ELM with kernels applies kernels instead of random feature mappings. However, with the exponentially increasing volume of training data in massive learning applications, centralized ELM with kernels suffers from the great memory consumption of large matrix operations. Besides, due to the high communication cost, some of these matrix operations cannot be directly implemented on shared-nothing distributed computing model like MapReduce. This paper proposes a distributed solution named Distributed Kernelized ELM (DK-ELM), which realizes an implementation of ELM with kernels on MapReduce. Distributed kernel matrix calculation and multiplication of matrix with vector are also applied to realize parallel calculation of DK-ELM. Extensive experiments on massive datasets are conducted to verify both the scalability and training performance of DK-ELM. Experimental results show that DK-ELM has good scalability for massive learning applications.

Cognitive Computation | 2017

FE-ELM: A New Friend Recommendation Model with Extreme Learning Machine

Zhen Zhang; Xiangguo Zhao; Guoren Wang

Friend recommendation is one of the most popular services in location-based social network (LBSN) platforms, which recommends interested or familiar people to users. Except for the original social property and textual property in social networks, LBSN specially owns the spatial-temporal property. However, none of the existing methods fully utilized all the three properties (i.e., just one or two), which may lead to the low recommendation accuracy. Moreover, these existing methods are usually inefficient. In this paper, we propose a new friend recommendation model to solve the above shortcomings of the existing methods, called feature extraction-extreme learning machine (FE-ELM), where friend recommendation is regarded as a binary classification problem. Classification is an important task in cognitive computation community. First, we use new strategies in our FE-ELM model to extract the spatial-temporal feature, social feature, and textual feature. These features make full use of all above properties of LBSN and ensure the recommendation accuracy. Second, our FE-ELM model also takes advantage of the extreme learning machine (ELM) classifier. ELM has fast learning speed and ensures the recommendation efficiency. Extensive experiments verify the accuracy and efficiency of FE-ELM model.

World Wide Web | 2014

Probability based voting extreme learning machine for multiclass XML documents classification

Xiangguo Zhao; Xin Bi; Baiyou Qiao

This paper presents a novel solution based on Extreme Learning Machine (ELM) for multiclass XML documents classification. ELM is a generalized Single-hidden Layer Feedforward Network (SLFN) with extremely fast learning capacity. An improved vector model DSVM (Distribution based Structured Vector Model) is proposed to represent XML documents with more structural information and more precise semantic information. The XML documents classifiers are conducted based on PV-ELM (Probablity based Voting ELM) with a postprocessing method ε-RCC (ε - Revoting of Confusing Classes) to refine the voting results. To evaluate the overall performance of this solution, a series of experiments are conducted on two real datasets of news feeds online. The experimental results show that DSVM represents the XML documents more effectively and PV-ELM with ε-RCC achieves a higher accuracy than original ELM algorithm for multiclass classification.

Neurocomputing | 2016

Uncertain XML documents classification using Extreme Learning Machine

Xiangguo Zhao; Xin Bi; Guoren Wang; Zhen Zhang; Hongbo Yang

Driven by the emerging network data exchange and storage, XML documents classification has become increasingly important. Most existing representation model and conventional learning algorithm are defined on certain XML documents. However, in many real-world applications, XML datasets contain inherent uncertainty, which brings greater challenges to classification problem. In this paper, we propose a novel solution to classify uncertain XML documents, including uncertain XML documents representation and two uncertain learning algorithms based on Extreme Learning Machine. Experimental results show that our approaches exhibit prominent performance for uncertain XML documents classification problem.

Neurocomputing | 2010

Efficiently mining local conserved clusters from gene expression data

Guoren Wang; Yuhai Zhao; Xiangguo Zhao; Botao Wang; Baiyou Qiao

Extensive studies have shown that mining gene expression data is important for both bioinformatics research and biomedical applications. However, most existing studies focus only on either co-regulated gene clusters or emerging patterns. Factually, another analysis scheme, i.e. simultaneously mining phenotypes and diagnostic genes, is also biologically significant, which has received relative little attention so far. In this paper, we explore a novel concept of local conserved gene cluster (LC-Cluster) to address this problem. Specifically, an LC-Cluster contains a subset of genes and a subset of conditions such that the genes show steady expression values (instead of the coherent pattern rising and falling synchronously defined by some previous work) only on the subset of conditions but not along all given conditions. To avoid the exponential growth in subspace search, we further present two efficient algorithms, namely FALCONER and E-FALCONER, to mine the complete set of maximal LC-Clusters from gene expression data sets based on enumeration tree. Extensive experiments conducted on both real gene expression data sets and synthetic data sets show: (1) our approaches are efficient and effective, (2) our approaches outperform the existing enumeration tree based algorithms, and (3) our approaches can discover an amount of LC-Clusters, which are potentially of high biological significance.

Journal of Computer Science and Technology | 2017

Efficient Processing of Distributed Twig Queries Based on Node Distribution

Xin Bi; Xiangguo Zhao; Guoren Wang

Massive XML data are increasingly generated for the representation, storage and exchange of web information. Twig query processing over massive XML data has become a research focus. However, most traditional algorithms cannot be directly implemented in a distributed manner. Some of the existing distributed algorithms generate a lot of useless intermediate results and execute many join operations of partial results in most cases; others require the priori knowledge of query pattern before XML partition, storage and query processing, which is impractical in the cases of large-scale data or frequent incoming new queries. To improve efficiency and scalability, in this paper, we propose a 3-phase distributed algorithm DisT3 based on node distribution mechanism to avoid unnecessary intermediate results. Furthermore, we propose a lightweight local index ReP with an enhanced XML partitioning approach using arbitrary partitioning strategy, and based on ReP we propose an improved 2-phase distributed algorithm DisT2ReP to further reduce the communication cost. After the performance guarantees are analyzed, extensive experiments are conducted to verify the efficiency and scalability of our proposed algorithms in distributed twig query applications.

Archive | 2016

Record Linkage for Event Identification in XML Feeds Stream Using ELM

Xin Bi; Xiangguo Zhao; Wenhui Ma; Zhen Zhang; Heng Zhan

Most of the news portals and social media networks are utilizing RSS feeds for information distribution and content sharing. Event identification improves the service quality of feeds providers in the aspect of content distribution and event browsing. However, thriving challenges arise due to representation of structural information and real-time requirement in feeds streams mining. In this paper, we focus on the record linkage problem which classifies stream content into known categories. To realize fast and efficient record linkage over XML feeds stream, we design two classification strategies: a classifier based on ensemble ELMs and an incremental classifier based on OS-ELM. Experimental results show that our solutions provide effective and efficient record linkage for event identification applications.

asia-pacific web conference | 2015

Distributed XML Twig Query Processing Using MapReduce

Xin Bi; Guoren Wang; Xiangguo Zhao; Zhen Zhang; Shuang Chen

Twig query processing is one of the core operations of XML queries. Centralized holistic twig algorithms suffer great efficiency losses when large-scale XML documents are partitioned and stored in the cloud. Previous work on distributed twig query processing have some limitations, e.g., utter dependence on priori knowledge of query patterns, iteration of MapReduce jobs, etc. In this paper, our arbitrary XML partitioning and storage strategy require no knowledge of query pattern; twig queries can be efficiently processed in a single-round MapReduce job with good scalability. Extensive experiments are conducted to verify the efficiency and scalability of our algorithms.

World Wide Web | 2015

ELM based approximate dynamic cycle matching for homogeneous symmetric Pub/Sub system

Botao Wang; Pingping Liu; Guoren Wang; Xiangguo Zhao

The number of cycle matchings increases exponentially with the number of subscriptions and the maximum length of cycle matchings, which needs a large amount of space to store intermediate results. Approximate cycle matching aims to store only a small part of intermediate results and find cycle matchings as many as possible. The existing solution prunes the intermediate results by a threshold of probability of a subscription to be matched, where the discrete degree of probabilities is neglected. In this paper, we propose an approximate dynamic cycle matching algorithm based on intermediate results classification using extreme learning machine. We first introduce a method of incorporating probability information into feature vector, and then propose the approximate cycle algorithm. Further, we propose a dynamic classification strategy considering that the data distribution of subscriptions may change as time goes on. The proposed approximate cycle matching algorithm and the dynamic classification strategy are evaluated in a simulated environment. The results show that compared with the approximate cycle matching based on probability threshold, the approximate cycle matching based on ELM classification is faster, and the dynamic classification strategy is more efficient and convenient. ELM is more suitable for approximate dynamic cycle matching than SVM with regards to response time.

Explore More