Is this you? Create Your Porfile

Bo Xiao

Beijing University of Posts and Telecommunications

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Bo Xiao is active.

Explore More

Publication

Featured researches published by Bo Xiao.

ieee international conference on network infrastructure and digital content | 2009

An optimized item-based collaborative filtering recommendation algorithm

Jinbo Zhang; Zhiqing Lin; Bo Xiao; Chuang Zhang

Collaborative filtering is a very important technology in E-commerce. Unfortunately, with the increase of users and commodities, the user rating data is extremely sparse, which leads to the low efficient collaborative filtering recommendation system. To address these issues, an optimized collaborative filtering recommendation algorithm based on item is proposed. While calculating the similarity of two items, we obtain the ratio of users who rated both items to those who rated each of them. The ratio is taken into account in this method. The experimental results show that the proposed algorithm can improve the quality of collaborative filtering.

ieee international conference on network infrastructure and digital content | 2009

A high-precision forum crawler based on vertical crawling

Qing Gao; Bo Xiao; Zhiqing Lin; Xiyao Chen; Bing Zhou

In this paper, we present a special crawler for Internet forums. Different from General Crawler and Focused Crawler, it can get structured information directly get the most valuable web resources by utilizing the least system resources, filter the useless information to the maximum extent and finally supply users with high-precision information. This crawler adopts template-based processing method which is to use regular expressions to extract structured information. The URL queue is initialized by URLs set in seeds file and valuable URLs are extracted from web pages and added into the queue during the crawling process. Once the time of one post is beyond the specified time span or the web information is unchanged, the crawler can skip it in time to avoid wasting systems resources. Experimental results demonstrate that our crawler can collect real-time forum information more efficiently and precisely than other crawlers.

international conference on future computer and communication | 2010

A distributed vertical crawler using crawling-period based strategy

Bing Zhou; Bo Xiao; Zhiqing Lin; Chuang Zhang

Due to the explosive growth of the web pages, centralized crawlers are no longer sufficient to run on the web efficiently. There are many distributed crawlers in wide use; however, none of them is suitable for template-customized vertical crawling. In this paper, we present a distributed template-customized vertical crawler which is specially used for crawling Internet forums. The Client-Server architecture of the system and the function of every module are described in detail which can be extended to other fields easily. A crawling-period based distribution strategy is also proposed, with which the crawler manager can coordinate the quantity of crawling tasks and the resources of each crawler very well, and the crawler can process websites with different updating frequency flexibly. We also define a communication protocol between crawlers and crawler manager and describe how to solve the duplicated crawling problem in the distributed system. The performance of centralized vertical crawler and distributed vertical crawler are compared in the experiment. Experimental results demonstrate that the parallel operation of all the crawlers in the distributed system can greatly enhance the crawling efficiency.

Journal of Software | 2008

Credible Association Rule and Its Mining Algorithm Based on Maximum Clique: Credible Association Rule and Its Mining Algorithm Based on Maximum Clique

Bo Xiao; Qian-Fang Xu; Zhiqing Lin; Jun Guo; Chun-Guang Li

Existing association-rule mining algorithms mainly rely on the support-based pruning strategy to prune its combinatorial search space. This strategy is not quite effective in the process of mining potentially interesting low-support patterns. To solve this problem, the paper presents a novel concept of association pattern called credible association rule (CAR), in which each item has the same support level. The confidence directly reflects the credible degree of the rule instead of the traditional support. This paper also proposes a MaxCliqueMining algorithm which creates 2-item credible sets by adjacency matrix and then generates all rules based on maximum clique. Some propositions are verified and which show the properties of CAR and the feasibility and validity of the algorithm. Experimental results on the alarm dataset and Pumsb dataset demonstrate the effectiveness and accuracy of this

sino foreign interchange conference on intelligent science and intelligent data engineering | 2011

An evaluation on different graphs for semi-supervised learning

Chun-Guang Li; Xianbiao Qi; Jun Guo; Bo Xiao

Graph-based Semi-Supervised Learning (SSL) has been an active topic in machine learning for about a decade. It is well-known that how to construct the graph is the central concern in recent work since an efficient graph structure can significantly boost the final performance. In this paper, we present a review on several different graphs for graph-based SSL at first. And then, we conduct a series of experiments on benchmark data sets in order to give a comprehensive evaluation on the advantageous and shortcomings for each of them. Experimental results shown that: a) when data lie on independent subspaces and the number of labeled data is enough, the low-rank representation based method performs best, and b) in the majority cases, the local sparse representation based method performs best, especially when the number of labeled data is few.

ieee international conference on network infrastructure and digital content | 2012

A practical approach to topic detection based on credible association rule mining

Lihua Wu; Bo Xiao; Zhiqing Lin; Yueming Lu

Topic detection is to develop automatic methods to identify topically related documents within a stream of data; many approaches have been developed to classify documents with predefined knowledge. This paper presents a new approach for topic detection and tracking based on credible association rule (CAR). This paper considers topic detection without any prior knowledge of category structure or possible categories. Topic features are selected primarily based on CAR. Results on the test set show a marginal improvement by using CAR and its maximal cliques mining algorithm. The CAR maximal cliques mining algorithm is now applied on real topic detection and tracking system which gives us a lot of experience in adjusting and refining the algorithm. This algorithm also presents many useful interface extensions for other modules of the system to use.

ieee international conference on network infrastructure and digital content | 2012

A tweets recommendation algorithm based on user relationship and text emotional tendentiousness

Xi Wang; Bo Xiao; Zhiqing Lin; Yueming Lu

Micro-blog, also known as twitter, is a platform which is based on user relationships for information sharing, spreading and obtaining. So how to share the messages in micro-blog efficiently based on the users interest with analyzing the user information has become a key research topic. According to the analysis of the micro-blog user relationship and the theory of Analytic Hierarchy Process in the management science, this paper establishes a user influence model, to help explain how the target users affect the core user. With the combination of Word Activation Forces Theory, we put forward an algorithm to express the emotional tendentious of each noun-word in all the tweets. In addition, we calculate and find the corresponding tweets to recommend to the core user and the experiment in this paper proves the effectiveness of the algorithm.

Wireless Personal Communications | 2012

The Study of Content Security for Mobile Internet

Qianfang Xu; Jun Guo; Bo Xiao

The vast amount of information carried over Mobile Internet and the high speed are providing unprecedented convenience for users, Mobile Internet is facing growing threat of lack of security. It is crucial to maintain and improve safety and security for Mobile Internet for it to thrive and develop. At content level, users are facing increasing amount of malicious or spam content, jeopardizing public’s interest in legitimate internet content. Therefore, Mobile Internet information security has become an important research topic. In this paper we first propose a framework for content security management system for Mobile Internet, and then discuss how to acquire relevant information from Mobile Internet in a fast and efficient manner, how to process and analyze the vast amount of information collected, how to quickly discover negative or illegal information within the network, and provide detection and early warnings for potential hot topics. At the same time, we study how to perform audit and evaluation on the information content so that the relevant security management actions can be done.

ieee internationalconference on network infrastructure and digital content | 2010

Blog extraction with template-independent wrapper

Zhixuan Zhang; Chuang Zhang; Zhiqing Lin; Bo Xiao

Rich information is contributed to blogs by millions of users all around the world with the development of blogsphere. However, few work has been done on the study of blog extraction so far. Unlike the traditional template-dependent wrapper, not only blog articles but also blogroll is extracted with template-independent wrapper in this paper. In our method, blog extraction is formalized as a machine learning problem and a template-independent wrapper is learned by using labeled blog pages from a single site. Testing pages are obtained from 10 popular Chinese blog sites. And experimental results on 300 real blog pages indicate that the proposed method can correctly extract data from blogs with the accuracy of 90% or even above.

ieee international conference on network infrastructure and digital content | 2009

A visualization algorithm for alarm association mining

Qianfang Xu; Chunguang Li; Bo Xiao; Jun Guo

Currently those algorithms to mine the alarm association rules are limited to the minimal support, so that they can only obtain the association rules among the frequently occurring alarm events, furthermore, the rules couldnt be visual display. This paper provides a novel mining alarm correlation visualization algorithm based on the non-linear reduced-feature mapping. The algorithm firstly projects the alarms on multidimensional space according to co-occurrence strength of the alarms, and then reduces the dimensions of the space, finally provides the relationship of the alarms to user with visualization. Experimental results based on synthetic and real datasets demonstrated that this algorithm not only discovered correlation among alarms, but also acquired the fault in the telecommunications network based on the graph transformation.

Explore More