Is this you? Create Your Porfile

Yuhua Li

Huazhong University of Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yuhua Li is active.

Explore More

Publication

Featured researches published by Yuhua Li.

conference on information and knowledge management | 2009

Community mining on dynamic weighted directed graphs

Dongsheng Duan; Yuhua Li; Yanan Jin; Zhengding Lu

This paper focuses on community mining including community discovery and change-point detection on dynamic weighted directed graphs(DWDG). Real networks such as e-mail, co-author and financial networks can be modeled as DWDG. Community mining on DWDG has not been studied thoroughly, although that on static(or dynamic undirected unweighted)graphs has been exploited extensively. In this paper, Stream-Group is proposed to solve community mining on DWDG. For community discovery, a two-step approach is presented to discover the community structure of a weighted directed graph(WDG) in one time-slice: (1)The first step constructs compact communities according to each nodes single compactness which indicates the degree of a node belonging to a community in terms of the graphs relevance matrix; (2)The second step merges compact communities along the direction of maximum increment of the modularity. For change-point detection, a measure of the similarity between partitions is presented to determine whether a change-point appears along the time axis and an incremental algorithm is presented to update the partition of a graph segment when adding a new arriving graph into the graph segment. The effectiveness and efficiency of our algorithms are validated by experiments on both synthetic and real networks. Results show that our algorithms have a good trade-off between the effectiveness and efficiency in discovering communities and change-points.

Artificial Intelligence Review | 2012

Incremental K-clique clustering in dynamic social networks

Dongsheng Duan; Yuhua Li; Ruixuan Li; Zhengding Lu

Clustering entities into dense parts is an important issue in social network analysis. Real social networks usually evolve over time and it remains a problem to efficiently cluster dynamic social networks. In this paper, a dynamic social network is modeled as an initial graph with an infinite change stream, called change stream model, which naturally eliminates the parameter setting problem of snapshot graph model. Based on the change stream model, the incremental version of a well known k-clique clustering problem is studied and incremental k-clique clustering algorithms are proposed based on local DFS (depth first search) forest updating technique. It is theoretically proved that the proposed algorithms outperform corresponding static ones and incremental spectral clustering algorithm in terms of time complexity. The practical performances of our algorithms are extensively evaluated and compared with the baseline algorithms on ENRON and DBLP datasets. Experimental results show that incremental k-clique clustering algorithms are much more efficient than corresponding static ones, and have no accumulating errors that incremental spectral clustering algorithm has and can capture the evolving details of the clusters that snapshot graph model based algorithms miss.

international conference on data mining | 2012

RankTopic: Ranking Based Topic Modeling

Dongsheng Duan; Yuhua Li; Ruixuan Li; Rui Zhang; Aiming Wen

Topic modeling has become a widely used tool for document management due to its superior performance. However, there are few topic models distinguishing the importance of documents on different topics. In this paper, we investigate how to utilize the importance of documents to improve topic modeling and propose to incorporate link based ranking into topic modeling. Specifically, topical pagerank is used to compute the topic level ranking of documents, which indicates the importance of documents on different topics. By retreating the topical ranking of a document as the probability of the document involved in corresponding topic, a generalized relation is built between ranking and topic modeling. Based on the relation, a ranking based topic model Rank Topic is proposed. With Rank Topic, a mutual enhancement framework is established between ranking and topic modeling. Extensive experiments on paper citation data and Twitter data are conducted to compare the performance of Rank Topic with that of some state-of-the-art topic models. Experimental results show that Rank Topic performs much better than some baseline models and is comparable with the state-of-the-art link combined relational topic model (RTM) in generalization performance, document clustering and classification by setting a proper balancing parameter. It is also demonstrated in both quantitative and qualitative ways that topics detected by Rank Topic are more interpretable than those detected by some baseline models and still competitive with RTM.

IEEE Transactions on Knowledge and Data Engineering | 2014

LIMTopic: A Framework of Incorporating Link Based Importance into Topic Modeling

Dongsheng Duan; Yuhua Li; Ruixuan Li; Rui Zhang; Xiwu Gu; Kunmei Wen

Topic modeling has become a widely used tool for document management. However, there are few topic models distinguishing the importance of documents on different topics. In this paper, we propose a framework LIMTopic to incorporate link based importance into topic modeling. To instantiate the framework, RankTopic and HITSTopic are proposed by incorporating topical pagerank and topical HITS into topic modeling respectively. Specifically, ranking methods are first used to compute the topical importance of documents. Then, a generalized relation is built between link importance and topic modeling. We empirically show that LIMTopic converges after a small number of iterations in most experimental settings. The necessity of incorporating link importance into topic modeling is justified based on KL-Divergences between topic distributions converted from topical link importance and those computed by basic topic models. To investigate the document network summarization performance of topic models, we propose a novel measure called log-likelihood of ranking-integrated document-word matrix. Extensive experimental results show that LIMTopic performs better than baseline models in generalization performance, document clustering and classification, topic interpretability and document network summarization performance. Moreover, RankTopic has comparable performance with relational topic model (RTM) and HITSTopic performs much better than baseline models in document clustering and classification.

Artificial Intelligence Review | 2014

Detecting network communities using regularized spectral clustering algorithm

Liang Huang; Ruixuan Li; Hong Chen; Xiwu Gu; Kunmei Wen; Yuhua Li

The progressively scale of online social network leads to the difficulty of traditional algorithms on detecting communities. We introduce an efficient and fast algorithm to detect community structure in social networks. Instead of using the eigenvectors in spectral clustering algorithms, we construct a target function for detecting communities. The whole social network communities will be partitioned by this target function. We also analyze and estimate the generalization error of the algorithm. The performance of the algorithm is compared with the standard spectral clustering algorithm, which is applied to different well-known instances of social networks with a community structure, both computer generated and from the real world. The experimental results demonstrate the effectiveness of the algorithm.

Artificial Intelligence Review | 2014

Name disambiguation in scientific cooperation network by exploiting user feedback

Yuhua Li; Aiming Wen; Quan Lin; Ruixuan Li; Zhengding Lu

Name disambiguation is a very critical problem in scientific cooperation network. Ambiguous author names may occur due to the existence of multiple authors with the same name. Despite much research work has been conducted, the problem is still not resolved and becomes even more serious. In this paper, we focus ourselves on such problem. A method of exploiting user feedback for name disambiguation in scientific cooperation network is proposed, which can make use of user feedback to enhance the performance. Furthermore, to make the user feedback more effective, we divide user feedback into three types and assign different weights to them. To evaluate the effectiveness of our proposed method, experiments are conducted with standard public collections. We compare the performance of our proposal with baseline methods. Results show that the proposed algorithm outperforms the previous methods without introducing user interactions. Besides, we investigate into how different types of user feedback can affect the disambiguation results.

multimedia and ubiquitous engineering | 2009

A Directed Labeled Graph Frequent Pattern Mining Algorithm Based on Minimum Code

Yuhua Li; Quan Lin; Gang Zhong; Dongsheng Duan; Yanan Jin; Wei Bi

Most of existing frequent subgraph mining algorithms are used to deal with undirected unlabeled marked graph. A few of them aim at directed graph or labeled graph. But in the real world ,a lot of connections have direction and label,so directed labeled graph mining is more meaningful. Now we are analyzing a financial network which can be modeled by a directed weighted graph. We are interested in the patterns which are frequent.The graph pattern means a uniform expression of graphs which has different marked nodes but same structure. In our application we consider direction and weight as part of the pattern. It’s different from subgraph because subgraph mining consider the labels of nodes. This paper proposes a new algorithm mSpan for directed labeled graph frequent pattern mining. Based on FP-growth, the algorithm gets a minimum edge code and an abstract node code sequence to indentify a directed graph pattern uniquely through minimum extension. It can solve the graph pattern isomorphic problem and the redundant extension problem. The experiment shows that mSpan can mine all frequent directed, labeled graph patterns.

web age information management | 2007

Towards a type-2 fuzzy description logic for semantic search engine

Ruixuan Li; Xiaolin Sun; Zhengding Lu; Kunmei Wen; Yuhua Li

Classical description logics are limited in dealing with the crisp concepts and relationships, which makes it difficult to represent and process imprecise information in real applications. In this paper we present a type-2 fuzzy version of ALC and describe its syntax, semantics and reasoning algorithms, as well as the implementation of the logic with type-2 fuzzy OWL. Comparing with type-1 fuzzy ALC, system based on type-2 fuzzy ALC can define imprecise knowledge more exactly by using membership degree interval. To evaluate the ability of type-2 fuzzy ALC for handling vague information, we apply it to semantic search engine for building the fuzzy ontology and carry out the experiments through comparing with other search schemes. The experimental results show that the type-2 fuzzy ALC based system can increase the number of relevant hits and improve the precision of semantic search engine.

Tsinghua Science & Technology | 2010

Disambiguating authors by pairwise classification

Quan Lin; Bo Wang; Yuan Du; Xuezhi Wang; Yuhua Li; Songcan Chen

Abstract Name ambiguity is a critical problem in many applications, in particular in online bibliography systems, such as DBLP, ACM, and CiteSeerx. Despite the many studies, this problem is still not resolved and is becoming even more serious, especially with the increasing popularity of Web 2.0. This paper addresses the problem in the academic researcher social network ArnetMiner using a supervised method for exploiting all side information including co-author, organization, paper citation, title similarity, authors homepage, web constraint, and user feedback. The method automatically determines the person number k. Tests on the researcher social network with up to 100 different names show that the method significantly outperforms the baseline method using an unsupervised attribute-augmented graph clustering algorithm.

ieee international conference on services computing | 2004

Ontology-based universal knowledge grid: enabling knowledge discovery and integration on the grid

Yuhua Li; Zhengding Lu

This paper proposes an ontology-based grid architecture model in terms of the universal knowledge grid (UKG) for building large-scale distributed knowledge system on the grid. UKG emphasizes geographically distributed high-performance knowledge discovery applications and knowledge integration services. Five core components have been identified: an intelligent composer for user interface, an ontology server providing data integration ontology services, data mining ontology services, knowledge integration ontology services, a metadata directory server maintaining the metadata describing all the data, tools and knowledge, metadata database (db) stores metadata and a knowledge base as container for knowledge. An application example on foreign exchange management (FEM) is also presented.

Explore More