Xiujuan Xu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Xiujuan Xu is active.

Explore More

Publication

Featured researches published by Xiujuan Xu.

Expert Systems With Applications | 2009

Credit scoring algorithm based on link analysis ranking with support vector machine

Xiujuan Xu; Chunguang Zhou; Zhe Wang

Credit scoring is very important in business, especially in banks. We want to describe a person who is a good credit or a bad one by evaluating his/her credit. We systematically proposed three link analysis algorithms based on the preprocess of support vector machine, to estimate an applicants credit so as to decide whether a bank should provide a loan to the applicant. The proposed algorithms have two major phases which are called input weighted adjustor and class by support vector machine-based models. In the first phase, we consider the link relation by link analysis and integrate the relation of applicants through their information into input vector of next phase. In the other phase, an algorithm is proposed based on general support vector machine model. A real world credit dataset is used to evaluate the performance of the proposed algorithms by 10-fold cross-validation method. It is shown that the genetic link analysis ranking methods have higher performance in terms of classification accuracy.

Applied Mathematics and Computation | 2007

RFIMiner: A regression-based algorithm for recently frequent patterns in multiple time granularity data streams

Lifeng Jia; Zhe Wang; Nan Lu; Xiujuan Xu; Dongbin Zhou; Yan Wang

In this paper, we propose an algorithm for computing and maintaining recently frequent patterns which is more stable and smaller than the data stream and dynamically updating them with the incoming transactions. Our study mainly has two contributions. First, a regression-based data stream model is proposed to differentiate new and old transactions. The novel model reflects transactions into many multiple time granularities and can automatically adjust transactional fading rate by defining a fading factor. The factor defines a desired life-time of the information of transactions in the data stream. Second, we develop RFIMiner, a single-scan algorithm for mining recently frequent patterns from data streams. Our algorithm employs a special property among suffix-trees, so it is unnecessary to traverse suffix-trees when patterns are discovered. To cater to suffix-trees, we also adopt a new method called Depth-first and Bottom-up Inside Itemset Growth to find more recently frequent patterns from known frequent ones. Moreover, it avoids generating redundant computation and candidate patterns as well. We conduct detailed experiments to evaluate the performance of algorithm in several aspects. Results confirm that the new method has an excellent scalability and the performance meets the condition which requires better quality and efficiency of mining recently frequent itemsets in the data stream.

granular computing | 2006

An agent-based dual-tier algorithm for clustering data streams

Dongbin Zhou; Lifeng Jia; Zhe Wang; Xiujuan Xu; Chunguang Zhou

Characteristics of data stream make it difficult for the clustering algorithms to satisfy the requirements on efficiency and effectiveness. This paper proposes a data stream clustering algorithm on dual-tier structure which employs the agent method. In the on-line process, a set of agents working simultaneously collect similar data points into sub-clusters by applying a heuristic strategy. And in the off-line process, summary information from the on-line component will be further analyzed to obtain the final clusters. The algorithm also supports the time-window queries on streams. The empirical evidence shows that this method can obtain high-quality clusters with low time complexity. analysis over an arbitrary period of the stream etc. As for stream clustering, a common method is dividing the streaming data into chunks, and algorithms for static sets can be used on each sub-set separately (2). In recent years, stream algorithms have developed into a two-phase structure (3), (4). Usually, a dual framework includes two parts: the on-line component and the off-line component. The former is responsible for the fast but rough processing of streaming data and saving the summary information to meet the one-pass restriction while the latter takes advantage of the information to conduct high-level analysis. At present, stream algorithms are still facing some problems, for example: sensitive to the initial data points; bad quality of clusters due to the loss of global information caused by dividing the stream; high time complexity etc. A novel dual-tier clustering algorithm for data streams, AGCluStream, is proposed in this paper. The on-line algorithm uses agents to make similar points denser in local areas, and record the temporary distribution of data according to the pyramidal time frame (3). The off-line algorithm uses these records to conduct time-window analysis and higher-level clustering analysis. AGCluStream dose not divide the stream, and it adopts an incomplete-partition strategy to maintain the global information more effectively.

fuzzy systems and knowledge discovery | 2005

SuffixMiner: efficiently mining frequent itemsets in data streams by suffix-forest

Lifeng Jia; Chunguang Zhou; Zhe Wang; Xiujuan Xu

We proposed a new algorithm SuffixMiner which eliminates the requirement of multiple passes through the data when finding out all frequent itemsets in data streams, takes full advantage of the special property of suffix-tree to avoid generating candidate itemsets and traversing each suffix-tree during the itemset growth, and utilizes a new itemset growth method to mine all frequent itemsets in data streams. Experiment results show that the SuffixMiner algorithm not only has an excellent scalability to mine frequent itemsets over data streams, but also outperforms Apriori and Fp-Growth algorithms.

advanced data mining and applications | 2005

Mining recent frequent itemsets in data streams by radioactively attenuating strategy

Lifeng Jia; Zhe Wang; Chunguang Zhou; Xiujuan Xu

We propose a novel approach for mining recent frequent itemsets. The approach has three key contributions. First, it is a single-scan algorithm which utilizes the special property of suffix-trees to guarantee that all frequent itemsets are mined. During the phase of itemset growth it is unnecessary to traverse the suffix-trees which are the data structure for storing the summary information of data. Second, our algorithm adopts a novel method for itemset growth which includes two special kinds of itemset growth operations to avoid generating any candidate itemset. Third, we devise a new regressive strategy from the attenuating phenomenon of radioelement in nature, and apply it into the algorithm to distinguish the influence of latest transactions from that of obsolete transactions. We conduct detailed experiments to evaluate the algorithm. It confirms that the new method has an excellent scalability and the performance illustrates better quality and efficiency.

Lecture Notes in Computer Science | 2005

DualRank: a dual-phase algorithm for optimal profit mining in retailing market

Xiujuan Xu; Lifeng Jia; Zhe Wang; Chunguang Zhou

We systematically propose a dual-phase algorithm, DualRank, to mine the optimal profit in retailing market. DualRank algorithm has two major phases which are called mining general profit phase and optimizing profit phase respectively. In the first phase, the novel sub-algorithm, ItemRank, integrates the random distribution of items into profit mining to improve the performance of item order. In the other phase, two novel optimizing sub-algorithms are proposed to ameliorating results generated in the first phase. According to the cross-selling effect and the self-profit of items, DualRank algorithm could solve the problem of item order objectively and mechanically. We conduct detailed experiments to evaluate DualRank algorithm and experiment result confirms that the new method has an excellent ability for profit mining and the performance meets the condition which requires better quality and efficiency.

Pattern Recognition Letters | 2009