Is this you? Create Your Porfile

Yuejin Zhang

Central University of Finance and Economics

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yuejin Zhang is active.

Explore More

Publication

Featured researches published by Yuejin Zhang.

international conference on data mining | 2010

Find Intelligent Knowledge by Second-Order Mining: Three Cases from China

Guangli Nie; Lingling Zhang; Yuejin Zhang; Wei Deng; Yong Shi

This paper discusses the second-order mining of the results of data mining. There is still gap between the knowledge which can direct the operation of company and the knowledge got from data mining. We take the knowledge from data mining as primary knowledge and the knowledge from second-order data mining as intelligent knowledge. We discuss the importance of intelligent knowledge and the way to find intelligent knowledge from primary knowledge via second-order mining process in this study. Three cases from China are used to demonstrate the methods to finish second mining. These cases are related to central bank credit scoring, credit card churn prediction, and email service fields respectively.

Procedia Computer Science | 2017

Determinants of loan funded successful in online P2P Lending.

Yuejin Zhang; Haifeng Li; Mo Hai; Jiaxuan Li; Aihua Li

Abstract Online Pear-to-Pear lending has developed rapidly recently. In this article, an empirical study was conducted by using public dataset from Paipaidai, the largest online P2P lending in China. We analyze the factors that determine the probability of obtaining the loan in online P2P lending. The result indicates that annual interest rate, repayment period, description, credit grade, successful loan number, failed loan number, gender, and borrowed credit score are significant factors to loan funded successful on Paipaidai platform.

international conference on data mining | 2017

Finding Top-k Fuzzy Frequent Itemsets from Databases

Haifeng Li; Yue Wang; Ning Zhang; Yuejin Zhang

Frequent itemset mining is an important in data mining. Fuzzy data mining can more accurately describe the mining results in frequent itemset mining. Nevertheless, frequent itemsets are redundant for the users. A better way is to show the top-k results accordingly. In this paper, we define the score of fuzzy frequent itemset and propose the problem of top-k fuzzy frequent itemset mining, which, to the best of our knowledge, has never been focused on before. To address this problem, we employ a data structure named TopKFFITree to store the superset of the mining results, which has a significantly reduced size in comparison to all the fuzzy frequent itemsets. Then, we present an algorithm named TopK-FFI to build and maintain the data structure. In this algorithm, we employ a method to prune most of the fuzzy frequent itemsets immediately based on the monotony of itemset score. Theoretical analysis and experimental studies over 4 datasets demonstrate that our proposed algorithm can efficiently decrease the runtime and memory cost, and significantly outperform the naive algorithm Top-k-FFI-Miner.

Procedia Computer Science | 2017

A Performance Evaluation of Classification Algorithms for Big Data

Mo Hai; You Zhang; Yuejin Zhang

Abstract The performance of two typical classification algorithms in Spark: random forest and naive bayes are evaluated by using four metrics: classification accuracy, speedup, scaleup and sizeup. Experiments are performed on dataset and clusters of different scale. The results show that: (1) the accuracy of the two algorithms is high; (2) the increase of speedup is not linear. For the dataset with different size, the numbers of nodes is different when the speedup is the maximal; (3) the scaleup of random forest reaches its peak when the number of nodes is 2, and after that the scaleup decreases with the increase of the number of nodes;(4) for random forest, when the number of nodes is 2, the sizeup increases sharply with the increase of the size of dataset, and when the number of nodes is greater than 2, the sizeup increases more slowly with the increase of the size of dataset; for naive bayes, when the number of nodes is smaller than 6, the sizeup increases with the increase of the size of dataset, when number of nodes is 6 and the size of dataset is larger than that of Sogou_5, the change of the sizeup is not obvious with the increase of the size of dataset.

Procedia Computer Science | 2017

Discovering Top-k Probabilistic Frequent Itemsets from Uncertain Databases

Haifeng Li; Yuejin Zhang; Ning Zhang

Abstract Probabilistic frequent itemset mining is to find the itemsets with support larger than the threshold with a given probabilistic confidence within an uncertain database. Nevertheless, when the threshold is smaller, the mining results will be massive, which are not easy to understand by the users. In this paper, we focus on this problem and propose a method to achieve the top-k probabilistic frequent itemsets, which, to our best knowledge, has never been addressed before. A scoring function is defined to evaluate the level of itemsets. We introduce a compacted data structure, named TopKPFITree, to maintain the mining results and some other information. Furthermore, an efficient algorithm TopKPFIM is proposed to build the TopKPFITree and get the results. Our experimental results over uncertain datasets show that our algorithm significantly outperform the Naive algorithm.

Procedia Computer Science | 2017

Evaluating the well-qualified borrowers from PaiPaiDai.

Haifeng Li; Yuejin Zhang; Ning Zhang

Abstract Online peer-to-peer lending is a new but useful finance method for small enterprises that is conducted on the website. In this paper, we used the existing personal information to evaluate the potential borrowers who are well-qualified. A neutral network is used to help a borrower decide whether he can successfully loan from the P2P company. Our data is obtained from the PaiPaiDai website and is preprocessed proficiently. Our experimental results show that our method can achieve the precision exceeding 95%.

Procedia Computer Science | 2018