Yuejin Zhang
Central University of Finance and Economics
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yuejin Zhang.
international conference on data mining | 2010
Guangli Nie; Lingling Zhang; Yuejin Zhang; Wei Deng; Yong Shi
This paper discusses the second-order mining of the results of data mining. There is still gap between the knowledge which can direct the operation of company and the knowledge got from data mining. We take the knowledge from data mining as primary knowledge and the knowledge from second-order data mining as intelligent knowledge. We discuss the importance of intelligent knowledge and the way to find intelligent knowledge from primary knowledge via second-order mining process in this study. Three cases from China are used to demonstrate the methods to finish second mining. These cases are related to central bank credit scoring, credit card churn prediction, and email service fields respectively.
Procedia Computer Science | 2017
Yuejin Zhang; Haifeng Li; Mo Hai; Jiaxuan Li; Aihua Li
Abstract Online Pear-to-Pear lending has developed rapidly recently. In this article, an empirical study was conducted by using public dataset from Paipaidai, the largest online P2P lending in China. We analyze the factors that determine the probability of obtaining the loan in online P2P lending. The result indicates that annual interest rate, repayment period, description, credit grade, successful loan number, failed loan number, gender, and borrowed credit score are significant factors to loan funded successful on Paipaidai platform.
international conference on data mining | 2017
Haifeng Li; Yue Wang; Ning Zhang; Yuejin Zhang
Frequent itemset mining is an important in data mining. Fuzzy data mining can more accurately describe the mining results in frequent itemset mining. Nevertheless, frequent itemsets are redundant for the users. A better way is to show the top-k results accordingly. In this paper, we define the score of fuzzy frequent itemset and propose the problem of top-k fuzzy frequent itemset mining, which, to the best of our knowledge, has never been focused on before. To address this problem, we employ a data structure named TopKFFITree to store the superset of the mining results, which has a significantly reduced size in comparison to all the fuzzy frequent itemsets. Then, we present an algorithm named TopK-FFI to build and maintain the data structure. In this algorithm, we employ a method to prune most of the fuzzy frequent itemsets immediately based on the monotony of itemset score. Theoretical analysis and experimental studies over 4 datasets demonstrate that our proposed algorithm can efficiently decrease the runtime and memory cost, and significantly outperform the naive algorithm Top-k-FFI-Miner.
Procedia Computer Science | 2017
Mo Hai; You Zhang; Yuejin Zhang
Abstract The performance of two typical classification algorithms in Spark: random forest and naive bayes are evaluated by using four metrics: classification accuracy, speedup, scaleup and sizeup. Experiments are performed on dataset and clusters of different scale. The results show that: (1) the accuracy of the two algorithms is high; (2) the increase of speedup is not linear. For the dataset with different size, the numbers of nodes is different when the speedup is the maximal; (3) the scaleup of random forest reaches its peak when the number of nodes is 2, and after that the scaleup decreases with the increase of the number of nodes;(4) for random forest, when the number of nodes is 2, the sizeup increases sharply with the increase of the size of dataset, and when the number of nodes is greater than 2, the sizeup increases more slowly with the increase of the size of dataset; for naive bayes, when the number of nodes is smaller than 6, the sizeup increases with the increase of the size of dataset, when number of nodes is 6 and the size of dataset is larger than that of Sogou_5, the change of the sizeup is not obvious with the increase of the size of dataset.
Procedia Computer Science | 2017
Haifeng Li; Yuejin Zhang; Ning Zhang
Abstract Probabilistic frequent itemset mining is to find the itemsets with support larger than the threshold with a given probabilistic confidence within an uncertain database. Nevertheless, when the threshold is smaller, the mining results will be massive, which are not easy to understand by the users. In this paper, we focus on this problem and propose a method to achieve the top-k probabilistic frequent itemsets, which, to our best knowledge, has never been addressed before. A scoring function is defined to evaluate the level of itemsets. We introduce a compacted data structure, named TopKPFITree, to maintain the mining results and some other information. Furthermore, an efficient algorithm TopKPFIM is proposed to build the TopKPFITree and get the results. Our experimental results over uncertain datasets show that our algorithm significantly outperform the Naive algorithm.
Procedia Computer Science | 2017
Haifeng Li; Yuejin Zhang; Ning Zhang
Abstract Online peer-to-peer lending is a new but useful finance method for small enterprises that is conducted on the website. In this paper, we used the existing personal information to evaluate the potential borrowers who are well-qualified. A neutral network is used to help a borrower decide whether he can successfully loan from the P2P company. Our data is obtained from the PaiPaiDai website and is preprocessed proficiently. Our experimental results show that our method can achieve the precision exceeding 95%.
Procedia Computer Science | 2018
Haifeng Li; Xiaohua Chen; Yuejin Zhang; Mo Hai; Hanqing Hu
Procedia Computer Science | 2018
Luhua Zhang; Yuejin Zhang; Jun Wang; Xiao Wang; Haifeng Li; Runtong Cheng
Procedia Computer Science | 2018
Haifeng Li; Yuejin Zhang; Mo Hai; Hanqing Hu
Procedia Computer Science | 2018
Wei Song; Yuejin Zhang; Jun Wang; Haifeng Li; Yajing Meng; Runtong Cheng