Is this you? Create Your Porfile

Lean Yu

Beijing University of Chemical Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lean Yu is active.

Explore More

Publication

Featured researches published by Lean Yu.

IEEE Transactions on Evolutionary Computation | 2009

Evolving Least Squares Support Vector Machines for Stock Market Trend Mining

Lean Yu; Huanhuan Chen; Shouyang Wang; Kin Keung Lai

In this paper, an evolving least squares support vector machine (LSSVM) learning paradigm with a mixed kernel is proposed to explore stock market trends. In the proposed learning paradigm, a genetic algorithm (GA), one of the most popular evolutionary algorithms (EAs), is first used to select input features for LSSVM learning, i.e., evolution of input features. Then, another GA is used for parameters optimization of LSSVM, i.e., evolution of algorithmic parameters. Finally, the evolving LSSVM learning paradigm with best feature subset, optimal parameters, and a mixed kernel is used to predict stock market movement direction in terms of historical data series. For illustration and evaluation purposes, three important stock indices, S&P 500 Index, Dow Jones Industrial Average (DJIA) Index, and New York Stock Exchange (NYSE) Index, are used as testing targets. Experimental results obtained reveal that the proposed evolving LSSVM can produce some forecasting models that are easier to be interpreted by using a small number of predictive features and are more efficient than other parameter optimization methods. Furthermore, the produced forecasting model can significantly outperform other forecasting models listed in this paper in terms of the hit ratio. These findings imply that the proposed evolving LSSVM learning paradigm can be used as a promising approach to stock market tendency exploration.

International Journal of Information Technology and Decision Making | 2015

A Novel CEEMD-Based EELM Ensemble Learning Paradigm for Crude Oil Price Forecasting

Ling Tang; Wei Dai; Lean Yu; Shouyang Wang

To enhance the prediction accuracy for crude oil price, a novel ensemble learning paradigm coupling complementary ensemble empirical mode decomposition (CEEMD) and extended extreme learning machine (EELM) is proposed. This novel method is actually an improved model under the effective decomposition and ensemble framework, especially for nonlinear, complex, and irregular data. In this proposed method, CEEMD, a current extension from the competitive decomposition family of empirical mode decomposition (EMD), is first applied to divide the original data (i.e., difficult task) into a number of components (i.e., relatively easy subtasks). Then, EELM, a recently developed, powerful, fast and stable intelligent learning technique, is implemented to predict all extracted components individually. Finally, these predicted results are aggregated into an ensemble result as the final prediction using simple addition ensemble method. With the crude oil spot prices of WTI and Brent as sample data, the empirical results demonstrate that the novel CEEMD-based EELM ensemble model statistically outperforms all listed benchmarks (including typical forecasting techniques and similar ensemble models with other decomposition and ensemble tools) in prediction accuracy. The results also indicate that the novel model can be used as a promising forecasting tool for complicated time series data with high volatility and irregularity.

Applied Soft Computing | 2017

A non-iterative decomposition-ensemble learning paradigm using RVFL network for crude oil price forecasting

Ling Tang; Yao Wu; Lean Yu

Abstract To address time consuming and parameter sensitivity in the emerging decomposition- ensemble models, this paper develops a non-iterative learning paradigm without iterative training process. Different from the most existing decomposition-ensemble models using statistical or iterative approaches as individual forecasting tools, the proposed work otherwise employs the efficient and fast non-iterative algorithm—random vector functional link (RVFL) network with randomly fixed weights and direct input-output links. Three major steps are included: decomposition via ensemble empirical mode decomposition (EEMD), prediction via RVFL network, and ensemble via linear addition. With crude oil price as studying sample, the proposed EEMD-based RVFL network performs significantly better in terms of prediction accuracy than not only single algorithms such as RVFL network, extreme learning machine (ELM), kernel ridge regression, random forest, back propagation neural network, least square support vector regression, and autoregressive integrated moving average, but also their respective EEMD-based ensemble variants. As for speed ranking, RVFL network developed in 1994 ranks the first among all the listed methods, and EEMD-based RVFL network defeats all the ensemble methods and most single methods, possibly due to the fact that RVFL network with direct input-output links needs far less enhancement nodes and hence a shorter computational time than those without the direct links such as the ELM developed in 2006.

Applied Soft Computing | 2017

LSSVR ensemble learning with uncertain parameters for crude oil price forecasting

Lean Yu; Huijuan Xu; Ling Tang

Display Omitted An LSSVR ensemble learning paradigm with uncertain parameters is proposed.The user-defined parameters in LSSVR are treated as uncertain variables.Uncertain parameters are first used to formulate diverse individual members.Uncertainties are then offset via ensemble weighed averaging.Empirical study of crude oil price prediction verifies effectiveness of the model. Least squares support vector regression (LSSVR) is an effective and competitive approach for crude oil price prediction, but its performance suffers from parameter sensitivity and long tuning time. This paper considers the user-defined parameters as uncertain (or random) factors to construct an LSSVR ensemble learning paradigm, by taking four major steps. First, probability distributions of the user-defined parameters in LSSVR are designed using grid method for low upper bound estimation (LUBE). Second, random sets of parameters are generated according to the designed probability distributions to formulate diverse individual LSSVR members. Third, each individual member is applied to individual prediction. Finally, all individual results are combined to the final output via ensemble weighted averaging, with probabilities measuring the corresponding weights. The computational experiment using the crude oil spot price of West Texas Intermediate (WTI) verifies the effectiveness of the proposed LSSVR ensemble learning paradigm with uncertain parameters compared with some existing LSSVR variants (using other popular parameters selection algorithms), in terms of prediction accuracy and time-saving.

international conference on innovative computing, information and control | 2008

An AI-Agent-Based Trapezoidal Fuzzy Ensemble Forecasting Model for Crude Oil Price Prediction

Lean Yu; Shouyang Wang; Bo Wen; Kin Keung Lai

In this study, a Al-agent-based trapezoidal fuzzy ensemble forecasting model is proposed for crude oil price prediction. In the proposed ensemble model, some single AI models are first used as predictors for crude oil price prediction. Then these single prediction results produced by the single Al-based predictors are fuzzified into some fuzzy prediction representations. Subsequently, these fuzzified representations are fused into a fuzzy consensus, i.e., aggregated fuzzy prediction. Finally, the aggregated prediction is defuzzified into a crisp value as the final prediction results. For testing purposes, two typical crude oil price prediction experiments are presented.

Applied Soft Computing | 2017

An EEMD-based multi-scale fuzzy entropy approach for complexity analysis in clean energy markets

Ling Tang; Huiling Lv; Lean Yu

An EEMD-based multi-scale fuzzy entropy approach is proposed to analyze the complexity characteristics of clean energy markets.The divide and conquer strategy is introduced to provide a more comprehensive complexity measurement tool for both the overall dynamics and various inner features with different time scales.The proposed EEMD-based multi-scale fuzzy entropy approach for complexity analysis can provide a new perspective for understanding market dynamics. To measure the efficiency of clean energy markets, a multi-scale complexity analysis approach is proposed. Due to the coexisting characteristics of clean energy markets, the divide and conquer strategy is introduced to provide a more comprehensive complexity analysis framework for both overall dynamics and hidden features (in different time scales), and to identify the leading factors contributing to the complexity. In the proposed approach, ensemble empirical mode decomposition (EEMD), a competitive multi-scale analysis tool, is first implemented to capture meaningful features hidden in the original market system. Second, fuzzy entropy, an effective complexity measurement, is employed to analyze both the whole system and inner features. In empirical analysis, the nuclear energy and hydropower markets in China and US are investigated, and some interesting results are obtained. For overall dynamics, the US clean energy markets appear a significantly higher complexity level than Chinas markets, implying market maturity and efficiency of US clean energy relative to China. For inner features, similar features (in terms of similar time scales) in different markets present similar complexity levels. For different inner features, there are some distinct differences in clean energy markets between US and China. Chinas markets are mainly driven by upward long-term trends with a low-level complexity, while short-term fluctuations with high-level complexity are the leading features for the US markets. All these results demonstrate that the proposed EEMD-based multi-scale fuzzy entropy approach can provide a new analysis tool to understand the complexity of clean energy markets.

systems, man and cybernetics | 2008

A generalized Intelligent-agent-based fuzzy group forecasting model for oil price prediction

Lean Yu; Shouyang Wang; Kin Keung Lai

In this study, a generalized Intelligent-agent-based fuzzy group forecasting model is proposed for oil price prediction. In the proposed model, some single Intelligent-agent-based predictors with much disagreement are first created for crude oil price prediction. Then these single prediction results produced by these single intelligent predictors are fuzzified into some fuzzy prediction representations. Particularly, some methods of fuzzification are extended into a consolidated framework to make the later computation generalization. Subsequently, these fuzzified prediction representations are integrated into a fuzzy consensus, i.e., aggregated fuzzy prediction. Finally, the aggregated fuzzy prediction is defuzzified into a crisp value as the final prediction results. For verification and testing purposes, two typical oil price series are used to conduct the experiments.

International Journal of Information Technology and Decision Making | 2016

Prediction-Based Multi-Objective Optimization for Oil Purchasing and Distribution with the NSGA-II Algorithm

Lean Yu; Zebin Yang; Ling Tang

Due to the uncertainty in oil markets, this paper proposes a novel approach for oil purchasing and distribution optimization by incorporating price and demand prediction, i.e., the prediction-based oil purchasing-and-distribution optimization model. In particular, the proposed method bridges the latest information technology and decision-making technique by introducing the recently proposed information technology (i.e., extreme learning machine (ELM)) into the oil purchasing-and-distribution optimization model. Two main steps are involved: market prediction and planning optimization in the proposed model. In market prediction, the ELM technique is employed to provide fast training time and accurate forecasting results for oil prices and demands. In planning optimization, two objectives of general profit maximization and inventory risk minimization are considered; and the most popular multi-objective evolutionary algorithm (MOEA), nondominated sorting genetic algorithm II (NSGA-II), is implemented to search approximate Pareto optimal solutions. For illustration and verification, the motor gasoline market in the US is focused on as the study sample, and the experimental results demonstrate the superiority of the proposed prediction-based optimization approach over its benchmark models (without market prediction and/or planning optimization), in terms of the highest profit and the lowest risk.

International Journal of Information Technology and Decision Making | 2017

Importance Sampling for Credit Portfolio Risk with Risk Factors Having t-Copula

Rongda Chen; Ze Wang; Lean Yu

This paper proposes an efficient simulation method for calculating credit portfolio risk when risk factors have a heavy-tailed distributions. In modeling heavy tails, its features of return on underlying asset are captured by multivariate t-Copula. Moreover, we develop a three-step importance sampling (IS) procedure in the t-copula credit portfolio risk measure model for further variance reduction. Simultaneously, we apply the Levenberg–Marquardt algorithm associated with nonlinear optimization technique to solve the problem that estimates the mean-shift vector of the systematic risk factors after the probability measure change. Numerical results show that those methods developed in the t-copula model can produce large variance reduction relative to the plain Monte Carlo method, to estimate more accurately tail probability of credit portfolio loss distribution.

Applied Soft Computing | 2018

A DBN-based resampling SVM ensemble learning paradigm for credit classification with imbalanced data

Lean Yu; Rongtian Zhou; Ling Tang; Rongda Chen

Abstract Credit risk assessment is often accompanied with sampling data imbalance. For this reason, this paper tries to propose a deep belief network (DBN) based resampling support vector machine (SVM) ensemble learning paradigm to solve imbalanced data problem in credit classification. In this paradigm, a bagging algorithm is first used to generate variable training subsets to make the subsets rebalanced and suitable in size. Then the SVM model is used as individual base classifier to formulate diverse ensemble input members. Finally, the DBN model is applied as an ensemble method to fuse the input members to aggregate the classification results. In addition, the weights of different classes are changed by introducing a revenue matrix in terms of revenue-sensitive technique, which helps to make the results more reasonable. The experimental results indicate that the classification performance are improved effectively when the DBN-based ensemble strategy is integrated with re-sampling techniques, especially in imbalanced-data problem, implying that the proposed DBN-based resampling SVM ensemble learning paradigm can be used as a promising tool for credit risk classification with imbalanced data.

Explore More