Weixuan Li
Pacific Northwest National Laboratory
                                 Network
                            
                            Latest external collaboration on country level. Dive into details by clicking on the dots.
                                 Publication
                            
                            Featured researches published by Weixuan Li.
Journal of The Royal Statistical Society Series B-statistical Methodology | 2002
Yingcun Xia; Howell Tong; Weixuan Li; Lixing Zhu
Summary. Searching for an effective dimension reduction space is an important problem in regression, especially for high dimensional data. We propose an adaptive approach based on semiparametric models, which we call the (conditional) minimum average variance estimation (MAVE) method, within quite a general setting. The MAVE method has the following advantages. Most existing methods must undersmooth the nonparametric link function estimator to achieve a faster rate of consistency for the estimator of the parameters (than for that of the nonparametric function). In contrast, a faster consistency rate can be achieved by the MAVE method even without undersmoothing the nonparametric link function estimator. The MAVE method is applicable to a wide range of models, with fewer restrictions on the distribution of the covariates, to the extent that even time series can be included. Because of the faster rate of consistency for the parameter estimators, it is possible for us to estimate the dimension of the space consistently. The relationship of the MAVE method with other methods is also investigated. In particular, a simple outer product gradient estimator is proposed as an initial estimator. In addition to theoretical results, we demonstrate the efficacy of the MAVE method for high dimensional data sets through simulation. Two real data sets are analysed by using the MAVE approach.
Water Resources Research | 2016
Jiangjiang Zhang; Weixuan Li; Lingzao Zeng; Laosheng Wu
Surrogate models are commonly used in Bayesian approaches such as Markov Chain Monte Carlo (MCMC) to avoid repetitive CPU-demanding model evaluations. However, the approximation error of a surrogate may lead to biased estimation of the posterior distribution. This bias can be corrected by constructing a very accurate surrogate or implementing MCMC in a two-stage manner. Since the two-stage MCMC requires extra original model evaluations after surrogate evaluations, the computational cost is still high. If the information of measurement is incorporated, a locally accurate surrogate can be adaptively constructed with low computational cost. Based on this idea, we integrate Gaussian process (GP) and MCMC to adaptively construct locally accurate surrogates for Bayesian experimental design in groundwater contaminant source identification problems. Moreover, the uncertainty estimate of GP approximation error is incorporated in the Bayesian formula to avoid over-confident estimation of the posterior distribution. The proposed approach is tested with a numerical case study. Without sacrificing the estimation accuracy, the new approach achieves about 200 times of speed-up compared to our previous work which implemented MCMC in a two-stage manner.
Journal of Computational Physics | 2014
Weixuan Li; Guang Lin; Dongxiao Zhang
The probabilistic collocation-based Kalman filter (PCKF) is a recently developed approach for solving inverse problems. It resembles the ensemble Kalman filter (EnKF) in every aspect-except that it represents and propagates model uncertainty by polynomial chaos expansion (PCE) instead of an ensemble of model realizations. Previous studies have shown PCKF is a more efficient alternative to EnKF for many data assimilation problems. However, the accuracy and efficiency of PCKF depends on an appropriate truncation of the PCE series. Having more polynomial chaos basis functions in the expansion helps to capture uncertainty more accurately but increases computational cost. Selection of basis functions is particularly important for high-dimensional stochastic problems because the number of polynomial chaos basis functions required to represent model uncertainty grows dramatically as the number of input parameters (random dimensions) increases. In classic PCKF algorithms, the PCE basis functions are pre-set based on users@? experience. Also, for sequential data assimilation problems, the basis functions kept in PCE expression remain unchanged in different Kalman filter loops, which could limit the accuracy and computational efficiency of classic PCKF algorithms. To address this issue, we present a new algorithm that adaptively selects PCE basis functions for different problems and automatically adjusts the number of basis functions in different Kalman filter loops. The algorithm is based on adaptive functional ANOVA (analysis of variance) decomposition, which approximates a high-dimensional function with the summation of a set of low-dimensional functions. Thus, instead of expanding the original model into PCE, we implement the PCE expansion on these low-dimensional functions, which is much less costly. We also propose a new adaptive criterion for ANOVA that is more suited for solving inverse problems. The new algorithm was tested with different examples and demonstrated great effectiveness in comparison with non-adaptive PCKF and EnKF algorithms.
Water Resources Research | 2016
Jun Man; Jiangjiang Zhang; Weixuan Li; Lingzao Zeng; Laosheng Wu
The ensemble Kalman filter (EnKF) has been widely used in parameter estimation for hydrological models. The focus of most previous studies was to develop more efficient analysis (estimation) algorithms. On the other hand, it is intuitively understandable that a well-designed sampling (data-collection) strategy should provide more informative measurements and subsequently improve the parameter estimation. In this work, a Sequential Ensemble-based Optimal Design (SEOD) method, coupled with EnKF, information theory and sequential optimal design, is proposed to improve the performance of parameter estimation. Based on the first- and second-order statistics, different information metrics including the Shannon entropy difference (SD), degrees of freedom for signal (DFS) and relative entropy (RE) are used to design the optimal sampling strategy, respectively. The effectiveness of the proposed method is illustrated by synthetic one- and two-dimensional unsaturated flow case studies. It is shown that the designed sampling strategies can provide more accurate parameter estimation and state prediction compared with conventional sampling strategies. Optimal sampling designs based on various information metrics perform similarly in our cases. The effect of ensemble size on the optimal design is also investigated. Overall, larger ensemble size improves the parameter estimation and convergence of optimal sampling strategy. Although the proposed method is applied to unsaturated flow problems in this study, it can be equally applied in any other hydrological problems. This article is protected by copyright. All rights reserved.
Water Resources Research | 2009
Weixuan Li; Zhiming Lu; Dongxiao Zhang
Archive | 2002
Yingcun Xia; Howell Tong; Weixuan Li; Lixing Zhu
Journal of Computational Physics | 2016
Weixuan Li; Guang Lin; Bing Li
Journal of Computational Physics | 2015
Weixuan Li; Guang Lin
Water Resources Research | 2017
Jiangjiang Zhang; Weixuan Li; Guang Lin; Lingzao Zeng; Laosheng Wu
annual simulation symposium | 2015
Weixuan Li; Dongxiao Zhang; Guang Lin
