Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Thomas C. M. Lee is active.

Publication


Featured researches published by Thomas C. M. Lee.


Journal of the American Statistical Association | 2006

Structural Break Estimation for Nonstationary Time Series Models

Richard A. Davis; Thomas C. M. Lee; Gabriel A. Rodriguez-Yam

This article considers the problem of modeling a class of nonstationary time series using piecewise autoregressive (AR) processes. The number and locations of the piecewise AR segments, as well as the orders of the respective AR processes, are assumed unknown. The minimum description length principle is applied to compare various segmented AR fits to the data. The goal is to find the “best” combination of the number of segments, the lengths of the segments, and the orders of the piecewise AR processes. Such a “best” combination is implicitly defined as the optimizer of an objective function, and a genetic algorithm is implemented to solve this difficult optimization problem. Numerical results from simulation experiments and real data analyses show that the procedure has excellent empirical properties. The segmentation of multivariate time series is also considered. Assuming that the true underlying model is a segmented autoregression, this procedure is shown to be consistent for estimating the location of the breaks.


Journal of Time Series Analysis | 2008

Break Detection for a Class of Nonlinear Time Series Models

Richard A. Davis; Thomas C. M. Lee; Gabriel A. Rodriguez-Yam

This article considers the problem of detecting break points for a nonstationary time series. Specifically, the time series is assumed to follow a parametric nonlinear time-series model in which the parameters may change values at fixed times. In this formulation, the number and locations of the break points are assumed unknown. The minimum description length (MDL) is used as a criterion for estimating the number of break points, the locations of break points and the parametric model in each segment. The best segmentation found by minimizing MDL is obtained using a genetic algorithm. The implementation of this approach is illustrated using generalized autoregressive conditionally heteroscedastic (GARCH) models, stochastic volatility models and generalized state-space models as the parametric model for the segments. Empirical results show good performance of the estimates of the number of breaks and their locations for these various models. Copyright 2008 The Authors. Journal compilation 2008 Blackwell Publishing Ltd


Computational Statistics & Data Analysis | 2003

Smoothing parameter selection for smoothing splines: a simulation study

Thomas C. M. Lee

Smoothing splines are a popular method for performing nonparametric regression. Most important in the implementation of this method is the choice of the smoothing parameter. This article provides a simulation study of several smoothing parameter selection methods, including two so-called risk estimation methods. To the best of the authors knowledge, the empirical performances of these two risk estimation methods have never been reported in the literature. Empirical conclusions from and recommendations based on the simulation results will be provided. One noteworthy empirical observation is that the popular method, generalized cross-validation, was outperformed by another method, an improved Akaike Information criterion, that shares the same assumptions and computational complexity.


The Annals of Applied Statistics | 2010

An MDL approach to the climate segmentation problem

QiQi Lu; Robert Lund; Thomas C. M. Lee

This paper proposes an information theory approach to estimate the number of changepoints and their locations in a climatic time series. A model is introduced that has an unknown number of changepoints and allows for series autocorrelations, periodic dynamics, and a mean shift at each changepoint time. An objective function gauging the number of changepoints and their locations, based on a minimum description length (MDL) information criterion, is derived. A genetic algorithm is then developed to optimize the objective function. The methods are applied in the analysis of a century of monthly temperatures from Tuscaloosa, Alabama.


Journal of the American Statistical Association | 2000

A Minimum Description Length-Based Image Segmentation Procedure, and its Comparison with a Cross-Validation-Based Segmentation Procedure

Thomas C. M. Lee

Abstract Image segmentation is a very important problem in image analysis, as quite often it is a key component of a good practical solution to a real-life imaging problem. It aims to partition a digital image into a set of nonoverlapping homogeneous regions. One approach to segmenting an image is to fit a piecewise constant function to the image and define the segmentation by the discontinuity points of the fitted function. The articles first contribution is to present a new and automatic segmentation procedure which follows this piecewise constant function fitting approach. This procedure is based on Rissanens minimum description length (MDL) principle and consists of two components: (a) an MDL-based criterion in which the “best” segmentation (i.e., the “best” fitted piecewise constant function) is defined as its minimizer and (b) a fast-merging algorithm that attempts to locate this minimizer. As a second contribution, the new MDL-based procedure is compared with a cross-validation based segmentation procedure. Empirical results from a simulation study suggest the new MDL-based procedure is superior. Some possible extensions of the MDL-based procedure are also described.


Journal of Computational and Graphical Statistics | 2006

Robust SiZer for Exploration of Regression Structures and Outlier Detection

Jan Hannig; Thomas C. M. Lee

The SiZer methodology is a valuable tool for conducting exploratory data analysis. In this article a robust version of SiZer is developed for the regression setting. This robust SiZer is capable of producing SiZer maps with different degrees of robustness. By inspecting such SiZer maps, either as a series of plots or in the form of a movie, the structures hidden in a dataset can be more effectively revealed. It is also demonstrated that the robust SiZer can be used to help identify outliers. Results from both real data and simulated examples will be provided.


Journal of the American Statistical Association | 2016

Generalized Fiducial Inference: A Review and New Results

Jan Hannig; Hari Iyer; Randy C. S. Lai; Thomas C. M. Lee

Abstract R. A. Fisher, the father of modern statistics, proposed the idea of fiducial inference during the first half of the 20th century. While his proposal led to interesting methods for quantifying uncertainty, other prominent statisticians of the time did not accept Fisher’s approach as it became apparent that some of Fisher’s bold claims about the properties of fiducial distribution did not hold up for multi-parameter problems. Beginning around the year 2000, the authors and collaborators started to reinvestigate the idea of fiducial inference and discovered that Fisher’s approach, when properly generalized, would open doors to solve many important and difficult inference problems. They termed their generalization of Fisher’s idea as generalized fiducial inference (GFI). The main idea of GFI is to carefully transfer randomness from the data to the parameter space using an inverse of a data-generating equation without the use of Bayes’ theorem. The resulting generalized fiducial distribution (GFD) can then be used for inference. After more than a decade of investigations, the authors and collaborators have developed a unifying theory for GFI, and provided GFI solutions to many challenging practical problems in different fields of science and industry. Overall, they have demonstrated that GFI is a valid, useful, and promising approach for conducting statistical inference. The goal of this article is to deliver a timely and concise introduction to GFI, to present some of the latest results, as well as to list some related open research problems. It is authors’ hope that their contributions to GFI will stimulate the growth and usage of this exciting approach for statistical inference. Supplementary materials for this article are available online.


Technometrics | 2005

Model selection for the competing-risks model with and without masking

Radu V. Craiu; Thomas C. M. Lee

The competing-risks model is useful in settings in which individuals (or units) may die (or fail) because of various causes. It can also be the case that for some of the items, the cause of failure is known only up to a subgroup of all causes, in which case we say that the failure is group-masked. A widely used approach for competing-risks data with and without masking involves the specification of cause-specific hazard rates. Often, because of the availability of likelihood methods for estimation and testing, piecewise constant hazards are used. The piecewise constant rates also offer model flexibility and computational convenience. However, for such piecewise constant hazard models, the choice of the endpoints for each interval on which the hazards are constant is usually a subjective one. In this article we discuss and propose the use of model selection methods that are data-driven and automatic. We compare three model selection procedures based on the minimum description length principle, the Bayes information criterion, and the Akaike information criterion. A fast-splitting algorithm is the computational tool used to select among an enormous number of possible models. We test the effectiveness of the methods through numerical studies, including a real dataset with masked failure causes.


Computational Statistics & Data Analysis | 1998

Maximum likelihood restoration and choice of smoothing parameter in deconvolution of image data subject to Poisson noise

H. Malcolm Hudson; Thomas C. M. Lee

Abstract Image degradation by blurring is a well-known phenomenon often described by the mathematical operation of convolution. Fourier methods are well developed for recovery, or restoration, of the true image from an observed image affected by convolution blur and additive constant variance Gaussian noise. One focus of this paper is to describe another statistical restoration method which is available when the image data exhibits Poisson variability. This is a common situation when counts of recorded activity form the image, as in medical imaging contexts. We apply Maximum Likelihood (ML) and Maximum Penalized Likelihood (MPL) procedures to deconvolve image data which has been degraded by blurring and Poisson variability in recorded activity. A second focus is formulation and comparison of automated selection procedures for regularization (smoothing) parameters in this context.


Signal Processing | 2007

Robust estimation of the self-similarity parameter in network traffic using wavelet transform

Haipeng Shen; Zhengyuan Zhu; Thomas C. M. Lee

This article studies the problem of estimating the self-similarity parameter of network traffic traces. A robust wavelet-based procedure is proposed for this estimation task of deriving estimates that are less sensitive to some commonly encountered non-stationary traffic conditions, such as sudden level shifts and breaks. Two main ingredients of the proposed procedure are: (i) the application of a robust regression technique for estimating the parameter from the wavelet coefficients of the traces, and (ii) the proposal of an automatic level shift removal algorithm for removing sudden jumps in the traces. Simulation experiments are conducted to compare the proposed estimator with existing wavelet-based estimators. The proposed estimator is also applied to real traces obtained from the Abilene Backbone Network and a university campus network. Both results from simulated experiments and real trace applications suggest that the proposed estimator is superior.

Collaboration


Dive into the Thomas C. M. Lee's collaboration.

Top Co-Authors

Avatar

Jan Hannig

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alexander Aue

University of California

View shared research outputs
Top Co-Authors

Avatar

Hee-Seok Oh

Seoul National University

View shared research outputs
Top Co-Authors

Avatar

Curtis B. Storlie

Los Alamos National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Debashis Paul

University of California

View shared research outputs
Top Co-Authors

Avatar

Haonan Wang

Colorado State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Douglas Nychka

National Center for Atmospheric Research

View shared research outputs
Researchain Logo
Decentralizing Knowledge