Haeran Cho
London School of Economics and Political Science
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Haeran Cho.
Journal of The Royal Statistical Society Series B-statistical Methodology | 2012
Haeran Cho; Piotr Fryzlewicz
The paper considers variable selection in linear regression models where the number of covariates is possibly much larger than the number of observations. High dimensionality of the data brings in many complications, such as (possibly spurious) high correlations between the variables, which result in marginal correlation being unreliable as a measure of association between the variables and the response. We propose a new way of measuring the contribution of each variable to the response which takes into account high correlations between the variables in a data-driven way. The proposed tilting procedure provides an adaptive choice between the use of marginal correlation and tilted correlation for each variable, where the choice is made depending on the values of the hard thresholded sample correlation of the design matrix. We study the conditions under which this measure can successfully discriminate between the relevant and the irrelevant variables and thus be used as a tool for variable selection. Finally, an iterative variable screening algorithm is constructed to exploit the theoretical properties of tilted correlation, and its good practical performance is demonstrated in a comparative simulation study.
Statistica Sinica | 2012
Haeran Cho; Piotr Fryzlewicz
In this paper, we propose a fast, well-performing, and consistent method for segmenting a piecewise-stationary, linear time series with an unknown number of breakpoints. The time series model we use is the nonparametric Locally Sta- tionary Wavelet model, in which a complete description of the piecewise-stationary second-order structure is provided by wavelet periodograms computed at multiple scales and locations. The initial stage of our method is a new binary segmenta- tion procedure, with a theoretically justified and rapidly computable test criterion that detects breakpoints in wavelet periodograms separately at each scale. This is followed by within-scale and across-scales post-processing steps, leading to con- sistent estimation of the number and locations of breakpoints in the second-order structure of the original process. An extensive simulation study demonstrates good performance of our method.
Electronic Journal of Statistics | 2016
Haeran Cho
In this paper, we consider the problem of (multiple) change-point detection in panel data. We propose the double CUSUM statistic which utilises the cross-sectional change-point structure by examining the cumulative sums of ordered CUSUMs at each point. The efficiency of the proposed change-point test is studied, which is reflected on the rate at which the cross-sectional size of a change is permitted to converge to zero while it is still detectable. Also, the consistency of the proposed change-point detection procedure based on the binary segmentation algorithm, is established in terms of both the total number and locations (in time) of the estimated change-points. Motivated by the representation properties of the Generalised Dynamic Factor Model, we propose a bootstrap procedure for test criterion selection, which accounts for both cross-sectional and within-series correlations in high-dimensional data. The empirical performance of the double CUSUM statistics, equipped with the proposed bootstrap scheme, is investigated in a comparative simulation study with the state-of-the-art. As an application, we analyse the log returns of S&P 100 component stock prices over a period of one year.
Statistics and Computing | 2011
Haeran Cho; Piotr Fryzlewicz
We compare two state-of-the-art non-linear techniques for nonparametric function estimation via piecewise constant approximation: the taut string and the Unbalanced Haar methods. While it is well-known that the latter is multiscale, it is not obvious that the former can also be interpreted as multiscale. We provide a unified multiscale representation for both methods, which offers an insight into the relationship between them as well as suggesting lessons both methods can learn from each other.
Journal of Econometrics | 2018
Matteo Barigozzi; Haeran Cho; Piotr Fryzlewicz
We propose the first comprehensive treatment of high-dimensional time series factor models with multiple change-points in their second-order structure. We operate under the most flexible definition of piecewise stationarity, and estimate the number and locations of change-points consistently as well as identifying whether they originate in the common or idiosyncratic components. Through the use of wavelets, we transform the problem of change-point detection in the second-order structure of a high-dimensional time series, into the (relatively easier) problem of change-point detection in the means of high-dimensional panel data. Also, our methodology circumvents the difficult issue of the accurate estimation of the true number of factors in the presence of multiple change-points by adopting a screening procedure. We further show that consistent factor analysis is achieved over each segment defined by the change-points estimated by the proposed methodology. In extensive simulation studies, we observe that factor analysis prior to change-point detection improves the detectability of change-points, and identify and describe an interesting ‘spillover’ effect in which substantial breaks in the idiosyncratic components get, naturally enough, identified as change-points in the common components, which prompts us to regard the corresponding change-points as also acting as a form of ‘factors’. Our methodology is implemented in the R package factorcpt, available from CRAN.
arXiv: Methodology | 2016
Haeran Cho
In this paper, we introduce a new method for testing the stationarity of time series, where the test statistic is obtained from measuring and maximising the difference in the second-order structure over pairs of randomly drawn intervals. The asymptotic normality of the test statistic is established for both Gaussian and a range of non-Gaussian time series, and a bootstrap procedure is proposed for estimating the variance of the main statistics. Further, we show the consistency of our test under local alternatives. Due to the flexibility inherent in the random, unsystematic sub-samples used for test statistic construction, the proposed method is able to identify the intervals of significant departure from the stationarity without any dyadic constraints, which is an advantage over other tests employing systematic designs. We demonstrate its good finite sample performance on both simulated and real data, particularly in detecting localised departure from the stationarity.
Journal of The Royal Statistical Society Series B-statistical Methodology | 2015
Haeran Cho; Piotr Fryzlewicz
European Review of Economic History | 2012
Olga Christodoulaki; Haeran Cho; Piotr Fryzlewicz
Archive | 2008
Haeran Cho; Piotr Fryzlewicz
LSE Research Online Documents on Economics | 2014
Piotr Fryzlewicz; Haeran Cho