Tengyao Wang
University of Cambridge
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Tengyao Wang.
Annals of Statistics | 2016
Tengyao Wang; Quentin Berthet; Richard J. Samworth
In recent years, sparse principal component analysis has emerged as an extremely popular dimension reduction technique for high-dimensional data. The theoretical challenge, in the simplest case, is to estimate the leading eigenvector of a population covariance matrix under the assumption that this eigenvector is sparse. An impressive range of estimators have been proposed; some of these are fast to compute, while others are known to achieve the minimax optimal rate over certain Gaussian or sub-Gaussian classes. In this paper, we show that, under a widely-believed assumption from computational complexity theory, there is a fundamental trade-off between statistical and computational performance in this problem. More precisely, working with new, larger classes satisfying a restricted covariance concentration condition, we show that there is an effective sample size regime in which no randomised polynomial time algorithm can achieve the minimax optimal rate. We also study the theoretical performance of a (polynomial time) variant of the well-known semidefinite relaxation estimator, revealing a subtle interplay between statistical and computational efficiency.
Archive | 2016
Tengyao Wang
Spectral methods have become increasingly popular in designing fast algorithms for modern highdimensional datasets. This thesis looks at several problems in which spectral methods play a central role. In some cases, we also show that such procedures have essentially the best performance among all randomised polynomial time algorithms by exhibiting statistical and computational trade-offs in those problems. In the first chapter, we prove a useful variant of the well-known Davis–Kahan theorem, which is a spectral perturbation result that allows us to bound of the distance between population eigenspaces and their sample versions. We then propose a semi-definite programming algorithm for the sparse principal component analysis (PCA) problem, and analyse its theoretical performance using the perturbation bounds we derived earlier. It turns out that the parameter regime in which our estimator is consistent is strictly smaller than the consistency regime of a minimax optimal (yet computationally intractable) estimator. We show through reduction from a well-known hard problem in computational complexity theory that the difference in consistency regimes is unavoidable for any randomised polynomial time estimator, hence revealing subtle statistical and computational trade-offs in this problem. Such computational trade-offs also exist in the problem of restricted isometry certification. Certifiers for restricted isometry properties can be used to construct design matrices for sparse linear regression problems. Similar to the sparse PCA problem, we show that there is also an intrinsic gap between the class of matrices certifiable using unrestricted algorithms and using polynomial time algorithms. Finally, we consider the problem of high-dimensional changepoint estimation, where we estimate the time of change in the mean of a high-dimensional time series with piecewise constant mean structure. Motivated by real world applications, we assume that changes only occur in a sparse subset of all coordinates. We apply a variant of the semi-definite programming algorithm in sparse PCA to aggregate the signals across different coordinates in a near optimal way so as to estimate the changepoint location as accurately as possible. Our statistical procedure shows superior performance compared to existing methods in this problem.
Pancreas | 2017
Michael Feretis; Tengyao Wang; Satheesh Iype; Adam Duckworth; Rebecca Brais; Bristi Basu; Neville V. Jamieson; Emmanuel Huguet; Anita Balakrishnan; Asif Jah; Raaj K. Praseedom; S. Harper; Siong-Seng Liau
Objectives The aims of this study were to (i) identify independent predictors of survival after pancreaticoduodenectomy for ampullary cancer and (ii) develop a prognostic model of survival. Methods Data were analyzed retrospectively on 110 consecutive patients who underwent pancreaticoduodenectomy between 2002 and 2013. Subjects were categorized into 3 nodal subgroups as per the recently proposed nodal subclassification: N0 (node negative), N1 (1–2 metastatic nodes), or N2 (≥3 metastatic nodes). Clinicopathological features and overall survival were compared by Kaplan-Meier and Cox regression analyses. Results The overall 1-, 3-, and 5-year survival rates were 79.8%, 42.2%, and 34.9%, respectively. The overall 1-, 3-, and 5-year survival rates for the N0 group were 85.2%, 71.9%, and 67.4%, respectively. The 1-, 3-, 5-year survival rates for the N1 and N2 subgroups were 81.5%, 49.4%, and 49.4% and 75%, 19.2%, and 6.4%, respectively (log rank, P < 0.0001). After performing a multivariate Cox regression analysis, vascular invasion and lymph node ratio were the only independent predictors of survival. Hence, a prediction model of survival was constructed based on those 2 variables. Conclusions Using data from a carefully selected cohort of patients, we created a pilot prognostic model of postresectional survival. The proposed model may help clinicians to guide treatments in the adjuvant setting.
Journal of The Australian Mathematical Society | 2012
Vladimir Bolotnikov; Tengyao Wang; Joshua M. Weiss
Characterization of generalized Schur functions in terms of their Taylor coefficients was established by M. G. Krein and H. Langer in [14]. We establich a boundary analog of this characterization.
Journal of Computational and Applied Mathematics | 2011
Tengyao Wang; Joshua M. Weiss
We devise an efficient algorithm that, given points z 1 , ? , z k in the open unit disk D and a set of complex numbers { f i , 0 , f i , 1 , ? , f i , n i - 1 } assigned to each z i , produces a rational function f with a single (multiple) pole in D , such that f is bounded on the unit circle by a predetermined positive number, and its Taylor expansion at z i has f i , 0 , f i , 1 , ? , f i , n i - 1 as its first n i coefficients.
Biometrika | 2015
Yi Yu; Tengyao Wang; Richard J. Samworth
Journal of The Royal Statistical Society Series B-statistical Methodology | 2018
Tengyao Wang; Richard J. Samworth
Archive | 2013
Tengyao Wang; Nitin Viswanathan
international conference on machine learning | 2013
S ebastian Bubeck; Tengyao Wang; Nitin Viswanathan
arXiv: Statistics Theory | 2017
Qiyang Han; Tengyao Wang; Sabyasachi Chatterjee; Richard J. Samworth