Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Linda H. Zhao is active.

Publication


Featured researches published by Linda H. Zhao.


Journal of the American Statistical Association | 2005

Statistical Analysis of a Telephone Call Center: A Queueing-Science Perspective

Lawrence D. Brown; Noah Gans; Avishai Mandelbaum; Anat Sakov; Haipeng Shen; Sergey Zeltyn; Linda H. Zhao

A call center is a service network in which agents provide telephone-based services. Customers who seek these services are delayed in tele-queues. This article summarizes an analysis of a unique record of call center operations. The data comprise a complete operational history of a small banking call center, call by call, over a full year. Taking the perspective of queueing theory, we decompose the service process into three fundamental components: arrivals, customer patience, and service durations. Each component involves different basic mathematical structures and requires a different style of statistical analysis. Some of the key empirical results are sketched, along with descriptions of the varied techniques required. Several statistical techniques are developed for analysis of the basic components. One of these techniques is a test that a point process is a Poisson process. Another involves estimation of the mean function in a nonparametric regression with lognormal errors. A new graphical technique is introduced for nonparametric hazard rate estimation with censored data. Models are developed and implemented for forecasting of Poisson arrival rates. Finally, the article surveys how the characteristics deduced from the statistical analyses form the building blocks for theoretically interesting and practically useful mathematical models for call center operations.


international conference on machine learning | 2009

Supervised learning from multiple experts: whom to trust when everyone lies a bit

Vikas C. Raykar; Shipeng Yu; Linda H. Zhao; Anna Jerebko; Charles Florin; Gerardo Hermosillo Valadez; Luca Bogoni; Linda Moy

We describe a probabilistic approach for supervised learning when we have multiple experts/annotators providing (possibly noisy) labels but no absolute gold standard. The proposed algorithm evaluates the different experts and also gives an estimate of the actual hidden labels. Experimental results indicate that the proposed method is superior to the commonly used majority voting baseline.


The Annals of Applied Statistics | 2011

An autoregressive approach to house price modeling

Chaitra H. Nagaraja; Lawrence D. Brown; Linda H. Zhao

A statistical model for predicting individual house prices is proposed utilizing only information regarding sale price, time of sale, and location (ZIP code). This model is composed of a xed time eect and a random ZIP (postal) code eect combined with an autoregressive component. The latter piece is applied only to homes sold repeatedly while the former two components are applied to all of the data. In addition, the autoregressive component incorporates heteroscedasticity in the errors. To evaluate the proposed model, single-family home sales for twenty U.S. metropolitan areas from July 1985 through September 2004 are analyzed. The model is shown to have better predictive abilities than the benchmark S&P/Case-Shiller model, which is a repeat sales model, and a conventional mixed eects model. It is also shown that the time eect in the proposed model can be converted into a house price index. Finally, the special case of Los Angeles, CA is discussed as an example of history repeating itself in regards to the current housing market meltdown.


Journal of Computational and Graphical Statistics | 2010

Fast Computation of Kernel Estimators

Vikas C. Raykar; Ramani Duraiswami; Linda H. Zhao

The computational complexity of evaluating the kernel density estimate (or its derivatives) at m evaluation points given n sample points scales quadratically as O(nm)—making it prohibitively expensive for large datasets. While approximate methods like binning could speed up the computation, they lack a precise control over the accuracy of the approximation. There is no straightforward way of choosing the binning parameters a priori in order to achieve a desired approximation error. We propose a novel computationally efficient ε-exact approximation algorithm for the univariate Gaussian kernel-based density derivative estimation that reduces the computational complexity from O(nm) to linear O(n+m). The user can specify a desired accuracy ε. The algorithm guarantees that the actual error between the approximation and the original kernel estimate will always be less than ε. We also apply our proposed fast algorithm to speed up automatic bandwidth selection procedures. We compare our method to the best available binning methods in terms of the speed and the accuracy. Our experimental results show that the proposed method is almost twice as fast as the best binning methods and is around five orders of magnitude more accurate. The software for the proposed method is available online.


Journal of Nonparametric Statistics | 2009

Sharp adaptive estimation by a blockwise method

T. Tony Cai; Mark G. Low; Linda H. Zhao

We consider a blockwise James–Stein estimator for nonparametric function estimation in suitable wavelet or Fourier bases. The estimator can be readily explained and implemented. We show that the estimator is asymptotically sharp adaptive in minimax risk over any Sobolev ball containing the true function. Further, for a moderately broad range of bounded sets in the Besov space our estimator is asymptotically nearly sharp adaptive in the sense that it comes within the Donoho–Liu constant, 1.24, of being exactly sharp adaptive. Other parameter spaces are also considered. The paper concludes with a Monte-Carlo study comparing the performance of our estimator with that of three other popular wavelet estimators. Our procedure generally (but not always) outperforms two of these and is overall comparable, or perhaps slightly superior, with the third.


Evaluation Review | 2013

Covariance Adjustments for the Analysis of Randomized Field Experiments

Richard A. Berk; Emil Pitkin; Lawrence D. Brown; Andreas Buja; Edward I. George; Linda H. Zhao

Background: It has become common practice to analyze randomized experiments using linear regression with covariates. Improved precision of treatment effect estimates is the usual motivation. In a series of important articles, David Freedman showed that this approach can be badly flawed. Recent work by Winston Lin offers partial remedies, but important problems remain. Results: In this article, we address those problems through a reformulation of the Neyman causal model. We provide a practical estimator and valid standard errors for the average treatment effect. Proper generalizations to well-defined populations can follow. Conclusion: In most applications, the use of covariates to improve precision is not worth the trouble.


Sociological Methods & Research | 2014

Misspecified Mean Function Regression Making Good Use of Regression Models That Are Wrong

Richard A. Berk; Lawrence D. Brown; Andreas Buja; Edward I. George; Emil Pitkin; Kai Zhang; Linda H. Zhao

There are over three decades of largely unrebutted criticism of regression analysis as practiced in the social sciences. Yet, regression analysis broadly construed remains for many the method of choice for characterizing conditional relationships. One possible explanation is that the existing alternatives sometimes can be seen by researchers as unsatisfying. In this article, we provide a different formulation. We allow the regression model to be incorrect and consider what can be learned nevertheless. To this end, the search for a correct model is abandoned. We offer instead a rigorous way to learn from regression approximations. These approximations, not “the truth,” are the estimation targets. There exist estimators that are asymptotically unbiased and standard errors that are asymptotically correct even when there are important specification errors. Both can be obtained easily from popular statistical packages.


Bernoulli | 2007

Trade-offs between global and local risks in nonparametric function estimation

T. Tony Cai; Mark G. Low; Linda H. Zhao

The problem of loss adaptation is investigated: given a fixed parameter, the goal is to construct an estimator that adapts to the loss function in the sense that the estimator is optimal both globally and locally at every point. Given the class of estimator sequences that achieve the minimax rate, over a fixed Besov space, for estimating the entire function a lower bound is given on the performance for estimating the function at each point. This bound is larger by a logarithmic factor than the usual minimax rate for estimation at a point when the global and local minimax rates of convergence differ. A lower bound for the maximum global risk is given for estimators that achieve optimal minimax rates of convergence at every point. An inequality concerning estimation in a two-parameter statistical problem plays a key role in the proof. It can be considered as a generalization of an inequality due to Brown and Low. This may be of independent interest. A particular wavelet estimator is constructed which is globally optimal and which attains the lower bound for the local risk provided by our


Statistics & Probability Letters | 2002

Bayesian nonparametric point estimation under a conjugate prior

Xuefeng Li; Linda H. Zhao

Estimation of a nonparametric regression function at a point is considered. The function is assumed to lie in a Sobolev space, Sq, of order q. The asymptotic squared-error performance of Bayes estimators corresponding to Gaussian priors is investigated as the sample size, n, increases. It is shown that for any such fixed prior on Sq the Bayes procedures do not attain the optimal minimax rate over balls in Sq. This result complements that in Zhao (Ann. Statist. 28 (2000) 532) for estimating the entire regression function, but the proof is rather different.


Archive | 2013

What You Can Learn from Wrong Causal Models

Richard A. Berk; Lawrence D. Brown; Edward I. George; Emil Pitkin; Mikhail Traskin; Kai Zhang; Linda H. Zhao

It is common for social science researchers to provide estimates of causal effects from regression models imposed on observational data. The many problems with such work are well documented and widely known. The usual response is to claim, with little real evidence, that the causal model is close enough to the “truth” that sufficiently accurate causal effects can be estimated. In this chapter, a more circumspect approach is taken. We assume that the causal model is a substantial distance from the truth and then consider what can be learned nevertheless. To that end, we distinguish between how nature generated the data, a “true” model representing how this was accomplished, and a working model that is imposed on the data. The working model will typically be “wrong.” Nevertheless, unbiased or asymptotically unbiased estimates from parametric, semiparametric, and nonparametric working models can often be obtained in concert with appropriate statistical tests and confidence intervals. However, the estimates are not of the regression parameters typically assumed. Estimates of causal effects are not provided. Correlation is not causation. Nor is partial correlation, even when dressed up as regression coefficients. However, we argue that insights about causal effects do not require estimates of causal effects. We also discuss what can be learned when our alternative approach is not persuasive.

Collaboration


Dive into the Linda H. Zhao's collaboration.

Top Co-Authors

Avatar

Lawrence D. Brown

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Richard A. Berk

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Andreas Buja

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Edward I. George

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Emil Pitkin

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Kai Zhang

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Mark G. Low

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

T. Tony Cai

University of Pennsylvania

View shared research outputs
Researchain Logo
Decentralizing Knowledge