Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where David H. Annis is active.

Publication


Featured researches published by David H. Annis.


Journal of Quantitative Analysis in Sports | 2005

Hybrid Paired Comparison Analysis, with Applications to the Ranking of College Football Teams

David H. Annis; Bruce A. Craig

Existing paired comparison models used for ranking football teams primarily focus on either wins and losses or points scored (either via each teams total or a margin of victory). While reasonable, each approach fails to produce satisfactory rankings in frequently arising situations due to its ignorance of additional data. We propose a new, hybrid model incorporating both wins and constituent scores and show that it outperforms its competitors and is robust against model mis-specification based on a series of simulation studies. We conclude by illustrating the method using the 2003-04 and 2004-05 college football seasons.


The American Statistician | 2006

A Comparison of Potential Playoff Systems for NCAA I-A Football

David H. Annis; Samuel S. Wu

This article discusses the properties of various knockout tournament designs and presents theoretical results. Potential playoff schemes for Division I-A football are examined via simulation studies. Several metrics are used to assess the relative merits of playoff scenarios, which differ in number, selection, and seeding of playoff teams. Most suggest that college football would benefit from a limited playoff system. Interestingly, for the class of playoff systems examined, the number of teams influences the performance far more than does the seeding procedure.


The American Statistician | 2005

Rethinking the Paper Helicopter: Combining Statistical and Engineering Knowledge

David H. Annis

Boxs paper helicopter has been used to teach experimental design for more than a decade. It is simple, inexpensive, and provides real data for an involved, multifactor experiment. Unfortunately it can also further an all-too-common practice that Professor Box himself has repeatedly cautioned against, namely ignoring the fundamental science while rushing to solve problems that may not be sufficiently understood. Often this slighting of the science so as to get on with the statistics is justified by referring to Boxs oft-quoted maxim that “All models are wrong, however some are useful.” Nevertheless, what is equally true, to paraphrase both Professor Box and George Orwell, is that “All models are wrong, but some are more wrong than others.” To experiment effectively it is necessary to understand the relevant science so as to distinguish between what is usefully wrong, and what is dangerously wrong. This article presents an improved analysis of Boxs helicopter problem relying on statistical and engineering knowledge and shows that this leads to an enhanced paper helicopter, requiring fewer experimental trails and achieving superior performance. In fact, of the 20 experimental trials run for validation—10 each of the proposed aerodynamic design and the conventional full factorial optimum—the longest 10 flight times all belong to the aerodynamic optimum, while the shortest 10 all belong to the conventional full factorial optimum. I further discuss how ancillary engineering knowledge can be incorporated into thinking about—and teaching—experimental design.


Journal of Quantitative Analysis in Sports | 2006

Optimal End-Game Strategy in Basketball

David H. Annis

When faced with protecting a three-point lead in the waning seconds of a basketball game, which is a preferable strategy: playing defense or fouling the offense before they can attempt a game-tying shot? Gonzaga University head coach, Mark Few, was faced with such a decision against Michigan State in the semi-finals of the Maui Invitational (November 22, 2005) and elected to play defense. The strategy backfired, as Michigan States Maurice Ager made a three-point basket at the buzzer to force overtime. (Gonzaga eventually won in triple overtime.) Was this failure to hold the lead at the end of regulation bad luck or bad strategy? Put another way, which strategy (conventional defense or intentionally fouling) maximizes the defensive teams chances of winning the game? Drawing on the Gonzaga/Michigan State game for inspiration, this paper addresses this question and concludes that, contrary to popular belief, intentionally fouling is preferable to playing tight defense.


The American Statistician | 2010

Estimation in Reversible Markov Chains

David H. Annis; Peter C. Kiessler; Robert Lund; Tara l. Steuber

This article examines estimation of the one-step-ahead transition probabilities in a reversible Markov chain on a countable state space. A symmetrized moment estimator is proposed that exploits the reversible structure. Examples are given where the symmetrized estimator has superior asymptotic properties to those of a naive estimator, implying that knowledge of reversibility can sometimes improve estimation. The asymptotic mean and variance of the estimators are quantified. The results are proven using only elementary results such as the law of large numbers and the central limit theorem.


The American Statistician | 2007

Dyadic Data Analysis

David H. Annis

neighbors, classification and regression trees (six pages), and neural networks (17 pages). The mechanics of the latter are adequately described, but there is too much emphasis on arithmetic, and little effort is made to intuitively justify the prediction process. At the end of the chapter, there is one paragraph on multiple regression analysis and one sentence on logistic regression. A major disappointment to me was the almost exclusive reliance in the examples on a rather old automobile fuel efficiency dataset (there is one observation for a Datsun 1200 vehicle). I had hoped to see some real business applications. After a four page “Deployment” chapter, the book ends with a “Conclusions” chapter containing one large-scale example involving data on the incidence of diabetes among Pima Indians. Here we find histograms, box plots, a two-sample t test, some derived associative rules which I did not find overly insightful, and a brief summary of prediction results via neural networks. I think students coming out of an undergraduate regression course could do a fine job of analyzing these data without resorting to the bells and whistles of data mining. And once again, where are the business applications? Although the text does give a brief snapshot of the subject, it is lacking in detail, applications, and opportunities for practice. Someone considering becoming involved in a data mining project or teaching an introductory course in the subject would be advised to learn much more than what MSD offers. Good information sources are the much more ambitious books by Hastie, Tibshirani, and Friedman (2001) (the best-selling Springer statistics book ever, thanks to purchases by those outside our discipline) and Bishop (2006).


Journal of Quantitative Analysis in Sports | 2007

Dimension Reduction for Hybrid Paired Comparison Models

David H. Annis

Rating -- and subsequently ranking -- college football teams requires making sense of sometimes conflicting pair-wise comparisons. Classical statistical techniques fall into one of two classes: win/loss models, which focus on binary outcomes, and point-scoring models, which consider the distribution of component scores. Annis and Craig (2005) illustrate deficiencies of both, and propose a hybrid method that considers both sources of data. Their method, while providing satisfactory results in many circumstances, can be difficult to implement numerically. This paper presents a refinement of their hybrid rating algorithm which preserves their original intent but simplifies greatly its implementation. Like its predecessor, the new model enjoys robustness to model misspecification, while offering substantial simplification.


Journal of the American Statistical Association | 2005

Permutation, Parametric, and Bootstrap Tests of Hypotheses

David H. Annis

Kenneth Lange’s latest book deftly blends theoretical and practical concepts about optimization theory in the field of statistics. In the foreward, Lange states that the text is intended for graduate students in statistics, although I think that it may also be accessible for upper-level undergraduate students with a rigorous background in pure mathematics. The book can be divided into two sections. The first five chapters of the text contain very little information on practical optimization methods, but rather act as a primer on the mathematical analysis necessary to understand the analytic underpinnings of modern optimization. Although it may not have been the author’s intent, the first five chapters may have significant educational value beyond computing as a primer that could be titled “A Review of Real Analysis for Statisticians.” Lange gives very brief, but clear descriptions of analytical ideas such as convergence, connectedness, and differentiation that would be useful for any first-year statistics doctoral student lacking the proper undergraduate real analysis courses needed for advanced statistics courses; the only necessary topic missing is that of measure theory. This book helps develop the student’s intuition for abstract concepts in analysis much better than many undergraduate analysis texts. Chapter 2 contains an overview of six of the “seven C’s of analysis” (convergence, complete, closed, compact, continuous, and connected). Although the chapter is only 21 pages long, any student who devotes proper attention to them will learn very quickly whether such material constitutes a review of the basic analysis necessary for statistical inference or whether he or she needs to seek outside reference materials for more details on the fundamental theorems and ideas presented (e.g., Bolzano–Weierstrass, intermediate mean value theorem, uniform continuity). Next comes a chapter is devoted to the concepts of differentiation that are essential for proofs in optimization. Because the bulk of the optimization algorithms used in statistics require these tools, the material covered is absolutely essential (although hopefully covered in a previous advanced calculus course). Chapter 4 covers a topic even more specific to optimization, Karush–Kuhn– Tucker (KKT) theory. Although the material in this chapter is extremely useful for describing of the algorithms to come, the steep gradient of mathematical sophistication was a bit jarring; students may quickly find the nature of the course changed after becoming accustomed to the more introductory material in the first three chapters. Chapter 5, on convexity (the seventh “C”), is a jewel of the text. Convexity can be a unifying concept in statistics, yet is rarely given sufficient attention as such. The author returns to the style of the first three chapters and presents clear definitions and examples in the area of convexity. Students who have a firm grasp of the material in this chapter should gain maximum benefit from the presentation of the algorithms in the second half of the book. The second half of the text gives more practical direction in the art of statistical optimization. The usual suspects (e.g., the EM algorithm, Newton’s and quasi-Newton methods) receive their own chapters, although even in the discussion of these standard optimization methods, Lange embeds the methodology in a more general framework. His discussion of the majorization and minimization algorithm allows for a common framework in which to discuss other methods (which have seen significantly greater use in statistics). Chapter 6, on the majorization–minimization (MM) algorithms, brings the first real applications of the material in the first half to statistical optimization problems. The philosophy of the MM algorithm is presented in a straightforward manner, and relevant examples of potential implementations of the algorithm are presented for linear regression and the Bradley–Terry model of ranking. As someone who has taught optimization, I found the examples in the text interesting but wished more applications like those had been provided as exercises for students. Chapters 7 and 8 cover the two most important optimization algorithms in statistics: the EM algorithm and Newton’s method-based maximization. The one unique feature of these chapters (compared to other texts) is the consideration of both algorithms in light of the principles behind the MM algorithm. One can find interesting, but accessible examples throughout both chapters, including, but not limited to, factor analysis, image analysis, and generalized linear models. From a researcher’s perspective, I found the material in Chapters 9–11 the most relevant. Conjugate gradient, convex programming, interior point methods, and duality have not been extensively covered in statistical computing textbooks. Such modern optimization methods have been explored in statistical optimization practice only very recently. I was glad to see them given a fair bit of explanation in the last three chapters. The concept of analyzing convergence of optimization algorithms has been covered only in the basic presentation of new statistical optimization algorithms; this should be useful to someone interested in exploring the limits of current and future statistical optimization algorithms. The material in Chapter 10 facilitates theoretical (rather than simulated) comparison of algorithms in different contexts. Although I found Optimization to be an extremely engaging textbook, I find it difficult to advocate its use as a single graduate text for a statistical computing course. The book does not cover stochastic versions of the EM algorithm nor simulated annealing, two of the more popular optimization algorithms used by statisticians. The exercises are entirely theoretical in nature, which may not be appropriate for most courses in graduate-level statistical computing. The number of graduate programs in statistics that could devote an entire semester-long computing course to the finer (and theoretical) points of maximization algorithms is most likely small. However, the text is ideal for graduate students or researchers beginning research on optimization problems in statistics. There is little doubt that someone who worked through the text as part of a reading course or a specialized graduate seminar would benefit greatly from the author’s perspective, giving him or her a more intricate understanding of why optimization algorithms work, rather than how to implement them.


Journal of the American Statistical Association | 2008

Bayes Linear Statistics: Theory and Methods

David H. Annis

Annis reviews Bayes Linear Statistics: Theory and Methods by Michael Goldstein and David Wooff.


Journal of the American Statistical Association | 2006

Bayesian Statistics and Marketing

David H. Annis

writing style is clear and practical, with an emphasis on how the methods can be applied to real applications. In many of the areas, the book outlines methods on edge of practice, which will be helpful for those active in research. The authors suggest that the fundamental prerequisites for the book are “knowledge of probability and statistics at least at the level of a first university course in a quantitative discipline” and “familiarity with undergraduate calculus and linear algebra.” Given some of the more technical material presented, some additional statistics background beyond a single course would be highly beneficial. I do, however, disagree with the authors when they say that some previous exposure to finance, economics, or insurance is beneficial but not absolutely necessary. The examples and discussion of concepts will not be accessible to a reader without good exposure to some of these areas. Chapter 1 provides a helpful and detailed overview of many aspects of financial risk, including a brief history of the subject, why managing risk is important from various perspectives, and the new regulatory framework. Chapter 2 presents the core building blocks of risk management, including loss distributions, value at risk, and exploratory summaries of market risk. Chapters 3–7 discuss multivariate models, financial time series, copulas, aggregate risk, and extreme value theory. By presenting some of the key theoretical results, combined with numerous examples and plots to help visualize results, the book gives helpful introductions and details about these topics. Chapters 8 and 9 present a detailed overview of credit risk management and dynamic credit risk models from a variety of perspectives. Finally, Chapter 10 presents some material on operational risk and insurance analytics. A short appendix provides a skeleton overview of some basic definitions, probability distributions, and likelihood inference approaches. For those not familiar with these, some additional references certainly would be needed. One of this book’s valuable features is the breadth of methods explored from a variety of disciplines. This allows the reader to get a clearer sense of what options are available and which approach is best suited for a particular application. Given the book’s ambitious mandate and the wide assortment of topics considered, it is not surprising that sometimes the book is a bit terse and dense. Some additional background from supplemental books will be needed to fill in missing details. Nonetheless, overall this book is an excellent resource for anyone interested in understanding how best to quantify financial risk.

Collaboration


Dive into the David H. Annis's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge