Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Galit Shmueli is active.

Publication


Featured researches published by Galit Shmueli.


Statistical Science | 2010

To Explain or To Predict

Galit Shmueli

Statistical modeling is a powerful tool for developing and testing theories by way of causal explanation, prediction, and description. In many disciplines there is near-exclusive use of statistical modeling for causal ex- planation and the assumption that models with high explanatory power are inherently of high predictive power. Conflation between explanation and pre- diction is common, yet the distinction must be understood for progressing scientific knowledge. While this distinction has been recognized in the phi- losophy of science, the statistical literature lacks a thorough discussion of the many differences that arise in the process of modeling for an explanatory ver- sus a predictive goal. The purpose of this article is to clarify the distinction between explanatory and predictive modeling, to discuss its sources, and to reveal the practical implications of the distinction to each step in the model- ing process.


Management Information Systems Quarterly | 2011

Predictive analytics in information systems research

Galit Shmueli; Otto R. Koppius

textabstractThis research essay highlights the need to integrate predictive analytics into information systems research and shows several concrete ways in which this goal can be accomplished. Predictive analytics include empirical methods (statistical and other) that generate data predictions as well as methods for assessing predictive power. Predictive analytics not only assist in creating practically useful models, they also play an important role alongside explanatory modeling in theory building and theory testing. We describe six roles for predictive analytics: new theory generation, measurement development, comparison of competing theories, improvement of existing models, relevance assessment, and assessment of the predictability of empirical phenomena. Despite the importance of predictive analytics, we find that they are rare in the empirical IS literature. Extant IS literature relies nearly exclusively on explanatory statistical modeling, where statistical inference is used to test and evaluate the explanatory power of underlying causal models, and predictive power is assumed to follow automatically from the explanatory model. However, explanatory power does not imply predictive power and thus predictive analytics are necessary for assessing predictive power and for building empirical models that predict well. To show that predictive analytics and explanatory statistical modeling are fundamentally disparate, we show that they are different in each step of the modeling process. These differences translate into different final models, so that a pure explanatory statistical model is best tuned for testing causal hypotheses and a pure predictive model is best in terms of predictive power. We convert a well-known explanatory paper on TAM to a predictive context to illustrate these differences and show how predictive analytics can add theoretical and practical value to IS research.


Information Systems Research | 2013

Research Commentary---Too Big to Fail: Large Samples and the p-Value Problem

Mingfeng Lin; Henry C. Lucas; Galit Shmueli

The Internet has provided IS researchers with the opportunity to conduct studies with extremely large samples, frequently well over 10,000 observations. There are many advantages to large samples, but researchers using statistical inference must be aware of the p-value problem associated with them. In very large samples, p-values go quickly to zero, and solely relying on p-values can lead the researcher to claim support for results of no practical significance. In a survey of large sample IS research, we found that a significant number of papers rely on a low p-value and the sign of a regression coefficient alone to support their hypotheses. This research commentary recommends a series of actions the researcher can take to mitigate the p-value problem in large samples and illustrates them with an example of over 300,000 camera sales on eBay. We believe that addressing the p-value problem will increase the credibility of large sample IS research as well as provide more insights for readers.


Proceedings of the National Academy of Sciences of the United States of America | 2002

Early statistical detection of anthrax outbreaks by tracking over-the-counter medication sales

Anna Goldenberg; Galit Shmueli; Richard A. Caruana; Stephen E. Fienberg

The recent series of anthrax attacks has reinforced the importance of biosurveillance systems for the timely detection of epidemics. This paper describes a statistical framework for monitoring grocery data to detect a large-scale but localized bioterrorism attack. Our system illustrates the potential of data sources that may be more timely than traditional medical and public health data. The system includes several layers, each customized to grocery data and tuned to finding footprints of an epidemic. We also propose an evaluation methodology that is suitable in the absence of data on large-scale bioterrorist attacks and disease outbreaks.


Technometrics | 2010

Statistical Challenges Facing Early Outbreak Detection in Biosurveillance

Galit Shmueli; Howard Burkom

Modern biosurveillance is the monitoring of a wide range of prediagnostic and diagnostic data for the purpose of enhancing the ability of the public health infrastructure to detect, investigate, and respond to disease outbreaks. Statistical control charts have been a central tool in classic disease surveillance and also have migrated into modern biosurveillance; however, the new types of data monitored, the processes underlying the time series derived from these data, and the application context all deviate from the industrial setting for which these tools were originally designed. Assumptions of normality, independence, and stationarity are typically violated in syndromic time series. Target values of process parameters are time-dependent and hard to define, and data labeling is ambiguous in the sense that outbreak periods are not clearly defined or known. Additional challenges include multiplicity in several dimensions, performance evaluation, and practical system usage and requirements. Our focus is mainly on the monitoring of time series to provide early alerts of anomalies to stimulate investigation of potential outbreaks, with a brief summary of methods to detect significant spatial and spatiotemporal case clusters. We discuss the statistical challenges in monitoring modern biosurveillance data, describe the current state of monitoring in the field, and survey the most recent biosurveillance literature.


Information Systems Research | 2008

Consumer Surplus in Online Auctions

Ravi Bapna; Wolfgang Jank; Galit Shmueli

Despite the growing research interest in Internet auctions, particularly those on eBay, little is known about quantifiable consumer surplus levels in such mechanisms. Using an ongoing novel field experiment that involves real bidders participating in real auctions, and voting with real dollars, we collect and examine a unique dataset to estimate consumer surplus in eBay auctions. The estimation procedure relies mainly on knowing the highest bid, which is not disclosed by eBay, but is available to us from our experiment. At the outset we assume a private value second-price sealed-bid auction setting, as well as a lack of alternative buying options within or outside eBay. Our analysis, based on a sample of 4514 eBay auctions, indicates that consumers extract a median surplus of at least


Journal of Business & Economic Statistics | 2008

Explaining and Forecasting Online Auction Prices and Their Dynamics Using Functional Data Analysis

Shanshan Wang; Wolfgang Jank; Galit Shmueli

4 per eBay auction. This estimate is unbiased under the above assumptions, and otherwise it is a lower bound. The distribution of surplus is highly skewed given the diverse nature of the data. We find that eBays auctions generate at least


Bayesian Analysis | 2006

Conjugate Analysis of the Conway-Maxwell-Poisson Distribution

Joseph B. Kadane; Galit Shmueli; Thomas P. Minka; Sharad Borle; Peter Boatwright

7.05 billion in total consumer surplus in the year 2003 and may generate up to


Statistical Science | 2006

Functional Data Analysis in Electronic Commerce Research

Wolfgang Jank; Galit Shmueli

7.68 billion if the private value sealed-bid assumption does not hold. We check for the validity of our assumptions and the robustness of our estimates using an additional dataset from 2005 and a randomly sampled validation dataset from eBay.


Journal of Computational and Graphical Statistics | 2005

Visualizing Online Auctions

Galit Shmueli; Wolfgang Jank

Online auctions have become increasingly popular in recent years, and as a consequence there is a growing body of empirical research on this topic. Most of that research treats data from online auctions as cross-sectional, and consequently ignores the changing dynamics that occur during an auction. In this article we take a different look at online auctions and propose to study an auctions price evolution and associated price dynamics. Specifically, we develop a dynamic forecasting system to predict the price of an ongoing auction. By dynamic, we mean that the model can predict the price of an auction “in progress” and can update its prediction based on newly arriving information. Forecasting price in online auctions is challenging because traditional forecasting methods cannot adequately account for two features of online auction data: (1) the unequal spacing of bids and (2) the changing dynamics of price and bidding throughout the auction. Our dynamic forecasting model accounts for these special features by using modern functional data analysis techniques. Specifically, we estimate an auctions price velocity and acceleration and use these dynamics, together with other auction-related information, to develop a dynamic functional forecasting model. We also use the functional context to systematically describe the empirical regularities of auction dynamics. We apply our method to a novel set of Harry Potter and Microsoft Xbox data and show that our forecasting model outperforms traditional methods.

Collaboration


Dive into the Galit Shmueli's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ravi Bapna

University of Minnesota

View shared research outputs
Top Co-Authors

Avatar

Suneel Babu Chatla

National Tsing Hua University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge