Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where David John Gagne is active.

Publication


Featured researches published by David John Gagne.


international conference on data mining | 2008

Spatiotemporal Relational Probability Trees: An Introduction

Amy McGovern; Nathan C. Hiers; Matthew W. Collier; David John Gagne; Rodger A. Brown

We introduce spatiotemporal relational probability trees (SRPTs), probability estimation trees for relational data that can vary in both space and time. The SRPT algorithm addresses the exponential increase in search complexity through sampling. We validate the SRPT using a simulated data set and we empirically demonstrate the SRPT algorithm on two real-world data sets.


Journal of Atmospheric and Oceanic Technology | 2009

Classification of Convective Areas Using Decision Trees

David John Gagne; Amy McGovern; Jerry Brotzge

Abstract This paper presents an automated approach for classifying storm type from weather radar reflectivity using decision trees. Recent research indicates a strong relationship between storm type (morphology) and severe weather, and such information can aid in the warning process. Furthermore, new adaptive sensing tools, such as the Center for Collaborative Adaptive Sensing of the Atmosphere’s (CASA’s) weather radar, can make use of storm-type information in real time. Given the volume of weather radar data from those tools, manual classification of storms is not possible when dealing with real-time data streams. An automated system can more quickly and efficiently sort through real-time data streams and return value-added output in a form that can be more easily manipulated and understood. The method of storm classification in this paper combines two machine learning techniques: K-means clustering and decision trees. K-means segments the reflectivity data into clusters, and decision trees classify eac...


Statistical Analysis and Data Mining | 2011

Using spatiotemporal relational random forests to improve our understanding of severe weather processes

Amy McGovern; David John Gagne; Nathaniel Troutman; Rodger A. Brown; Jeffrey B. Basara; John K. Williams

Major severe weather events can cause a significant loss of life and property. We seek to revolutionize our understanding of and our ability to predict such events through the mining of severe weather data. Because weather is inherently a spatiotemporal phenomenon, mining such data requires a model capable of representing and reasoning about complex spatiotemporal dynamics, including temporally and spatially varying attributes and relationships. We introduce an augmented version of the Spatiotemporal Relational Random Forest, which is a random forest that learns with spatiotemporally varying relational data. Our algorithm maintains the strength and performance of random forests but extends their applicability, including the estimation of variable importance, to complex spatiotemporal relational domains. We apply the augmented Spatiotemporal Relational Random Forest to three severe weather data sets. These are: predicting atmospheric turbulence across the continental United States, examining the formation of tornadoes near strong frontal boundaries, and understanding the spatial evolution of drought across the southern plains of the United States. The results on such a wide variety of real-world domains demonstrate the extensive applicability of the Spatiotemporal Relational Random Forest. Our long-term goal is to significantly improve the ability to predict and warn about severe weather events. We expect that the tools and techniques we develop will be applicable to a wide range of complex spatiotemporal phenomena.


Machine Learning | 2014

Enhancing understanding and improving prediction of severe weather through spatiotemporal relational learning

Amy McGovern; David John Gagne; John K. Williams; Rodger A. Brown; Jeffrey B. Basara

Severe weather, including tornadoes, thunderstorms, wind, and hail annually cause significant loss of life and property. We are developing spatiotemporal machine learning techniques that will enable meteorologists to improve the prediction of these events by improving their understanding of the fundamental causes of the phenomena and by building skillful empirical predictive models. In this paper, we present significant enhancements of our Spatiotemporal Relational Probability Trees that enable autonomous discovery of spatiotemporal relationships as well as learning with arbitrary shapes. We focus our evaluation on two real-world case studies using our technique: predicting tornadoes in Oklahoma and predicting aircraft turbulence in the United States. We also discuss how to evaluate success for a machine learning algorithm in the severe weather domain, which will enable new methods such as ours to transfer from research to operations, provide a set of lessons learned for embedded machine learning applications, and discuss how to field our technique.


Bulletin of the American Meteorological Society | 2015

Solar Energy Prediction: An International Contest to Initiate Interdisciplinary Research on Compelling Meteorological Problems

Amy McGovern; David John Gagne; Lucas Eustaquio; Gilberto Titericz; Benjamin Lazorthes; Owen Zhang; Gilles Louppe; Peter Prettenhofer; Jeffrey B. Basara; Thomas M. Hamill; David Margolin

15 As meteorological observing systems and models grow in complexity and number, the size of 16 the data becomes overwhelming for humans to analyze using traditional techniques. Com17 puter scientists, and specifically machine learning and data mining researchers, are develop18 ing frameworks for analyzing big data. The AMS Committee on Artificial Intelligence and 19 its Applications to Environmental Science aims to bring AI researchers and environmental 20 scientists together to increase the synergy between the two. The AI committee has spon21 sored 4 previous contests on a variety of meteorological problems including wind energy, 22 storm classification, winter hydrometeor classification, and air pollution, with the goal of 23 bringing together the two fields of research. Although these were successful, the audience 24 was limited to existing environmental science researchers (usually 10-20 teams of people 25 primarily within the AMS community). For the 2013/14 contest, we expanded to a global 26 audience by focusing on the compelling problem of solar energy prediction and by having 27 the established forum Kaggle host our contest. Using this forum, we had over 160 teams 28 from all around the world participate. Improved solar energy forecasting is a necessary com29 ponent of making solar energy a viable alternative power source. This paper summarizes 30 our experiences in the 2013/14 contest, discusses the data in detail, and presents the win31 ning prediction methods. The contest data come from the NOAA/ESRL Global Ensemble 32 Forecasting System Reforecast Version 2 and the Oklahoma Mesonet with sponsorship from 33 EarthRisk Technologies. All winning methods utilized gradient boosted regression trees but 34 differed in parameter choices and interpolation methods. 35


Proceedings of the 2011 workshop on Knowledge discovery, modeling and simulation | 2011

Machine learning enhancement of storm scale ensemble precipitation forecasts

David John Gagne; Amy McGovern; Ming Xue

Precipitation forecasts provide both a crucial service for the general populace and a challenging forecasting problem due to the complex, multi-scale interactions required for precipitation formation. The Center for the Analysis and Prediction of Storms (CAPS) Storm Scale Ensemble Forecast (SSEF) system is a promising method of providing high-resolution forecasts of the intensity and uncertainty in precipitation forecasts. The SSEF incorporates multiple models with varied parameterization scheme combinations and produces forecasts every 4 km over the continental US. The SSEF precipitation forecasts exhibit significant negative biases and placement errors. In order to correct these issues, multiple machine learning algorithms have been applied to the SSEF precipitation forecasts to correct the forecasts using the NSSL National Mosaic and Multisensor QPE (NMQ) grid as verification. The 2010 SSEF was used for training. Two levels of post-processing are performed. In the first, probabilities of any precipitation are determined and used to find optimal thresholds for the precipitation areas. Then, three types of forecasts are produced in those areas. First, the probability of the 1-hour accumulated precipitation exceeding a threshold is predicted with random forests, logistic regression, and multivariate adaptive regression splines (MARS). Second, deterministic forecasts based on a correction from the ensemble mean are made with linear regression, random forests, and MARS. Third, fixed probability interval forecasts are made with quantile regressions and quantile regression forests. Models are generated from points sampled from the western, central, and eastern sections of the domain. Verification statistics and case study results show improvements in the reliability and skill of the forecasts compared to the original ensemble while controlling for the over-prediction of the precipitation areas and without sacrificing smaller scale details from the model runs.


Weather and Forecasting | 2014

Machine Learning Enhancement of Storm-Scale Ensemble Probabilistic Quantitative Precipitation Forecasts

David John Gagne; Amy McGovern; Ming Xue

AbstractProbabilistic quantitative precipitation forecasts challenge meteorologists due to the wide variability of precipitation amounts over small areas and their dependence on conditions at multiple spatial and temporal scales. Ensembles of convection-allowing numerical weather prediction models offer a way to produce improved precipitation forecasts and estimates of the forecast uncertainty. These models allow for the prediction of individual convective storms on the model grid, but they often displace the storms in space, time, and intensity, which results in added uncertainty. Machine learning methods can produce calibrated probabilistic forecasts from the raw ensemble data that correct for systemic biases in the ensemble precipitation forecast and incorporate additional uncertainty information from aggregations of the ensemble members and additional model variables. This study utilizes the 2010 Center for Analysis and Prediction of Storms Storm-Scale Ensemble Forecast system and the National Severe ...


conference on intelligent data understanding | 2012

Machine learning enhancement of Storm Scale Ensemble precipitation forecasts

David John Gagne; Amy McGovern; Ming Xue

Precipitation forecasts provide both a crucial service for the general populace and a challenging forecasting problem due to the complex, multi-scale interactions required for precipitation formation. The Center for the Analysis and Prediction of Storms (CAPS) Storm Scale Ensemble Forecast (SSEF) system is a promising method of providing high-resolution forecasts of the intensity and uncertainty in precipitation forecasts. The SSEF incorporates multiple models with varied parameterization scheme combinations and produces forecasts every 4 km over the continental US. The SSEF precipitation forecasts exhibit significant negative biases and placement errors. In order to correct these issues, multiple machine learning algorithms have been applied to the SSEF precipitation forecasts to correct the forecasts using the NSSL National Mosaic and Multisensor QPE (NMQ) grid as verification. The 2010 SSEF was used for training. Two levels of post-processing are performed. In the first, probabilities of any precipitation are determined and used to find optimal thresholds for the precipitation areas. Then, three types of forecasts are produced in those areas. First, the probability of the 1-hour accumulated precipitation exceeding a threshold is predicted with random forests, logistic regression, and multivariate adaptive regression splines (MARS). Second, deterministic forecasts based on a correction from the ensemble mean are made with linear regression, random forests, and MARS. Third, fixed probability interval forecasts are made with quantile regressions and quantile regression forests. Models are generated from points sampled from the western, central, and eastern sections of the domain. Verification statistics and case study results show improvements in the reliability and skill of the forecasts compared to the original ensemble while controlling for the over-prediction of the precipitation areas and without sacrificing smaller scale details from the model runs.


Bulletin of the American Meteorological Society | 2017

Using Artificial Intelligence to Improve Real-Time Decision-Making for High-Impact Weather

Amy McGovern; Kimberly L. Elmore; David John Gagne; Sue Ellen Haupt; Christopher D. Karstens; Ryan Lagerquist; Travis M. Smith; John K. Williams

AbstractHigh-impact weather events, such as severe thunderstorms, tornadoes, and hurricanes, cause significant disruptions to infrastructure, property loss, and even fatalities. High-impact events can also positively impact society, such as the impact on savings through renewable energy. Prediction of these events has improved substantially with greater observational capabilities, increased computing power, and better model physics, but there is still significant room for improvement. Artificial intelligence (AI) and data science technologies, specifically machine learning and data mining, bridge the gap between numerical model prediction and real-time guidance by improving accuracy. AI techniques also extract otherwise unavailable information from forecast models by fusing model output with observations to provide additional decision support for forecasters and users. In this work, we demonstrate that applying AI techniques along with a physical understanding of the environment can significantly improve ...


Journal of Applied Meteorology and Climatology | 2012

Tornadic Supercell Environments Analyzed Using Surface and Reanalysis Data: A Spatiotemporal Relational Data-Mining Approach

David John Gagne; Amy McGovern; Jeffrey B. Basara; Rodger A. Brown

AbstractOklahoma Mesonet surface data and North American Regional Reanalysis data were integrated with the tracks of over 900 tornadic and nontornadic supercell thunderstorms in Oklahoma from 1994 to 2003 to observe the evolution of near-storm environments with data currently available to operational forecasters. These data are used to train a complex data-mining algorithm that can analyze the variability of meteorological data in both space and time and produce a probabilistic prediction of tornadogenesis given variables describing the near-storm environment. The algorithm was assessed for utility in four ways. First, its probability forecasts were scored. The algorithm did produce some useful skill in discriminating between tornadic and nontornadic supercells as well as in producing reliable probabilities. Second, its selection of relevant attributes was assessed for physical significance. Surface thermodynamic parameters, instability, and bulk wind shear were among the most significant attributes. Thir...

Collaboration


Dive into the David John Gagne's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ming Xue

University of Oklahoma

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

John K. Williams

National Center for Atmospheric Research

View shared research outputs
Top Co-Authors

Avatar

Rodger A. Brown

National Oceanic and Atmospheric Administration

View shared research outputs
Top Co-Authors

Avatar

Sue Ellen Haupt

National Center for Atmospheric Research

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Christopher D. Karstens

National Oceanic and Atmospheric Administration

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge