[PDF] Benchmarking Forecasting Models for Space Weather Drivers

Abstract

Space weather indices are commonly used to drive operational forecasts of various geospace systems, including the thermosphere for mass density and satellite drag. The drivers serve as proxies for various processes that cause energy flow and deposition in the geospace system. Forecasts of neutral mass density is a major uncertainty in operational orbit prediction and collision avoidance for objects in low earth orbit (LEO). For the strongly driven system, accuracy of space weather driver forecasts is crucial for operations. The High Accuracy Satellite Drag Model (HASDM) currently employed by the United States Air Force in an operational environment is driven by four (4) solar and two (2) geomagnetic proxies. Space Environment Technologies (SET) is contracted by the space command to provide forecasts for the drivers. This work performs a comprehensive assessment for the performance of the driver forecast models. The goal is to provide a benchmark for future improvements of the forecast models. Using an archived data set spanning six (6) years and 15,000 forecasts across solar cycle 24, we quantify the temporal statistics of the model performance.

Full PDF

CConﬁdential manuscript submitted to

Space Weather

Benchmarking Forecasting Models for Space Weather Drivers

Richard J. Licata , W. Kent Tobiska , and Piyush M. Mehta Department of Mechanical and Aerospace Engineering, West Virginia University, Morgantown, West Virginia, USA. Space Environment Technologies, Paciﬁc Palisades, California, USA.

Key Points: • Four solar ( F . , S . , M . , and Y . ) and two geomagnetic ( a p and Dst ) driverindices used as inputs by the operational HASDM system. • Temporal statistics using six years of historical data set for driver forecasts. • Baseline for future developments within the community. ∗ Corresponding author: Richard J. Licata, [email protected] –1– a r X i v : . [ phy s i c s . s p ace - ph ] M a r onﬁdential manuscript submitted to Space Weather

Abstract

Space weather indices are commonly used to drive operational forecasts of various geospacesystems, including the thermosphere for mass density and satellite drag. The drivers serve asproxies for various processes that cause energy ﬂow and deposition in the geospace system.Forecasts of neutral mass density is a major uncertainty in operational orbit prediction andcollision avoidance for objects in low earth orbit (LEO). For the strongly driven system, ac-curacy of space weather driver forecasts is crucial for operations. The High Accuracy Satel-lite Drag Model (HASDM) currently employed by the United States Air Force in an opera-tional environment is driven by four (4) solar and two (2) geomagnetic proxies. Space Envi-ronment Technologies (SET) is contracted by the space command to provide forecasts for thedrivers. This work performs a comprehensive assessment for the performance of the driverforecast models. The goal is to provide a benchmark for future improvements of the forecastmodels. Using an archived data set spanning six (6) years and 15,000 forecasts across solarcycle 24, we quantify the temporal statistics of the model performance.

Accurately quantifying mass density in the thermosphere remains a predicament forthe community. The diﬃculty stems from the highly dynamic nature of the thermosphere, anenvironment driven by a number of factors ranging from solar extreme ultraviolet (EUV) andgeomagnetic heating to gravity waves in the lower atmosphere. Emmert (2015) provides athorough overview of the physical drivers and their eﬀects on thermospheric density. Currentcapabilities limit our ability to predict satellites’ trajectories with precision in an operationalsetting. During large solar and geomagnetic storms, operators struggle to locate many resi-dent space objects, let alone have the means to predict their orbits (

Berger et al.

Storz et al.

Bowman et al. –2–onﬁdential manuscript submitted to

Space Weather

The JB2008 models neutral density in the thermosphere using global exospheric tem-perature equations that leverage four solar indices to simulate thermosphere heating fromdiﬀerent sources of solar energy (

Tobiska et al.

Bowman et al. F . proxyhas a strong correlation to solar extreme ultraviolet (EUV) irradiance which has led to itslong-time use as measure of solar EUV energy. S . is an index indicative of activity of theintegrated 26-34 nm bandpass solar chromospheric EUV emission, which penetrates to themiddle thermosphere and is absorbed by atomic oxygen. The M . proxy is used as a mea-sure of far ultraviolet (FUV) photospheric 160 nm Schumann-Runge Continuum emissions,which penetrate to the lower thermosphere and cause molecular oxygen dissociation. Thefourth solar index is Y . which is a composite of X b and Lyman-alpha. This serves as acomposite measure of solar coronal 0.1-0.8 nm X-ray emissions and 121.6 nm Lyman-alpha,both of which penetrate to the mesosphere and participate in water chemistry. In order toforecast these indices/proxies, SET uses a linear predictive algorithm that captures persis-tence and recurrence ( Tobiska et al. ap and Dst indices. The ap index is a measure of global geomagnetic activity derived from twelveobservatories that fall between 48 ◦ N and 63 ◦ S in latitude (

McClain and Vallado ap during quiet geomagnetic conditions results in low density errors, but Dst proves to be a more eﬀective driver during storm times (

Bowman et al.

Dst is an in-dex that represents the strength of the storm-time ring current in the inner-magnetosphere(

Tobiska et al.

Anemomilos algorithm, whichprovides a forecast with maximum prediction window of six days (

Tobiska et al. ap does not have an algorithm or model to provideforecasts to the JB2008 model. The three-hourly ap forecasts are actually interpolated val-ues from the National Oceanic and Atmospheric Administration (NOAA) Space WeatherPrediction Center’s (SWPC) K p forecasts ( ). Additionally, they are generated from an ensemble of individual hu-man forecasters’ predictions informed by model output [University of Michigan’s GeospaceModel since 2017] (

Steenburgh et al.

Singer

Haiducek et al. –3–onﬁdential manuscript submitted to

Space Weather

Even though SWPC only recently switched to using the Geospace Model, this data representsthe oﬃcial NOAA SWPC forecast and we use it as such.Errors in the space weather driver forecasts cause errors in the resulting densities,therefore impairing satellite conjunction analyses. Bussy-Virat et al. (2018) recently per-formed a study to show the eﬀects on driver uncertainty on the probability of collision be-tween two space objects. A similar study was performed more recently by Licata et al. (2019)incorporating additional forecasts and further conditioning distributions.

Figure 1. (left) Deterministic and probabilistic F . forecasts in addition to the true variation during thetime period. (right) Satellite position distributions relative to the true position after encountering six days ofprobabilistic densities resulting from the corresponding F . ﬂuctuations. White arrows represent positionusing deterministic F . values. The probabilistic F . forecasts in Figure 1 were generated using the statistical mea-sures identiﬁed in the current study. There was a constraint of the maximum change in thedriver ( dF . ) from one time-step to the next. This limiting factor was chosen through fur-ther statistical analyses. Each driver forecast was input to a quasi-physical model of the massdensity built using recurrent neural network to forecast a resulting 3D density grid that wouldbe used in orbit propagation ( Mehta et al.

Licata and Mehta ∼ –4–onﬁdential manuscript submitted to Space Weather the analysis. Here, the mean probabilistic position was more accurate than the deterministicposition. Figure 1 is a derivative of this work.We expand upon the work of Bussy-Virat et al. (2018) by using (i) all solar and geo-magnetic drivers that are used in operations, (ii) a large historical data set covering a periodof six (6) years, (iii) an extended forecast window of up to six (6) days, and (iv) the initialdriver values to characterize model performance as a function of the solar and geomagneticactivity.The outline for the current paper is as follows: the following section introduces thetechniques and thresholds to bin solar and geomagnetic drivers. This is done separately be-tween the domains and presents distinct methods. Next, the resulting uncertainty ﬁgures arepresented and discussed followed by the conclusion.

The SET algorithms produce ﬁles every three hours generating updated six-day fore-casts for solar and geomagnetic indices. These forecasts have a temporal resolution of threehours. In addition, they archive the observed values for each time step. To conduct this anal-ysis, forecasts from October 2012 through the end of 2018 were used with the exception ofsome missing/corrupted forecasts. In total, there were over 15,000 ﬁles to leverage for thisstudy.In order to eﬀectively examine the solar and geomagnetic indices in comparable terms,a consistent approach had to be determined. To provide the clearest possible representationfor all indices, diﬀerent methods are used for solar indices and geomagnetic indices but keptconsistent within the domains. Each index was split into separate sub-populations dependingon the initial forecasted value. Populations that ended up with fewer than 100 forecasts arenot shown, because there is insuﬃcient data to draw statistical conclusions.

The task of generating statistical results for the four (4) solar indices investigated ( F . , S . , M . , and Y . ) was relatively straightforward. The forecasts are generated usingSET’s SOLAR2000 algorithm (

Tobiska et al. –5–onﬁdential manuscript submitted to

Space Weather tary statistical analysis. Figure 2 depicts how the solar indices are distributed based on theinitially forecasted value.

Figure 2.

Distributions of initially forecasted values for each solar index with partitions shown in red.

The thresholds for F . had been previously speciﬁed and used in previous work ( Mehta

Licata et al. F . forecasts for each activity level. This was used to classify theremaining solar indices, with the absence of a natural partition. A natural partition within adistribution is seen at 150 sfu for S . . This was chosen for the particular threshold as it didnot greatly disrupt the number of forecasts in the adjacent activity levels. The four levels ofsolar activity are deﬁned in Table 1. –6–onﬁdential manuscript submitted to Space Weather

Table 1.

Activity level thresholds for the four solar indices. F . Low F . ≤

75 sfuModerate 75 < F . ≤

150 sfuElevated 150 < F . ≤

190 sfuHigh F . >

190 sfu S . Low S . ≤ < S . ≤ < S . ≤ S . > M . Low M . ≤ < MS . ≤ < M . ≤ M . > Y . Low Y . ≤ < Y . ≤ < Y . ≤ Y . > more positive than the issued value. All of the solar indices are updated daily, thereare twenty-four distributions for each (four magnitude-based and six temporal partitions). The analysis of the two geomagnetic indices, ap and Dst , was more intricate. Not onlyare the uncertainties functions of their magnitudes and time from epoch, they vary with solaractivity level. To analyze ap , three geomagnetic activity levels were chosen: low, moderateand active. In analyzing Dst , six geomagnetic activity levels were chosen and are consistentwith the NOAA G-scale as operationally applied by SET. Table 2 states the thresholds for ap and Dst .To allocate the geomagnetic forecasts, the largest value in the forecast for ap and themost negative value for Dst is the controlling factor. In addition, the forecast is classiﬁed bythe initial forecasted F . value. Since the distributions have a ﬁner temporal resolution anda solar dependency, there are 1,152 distributions for ap and Dst . –7–onﬁdential manuscript submitted to Space Weather

Table 2.

Bin thresholds for geomagnetic activity, ap and Dst . ap Low ap ≤ < ap ≤ ap > Dst G0 Dst ≥ − − > Dst ≥ − − > Dst ≥ − − > Dst ≥ − − > Dst ≥ −

Dst ≤ − − Dst signiﬁes an error twice the magnitude of the long-term mean

Dst , and the predictionwas more-negative than the issued value. The long-term mean values for ap and Dst are 9.2 and -8.8 nT , respectively. In the resulting uncertainty ﬁgures, the mean and standard deviation of forecast error(as a function of time from current epoch) are presented for each activity level. This way,biases can be identiﬁed and the algorithm’s temporal uncertainty can be determined. Figure3 shows the performance of the F . forecast algorithm.At low and moderate levels of solar activity, the F . algorithm is fairly unbiased. Itis not until elevated and high solar activity that a bias accumulates, showing a tendency ofover-forecasting the index. The evolution of the error’s standard deviation has an expectedgrowth with time from epoch for all activity levels, showing the uncertainty of the forecastincreasing with time. The algorithm performs well when the ﬁrst forecasted F . value isbelow 150 sfu, which accounted for approximately 87% of the forecasts.Figure 4 provides the algorithm performance for S . . There is little bias through low,moderate, and elevated activity levels (over 98% of forecasts) displaying strong overall per-formance. The uncertainty at these activity levels is similar to F . , but the performance athigh solar activity is not as stable. For high solar activity, there is a dominant tendency to –8–onﬁdential manuscript submitted to Space Weather

Figure 3. F . algorithm performance across four levels of solar activity. over-forecast in addition to a large uncertainty. The uncertainty also does not consistentlygrow with time.The F . and S . algorithms are both vulnerable to high solar activity, but the com-prehensive eﬀectiveness is visible. The limitation during high activity is due to the volatilityof the sun during solar maximum, i.e, the inability to accurately forecast ﬂares and the lackof information from the solar East limb and solar far-side active region’s growth. The algo-rithms for the remaining indices prove to be more robust to solar activity. The M . perfor-mance is presented in Figure 5.For M . , there is a minimal bias of ±

2% for the lower two activity levels, but at lowsolar activity, there is a slight tendency to under-predict. At elevated and high solar activity,the bias is accumulating with time and increases in intensity. Across all levels, the uncer-tainty starts below 4% and grows steadily with time. An interesting characteristic that con- –9–onﬁdential manuscript submitted to

Space Weather

Figure 4. S . algorithm performance across four levels of solar activity. trasts the prior two indices is the lower uncertainty at high solar activity. The diﬀerence inperformance is not drastic relative to the other conditions.To conclude the analysis of the solar indices, Figure 6 shows the performance of the Y . algorithm. Relative to the previous three indices, the Y . algorithm is considerablyrobust to activity levels and has less overall uncertainty. In the ﬁrst two activity levels, thebias is less than ±

1% for nearly the entire prediction window. The uncertainty grows withtime for all activity levels, but its magnitude is less signiﬁcant than the other indices. Thebias never exceeds 5% and the uncertainty 12%.As previously stated, the geomagnetic indices were more diﬃcult to analyze due to anincrease in dependencies and a ﬁner time resolution. Each geomagnetic index has its own setof activity levels but are both based on the previous F . thresholds. The performance of the ap forecasts is shown in Figure 7. –10–onﬁdential manuscript submitted to Space Weather

Figure 5. M . algorithm performance across four levels of solar activity. Unlike the solar indices, there are multiple conditions with insuﬃcient data to con-duct the analysis. The most distinct diﬀerence in the ap forecast performance, relative to theother indices, is the discontinuity at the three-day mark. Mentioned in the Introduction, theforecasts only have a three-day prediction window. The forecasts are provided by the judge-ment of an array of Space Weather forecasters at NOAA SWPC with the aid of the Geospacemodel.Figure 7 shows uncertainty results for a six-day prediction window to be consistentwith the other indices, even though SET sets every ap value to zero after three days. Thereare still interesting results in the latter three days of the forecasts across the diﬀerent condi-tions. For example, the magnitude of under-prediction (when ap is set to zero) is diﬀerent foreach condition as is the volatility of ap , shown by the standard deviation. Even so, the mostimportant aspect of Figure 7 is the ﬁrst three days when forecasts are provided. –11–onﬁdential manuscript submitted to Space Weather

Figure 6. Y . algorithm performance across four levels of solar activity. During low geomagnetic activity (across all solar activity levels), there is no signiﬁ-cant bias detected. With moderate geomagnetic activity, there is a general over-predictionthat decreases over the three-day provided forecast. It shows a possible path for predictionimprovement by relying on persistence when ap is high at the start of the forecasts. Anotherkey determination is shown by the right-most panels where there is only a single forecast thathas a value greater than 50 2 nT . This reﬂects the diﬃculty in quantifying the intensity of astorm, even with the aid of a physics-based model.The last algorithm analyzed is SET’s Anemomilos for

Dst forecasts, shown in Figure 8.The G5 row is not shown. There was only a single forecast where a G5 storm was expected.There are only 9 /

24 conditions with enough forecasts to perform the analysis, but the re-maining results provide insight to the strengths and weaknesses of the algorithm. –12–onﬁdential manuscript submitted to

Space Weather

Figure 7. ap forecast uncertainty for the twelve solar and geomagnetic conditions. In the top-left subplot (when conditions are quiet), the forecasts remain relatively un-biased, and the uncertainty slowly increases with time. Figure 8 shows a general tendencyto predict

Dst to be more positive for nearly all G0 and G1 conditions, with the exception ofG1 low solar activity conditions. In this case, the algorithm has a strong bias to expect

Dst tobe ∼ nT more negative than the issued values over the ﬁrst four days of the forecast. Fol-lowing the strong inclination after day four, the algorithm tends to neutralize the bias. This isinterpreted as accurate prediction of Dst recovery to quiet conditions. –13–onﬁdential manuscript submitted to

Space Weather

Figure 8.

Dst forecast uncertainty for the combined solar and geomagnetic conditions.

The bias for G1-G3, moderate solar activity conditions shows a strong temporal depen-dency transitioning from under to over prediction in each case. G2 moderate solar activity isa case with a peculiar trend of the uncertainty decaying with time from epoch. This is alsothe case for G3 moderate solar activity. However, this case has extreme and unclear resultswith both the bias and uncertainty changing rapidly with an inverse relationship. This be-havior points to a need for improvement in these conditions. A source of the Dst variabilityin G0-G3 conditions is the high-speed stream (HSS) and Anemomilos does not model theseevents. –14–onﬁdential manuscript submitted to

Space Weather

The analysis of the SET algorithms used by the JB2008 and HASDM models providedclear performance capabilities for the current standard for density model driver forecasts.This work showed the many strengths of these predictive algorithms while also showing con-ditions where improvements can be made. In general, the forecasting capability for solar in-dices at low and moderate activity levels has comparably low uncertainty and virtually nobias. This performance is degraded to an extent at elevated and especially high activity lev-els, where the sun is more volatile.The best performing algorithm is for Y . whose forecasting method is the most com-plex of the four solar indices investigated. The algorithm for M . also has low uncertaintyand low bias at the two lower solar activity levels. The forecasts for F . and S . prove tobe more uncertain and with generally higher biases. Both indices had strong tendencies toover-predict at high solar activity.The geomagnetic indices, ap and Dst , proved to be diﬃcult to predict even using thetwo diverse methods. The forecasts for ap are determined by a team of forecasters with theaid of a model, and there was still a low probability of detection for geomagnetic storms. Inmost conditions however, there was little or no bias in the predictions. The three-day predic-tion window also ended up being a limitation, and results from a full six-day forecast wouldbe intriguing. The Dst algorithm performed well during G0 (or quiet) conditions but showedunusual trends with increased geomagnetic activity.A major limitation in this study was the lack of forecasts in certain conditions. Thiswas particularly problematic for the geomagnetic indices, and using the most extreme indexvalue to bin forecasts was used to oﬀset this limitation. Even with this technique, a large per-centage of conditions had insuﬃcient data to perform the uncertainty analysis. In the future,we hope to include additional forecasts to the analysis to update the results in order to covermore conditions.This work is intended to provide the community with a performance level for future al-gorithm and model development in an eﬀort to improve our capability to accurately forecastdensity and determine satellite trajectories. –15–onﬁdential manuscript submitted to

Space Weather

The authors would like to acknowledge Bruce Bowman, Dave Bouwer, and AlfredoCruz of Space Environment Technologies for access to forecasts and relevant documentationto perform this study in addition to providing important insight. This work was made possi-ble by NASA West Virginia Space Grant Consortium, Training Grant

References

Berger, T. E., M. J. Holzinger, E. K. Sutton, and J. P. Thayer (2020), Flying ThroughUncertainty,

Space Weather , (1), e2019SW002,373, doi:10.1029/2019SW002373,e2019SW002373 2019SW002373.Bowman, B., W. K. Tobiska, F. Marcos, C. Huang, C. Lin, and W. Burke (2008), A New Em-pirical Thermospheric Density Model JB2008 Using New Solar and Geomagnetic Indices,in AIAA/AAS Astrodynamics Specialist Conference , AIAA 2008-6438.Bowman, B., W. K. Tobiska, F. Marcos, C. Huang, C. Lin, and W. Burke (2012), A New Em-pirical Thermospheric Density Model JB2008 Using New Solar and Geomagnetic Indices,doi:10.2514/6.2008-6438.Bussy-Virat, C. D., A. J. Ridley, and J. W. Getchius (2018), Eﬀects of Uncertainties in theAtmospheric Density on the Probability of Collision Between Space Objects,

SpaceWeather , (5), 519–537, doi:10.1029/2017SW001705.Emmert, J. (2015), Thermospheric mass density: A review, Advances in Space Research , ,doi:10.1016/j.asr.2015.05.038.Haiducek, J. D., D. T. Welling, N. Y. Ganushkina, S. K. Morley, and D. S. Ozturk (2017),SWMF Global Magnetosphere Simulations of January 2005: Geomagnetic Indicesand Cross-Polar Cap Potential, Space Weather , (12), 1567–1587, doi:10.1002/2017SW001695.ISO 14222 (2013), Space environment (natural and artiﬁcial) -Earth upper atmosphere, Stan-dard , International Organization for Standardization, Geneva, CH.Licata, R. J., and P. M. Mehta (2019), Physics-informed Machine Learning for Probabilis-tic Space Weather Modeling and Forecasting: Thermosphere and Satellite Drag, doi:10.13140/RG.2.2.32538.18880. –16–onﬁdential manuscript submitted to

Space Weather

Licata, R. J., and P. M. Mehta (2020), Physics-informed Machine Learning with Autoen-coders and LSTM for Probabilistic Space Weather Modeling and Forecasting, doi:10.13140/RG.2.2.17039.74401.Licata, R. J., P. M. Mehta, and C. Kay (2019), Data-Driven Framework for Space WeatherModeling with Uncertainty Treatment towards Space Situational Awareness and SpaceTraﬃc Management, in

Astrodynamics Specialist Conference , AAS 19-603.McClain, W., and D. Vallado (2001),

Fundamentals of Astrodynamics and Applications ,Space Technology Library, 556-557 pp., Springer Netherlands.Mehta, P. M. (2013), Thermospheric density and satellite drag modeling, Ph.D. thesis, Uni-versity of Kansas.Mehta, P. M., R. Linares, and E. K. Sutton (2018), A Quasi-Physical Dynamic Reduced Or-der Model for Thermospheric Mass Density via Hermitian Space-Dynamic Mode Decom-position,

Space Weather , (5), 569–588, doi:10.1029/2018SW001840.Singer, H. (2013), Report on the Selection of Geospace Model(s) For Transition to Opera-tions at NOAAâĂŹs Space Weather Prediction Center (SWPC), Tech. rep. , Space WeatherPrediction Center.Steenburgh, R., D. Biesecker, and G. Millward (2013),

From Predicting Solar Activityto Forecasting Space Weather: Practical Examples of Research-to-Operations andOperations-to-Research , pp. 239–254, doi:10.1007/978-1-4939-1182-0_17.Storz, M., B. Bowman, and J. Branson (2005),

High Accuracy Satellite Drag Model(HASDM) , doi:10.2514/6.2002-4886.Tobiska, W., T. Woods, F. Eparvier, R. Viereck, L. Floyd, D. Bouwer, G. Rottman, andO. White (2000), The SOLAR2000 empirical solar irradiance model and forecast tool,

Journal of Atmospheric and Solar-Terrestrial Physics , (14), 1233 – 1250, doi:https://doi.org/10.1016/S1364-6826(00)00070-5.Tobiska, W. K., B. Bowman, and S. D. Bouwer (2008a), Solar and Geomagnetic Indices forthe JB2008 Thermosphere Density Model , chap. 4, COSPAR CIRA Draft.Tobiska, W. K., S. D. Bouwer, and B. R. Bowman (2008b), The development of new solarindices for use in thermospheric density modeling,

Journal of Atmospheric and Solar-Terrestrial Physics , (5), 803 – 819, doi:https://doi.org/10.1016/j.jastp.2007.11.001.Tobiska, W. K., D. Knipp, W. J. Burke, D. Bouwer, J. Bailey, D. Odstrcil, M. P. Hagan,J. Gannon, and B. R. Bowman (2013), The Anemomilos prediction methodology for Dst, Space Weather , (9), 490–508, doi:10.1002/swe.20094.(9), 490–508, doi:10.1002/swe.20094.