[PDF] Deep Learning Emulation of Multi-Angle Implementation of Atmospheric Correction (MAIAC)

Abstract

New generation geostationary satellites make solar reflectance observations available at a continental scale with unprecedented spatiotemporal resolution and spectral range. Generating quality land monitoring products requires correction of the effects of atmospheric scattering and absorption, which vary in time and space according to geometry and atmospheric composition. Many atmospheric radiative transfer models, including that of Multi-Angle Implementation of Atmospheric Correction (MAIAC), are too computationally complex to be run in real time, and rely on precomputed look-up tables. Additionally, uncertainty in measurements and models for remote sensing receives insufficient attention, in part due to the difficulty of obtaining sufficient ground measurements. In this paper, we present an adaptation of Bayesian Deep Learning (BDL) to emulation of the MAIAC atmospheric correction algorithm. Emulation approaches learn a statistical model as an efficient approximation of a physical model, while machine learning methods have demonstrated performance in extracting spatial features and learning complex, nonlinear mappings. We demonstrate stable surface reflectance retrieval by emulation (R2 between MAIAC and emulator SR are 0.63, 0.75, 0.86, 0.84, 0.95, and 0.91 for Blue, Green, Red, NIR, SWIR1, and SWIR2 bands, respectively), accurate cloud detection (86\%), and well-calibrated, geolocated uncertainty estimates. Our results support BDL-based emulation as an accurate and efficient (up to 6x speedup) method for approximation atmospheric correction, where built-in uncertainty estimates stand to open new opportunities for model assessment and support informed use of SR-derived quantities in multiple domains.

Full PDF

11 Deep Learning Emulation of Multi-AngleImplementation of Atmospheric Correction(MAIAC)

Kate Duffy , Thomas Vandal , Weile Wang , Ramakrishna Nemani , and Auroop R. Ganguly Sustainability and Data Sciences Laboratory, Department of Civil and Environmental Engineering, NortheasternUniversity, 360 Huntington Avenue, Boston, MA. NASA Ames Research Center, Moffett Blvd, Mountain View, CA. Bay Area Environmental Research Institute, P.O. Box 25 Moffett Field, CA. California State University, Monterey Bay, Seaside, CA.

Abstract —New generation geostationary satellites make solarreﬂectance observations available at a continental scale withunprecedented spatiotemporal resolution and spectral range.Generating quality land monitoring products requires correctionof the effects of atmospheric scattering and absorption, whichvary in time and space according to geometry and atmosphericcomposition. Many atmospheric radiative transfer models, in-cluding that of Multi-Angle Implementation of AtmosphericCorrection (MAIAC), are too computationally complex to berun in real time, and rely on precomputed look-up tables.Additionally, uncertainty in measurements and models for remotesensing receives insufﬁcient attention, in part due to the difﬁcultyof obtaining sufﬁcient ground measurements. In this paper,we present an adaptation of Bayesian Deep Learning (BDL)to emulation of the MAIAC atmospheric correction algorithm.Emulation approaches learn a statistical model as an efﬁcient ap-proximation of a physical model, while machine learning methodshave demonstrated performance in extracting spatial features andlearning complex, nonlinear mappings. We demonstrate stablesurface reﬂectance retrieval by emulation (R2 between MAIACand emulator SR are 0.63, 0.75, 0.86, 0.84, 0.95, and 0.91 forBlue, Green, Red, NIR, SWIR1, and SWIR2 bands, respectively),accurate cloud detection (86%), and well-calibrated, geolocateduncertainty estimates. Our results support BDL-based emulationas an accurate and efﬁcient (up to 6x speedup) method for ap-proximation atmospheric correction, where built-in uncertaintyestimates stand to open new opportunities for model assessmentand support informed use of SR-derived quantities in multipledomains.

Index Terms —emulation, atmospheric correction, MAIAC,deep learning, Bayesian deep learning, uncertainty quantiﬁcation,Himawari-8, geostationary

I. I

NTRODUCTION O PERATIONAL land surface monitoring and scientiﬁcstudies are beneﬁted by satellite-based observationsat unprecedented spatiotemporal resolution. New-generationgeostationary satellites include the Japanese Space Agency’sHimawari-8 and the National Oceanographic and AtmosphericAdministration’s (NOAA) Geostationary Operational Envi-ronmental Satellite (GOES) series. Geosynchronous orbits,

Corresponding author: Kate Duffy, [email protected] which have traditionally been leveraged for communicationsand weather monitoring satellites, enable sensors to producecontinental and regional-scale scans at intervals of as littleas 30 seconds. Such high-temporal resolution observationshave applications in study of diurnal processes, near-real timemonitoring of natural hazards, and creation of relatively cloud-free daily composites. In comparison to previous geostationarysensors, these satellites have improved spatial resolution andspectral range. These characteristics lend new-generation geo-stationary satellites to applications for land surface monitoringand invite comparison to sensors like the land-monitoring ﬂag-ship Moderate Resolution Imaging Spectrometer (MODIS).As remote sensing helps to propel earth science into thebig data era, sciences face a challenge in processing andmaking use of terabytes of observational data, much of whichhas unknown accuracy [1]. Several types of uncertainty ex-ist, including aleatoric uncertainty from measurement noiseand epistemic uncertainty from incomplete knowledge aboutmodeled processes. One such modeled process is spatially,temporally, and spectrally varying interaction of reﬂectedenergy with gases and aerosols in the Earth’s atmosphere [2].Scattering and absorption effects are particularly strong in thevisible and near infrared spectra and depend on the locationand properties of atmospheric aerosols and water vapor. Thesecomplex interactions, combined with the challenges of adja-cency effects, heterogeneous landscapes, and rugged terrain,make atmospheric effects difﬁcult to correct. Removing theseperturbations, which can vary reﬂectance by up to 15%,prevents atmospheric variability from being interpreted as landsurface change, and enables generation of reliable monitoringproducts [2]. Approaches to separate surface reﬂectance fromatmospheric signals range from simple methods like darkbody subtraction to sophisticated land-atmosphere models thatnumerically simulate the transfers of energy in the atmosphereby absorption, scattering, and emission.Developed for MODIS, the Multi-Angle Implementation ofAtmospheric Correction (MAIAC) algorithm has been adaptedto retrieve surface reﬂectance and atmospheric composition forthe geostationary satellite Himawari-8. MAIAC uses a semi- a r X i v : . [ c s . L G ] O c t Fig. 1. Schematic of emulator approach, where a physics-based model is replaced by an efﬁcient surrogate model. analytical solution of the kernel-based RossThick LiSparse(RTLS) model [3]. MAIAC uses time series of up to sixteendays and stored information about the characteristics of eachlocation to help separate the contributions of surface andatmosphere to the observed signal [4], [5], [6]. Running anatmospheric radiative transfer model in real time is compu-tationally complex. Instead, MAIAC relies on the generationof look-up tables (LUT) with precomputed values. LUTs areprecomputed at a grid density chosen with consideration toboth accuracy and memory requirement, with values retrievedby linear interpolation between calculated values [4].Another approach to reducing the runtime computationof expensive models is through emulation. Emulation is anapproach to modeling that replaces a physics-based modelor model component with a learned component, which actsas a fast approximation of the model physics. The objec-tive of emulation is not to develop a new parametrization,but to efﬁciently and accurately reproduce an existing one,which has been carefully developed and validated based ondomain knowledge. Emulation using statistical models andshallow neural networks has been applied various earth sci-ence applications including climate modeling [7], [8], [9],hydrology [10], and atmospheric modeling [11], [12], [13].Where emulation can meaningfully accelerate modeling, theneed for computing time and resources is reduced. Efﬁcientsurrogate models can be used for sensitivity analysis anduncertainty quantiﬁcation within computing resource-limitedcontexts. Emulation has also shown potential for scientiﬁc in-sight, such as into the relationships between high-dimensionalmodel inputs and output, especially where process dynamicsare not well understood [9].The capability of neural networks to learn complex, non-linear mappings has been used in remote sensing for severaldecades. Many recent works have utilized convolutional neuralnetworks (CNN), a class of algorithm that can extract featuresfrom spatial data. The ability to leverage spatial correlations inimage-like inputs has led to achievements for remote sensingtasks including object detection [14], land use and land cover(LULC) classiﬁcation [15], [16], [17], and prediction of quan-tities ranging from agricultural yield [18] to poverty [19]. Non-deep learning algorithms have been applied to predict remotesensing products including multispectral surface reﬂectance [20] and vegetation indices [21].While machine learning has demonstrated ability for ex-tracting credible insights from complex datasets in multiplegeoscience domains [22], [23], [24], reasons for caution re-main in many applications. Deep neural networks are lim-ited in physical interpretability and generally do not havebuilt-in quantiﬁcation of predictive uncertainty. Uncertaintyassessment is useful for informed use of machine learning,and is also necessary for generation of some high-level landproducts from surface reﬂectance. For example, the MODISleaf area index (LAI) and fraction of photosynthetically activeradiation (FPAR) algorithm is calibrated using uncertainty onits inputs [25], [26]. Bayesian emulators have been used tomimic systems from biology to built infrastuctures [27], [28].Approximations of Bayesian inference can be used to extractinformation about both aleatoric and epistemic uncertaintyfrom deep learning (DL) models [29].In this paper we develop methodology integrating emulationand Bayesian Deep Learning (BDL). The approach is adaptedto remote sensing, a domain where big data and complex mod-els present both a challenge and an opportunity for ﬂexible,data driven methods. We demonstrate a BDL-based emulationof MAIAC’s surface reﬂectance retrieval and cloud classiﬁca-tion routines with built-in Bayesian uncertainty quantiﬁcation.We test the performance of the MAIAC emulator over variousland cover types and seasons and ﬁnd stable performance.Additionally, we assess the calibration of uncertainty estimatesand quantify the increase in speed compared to MAIAC.The main contributions of this paper are in both science andmethods. • Our methods innovation consists of the adaption ofBayesian deep learning to emulation of a physics modelin remote sensing. • The resulting advancement in scientiﬁc insight comesfrom the well-calibrated and geolocated estimates ofmodel uncertainty, previously not available for MAIAC.The rest of this paper is organized as follows. SectionII reviews the study area and data sets used. Section IIIintroduces the proposed methods for emulation. Section IVpresents results and evaluation. Finally, conclusions are drawnin Section V.

Fig. 2. Top of atmosphere reﬂectance and sample results for MAIAC and DCVDSR emulator. Columns 1-3 are in RGB true colors. Rows from top: IndochinaPeninsula and the South China Sea, eastern China and the Bohai Sea, southwestern Australia.

II. S

TUDY A REA AND D ATA S ETS

Datasets used in this study are from the Advanced HimawariImager (AHI) sensor carried by the Japanese geostationarysatellite Himawari-8. In the GeoNEX processing pipeline,Himawari Standard Data (HSD) scans are georeferenced andconverted to gridded data. The resulting gridded data setsfollow a geographic coordinate system with a 120 ◦ by 120 ◦ extent (E85 ◦ - E205 ◦ , N60 ◦ - S60 ◦ ). The domain is dividedinto 6 ◦ by 6 ◦ tiles deﬁned by ﬁxed latitude and longitude.Himawari-8s full disk, which encompasses the entire view asseen from the satellite, covers the continent of Australia andeastern Asia. Full disk scans are repeated every ten minutes.The Advanced Himawari Imager has sixteen observingbands encompassing visible, near-infrared (NIR), short waveinfrared (SWIR), and thermal infrared (TIR) with spatialresolution ranging from 0.5 to 2 km. Bands one through six aresolar reﬂective bands, spectrally similar to NASA’s MODIS.All bands are resampled to common 0.01 ◦ resolution, whichcorresponds to 1 km at the equator [Table I]. TABLE IH

IMWARI -8 AHI

SOLAR REFLECTIVE BANDS FOR LAND SURFACEOBSERVATION . Himwari-8 AHI Band Blue Green Red NIR SWIR1 SWIR2

Center Wavelength ( µ m) 0.46 0.51 0.64 0.86 1.6 2.3Spatial resolution (km) 1.0 1.0 1.0 0.5 2.0 2.0 A. Himawari-8 AHI TOA Reﬂectance

B. Himawari-8 AHI Surface Reﬂectance

C. MODIS MCD12Q1 Land Cover Type

Land cover types are identiﬁed using MODIS global landcover classiﬁcation, which is produced annually from com-bined Terra and Aqua observations [31]. MCD12Q1 incor-porates ﬁve distinct classiﬁcation schemes. We use the In-ternational Geosphere Biosphere Programme (IGBP) globalvegetation classiﬁcation scheme. This scheme delineates 17distinct classes including 11 natural, 3 developed/mixed, and3 non-vegetated. MCD12Q1 is interpolated from 500 meterresolution to the 0.01 degree grid of the AHI datasets.III. M

ETHODS

A. Emulator model

Mapping from TOA reﬂectance to SR is typically han-dled by computationally expensive models which simulatenonlinear physics and incorporate ancillary information aboutatmospheric conditions. MAIAC is a state of the art methodfor accomplishing atmospheric correction. In our approach toemulation, several deep networks are learned to approximatethe MAIAC atmospheric correction algorithm.

1) Bayesian Deep Learning:

Typical deep neural networksare learned as deterministic functions which fail to captureinherent uncertainty in model parameters (epistemic) anddata (aleatoric). However, quantifying these uncertainties arecritical for decision making in applications from autonomousdriving to the physical sciences. Several approaches based inBayesian probability theory have been applied for uncertaintyquantiﬁcation (UQ) in neural networks. Bayesian neural net-works (BNN) are a well-deﬁned approach to capturing theseuncertainties that aims to learn probability distributions overthe functional parameters, such as the neural network weightsand biases. However, performing inference on the full posteriordistribution is intractable for networks with more than 2 hiddenlayers. This has led to the development of more efﬁcientapproximations of Bayesian inference.Bayesian Deep Learning has been shown to be an effectiveapproach to modeling uncertainty in BNNs by deﬁning a vari-ational (factorized) approximation, q θ ( W ) = (cid:81) Ll =1 q θ ( w l ) ,of the true posterior distribution, p ( W ) , for variational pa-rameters θ and W = { w l } Ll =1 [29]. Dropout, the process ofrandomly removing nodes from deep neural networks [32],is applied before each layer, l , in the model to approximate q ( w l ) . This approach can be thought of as ”thinning” the network. The optimization objective for this variational inter-pretation of a BNN, f , can be written as follows [33]: ˆ L ( θ ) = − N N (cid:88) i =1 log p ( y i | f W ( x i )) + KL ( q θ ( W ) || p ( W ))= L X ( θ ) + KL ( q θ ( W ) || p ( W )) (1)for data samples { x i , y i } Ni =1 . The ﬁrst term represents the ob-jective function expresses the log likelihood of the model whileKullbackLeibler divergence (KL) term acts as a regularizerby discouraging separation between the approximate posteriorand the model prior.During inference, stochastic forward passes generate T independent and identically distributed samples. From these,we can empirically approximate the model’s predictive distri-bution, the variance of which expresses the model’s conﬁdenceinternal. With T samples of [ˆ y , ˆ σ ] from the Bayesian network f W ( X ) , the unbiased estimates of the ﬁrst two moments ofthe predictive distribution are: E [ y ] = 1 T T (cid:88) t =1 ˆ y t (2) V ar [ y ] = 1 T T (cid:88) t =1 (ˆ y t + ˆ σ t ) − ( 1 T T (cid:88) t =1 ˆ y t ) (3)Dropout is already applied in many deep learning models todiscourage overﬁtting, and thus allows UQ without adding anycomputational complexity [32]. Uncertainty estimation fromdropout encompasses both epistemic uncertainty, reduciblethough collection of more data, and aleatoric uncertainty frommeasurement noise. All models used in this work are Bayesianmodels implemented with concrete dropout. Concrete dropoutis a variant that adapts dropout probability to obtain wellcalibrated uncertainty estimate [33].

2) Discrete-Continuous Distribution:

Prediction tasks gen-erally fall into one of two categories: regression tasks predicta continuous quantity, while classiﬁcation tasks are con-cerned with assigning a class label. MAIAC’s atmosphericcorrection and cloud classiﬁcation algorithms generate surfacereﬂectance, a continuous variable ranging between 0 and 1,and binary cloud classiﬁcation. We learn a discrete-continuousmodel to perform both regression and classiﬁcation tasksin one probabilistic model [34]. To this end, the model isconditioned to predict the probability of a pixel being clearsky. For the Bayesian network f W ( X ) described in SectionIII-A1, the mean, variance, and probability are sampled asfollows: [ˆ y, ˆ σ , ˆ φ ] = f W ( X ) (4) ˆ p = Sigmoid ( ˆ φ ) (5)This conditioning results in a two-part loss function withthe ﬁrst term capturing cross-entropy of predicted and cloudlabel and cloud prediction, and the second term capturingconditional regression loss at clear sky pixels. Here, the y TABLE IIE

VALUATION OF SURFACE REFLECTANCE FROM THE THREE CANDIDATE EMULATOR MODELS . Blue Green Red NIR SWIR1 SWIR2 SR MAIAC 0.064 0.084 0.155 0.307 0.327 0.231 SR MAIAC − SR emulator DCFC -0.001 -0.003 -0.015 -0.012 -0.012 -0.031DCCNN 0.002 0.006

DCVDSR -0.004 -0.009 0.010 CV MAIAC − CV emulator CV MAIAC (%) DCFC

DCCNN 31 24 10

DCCNN 0.818 0.866 0.948 0.95 0.977 0.96DCVDSR 0.797 0.864 0.93 0.917 0.977 956Conditional RMSE DCFC 0.0221 0.0217 0.0235 .0233 is the binary indicator of whether the classiﬁcation is correctand D is the number of pixels with pixel index i . L X ( θ ) = binary classification loss (cid:122) (cid:125)(cid:124) (cid:123) − D (cid:88) i (cid:104) y i log ( ˆ p i )) + (1 − y i ) log (1 − ˆ p i ) (cid:105) + 1 D (cid:88) i,y i >

12 ˆ σ i − || y i − ˆ y i || + 12 log ˆ σ i (cid:124) (cid:123)(cid:122) (cid:125) conditional regression loss (6)

3) Implementation and Training:

We implement threefull Bayesian architectures conditioned to learn a discrete-continuous distribution as in Vandal et al. (2018) [34]:discrete-continuous fully connected neural network (DCFC),discrete-continuous convolutional neural network (DCCNN),and discrete-continuous very deep super resolution network(DCVDSR). DCVDSR, inspired by image super-resolutionnetworks, is a convolutional neural network similar to DCCNNbut incorporates a skip connection between the ﬁrst and lasthidden layers. All models have 3 layers with 512 hidden unitsper layer and ReLU activations. DCCNN and DCVDSR haveﬁlter sizes of 3.Over 200 GB of data from a two year period is dividedinto training (2016) and testing (2017) sets. Models areimplemented in TensorFlow 2.0 and trained on 50 by 50pixel patches using stochastic gradient descent and Adamoptimization with β = 0 . , β = 0 . , (cid:15) = 1 e − , a batchsize of 16, and learning rate of e − [35]. For concretedropout, hyperparameters tau and prior length-scale set to e − and e − . B. Assessment of Emulated SR and Cloud Products

Due to the challenges associated with obtaining groundtruth reﬂectance observations, comparison with an existing,comprehensively validated product can be used to assess theperformance of a new reﬂectance product [36]. Our methodol-ogy for assessment of emulated data products follows standardmethods for assessment of a reﬂectance product. As both the MAIAC SR and emulator SR are predicted from AHI TOAreﬂectance, all pixels are guaranteed coincident, coangled andcolocated, and can be directly compared.For clear sky pixels, agreement and error between MAIACand emulator SR are evaluated for each solar reﬂective band.Additionally, the ability of the emulator to discriminate be-tween clear sky and non-clear sky pixels is evaluated in theassessment of the emulator cloud product. To evaluate stabilityof emulator performance under varied land cover conditions,results are presented for performance common MODIS landcover classiﬁcations.IV. R

ESULTS AND D ISCUSSION

A. Model Evaluation

We adapt three deep learning architectures to learn a fastapproximation of MAIAC’s surface reﬂectance and cloudretrieval algorithm. We compare the three models to identifythe best-performing architecture, based on lowest error in SRprediction and highest accuracy in cloud identiﬁcation.

1) Surface Reﬂectance:

A comparison between basic statis-tics of surface reﬂectance datasets obtained from MAIACand the emulators are presented in Table II. Here, meanvalue of surface reﬂectance is average intensity of clearsky pixels in each band. Coefﬁcient of variation (CV) isa measure of relative dispersion of the data calculated byrelating the standard deviation and mean of a distribution.For surface reﬂectance, CV relates to the radiometric stabilitycharacteristic of the sensor, with lower CV indicating greaterstability. Comparison of MAIAC and emulator CV suggestthat the predictions of the fully connected model (DCFC) mostclosely matches the dispersion of MAIAC SR. The emulatormodels generally capture the relative magnitudes of variationin each wavelength while underestimating variation of the SRdistribution. Underestimation of observed variation is mostpronounced for the blue band across models. We also evaluatethe spatial autocorrelation, or the degree to which values ofa single variable are correlated due to nearness in space. Wecalculate Moran’s I for each of the SR datasets as an indicatorto the extent similar values cluster in space (I = 1), values

Fig. 3. Density plot illustrating the relationship between MAIAC SR and emulator (DCVDSR) SR for six solar reﬂective bands.Fig. 4. Histogram of difference between MAIAC SR and emulator (DCVDSR) SR for six solar reﬂective rands. are randomly located (I = 0), or similar values are dispersedin space (I = -1). We observe positive values of Moran’s Iin all datasets, ﬁnding MAIAC with I = 0.81 and emulatorswith approximately I = 0.94. The emulator datasets exhibitgreater clustering of like values, reﬂecting characteristics ofboth surface reﬂectance and cloud products.Correlation coefﬁcient and conditional RMSE are alsopresented in Table II. Conditional RMSE refers to RMSEevaluated at clear sky pixels only. Correlation coefﬁcientindicates the strongest linear relationship between MAIAC SRand DCFC emulator SR. Evaluation of the performance across models suggests that mappings in some bands may be easierto learn (SWIR1, SWIR2), and others more difﬁcult to learn(Blue, Green). Conversely, conditional RMSE is generallylowest for the DCVDSR model. As the square root of thevariance of residuals, RMSE indicates the absolute ﬁt of themodel to the data, and can be thought of as more germane topredictive ability than correlation.Pixel by pixel comparison is presented using density plotsin Figure 3. The plots suggest strong coherence betweenMAIAC and emulator SR. A 1:1 line is displayed for visualcomparison, while slope and intercept of the data best ﬁt line

Fig. 5. Analysis of the emulator cloud prediction. (a)

Cloud classiﬁcation accuracy with varying decisions thresholds. (b)

ROC curve for evaluation ofclassiﬁcation performance. are displayed on the plots. Outliers are generally located abovethe 1:1 line, indicating that the emulator models may notcapture the upper tail of the SR distribution.Histograms of differences between MAIAC and emulatorsurface reﬂectance are plotted in Figure 4. Locations arerandomly sampled over the entire domain for the year 2017.For an ideal model, differences between observed and modeledvalues should be small and unbiased. Distributions are gener-ally symmetric and centered around near-zero means, indicat-ing minimal bias toward overestimation or underestimation byDCVDSR.

2) Cloud Identiﬁcation:

Evaluation of the emulator cloudprediction is performed by pixelwise comparison betweenthe MAIAC cloud mask and the emulator cloud mask. Asdescribed in Section III-A2, the model is conditioned to predict ˆ p as the probability of a pixel being clear sky. By selectinga decision threshold value of p , cloud classiﬁcation proceedsby casting pixels with ˆ p < p as non-clear sky and ˆ p > p asclear sky. Continuously varying the decision threshold p andcalculating the resulting classiﬁcation accuracy indicates theoptimal mask probability. p , for each trained model. Figure 5presents a plot of classiﬁcation accuracy with varying decisionthreshold.Figure 5 also presents the ROC curve, used to assessdiscrimination ability of binary classiﬁers. True positive rate(TPR) is plotted against false positive rate (FPR) at variousthresholds. Area under the curve (AUC) provides a measureof how well the model can discriminate between two classes,with a maximum value of one for perfect classiﬁcation.Performance by accuracy for cloud classiﬁcation is similaracross the evaluated emulator models (Table III). Sensitivityrefers to the true positive rate, or proportion of clear skypixels that are correctly classiﬁed. Speciﬁcity refers to the true negative rate, or the proportion of non-clear pixels thatare correctly classiﬁed. A high speciﬁcity classiﬁer will screenhigh aerosol pixels, while a less conservative, higher sensitivityclassiﬁer carries more chance of cloud contamination. Suchcloud contamination has a potentially strong negative effecton SR retrieval. The three emulator models are generally moreconservative, achieving greater classiﬁcation accuracy for non-clear pixels than clear sky pixels. TABLE IIIC

LOUD CLASSIFICATION ACCURACY , SENSITIVITY , AND SPECIFICITY . Accuracy (%) Sensitivity Speciﬁcity

DCFC 0.8656 0.7017

DCCNN

It should be noted that assessment of classiﬁcation accuracyuses MAIAC cloud masks a ground truth. Cloud masksproduced from MAIAC contain uncertainties and inaccuraciesof their own, and it is possible that the ability of CNNs toincorporate spatial information produces an advantage in cloudclassiﬁcation. Visual assessment of cloud predictions fromMAIAC and emulator often indicate greater spatial coherenceof emulator cloud masks, and lesser appearance of someundesirable model artifacts (Figure 2).

3) Stability of Model over Varied Conditions:

Homoge-neous vegetation areas are identiﬁed using MODIS MCD12Q1Land Cover Type I and performance of the MAIAC emulatoris evaluated for each land cover type separately. Performanceincluding conditional RMSE of SR and cloud classiﬁcationaccuracy by the DCVDSR emulator are presented in Figure6. Results are presented for the nine most abundant classes inthe test set. Both regression error and cloud classiﬁcation ac-

Fig. 6. (a)

Emulator (DCVDSR) performance according to International Geosphere Biosphere Programme (IGBP) global vegetation classiﬁcation scheme. (b)

Emulator (DCVDSR) performance in the Northern Hemisphere across seasons for 2017. Dotted lines represent overall mean performance for all land coversand all seasons in the Northern Hemisphere, respectively. curacy are relatively stable for vegetated categories includingforests, shrubland, and savanna. Barren land result in poorerperformance.The optical properties of highly reﬂective surfaces presenta challenge to atmospheric correction, and such may alsoresult in poor performance for MAIAC [37]. Therefore, itis necessary to consider that results in Figure 6 representcomparison to MAIAC’s estimates, rather than to a groundtruth values.Seasonal performance of the emulator is also presented inFigure 6. Seasonal analysis is used to evaluate performanceunder annual ﬂuctuations in vegetation phenology. Springgreen-up, fall senescence, and transitions between wet and dryseasons result in SR variation of several absolute percent invegetated areas [6]. SR error and cloud classiﬁcation accuracyare are generally stable throughout the year, but evidenceslightly poorer performance and greater spread fall months (SRprediction) and winter months (cloud classiﬁcation). Seasonalperformance is evaluated separately for each hemisphere forconsistency of seasons. Similar results were found for theSouthern Hemisphere.

B. Uncertainty Quantiﬁcation

Bayesian deep learning models capture predictive uncer-tainty in regression task by producing a probabilistic output.As described in Section III-A1, we use variational inferenceto produce an ensemble of predictions for each sample, thencompute unbiased estimates of the ﬁrst and second momentsof the predictive distribution at each pixel. From the secondmoment, the standard deviation expresses the magnitude ofpredictive uncertainty. From MAIAC, uncertainty generallygrows in proportion to surface brightness [5]. This is alsoobserved in emulator uncertainty width (Figure 7).We assess the quality of the uncertainty measurements byevaluating the uncertainty calibration, or whether the modelcaptures the uncertainty in observed data. We compare themodel’s predictive distribution to the observed values by eval-uating the frequency of residuals lying in various probabilitythresholds within the predicted distribution [38]. Figure 7presents each model’s uncertainty calibration. A perfectly cal-ibrated model, which captures the distribution of the observeddata, would match the 1:1 line. All three models underestimateuncertainty to some extent, meaning they are overconﬁdentin their predictions. Of the three, DCVDSR has most well-calibrated uncertainty.

Fig. 7. ( a ) Plot of pixel intensity versus uncertainty reveals increasing uncer-tainty with increasing surface brightness. ( b ) Uncertainty calibration evaluatesthe frequency of observed values (y-axis) within predicted probability ranges(x-axis). C. Performance

The spatiotemporal resolution and spatial extent of AHIscans result in generation of over 50 TB of TOA reﬂectanceper year. In this section we consider the nontrivial computingtime necessary to retrieve surface reﬂectance from these scans.To evaluate the deep learning emulator models, we assessinference from one forward pass (static network) and tenstochastic forward passes (Bayesian sampling network). Asingle inference with the static network is sufﬁcient to produceSR and cloud products; Bayesian sampling produces the samewith uncertainty quantiﬁcation.Processing speeds are presented in Table IV. Emulatorinference is evaluated on one GPU, while MAIAC, accelerated using precomputed look up tables, is run on one CPU. Amongthe compared emulator models, processing speed decreaseswith increasing complexity. Inference with Bayesian samplingis generally comparable in speed to MAIAC, while inferenceon the static network represents between 3.75x (DCDVSR)and 6x (DCFC) speedup.

TABLE IVP

ROCESSING SPEED OF

MAIAC

AND EMULATOR MODELS .Model Examples per secondStatic network Bayesian samplingnetworkMAIAC 0.40 —DCFC 2.4 0.60DCCNN 1.8 0.33DCVDSR 1.5 0.25

V. C

ONCLUSIONS

In this work we evaluate the usefulness of deep learning-based emulation to approximate the MAIAC algorithms forsurface reﬂectance retrieval and cloud identiﬁcation. Discrete-continuous Bayesian neural networks are learned to emulateMAIAC with built-in uncertainty quantiﬁcation. Using the full120 ◦ by 120 ◦ view of Himawari-8 a broad study area, we ﬁndthat predictions from the emulator models are consistent withMAIAC and robust over varied land cover types and seasons.Analysis demonstrates well-calibrated uncertainty estimatesfor the proposed MAIAC emulator. The ability to generateprobabilistic mappings from observed data to the geophysicalvariables of interest has potential applications including sen-sitivity analysis and model assessment enabled by geolocatedestimates of uncertainty.While paper focuses on emulation of atmospheric correctionfor reﬂected solar radiation, future work in deep learning-basedapproximation may be applicable to probabilistic predictionof other quantities and has potential for efﬁciently exploitinglarge volumes of satellite data. R EFERENCES[1] G. M. Foody and P. M. Atkinson,

Uncertainty in remote sensing andGIS . John Wiley & Sons, 2003.[2] E. Vermote, C. Justice, M. Claverie, and B. Franch, “Preliminary analysisof the performance of the landsat 8/oli land surface reﬂectance product,”

Remote Sensing of Environment , vol. 185, pp. 46–56, 2016.[3] W. Lucht, C. B. Schaaf, and A. H. Strahler, “An algorithm for theretrieval of albedo from space using semiempirical brdf models,”

IEEETransactions on Geoscience and Remote Sensing , vol. 38, no. 2, pp. 977–998, 2000.[4] A. Lyapustin, J. Martonchik, Y. Wang, I. Laszlo, and S. Korkin, “Mul-tiangle implementation of atmospheric correction (maiac): 1. radiativetransfer basis and look-up tables,”

Journal of Geophysical Research:Atmospheres , vol. 116, no. D3, 2011.[5] A. Lyapustin, Y. Wang, I. Laszlo, R. Kahn, S. Korkin, L. Remer,R. Levy, and J. Reid, “Multiangle implementation of atmospheric cor-rection (maiac): 2. aerosol algorithm,”

Journal of Geophysical Research:Atmospheres , vol. 116, no. D3, 2011.[6] A. I. Lyapustin, Y. Wang, I. Laszlo, T. Hilker, F. G. Hall, P. J.Sellers, C. J. Tucker, and S. V. Korkin, “Multi-angle implementation ofatmospheric correction for modis (maiac): 3. atmospheric correction,”

Remote Sensing of Environment , vol. 127, pp. 385–393, 2012.[7] S. Castruccio, D. J. McInerney, M. L. Stein, F. Liu Crouch, R. L. Jacob,and E. J. Moyer, “Statistical emulation of climate model projectionsbased on precomputed gcm runs,”

Journal of Climate , vol. 27, no. 5,pp. 1829–1844, 2014.[8] P. Holden and N. Edwards, “Dimensionally reduced emulation of anaogcm for application to integrated assessment modelling,”

GeophysicalResearch Letters , vol. 37, no. 21, 2010.[9] P. B. Holden, N. R. Edwards, P. H. Garthwaite, and R. D. Wilkinson,“Emulation and interpretation of high-dimensional climate model out-puts,”

Journal of Applied Statistics , vol. 42, no. 9, pp. 2038–2055, 2015.[10] M. A. Schnorbus and A. J. Cannon, “Statistical emulation of streamﬂowprojections from a distributed hydrological model: Application to cmip3and cmip5 climate projections for b ritish c olumbia, c anada,”

WaterResources Research , vol. 50, no. 11, pp. 8907–8926, 2014.[11] V. M. Krasnopolsky, M. S. Fox-Rabinovitz, and D. V. Chalikov, “Newapproach to calculation of atmospheric model physics: Accurate and fastneural network emulation of longwave radiation in a climate model,”

Monthly Weather Review , vol. 133, no. 5, pp. 1370–1383, 2005.[12] L. Martino, J. Vicent, and G. Camps-Valls, “Automatic emulator andoptimized look-up table generation for radiative transfer models,” in , pp. 1457–1460, IEEE, 2017.[13] J. Rivera, J. Verrelst, J. G´omez-Dans, J. Mu˜noz-Mar´ı, J. Moreno,and G. Camps-Valls, “An emulator toolbox to approximate radiativetransfer models with statistical learning,”

Remote Sensing , vol. 7, no. 7,pp. 9347–9370, 2015.[14] G. Cheng, P. Zhou, and J. Han, “Learning rotation-invariant convolu-tional neural networks for object detection in vhr optical remote sensingimages,”

IEEE Transactions on Geoscience and Remote Sensing , vol. 54,no. 12, pp. 7405–7415, 2016.[15] S. Basu, S. Ganguly, S. Mukhopadhyay, R. DiBiano, M. Karki, andR. Nemani, “Deepsat: a learning framework for satellite imagery,”in

Proceedings of the 23rd SIGSPATIAL international conference onadvances in geographic information systems , p. 37, ACM, 2015.[16] M. Castelluccio, G. Poggi, C. Sansone, and L. Verdoliva, “Land use clas-siﬁcation in remote sensing images by convolutional neural networks,” arXiv preprint arXiv:1508.00092 , 2015.[17] L. Mou, P. Ghamisi, and X. X. Zhu, “Deep recurrent neural networks forhyperspectral image classiﬁcation,”

IEEE Transactions on Geoscienceand Remote Sensing , vol. 55, no. 7, pp. 3639–3655, 2017.[18] J. You, X. Li, M. Low, D. Lobell, and S. Ermon, “Deep gaussian processfor crop yield prediction based on remote sensing data,” in

Thirty-FirstAAAI Conference on Artiﬁcial Intelligence , 2017.[19] M. Xie, N. Jean, M. Burke, D. Lobell, and S. Ermon, “Transfer learningfrom deep features for remote sensing and poverty mapping,” in

ThirtiethAAAI Conference on Artiﬁcial Intelligence , 2016.[20] S. Zhu, B. Lei, and Y. Wu, “Retrieval of hyperspectral surface reﬂectancebased on machine learning,”

Remote Sensing , vol. 10, no. 2, p. 323, 2018.[21] Y. Chen, K. Sun, C. Chen, T. Bai, T. Park, W. Wang, R. R. Nemani,and R. B. Myneni, “Generation and evaluation of lai and fpar productsfrom himawari-8 advanced himawari imager (ahi) data,”

Remote Sensing ,vol. 11, no. 13, p. 1517, 2019. [22] L. Ma, Y. Liu, X. Zhang, Y. Ye, G. Yin, and B. A. Johnson, “Deep learn-ing in remote sensing applications: A meta-analysis and review,”

ISPRSjournal of photogrammetry and remote sensing , vol. 152, pp. 166–177,2019.[23] C. Shen, “A transdisciplinary review of deep learning research and itsrelevance for water resources scientists,”

Water Resources Research ,vol. 54, no. 11, pp. 8558–8593, 2018.[24] T. Vandal, E. Kodra, S. Ganguly, A. Michaelis, R. Nemani, and A. R.Ganguly, “Deepsd: Generating high resolution climate change projec-tions through single image super-resolution,” in

Proceedings of the 23rdacm sigkdd international conference on knowledge discovery and datamining , pp. 1663–1672, ACM, 2017.[25] C. Chen, Y. Knyazikhin, T. Park, K. Yan, A. Lyapustin, Y. Wang,B. Yang, and R. Myneni, “Prototyping of lai and fpar retrievals frommodis multi-angle implementation of atmospheric correction (maiac)data,”

Remote Sensing , vol. 9, no. 4, p. 370, 2017.[26] Y. Wang, Y. Tian, Y. Zhang, N. El-Saleous, Y. Knyazikhin, E. Vermote,and R. B. Myneni, “Investigation of product accuracy as a functionof input and model uncertainties: Case study with seawifs and modislai/fpar algorithm,”

Remote Sensing of Environment , vol. 78, no. 3,pp. 299–313, 2001.[27] Y. Kang and M. Krarti, “Bayesian-emulator based parameter identiﬁca-tion for calibrating energy models for existing buildings,” in

Buildingsimulation , vol. 9, pp. 411–428, Springer, 2016.[28] I. Vernon, J. Liu, M. Goldstein, J. Rowe, J. Topping, and K. Lindsey,“Bayesian uncertainty analysis for complex systems biology models:emulation, global parameter searches and evaluation of gene functions,”

BMC systems biology , vol. 12, no. 1, p. 1, 2018.[29] Y. Gal and Z. Ghahramani, “Dropout as a bayesian approximation:Representing model uncertainty in deep learning,” in internationalconference on machine learning , pp. 1050–1059, 2016.[30] J. M. Agency,

Himawari-8/9 Himawari Standard Data User’s GuideVersion 1.2 , 2015.[31] D. Sulla-Menashe and M. A. Friedl, “User guide to collection 6 modisland cover (mcd12q1 and mcd12c1) product,”

USGS: Reston, VA, USA ,2018.[32] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhut-dinov, “Dropout: a simple way to prevent neural networks from overﬁt-ting,”

The journal of machine learning research , vol. 15, no. 1, pp. 1929–1958, 2014.[33] Y. Gal, J. Hron, and A. Kendall, “Concrete dropout,” in

Advances inNeural Information Processing Systems , pp. 3581–3590, 2017.[34] T. Vandal, E. Kodra, J. Dy, S. Ganguly, R. Nemani, and A. R. Ganguly,“Quantifying uncertainty in discrete-continuous and skewed data withbayesian deep learning,” in

Proceedings of the 24th ACM SIGKDDInternational Conference on Knowledge Discovery & Data Mining ,pp. 2377–2386, ACM, 2018.[35] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”

CoRR , vol. abs/1412.6980, 2014.[36] M. Feng, C. Huang, S. Channan, E. F. Vermote, J. G. Masek, and J. R.Townshend, “Quality assessment of landsat surface reﬂectance productsusing modis data,”

Computers & Geosciences , vol. 38, no. 1, pp. 9–22,2012.[37] A. Lyapustin, Y. Wang, I. Laszlo, and S. Korkin, “Improved cloud andsnow screening in maiac aerosol retrievals using spectral and spatialanalysis,”

Atmospheric Measurement Techniques , vol. 5, no. 4, pp. 843–850, 2012.[38] A. Kendall and Y. Gal, “What uncertainties do we need in bayesiandeep learning for computer vision?,” in