Deep Learning Emulation of Multi-Angle Implementation of Atmospheric Correction (MAIAC)
Kate Duffy, Thomas Vandal, Weile Wang, Ramakrishna Nemani, Auroop R. Ganguly
11 Deep Learning Emulation of Multi-AngleImplementation of Atmospheric Correction(MAIAC)
Kate Duffy , Thomas Vandal , Weile Wang , Ramakrishna Nemani , and Auroop R. Ganguly Sustainability and Data Sciences Laboratory, Department of Civil and Environmental Engineering, NortheasternUniversity, 360 Huntington Avenue, Boston, MA. NASA Ames Research Center, Moffett Blvd, Mountain View, CA. Bay Area Environmental Research Institute, P.O. Box 25 Moffett Field, CA. California State University, Monterey Bay, Seaside, CA.
Abstract —New generation geostationary satellites make solarreflectance observations available at a continental scale withunprecedented spatiotemporal resolution and spectral range.Generating quality land monitoring products requires correctionof the effects of atmospheric scattering and absorption, whichvary in time and space according to geometry and atmosphericcomposition. Many atmospheric radiative transfer models, in-cluding that of Multi-Angle Implementation of AtmosphericCorrection (MAIAC), are too computationally complex to berun in real time, and rely on precomputed look-up tables.Additionally, uncertainty in measurements and models for remotesensing receives insufficient attention, in part due to the difficultyof obtaining sufficient ground measurements. In this paper,we present an adaptation of Bayesian Deep Learning (BDL)to emulation of the MAIAC atmospheric correction algorithm.Emulation approaches learn a statistical model as an efficient ap-proximation of a physical model, while machine learning methodshave demonstrated performance in extracting spatial features andlearning complex, nonlinear mappings. We demonstrate stablesurface reflectance retrieval by emulation (R2 between MAIACand emulator SR are 0.63, 0.75, 0.86, 0.84, 0.95, and 0.91 forBlue, Green, Red, NIR, SWIR1, and SWIR2 bands, respectively),accurate cloud detection (86%), and well-calibrated, geolocateduncertainty estimates. Our results support BDL-based emulationas an accurate and efficient (up to 6x speedup) method for ap-proximation atmospheric correction, where built-in uncertaintyestimates stand to open new opportunities for model assessmentand support informed use of SR-derived quantities in multipledomains.
Index Terms —emulation, atmospheric correction, MAIAC,deep learning, Bayesian deep learning, uncertainty quantification,Himawari-8, geostationary
I. I
NTRODUCTION O PERATIONAL land surface monitoring and scientificstudies are benefited by satellite-based observationsat unprecedented spatiotemporal resolution. New-generationgeostationary satellites include the Japanese Space Agency’sHimawari-8 and the National Oceanographic and AtmosphericAdministration’s (NOAA) Geostationary Operational Envi-ronmental Satellite (GOES) series. Geosynchronous orbits,
Corresponding author: Kate Duffy, [email protected] which have traditionally been leveraged for communicationsand weather monitoring satellites, enable sensors to producecontinental and regional-scale scans at intervals of as littleas 30 seconds. Such high-temporal resolution observationshave applications in study of diurnal processes, near-real timemonitoring of natural hazards, and creation of relatively cloud-free daily composites. In comparison to previous geostationarysensors, these satellites have improved spatial resolution andspectral range. These characteristics lend new-generation geo-stationary satellites to applications for land surface monitoringand invite comparison to sensors like the land-monitoring flag-ship Moderate Resolution Imaging Spectrometer (MODIS).As remote sensing helps to propel earth science into thebig data era, sciences face a challenge in processing andmaking use of terabytes of observational data, much of whichhas unknown accuracy [1]. Several types of uncertainty ex-ist, including aleatoric uncertainty from measurement noiseand epistemic uncertainty from incomplete knowledge aboutmodeled processes. One such modeled process is spatially,temporally, and spectrally varying interaction of reflectedenergy with gases and aerosols in the Earth’s atmosphere [2].Scattering and absorption effects are particularly strong in thevisible and near infrared spectra and depend on the locationand properties of atmospheric aerosols and water vapor. Thesecomplex interactions, combined with the challenges of adja-cency effects, heterogeneous landscapes, and rugged terrain,make atmospheric effects difficult to correct. Removing theseperturbations, which can vary reflectance by up to 15%,prevents atmospheric variability from being interpreted as landsurface change, and enables generation of reliable monitoringproducts [2]. Approaches to separate surface reflectance fromatmospheric signals range from simple methods like darkbody subtraction to sophisticated land-atmosphere models thatnumerically simulate the transfers of energy in the atmosphereby absorption, scattering, and emission.Developed for MODIS, the Multi-Angle Implementation ofAtmospheric Correction (MAIAC) algorithm has been adaptedto retrieve surface reflectance and atmospheric composition forthe geostationary satellite Himawari-8. MAIAC uses a semi- a r X i v : . [ c s . L G ] O c t Fig. 1. Schematic of emulator approach, where a physics-based model is replaced by an efficient surrogate model. analytical solution of the kernel-based RossThick LiSparse(RTLS) model [3]. MAIAC uses time series of up to sixteendays and stored information about the characteristics of eachlocation to help separate the contributions of surface andatmosphere to the observed signal [4], [5], [6]. Running anatmospheric radiative transfer model in real time is compu-tationally complex. Instead, MAIAC relies on the generationof look-up tables (LUT) with precomputed values. LUTs areprecomputed at a grid density chosen with consideration toboth accuracy and memory requirement, with values retrievedby linear interpolation between calculated values [4].Another approach to reducing the runtime computationof expensive models is through emulation. Emulation is anapproach to modeling that replaces a physics-based modelor model component with a learned component, which actsas a fast approximation of the model physics. The objec-tive of emulation is not to develop a new parametrization,but to efficiently and accurately reproduce an existing one,which has been carefully developed and validated based ondomain knowledge. Emulation using statistical models andshallow neural networks has been applied various earth sci-ence applications including climate modeling [7], [8], [9],hydrology [10], and atmospheric modeling [11], [12], [13].Where emulation can meaningfully accelerate modeling, theneed for computing time and resources is reduced. Efficientsurrogate models can be used for sensitivity analysis anduncertainty quantification within computing resource-limitedcontexts. Emulation has also shown potential for scientific in-sight, such as into the relationships between high-dimensionalmodel inputs and output, especially where process dynamicsare not well understood [9].The capability of neural networks to learn complex, non-linear mappings has been used in remote sensing for severaldecades. Many recent works have utilized convolutional neuralnetworks (CNN), a class of algorithm that can extract featuresfrom spatial data. The ability to leverage spatial correlations inimage-like inputs has led to achievements for remote sensingtasks including object detection [14], land use and land cover(LULC) classification [15], [16], [17], and prediction of quan-tities ranging from agricultural yield [18] to poverty [19]. Non-deep learning algorithms have been applied to predict remotesensing products including multispectral surface reflectance [20] and vegetation indices [21].While machine learning has demonstrated ability for ex-tracting credible insights from complex datasets in multiplegeoscience domains [22], [23], [24], reasons for caution re-main in many applications. Deep neural networks are lim-ited in physical interpretability and generally do not havebuilt-in quantification of predictive uncertainty. Uncertaintyassessment is useful for informed use of machine learning,and is also necessary for generation of some high-level landproducts from surface reflectance. For example, the MODISleaf area index (LAI) and fraction of photosynthetically activeradiation (FPAR) algorithm is calibrated using uncertainty onits inputs [25], [26]. Bayesian emulators have been used tomimic systems from biology to built infrastuctures [27], [28].Approximations of Bayesian inference can be used to extractinformation about both aleatoric and epistemic uncertaintyfrom deep learning (DL) models [29].In this paper we develop methodology integrating emulationand Bayesian Deep Learning (BDL). The approach is adaptedto remote sensing, a domain where big data and complex mod-els present both a challenge and an opportunity for flexible,data driven methods. We demonstrate a BDL-based emulationof MAIAC’s surface reflectance retrieval and cloud classifica-tion routines with built-in Bayesian uncertainty quantification.We test the performance of the MAIAC emulator over variousland cover types and seasons and find stable performance.Additionally, we assess the calibration of uncertainty estimatesand quantify the increase in speed compared to MAIAC.The main contributions of this paper are in both science andmethods. • Our methods innovation consists of the adaption ofBayesian deep learning to emulation of a physics modelin remote sensing. • The resulting advancement in scientific insight comesfrom the well-calibrated and geolocated estimates ofmodel uncertainty, previously not available for MAIAC.The rest of this paper is organized as follows. SectionII reviews the study area and data sets used. Section IIIintroduces the proposed methods for emulation. Section IVpresents results and evaluation. Finally, conclusions are drawnin Section V.
Fig. 2. Top of atmosphere reflectance and sample results for MAIAC and DCVDSR emulator. Columns 1-3 are in RGB true colors. Rows from top: IndochinaPeninsula and the South China Sea, eastern China and the Bohai Sea, southwestern Australia.
II. S
TUDY A REA AND D ATA S ETS
Datasets used in this study are from the Advanced HimawariImager (AHI) sensor carried by the Japanese geostationarysatellite Himawari-8. In the GeoNEX processing pipeline,Himawari Standard Data (HSD) scans are georeferenced andconverted to gridded data. The resulting gridded data setsfollow a geographic coordinate system with a 120 ◦ by 120 ◦ extent (E85 ◦ - E205 ◦ , N60 ◦ - S60 ◦ ). The domain is dividedinto 6 ◦ by 6 ◦ tiles defined by fixed latitude and longitude.Himawari-8s full disk, which encompasses the entire view asseen from the satellite, covers the continent of Australia andeastern Asia. Full disk scans are repeated every ten minutes.The Advanced Himawari Imager has sixteen observingbands encompassing visible, near-infrared (NIR), short waveinfrared (SWIR), and thermal infrared (TIR) with spatialresolution ranging from 0.5 to 2 km. Bands one through six aresolar reflective bands, spectrally similar to NASA’s MODIS.All bands are resampled to common 0.01 ◦ resolution, whichcorresponds to 1 km at the equator [Table I]. TABLE IH
IMWARI -8 AHI
SOLAR REFLECTIVE BANDS FOR LAND SURFACEOBSERVATION . Himwari-8 AHI Band Blue Green Red NIR SWIR1 SWIR2
Center Wavelength ( µ m) 0.46 0.51 0.64 0.86 1.6 2.3Spatial resolution (km) 1.0 1.0 1.0 0.5 2.0 2.0 A. Himawari-8 AHI TOA Reflectance
B. Himawari-8 AHI Surface Reflectance
C. MODIS MCD12Q1 Land Cover Type
Land cover types are identified using MODIS global landcover classification, which is produced annually from com-bined Terra and Aqua observations [31]. MCD12Q1 incor-porates five distinct classification schemes. We use the In-ternational Geosphere Biosphere Programme (IGBP) globalvegetation classification scheme. This scheme delineates 17distinct classes including 11 natural, 3 developed/mixed, and3 non-vegetated. MCD12Q1 is interpolated from 500 meterresolution to the 0.01 degree grid of the AHI datasets.III. M
ETHODS
A. Emulator model
Mapping from TOA reflectance to SR is typically han-dled by computationally expensive models which simulatenonlinear physics and incorporate ancillary information aboutatmospheric conditions. MAIAC is a state of the art methodfor accomplishing atmospheric correction. In our approach toemulation, several deep networks are learned to approximatethe MAIAC atmospheric correction algorithm.
1) Bayesian Deep Learning:
Typical deep neural networksare learned as deterministic functions which fail to captureinherent uncertainty in model parameters (epistemic) anddata (aleatoric). However, quantifying these uncertainties arecritical for decision making in applications from autonomousdriving to the physical sciences. Several approaches based inBayesian probability theory have been applied for uncertaintyquantification (UQ) in neural networks. Bayesian neural net-works (BNN) are a well-defined approach to capturing theseuncertainties that aims to learn probability distributions overthe functional parameters, such as the neural network weightsand biases. However, performing inference on the full posteriordistribution is intractable for networks with more than 2 hiddenlayers. This has led to the development of more efficientapproximations of Bayesian inference.Bayesian Deep Learning has been shown to be an effectiveapproach to modeling uncertainty in BNNs by defining a vari-ational (factorized) approximation, q θ ( W ) = (cid:81) Ll =1 q θ ( w l ) ,of the true posterior distribution, p ( W ) , for variational pa-rameters θ and W = { w l } Ll =1 [29]. Dropout, the process ofrandomly removing nodes from deep neural networks [32],is applied before each layer, l , in the model to approximate q ( w l ) . This approach can be thought of as ”thinning” the network. The optimization objective for this variational inter-pretation of a BNN, f , can be written as follows [33]: ˆ L ( θ ) = − N N (cid:88) i =1 log p ( y i | f W ( x i )) + KL ( q θ ( W ) || p ( W ))= L X ( θ ) + KL ( q θ ( W ) || p ( W )) (1)for data samples { x i , y i } Ni =1 . The first term represents the ob-jective function expresses the log likelihood of the model whileKullbackLeibler divergence (KL) term acts as a regularizerby discouraging separation between the approximate posteriorand the model prior.During inference, stochastic forward passes generate T independent and identically distributed samples. From these,we can empirically approximate the model’s predictive distri-bution, the variance of which expresses the model’s confidenceinternal. With T samples of [ˆ y , ˆ σ ] from the Bayesian network f W ( X ) , the unbiased estimates of the first two moments ofthe predictive distribution are: E [ y ] = 1 T T (cid:88) t =1 ˆ y t (2) V ar [ y ] = 1 T T (cid:88) t =1 (ˆ y t + ˆ σ t ) − ( 1 T T (cid:88) t =1 ˆ y t ) (3)Dropout is already applied in many deep learning models todiscourage overfitting, and thus allows UQ without adding anycomputational complexity [32]. Uncertainty estimation fromdropout encompasses both epistemic uncertainty, reduciblethough collection of more data, and aleatoric uncertainty frommeasurement noise. All models used in this work are Bayesianmodels implemented with concrete dropout. Concrete dropoutis a variant that adapts dropout probability to obtain wellcalibrated uncertainty estimate [33].
2) Discrete-Continuous Distribution:
Prediction tasks gen-erally fall into one of two categories: regression tasks predicta continuous quantity, while classification tasks are con-cerned with assigning a class label. MAIAC’s atmosphericcorrection and cloud classification algorithms generate surfacereflectance, a continuous variable ranging between 0 and 1,and binary cloud classification. We learn a discrete-continuousmodel to perform both regression and classification tasksin one probabilistic model [34]. To this end, the model isconditioned to predict the probability of a pixel being clearsky. For the Bayesian network f W ( X ) described in SectionIII-A1, the mean, variance, and probability are sampled asfollows: [ˆ y, ˆ σ , ˆ φ ] = f W ( X ) (4) ˆ p = Sigmoid ( ˆ φ ) (5)This conditioning results in a two-part loss function withthe first term capturing cross-entropy of predicted and cloudlabel and cloud prediction, and the second term capturingconditional regression loss at clear sky pixels. Here, the y TABLE IIE
VALUATION OF SURFACE REFLECTANCE FROM THE THREE CANDIDATE EMULATOR MODELS . Blue Green Red NIR SWIR1 SWIR2 SR MAIAC 0.064 0.084 0.155 0.307 0.327 0.231 SR MAIAC − SR emulator DCFC -0.001 -0.003 -0.015 -0.012 -0.012 -0.031DCCNN 0.002 0.006
DCVDSR -0.004 -0.009 0.010 CV MAIAC − CV emulator CV MAIAC (%) DCFC
DCCNN 31 24 10
DCCNN 0.818 0.866 0.948 0.95 0.977 0.96DCVDSR 0.797 0.864 0.93 0.917 0.977 956Conditional RMSE DCFC 0.0221 0.0217 0.0235 .0233 is the binary indicator of whether the classification is correctand D is the number of pixels with pixel index i . L X ( θ ) = binary classification loss (cid:122) (cid:125)(cid:124) (cid:123) − D (cid:88) i (cid:104) y i log ( ˆ p i )) + (1 − y i ) log (1 − ˆ p i ) (cid:105) + 1 D (cid:88) i,y i >
12 ˆ σ i − || y i − ˆ y i || + 12 log ˆ σ i (cid:124) (cid:123)(cid:122) (cid:125) conditional regression loss (6)
3) Implementation and Training:
We implement threefull Bayesian architectures conditioned to learn a discrete-continuous distribution as in Vandal et al. (2018) [34]:discrete-continuous fully connected neural network (DCFC),discrete-continuous convolutional neural network (DCCNN),and discrete-continuous very deep super resolution network(DCVDSR). DCVDSR, inspired by image super-resolutionnetworks, is a convolutional neural network similar to DCCNNbut incorporates a skip connection between the first and lasthidden layers. All models have 3 layers with 512 hidden unitsper layer and ReLU activations. DCCNN and DCVDSR havefilter sizes of 3.Over 200 GB of data from a two year period is dividedinto training (2016) and testing (2017) sets. Models areimplemented in TensorFlow 2.0 and trained on 50 by 50pixel patches using stochastic gradient descent and Adamoptimization with β = 0 . , β = 0 . , (cid:15) = 1 e − , a batchsize of 16, and learning rate of e − [35]. For concretedropout, hyperparameters tau and prior length-scale set to e − and e − . B. Assessment of Emulated SR and Cloud Products
Due to the challenges associated with obtaining groundtruth reflectance observations, comparison with an existing,comprehensively validated product can be used to assess theperformance of a new reflectance product [36]. Our methodol-ogy for assessment of emulated data products follows standardmethods for assessment of a reflectance product. As both the MAIAC SR and emulator SR are predicted from AHI TOAreflectance, all pixels are guaranteed coincident, coangled andcolocated, and can be directly compared.For clear sky pixels, agreement and error between MAIACand emulator SR are evaluated for each solar reflective band.Additionally, the ability of the emulator to discriminate be-tween clear sky and non-clear sky pixels is evaluated in theassessment of the emulator cloud product. To evaluate stabilityof emulator performance under varied land cover conditions,results are presented for performance common MODIS landcover classifications.IV. R
ESULTS AND D ISCUSSION
A. Model Evaluation
We adapt three deep learning architectures to learn a fastapproximation of MAIAC’s surface reflectance and cloudretrieval algorithm. We compare the three models to identifythe best-performing architecture, based on lowest error in SRprediction and highest accuracy in cloud identification.
1) Surface Reflectance:
A comparison between basic statis-tics of surface reflectance datasets obtained from MAIACand the emulators are presented in Table II. Here, meanvalue of surface reflectance is average intensity of clearsky pixels in each band. Coefficient of variation (CV) isa measure of relative dispersion of the data calculated byrelating the standard deviation and mean of a distribution.For surface reflectance, CV relates to the radiometric stabilitycharacteristic of the sensor, with lower CV indicating greaterstability. Comparison of MAIAC and emulator CV suggestthat the predictions of the fully connected model (DCFC) mostclosely matches the dispersion of MAIAC SR. The emulatormodels generally capture the relative magnitudes of variationin each wavelength while underestimating variation of the SRdistribution. Underestimation of observed variation is mostpronounced for the blue band across models. We also evaluatethe spatial autocorrelation, or the degree to which values ofa single variable are correlated due to nearness in space. Wecalculate Moran’s I for each of the SR datasets as an indicatorto the extent similar values cluster in space (I = 1), values
Fig. 3. Density plot illustrating the relationship between MAIAC SR and emulator (DCVDSR) SR for six solar reflective bands.Fig. 4. Histogram of difference between MAIAC SR and emulator (DCVDSR) SR for six solar reflective rands. are randomly located (I = 0), or similar values are dispersedin space (I = -1). We observe positive values of Moran’s Iin all datasets, finding MAIAC with I = 0.81 and emulatorswith approximately I = 0.94. The emulator datasets exhibitgreater clustering of like values, reflecting characteristics ofboth surface reflectance and cloud products.Correlation coefficient and conditional RMSE are alsopresented in Table II. Conditional RMSE refers to RMSEevaluated at clear sky pixels only. Correlation coefficientindicates the strongest linear relationship between MAIAC SRand DCFC emulator SR. Evaluation of the performance across models suggests that mappings in some bands may be easierto learn (SWIR1, SWIR2), and others more difficult to learn(Blue, Green). Conversely, conditional RMSE is generallylowest for the DCVDSR model. As the square root of thevariance of residuals, RMSE indicates the absolute fit of themodel to the data, and can be thought of as more germane topredictive ability than correlation.Pixel by pixel comparison is presented using density plotsin Figure 3. The plots suggest strong coherence betweenMAIAC and emulator SR. A 1:1 line is displayed for visualcomparison, while slope and intercept of the data best fit line
Fig. 5. Analysis of the emulator cloud prediction. (a)
Cloud classification accuracy with varying decisions thresholds. (b)
ROC curve for evaluation ofclassification performance. are displayed on the plots. Outliers are generally located abovethe 1:1 line, indicating that the emulator models may notcapture the upper tail of the SR distribution.Histograms of differences between MAIAC and emulatorsurface reflectance are plotted in Figure 4. Locations arerandomly sampled over the entire domain for the year 2017.For an ideal model, differences between observed and modeledvalues should be small and unbiased. Distributions are gener-ally symmetric and centered around near-zero means, indicat-ing minimal bias toward overestimation or underestimation byDCVDSR.
2) Cloud Identification:
Evaluation of the emulator cloudprediction is performed by pixelwise comparison betweenthe MAIAC cloud mask and the emulator cloud mask. Asdescribed in Section III-A2, the model is conditioned to predict ˆ p as the probability of a pixel being clear sky. By selectinga decision threshold value of p , cloud classification proceedsby casting pixels with ˆ p < p as non-clear sky and ˆ p > p asclear sky. Continuously varying the decision threshold p andcalculating the resulting classification accuracy indicates theoptimal mask probability. p , for each trained model. Figure 5presents a plot of classification accuracy with varying decisionthreshold.Figure 5 also presents the ROC curve, used to assessdiscrimination ability of binary classifiers. True positive rate(TPR) is plotted against false positive rate (FPR) at variousthresholds. Area under the curve (AUC) provides a measureof how well the model can discriminate between two classes,with a maximum value of one for perfect classification.Performance by accuracy for cloud classification is similaracross the evaluated emulator models (Table III). Sensitivityrefers to the true positive rate, or proportion of clear skypixels that are correctly classified. Specificity refers to the true negative rate, or the proportion of non-clear pixels thatare correctly classified. A high specificity classifier will screenhigh aerosol pixels, while a less conservative, higher sensitivityclassifier carries more chance of cloud contamination. Suchcloud contamination has a potentially strong negative effecton SR retrieval. The three emulator models are generally moreconservative, achieving greater classification accuracy for non-clear pixels than clear sky pixels. TABLE IIIC
LOUD CLASSIFICATION ACCURACY , SENSITIVITY , AND SPECIFICITY . Accuracy (%) Sensitivity Specificity
DCFC 0.8656 0.7017
DCCNN
It should be noted that assessment of classification accuracyuses MAIAC cloud masks a ground truth. Cloud masksproduced from MAIAC contain uncertainties and inaccuraciesof their own, and it is possible that the ability of CNNs toincorporate spatial information produces an advantage in cloudclassification. Visual assessment of cloud predictions fromMAIAC and emulator often indicate greater spatial coherenceof emulator cloud masks, and lesser appearance of someundesirable model artifacts (Figure 2).
3) Stability of Model over Varied Conditions:
Homoge-neous vegetation areas are identified using MODIS MCD12Q1Land Cover Type I and performance of the MAIAC emulatoris evaluated for each land cover type separately. Performanceincluding conditional RMSE of SR and cloud classificationaccuracy by the DCVDSR emulator are presented in Figure6. Results are presented for the nine most abundant classes inthe test set. Both regression error and cloud classification ac-
Fig. 6. (a)
Emulator (DCVDSR) performance according to International Geosphere Biosphere Programme (IGBP) global vegetation classification scheme. (b)
Emulator (DCVDSR) performance in the Northern Hemisphere across seasons for 2017. Dotted lines represent overall mean performance for all land coversand all seasons in the Northern Hemisphere, respectively. curacy are relatively stable for vegetated categories includingforests, shrubland, and savanna. Barren land result in poorerperformance.The optical properties of highly reflective surfaces presenta challenge to atmospheric correction, and such may alsoresult in poor performance for MAIAC [37]. Therefore, itis necessary to consider that results in Figure 6 representcomparison to MAIAC’s estimates, rather than to a groundtruth values.Seasonal performance of the emulator is also presented inFigure 6. Seasonal analysis is used to evaluate performanceunder annual fluctuations in vegetation phenology. Springgreen-up, fall senescence, and transitions between wet and dryseasons result in SR variation of several absolute percent invegetated areas [6]. SR error and cloud classification accuracyare are generally stable throughout the year, but evidenceslightly poorer performance and greater spread fall months (SRprediction) and winter months (cloud classification). Seasonalperformance is evaluated separately for each hemisphere forconsistency of seasons. Similar results were found for theSouthern Hemisphere.
B. Uncertainty Quantification
Bayesian deep learning models capture predictive uncer-tainty in regression task by producing a probabilistic output.As described in Section III-A1, we use variational inferenceto produce an ensemble of predictions for each sample, thencompute unbiased estimates of the first and second momentsof the predictive distribution at each pixel. From the secondmoment, the standard deviation expresses the magnitude ofpredictive uncertainty. From MAIAC, uncertainty generallygrows in proportion to surface brightness [5]. This is alsoobserved in emulator uncertainty width (Figure 7).We assess the quality of the uncertainty measurements byevaluating the uncertainty calibration, or whether the modelcaptures the uncertainty in observed data. We compare themodel’s predictive distribution to the observed values by eval-uating the frequency of residuals lying in various probabilitythresholds within the predicted distribution [38]. Figure 7presents each model’s uncertainty calibration. A perfectly cal-ibrated model, which captures the distribution of the observeddata, would match the 1:1 line. All three models underestimateuncertainty to some extent, meaning they are overconfidentin their predictions. Of the three, DCVDSR has most well-calibrated uncertainty.
Fig. 7. ( a ) Plot of pixel intensity versus uncertainty reveals increasing uncer-tainty with increasing surface brightness. ( b ) Uncertainty calibration evaluatesthe frequency of observed values (y-axis) within predicted probability ranges(x-axis). C. Performance
The spatiotemporal resolution and spatial extent of AHIscans result in generation of over 50 TB of TOA reflectanceper year. In this section we consider the nontrivial computingtime necessary to retrieve surface reflectance from these scans.To evaluate the deep learning emulator models, we assessinference from one forward pass (static network) and tenstochastic forward passes (Bayesian sampling network). Asingle inference with the static network is sufficient to produceSR and cloud products; Bayesian sampling produces the samewith uncertainty quantification.Processing speeds are presented in Table IV. Emulatorinference is evaluated on one GPU, while MAIAC, accelerated using precomputed look up tables, is run on one CPU. Amongthe compared emulator models, processing speed decreaseswith increasing complexity. Inference with Bayesian samplingis generally comparable in speed to MAIAC, while inferenceon the static network represents between 3.75x (DCDVSR)and 6x (DCFC) speedup.
TABLE IVP
ROCESSING SPEED OF
MAIAC
AND EMULATOR MODELS .Model Examples per secondStatic network Bayesian samplingnetworkMAIAC 0.40 —DCFC 2.4 0.60DCCNN 1.8 0.33DCVDSR 1.5 0.25
V. C
ONCLUSIONS
In this work we evaluate the usefulness of deep learning-based emulation to approximate the MAIAC algorithms forsurface reflectance retrieval and cloud identification. Discrete-continuous Bayesian neural networks are learned to emulateMAIAC with built-in uncertainty quantification. Using the full120 ◦ by 120 ◦ view of Himawari-8 a broad study area, we findthat predictions from the emulator models are consistent withMAIAC and robust over varied land cover types and seasons.Analysis demonstrates well-calibrated uncertainty estimatesfor the proposed MAIAC emulator. The ability to generateprobabilistic mappings from observed data to the geophysicalvariables of interest has potential applications including sen-sitivity analysis and model assessment enabled by geolocatedestimates of uncertainty.While paper focuses on emulation of atmospheric correctionfor reflected solar radiation, future work in deep learning-basedapproximation may be applicable to probabilistic predictionof other quantities and has potential for efficiently exploitinglarge volumes of satellite data. R EFERENCES[1] G. M. Foody and P. M. Atkinson,
Uncertainty in remote sensing andGIS . John Wiley & Sons, 2003.[2] E. Vermote, C. Justice, M. Claverie, and B. Franch, “Preliminary analysisof the performance of the landsat 8/oli land surface reflectance product,”
Remote Sensing of Environment , vol. 185, pp. 46–56, 2016.[3] W. Lucht, C. B. Schaaf, and A. H. Strahler, “An algorithm for theretrieval of albedo from space using semiempirical brdf models,”
IEEETransactions on Geoscience and Remote Sensing , vol. 38, no. 2, pp. 977–998, 2000.[4] A. Lyapustin, J. Martonchik, Y. Wang, I. Laszlo, and S. Korkin, “Mul-tiangle implementation of atmospheric correction (maiac): 1. radiativetransfer basis and look-up tables,”
Journal of Geophysical Research:Atmospheres , vol. 116, no. D3, 2011.[5] A. Lyapustin, Y. Wang, I. Laszlo, R. Kahn, S. Korkin, L. Remer,R. Levy, and J. Reid, “Multiangle implementation of atmospheric cor-rection (maiac): 2. aerosol algorithm,”
Journal of Geophysical Research:Atmospheres , vol. 116, no. D3, 2011.[6] A. I. Lyapustin, Y. Wang, I. Laszlo, T. Hilker, F. G. Hall, P. J.Sellers, C. J. Tucker, and S. V. Korkin, “Multi-angle implementation ofatmospheric correction for modis (maiac): 3. atmospheric correction,”
Remote Sensing of Environment , vol. 127, pp. 385–393, 2012.[7] S. Castruccio, D. J. McInerney, M. L. Stein, F. Liu Crouch, R. L. Jacob,and E. J. Moyer, “Statistical emulation of climate model projectionsbased on precomputed gcm runs,”
Journal of Climate , vol. 27, no. 5,pp. 1829–1844, 2014.[8] P. Holden and N. Edwards, “Dimensionally reduced emulation of anaogcm for application to integrated assessment modelling,”
GeophysicalResearch Letters , vol. 37, no. 21, 2010.[9] P. B. Holden, N. R. Edwards, P. H. Garthwaite, and R. D. Wilkinson,“Emulation and interpretation of high-dimensional climate model out-puts,”
Journal of Applied Statistics , vol. 42, no. 9, pp. 2038–2055, 2015.[10] M. A. Schnorbus and A. J. Cannon, “Statistical emulation of streamflowprojections from a distributed hydrological model: Application to cmip3and cmip5 climate projections for b ritish c olumbia, c anada,”
WaterResources Research , vol. 50, no. 11, pp. 8907–8926, 2014.[11] V. M. Krasnopolsky, M. S. Fox-Rabinovitz, and D. V. Chalikov, “Newapproach to calculation of atmospheric model physics: Accurate and fastneural network emulation of longwave radiation in a climate model,”
Monthly Weather Review , vol. 133, no. 5, pp. 1370–1383, 2005.[12] L. Martino, J. Vicent, and G. Camps-Valls, “Automatic emulator andoptimized look-up table generation for radiative transfer models,” in , pp. 1457–1460, IEEE, 2017.[13] J. Rivera, J. Verrelst, J. G´omez-Dans, J. Mu˜noz-Mar´ı, J. Moreno,and G. Camps-Valls, “An emulator toolbox to approximate radiativetransfer models with statistical learning,”
Remote Sensing , vol. 7, no. 7,pp. 9347–9370, 2015.[14] G. Cheng, P. Zhou, and J. Han, “Learning rotation-invariant convolu-tional neural networks for object detection in vhr optical remote sensingimages,”
IEEE Transactions on Geoscience and Remote Sensing , vol. 54,no. 12, pp. 7405–7415, 2016.[15] S. Basu, S. Ganguly, S. Mukhopadhyay, R. DiBiano, M. Karki, andR. Nemani, “Deepsat: a learning framework for satellite imagery,”in
Proceedings of the 23rd SIGSPATIAL international conference onadvances in geographic information systems , p. 37, ACM, 2015.[16] M. Castelluccio, G. Poggi, C. Sansone, and L. Verdoliva, “Land use clas-sification in remote sensing images by convolutional neural networks,” arXiv preprint arXiv:1508.00092 , 2015.[17] L. Mou, P. Ghamisi, and X. X. Zhu, “Deep recurrent neural networks forhyperspectral image classification,”
IEEE Transactions on Geoscienceand Remote Sensing , vol. 55, no. 7, pp. 3639–3655, 2017.[18] J. You, X. Li, M. Low, D. Lobell, and S. Ermon, “Deep gaussian processfor crop yield prediction based on remote sensing data,” in
Thirty-FirstAAAI Conference on Artificial Intelligence , 2017.[19] M. Xie, N. Jean, M. Burke, D. Lobell, and S. Ermon, “Transfer learningfrom deep features for remote sensing and poverty mapping,” in
ThirtiethAAAI Conference on Artificial Intelligence , 2016.[20] S. Zhu, B. Lei, and Y. Wu, “Retrieval of hyperspectral surface reflectancebased on machine learning,”
Remote Sensing , vol. 10, no. 2, p. 323, 2018.[21] Y. Chen, K. Sun, C. Chen, T. Bai, T. Park, W. Wang, R. R. Nemani,and R. B. Myneni, “Generation and evaluation of lai and fpar productsfrom himawari-8 advanced himawari imager (ahi) data,”
Remote Sensing ,vol. 11, no. 13, p. 1517, 2019. [22] L. Ma, Y. Liu, X. Zhang, Y. Ye, G. Yin, and B. A. Johnson, “Deep learn-ing in remote sensing applications: A meta-analysis and review,”
ISPRSjournal of photogrammetry and remote sensing , vol. 152, pp. 166–177,2019.[23] C. Shen, “A transdisciplinary review of deep learning research and itsrelevance for water resources scientists,”
Water Resources Research ,vol. 54, no. 11, pp. 8558–8593, 2018.[24] T. Vandal, E. Kodra, S. Ganguly, A. Michaelis, R. Nemani, and A. R.Ganguly, “Deepsd: Generating high resolution climate change projec-tions through single image super-resolution,” in
Proceedings of the 23rdacm sigkdd international conference on knowledge discovery and datamining , pp. 1663–1672, ACM, 2017.[25] C. Chen, Y. Knyazikhin, T. Park, K. Yan, A. Lyapustin, Y. Wang,B. Yang, and R. Myneni, “Prototyping of lai and fpar retrievals frommodis multi-angle implementation of atmospheric correction (maiac)data,”
Remote Sensing , vol. 9, no. 4, p. 370, 2017.[26] Y. Wang, Y. Tian, Y. Zhang, N. El-Saleous, Y. Knyazikhin, E. Vermote,and R. B. Myneni, “Investigation of product accuracy as a functionof input and model uncertainties: Case study with seawifs and modislai/fpar algorithm,”
Remote Sensing of Environment , vol. 78, no. 3,pp. 299–313, 2001.[27] Y. Kang and M. Krarti, “Bayesian-emulator based parameter identifica-tion for calibrating energy models for existing buildings,” in
Buildingsimulation , vol. 9, pp. 411–428, Springer, 2016.[28] I. Vernon, J. Liu, M. Goldstein, J. Rowe, J. Topping, and K. Lindsey,“Bayesian uncertainty analysis for complex systems biology models:emulation, global parameter searches and evaluation of gene functions,”
BMC systems biology , vol. 12, no. 1, p. 1, 2018.[29] Y. Gal and Z. Ghahramani, “Dropout as a bayesian approximation:Representing model uncertainty in deep learning,” in internationalconference on machine learning , pp. 1050–1059, 2016.[30] J. M. Agency,
Himawari-8/9 Himawari Standard Data User’s GuideVersion 1.2 , 2015.[31] D. Sulla-Menashe and M. A. Friedl, “User guide to collection 6 modisland cover (mcd12q1 and mcd12c1) product,”
USGS: Reston, VA, USA ,2018.[32] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhut-dinov, “Dropout: a simple way to prevent neural networks from overfit-ting,”
The journal of machine learning research , vol. 15, no. 1, pp. 1929–1958, 2014.[33] Y. Gal, J. Hron, and A. Kendall, “Concrete dropout,” in
Advances inNeural Information Processing Systems , pp. 3581–3590, 2017.[34] T. Vandal, E. Kodra, J. Dy, S. Ganguly, R. Nemani, and A. R. Ganguly,“Quantifying uncertainty in discrete-continuous and skewed data withbayesian deep learning,” in
Proceedings of the 24th ACM SIGKDDInternational Conference on Knowledge Discovery & Data Mining ,pp. 2377–2386, ACM, 2018.[35] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
CoRR , vol. abs/1412.6980, 2014.[36] M. Feng, C. Huang, S. Channan, E. F. Vermote, J. G. Masek, and J. R.Townshend, “Quality assessment of landsat surface reflectance productsusing modis data,”
Computers & Geosciences , vol. 38, no. 1, pp. 9–22,2012.[37] A. Lyapustin, Y. Wang, I. Laszlo, and S. Korkin, “Improved cloud andsnow screening in maiac aerosol retrievals using spectral and spatialanalysis,”
Atmospheric Measurement Techniques , vol. 5, no. 4, pp. 843–850, 2012.[38] A. Kendall and Y. Gal, “What uncertainties do we need in bayesiandeep learning for computer vision?,” in