A multi-sensor data-driven methodology for all-sky passive microwave inundation retrieval
HHydrol. Earth Syst. Sci., 21, 2685–2700, 2017https://doi.org/10.5194/hess-21-2685-2017© Author(s) 2017. This work is distributed underthe Creative Commons Attribution 3.0 License.
A multi-sensor data-driven methodology for all-sky passivemicrowave inundation retrieval
Zeinab Takbiri , Ardeshir M. Ebtehaj , and Efi Foufoula-Georgiou Department of Civil, Environmental and Geo- Engineering and St. Anthony Falls Laboratory, University of Minnesota,Twin Cities, Minneapolis, Minnesota, USA Department of Civil and Environmental Engineering, University of California, Irvine, California, USA
Correspondence to:
Zeinab Takbiri ([email protected])Received: 26 October 2016 – Discussion started: 1 November 2016Revised: 14 March 2017 – Accepted: 7 May 2017 – Published: 8 June 2017
Abstract.
We present a multi-sensor Bayesian passive mi-crowave retrieval algorithm for flood inundation mapping athigh spatial and temporal resolutions. The algorithm takesadvantage of observations from multiple sensors in optical,short-infrared, and microwave bands, thereby allowing fordetection and mapping of the sub-pixel fraction of inundatedareas under almost all-sky conditions. The method relies ona nearest-neighbor search and a modern sparsity-promotinginversion method that make use of an a priori dataset inthe form of two joint dictionaries. These dictionaries con-tain almost overlapping observations by the Special SensorMicrowave Imager and Sounder (SSMIS) on board the De-fense Meteorological Satellite Program (DMSP) F17 satel-lite and the Moderate Resolution Imaging Spectroradiometer(MODIS) on board the Aqua and Terra satellites. Evaluationof the retrieval algorithm over the Mekong Delta shows thatit is capable of capturing to a good degree the inundation di-urnal variability due to localized convective precipitation. Atlonger timescales, the results demonstrate consistency withthe ground-based water level observations, denoting that themethod is properly capturing inundation seasonal patternsin response to regional monsoonal rain. The calculated Eu-clidean distance, rank-correlation, and also copula quantileanalysis demonstrate a good agreement between the outputsof the algorithm and the observed water levels at monthlyand daily timescales. The current inundation products are ata resolution of 12.5 km and taken twice per day, but a higherresolution (order of 5 km and every 3 h) can be achievedusing the same algorithm with the dictionary populated bythe Global Precipitation Mission (GPM) Microwave Imager(GMI) products.
Capturing the diurnal spatiotemporal dynamics of inundationover coastal regions, deltaic surfaces, and river floodplainsrequires high-resolution observations in both time and space,which are not available from the typical sparse ground-basedsensors. Satellite observations from the visible to the mi-crowave bands of the electromagnetic spectrum have beenwidely used for mapping floods, estimating surface waterstorages, river discharge values, and water levels (Smith,1997). In the visible bands ( (cid:24) (cid:24) (cid:24)
80) is much higher than the dry soil ( (cid:24)
4) andthus the inundated areas are substantially less emissive andradiometrically colder than the surrounding soils and vegeta-tion covers. Moreover, emission from smooth water surfacesis more polarized than that from rough soils and vegetatedsurfaces (Ulaby et al., 1982; Papa et al., 2006; Prigent et al.,2007). This polarization signal has also been used throughempirical thresholding approaches to distinguish water sur-faces from other land surface types (Allison et al., 1979; Sip-pel et al., 1994, 1998; Brakenridge et al., 2005, 2007).
Published by Copernicus Publications on behalf of the European Geosciences Union.686 Z. Takbiri et al.: A multi-sensor data-driven methodology
Flood mapping from space was first accomplished usingvisible to near-infrared (VNIR) observations (0.4–1.1 µm),by the multispectral scanner system (MSS) sensors onboard Landsat-1 (Rango and Anderson, 1974; Rango andSalmonson, 1974; McGinnis and Rango, 1975). In thesepioneering works, flooded areas were mapped where thenear-infrared surface reflectance was below a certain thresh-old as water absorption is strong in this region. More re-cently, Brakenridge and Anderson (2006) showed that thevisible red band 1 (0.62–0.67 µm) and near-infrared (NIR)band 2 (0.84–0.87 µm) from the Moderate Resolution Imag-ing Spectroradiometer (MODIS) aboard the Terra and Aquasatellites can be used to detect water over land surfaces. Theymapped several hundreds of flood events at different sitesall over the world by classification of water via threshold-ing over the NIR band and the normalized difference veg-etation index, NDVI D . NIR (cid:0) red /=.
NIR C red / introducedby Rouse et al. (1974). To better discriminate the vegetationfrom inundated areas in threshold-based methods, Ticehurstet al. (2013) and Guerschman et al. (2011) used a new in-dex – called the normalized difference water index, NDWI D . red (cid:0) MIR /=. red C MIR / introduced by Gao (1996) andlater modified to MNDWI D . green (cid:0) MIR /=. green C MIR / by Xu (2006). This index exploits the mid-infrared (MIR;1.23–1.25 µm) part of the spectrum to improve the map-ping. In all thresholding methods, the shadows of terrains andclouds are usually miss-classified as inundated areas. There-fore, Kuenzer et al. (2015) used the topography and cloudinformation data as ancillary variables to obtain improvedestimates of the interannual dynamics of areas covered withwater over five deltaic regions with high annual cloud cover.The use of passive microwaves (PMW) to map floodedareas was pioneered by Allison et al. (1979), Giddingsand Choudhury (1989), and Choudhury (1991). Allison etal. (1979) used horizontal polarization of brightness temper-atures (Tb) at 19.3 GHz, from the electrically scanning mi-crowave radiometer (ESMR) on board the Nimbus-5 satel-lite, to delineate flooded regions in Australia. Giddings andChoudhury (1989) reported the 37 GHz vertical and hori-zontal polarization differences (i.e., Tb (cid:0) Tb / , from theScanning Multi-frequency Microwave Radiometer (SMMR)on board the Nimbus-7 satellite, as the most responsive chan-nel to identify the seasonal changes in the extent of flood-plains over South America. Temimi et al. (2005) used theempirical basin wetness index (BWI) defined by Basist etal. (1998), to obtain real-time water surface fraction (WSF)in the Mackenzie River basin, using multi-frequency infor-mation at 19, 37, and 85 GHz. To minimize the contamina-tion effects of atmospheric emission and variations of surfacetemperatures, Brakenridge et al. (2007) exploited the ratio ofTb values over inundated and dry surfaces at 36 GHz andpresented promising results over several river sites all overthe globe, using the PMW observations by the AdvancedMicrowave Scanning Radiometer – Earth Observing System(AMSR-E). De Groeve (2010) also used the same method and instrument to map floods for several hundreds of loca-tions for the Global Disaster Alert and Coordination System.While visible and shortwave-infrared bands often providesub-kilometer resolution for inundation mapping, their capa-bility is very limited in a cloudy sky. This limitation is usu-ally very restrictive over prone-to-flooding watersheds anddeltas in tropical regions with a high-frequency of heavy pre-cipitation events. For instance, a long-term analysis of Land-sat data revealed that due to cloud contamination, only 30 %of overpasses are useful for inundation mapping (Melack etal., 1994). Because of this limitation, most of the relatedsatellite products, including the MODIS inundation prod-ucts, are available mostly in monthly, seasonal, and/or an-nual timescales (Ordoyne and Friedl, 2008). However, mi-crowaves can penetrate clouds – and to some extent hy-drometeors in frequencies (cid:20)
37 GHz – to provide water in-undation mapping in almost all weather conditions. Unfortu-nately, due to the coarse resolution of microwave data, e.g., . (cid:2) / km at 19 GHz to . (cid:2) / km at 183 GHz for theSSMIS), only large water bodies can be detected and sub-pixel inundated areas cannot be directly identified (Smith,1997). Presently, there exist several sensors on board dif-ferent satellites that overlap in the spatial and time domainsthat sample land–atmosphere signals at different wavelengthsof the electromagnetic spectrum. Therefore, it is impera-tive to integrate these multi-sensor observations to overcometheir individual shortcomings and improve retrievals of land–atmosphere parameters and the extent of flooded areas (Pri-gent et al., 2001, 2007; Crétaux et al., 2011; Temimi et al.,2011; Schroeder et al., 2010).In this paper, we develop a method to retrieve sub-pixelinundation fraction (“inundation” referring to regions wherewater covers the land surface, excluding permanent waterbodies) only from passive microwave observations based ona set of paired VNIR and passive microwave training sam-ples. In particular, as training samples, we use global ob-servations of VNIR data from the MODIS on board Terra(launched in 2000) and Aqua satellites (launched in 2002)and passive microwave data from the Special Sensor Mi-crowave Imager and Sounder (SSMIS) on board DefenseMeteorological Satellite Program (DMSP) satellites F16–F18. Several years of observations (2000–present) by thesetwo sensors allow us to collect adequate overlapping datato link coarse-scale SSMIS passive microwave data to high-resolution MODIS VNIR data in the form of an organizeddataset. Obviously, this collection of almost coincident ob-servations does not contain direct information about sur-face inundation in a cloudy sky, as the radiative signals inVNIR wavelengths cannot penetrate clouds. However, overland, it is well understood (see Ferraro et al., 1986; Grody,1991; Wilheit et al., 1994) that hydrometeors and the atmo-spheric profile do not significantly affect the low-frequency <
60 GHz brightness temperatures. Therefore, the informa-tion content of the dataset over low-frequency channels isindependent of the atmospheric profile and can be used to a good degree of accuracy to recover inundated surfaces un-der cloudy conditions as well. It should be acknowledgedthat there is an uncertainty for the inundation retrieval underheavy rainy/cloudy skies when only the information in theclear-sky dataset is used. However, we expect that this uncer-tainty will be small since the information of the underlyingsurfaces in low-frequency channels of the collected datasetremains almost the same over different atmospheric condi-tions.The collected dataset has a large number of linked pairsof inundation fraction data from MODIS data SSMIS multi-frequency brightness temperatures. For algorithmic devel-opment, the dataset is organized into two fat matrices: thebrightness temperature and inundation dictionaries. For anobserved pixel-level brightness temperature, the proposedpassive retrieval algorithm uses the nearest-neighbor searchto isolate a few vectors in the dictionary of brightness temper-atures and their corresponding inundation fraction and thenuse them to estimate the unknown inundation fraction. Theproposed retrieval algorithm is applied to estimate daily in-undation fraction at spatial resolution of 12.5 km over theMekong in 2015. The main motivation for selecting thisdelta as a case study is that approximately 90 % of theMekong region is covered by clouds during the rainy season(Leinenkugel et al., 2013), which severely hampers the use ofinundation mapping in the VNIR bands. We retrieve the in-undation fraction twice per day using the proposed algorithmover the Mekong Delta and compare the results with the floodproducts of VNIR data during clear skies. We also evaluatedthe results against the daily and monthly water level data ob-tained from 11 gauges over the Mekong Delta (Fig. 1) to ex-amine consistency of the retrievals with the regional inunda-tion patterns.This paper is organized as follows. Section 2 explains the apriori dataset and the formation of the dictionaries and Sect. 3provides detailed information about the retrieval algorithm.Implementation of the method and validation are explainedin Sect. 4. Section 5 presents concluding remarks and direc-tions for future research.
The 60 000 km Mekong Delta is in South Vietnam (seeFig. 1) with a tropical monsoon climate system. The deltawith its agricultural industry is one of the most importantsources of food supply to Southeast Asia. This critical regionis home to nearly 20 million people, approximately 22 % ofthe population of Vietnam, and is one of the most denselypopulated regions in the world. The area has been exposedto exacerbated erosion due to human activities and increasedsea level rise and lowland flood events in the recent decades(e.g., Syvitski et al., 2005; Ericson et al., 2006; Nicholls andCazenave, 2010; Tessler et al., 2015). Improved quantifica-tion of (near) real-time inundation of the Mekong Delta can
Figure 1.
Map and digital elevation of the Mekong River basin(area D
795 000 km ) and its delta. The study area is delineated bya pink rectangle. The 11 stations (from Mekong River Commission)that monitor the water level are also marked by pink stars. help (1) to improve flood forecasting by identifying the in-undated and thus soil saturated zones and (2) to identify ero-sional and depositional hotspots that can improve geomor-phologic and ecosystem modeling. The proposed retrieval al-gorithm is applied to estimate sub-daily inundation fractionat resolution of 12.5 km over some of the lower regions ofthe Mekong Delta in calendar year 2015 (Fig. 1).Two sources of information are used to build a datasetthat connects almost coincident VNIR water inundation dataand multi-frequency passive microwave data. The VNIR dataconsist of the daily NASA standard MODIS near-real-time(NRT) water product (MWP-3D3ON; i.e., 3-Days imagery,three observations, and no shadow masking) with approx-imately 250 m spatial resolution (Nigro et al., 2014) fromboth Terra and Aqua satellites. The Terra and Aqua satel-lites both have a sun-synchronous orbit. They rotate around the earth in opposite directions: Terra has an ascending orbitwith the local equatorial crossing time of 10:30 LT and Aquahas a descending orbit with the local equatorial crossing timeof 01:30 p.m. MWP products are binary information of in-undation based on the Dartmouth Flood Observatory (DFO)algorithm, which uses a thresholding scheme on MODIS ob-servations at band 1 (0.62–0.67 µm), band 2 (0.84–0.87 µm),and band 7 (2.10–2.15 µm). To minimize the contaminationeffects of cloud and terrain shadows, we focus on 3-day com-posite MWP products (3D3ON). Clearly, the use of the 3-daycomposite MODIS-MWP data can affect daily inundation re-trievals; however, in the context of the presented algorithmthis is the best choice because, daily MODIS-MWP compos-ites are very uncertain due to the terrain shadows and clouds(Nigro et al., 2014). Typically, there are numerous missingpixels in the daily products, which reduce the sample sizedramatically. These errors are significantly reduced in 3-daycomposite products, as it is less likely that clouds (and theirshadows) stay at the same spot during a 3 day period (Nigroet al., 2014).The microwave data are obtained from the DMSP SSM/I-SSMIS Pathfinder Daily Equal-Area Scalable Earth Grid(EASE-Grid; see Armstrong and Brodzik, 1995) brightnesstemperatures distributed by the National Snow and Ice DataCenter (NSIDC). These datasets are at four central frequen-cies 19, 22, 37, and 91 GHz. All channels are vertically andhorizontally polarized except channel 22 GHz. The effec-tive resolution of the highest frequency channel is (cid:24) (cid:24)
25 km. DMSP SSM/I-SSMIS brightness temperaturedata products are from observations by the SSM/I and SSMISradiometer on board the DMSP F8, 11, 13, or 17. Since De-cember 2006, the F17 satellite has been the only operationalsatellite from the DMSP series, which carries on board theSSMIS instrument with equatorial crossing times of 05:30–06:30 a.m. and 17:30–18:30 p.m. for the descending and as-cending orbits, respectively. It is important to note that be-cause these satellites revisit every point on Earth at the samelocal time, repeatedly, the paired MODIS-MWP with DMSPSSMIS data have a fixed diurnal time difference in the entiredataset. Since the MODIS-MWP data are from the combina-tion of Terra and Aqua observations, their time tag is advan-tageous in the sense that it allows us to enrich the number ofsamples for the diurnal cycle of inundation dynamics.The first step for building the a priori dataset is to matchthe different space–time resolutions of the multi-sensor infor-mation. To unify the spatial resolution of the microwave data,the brightness temperatures of the three lower-frequencychannels are mapped onto the latitude–longitude grids ofthe high-frequency channel of 91 GHz with a resolution (cid:24) f that represents the ra- Figure 2.
A schematic showing construction steps of the a prioridataset for dictionaries. The top slab is the upscaled MODIS-MWPand the other slabs are the brightness temperature data at seven fre-quency bands. Each vector on the left is created by stacking a pixel-level information of the multi-frequency brightness temperatures bythe SSMIS radiometer and their corresponding inundation fractionsfrom the MWP product at 12.5 km resolution. This process is re-peated for each orbit to generate a large number of vectors and formseparate dictionaries for ascending and descending orbits using allsatellite overpasses in 5 years from 2010 to 2014. N D n (cid:2) m is thenumber of collected vectors for 1 day in a year. The same process isconducted for each day in 5 years (2010–2014) to create the dictio-naries with M D (cid:2) P i D N i vectors. tio of the number of inundated sub-pixels to the total numberof sub-pixels within a pixel size of 12.5 km. For matchingthe timescales of Tb and MWP values, the Tb values are av-eraged over a 3-day time window to minimize the possibleeffects of cloud contamination in the VNIR data. Figure 2demonstrates schematically the process of producing the ex-plained dataset. The proposed retrieval algorithm uses the link betweentwo available coincidental datasets, passive microwave (SS-MIS) and VNIR (MODIS-MWP), to retrieve inundation inthe cloudy days. First, the overlapped clear-sky pixels ofMODIS-MWP and SSMIS for 5 years (2010–2014) are col-lected over the study area to create two coincidental dictio-naries: the SSMIS dictionary and the MODIS-MWP dictio-nary. The SSMIS dictionary consists of 8-dimensional vec-tors of brightness temperature (Tb), where 8 is the number offrequency channels, and the MODIS-MWP dictionary con- sists of scalar values of inundation fractions for each corre-sponding pixel in Tb. In other words, the inundation frac-tion for each Tb in the brightness temperature dictionaryis known. The algorithm uses the information embedded inthese two dictionaries to estimate the unknown inundationfractions for each Tb observation vector. First, it searches thebrightness temperature dictionary to find the K most similarvectors in the Euclidean sense to the Tb observation vectorthrough the K -nearest-neighbors algorithm. Then, for these K -nearest-neighbors, the corresponding known scalar valuesin the inundation fraction dictionary are picked. If the ratioof the number of inundated vectors in K -nearest neighborsis greater than a threshold (which will be explained later),this pixel is called inundated and the algorithm goes to theestimation step. In the estimation step, the coefficients thatcan optimally estimate the Tb observation vector based on its K -nearest neighbors are calculated through a least-squaresregularization approach. Those coefficients are then used tolinearly combine the K known inundation fractions that areassociated with the neighboring Tb vectors for calculatingthe unknown inundation fraction. The above detection andestimation steps are repeated for each orbit at a pixel level of12.5 km over the study area. The algorithm is mathematicallydescribed in what follows.To organize the dataset in an algebraically tractable man-ner, M vectors of microwave brightness temperatures b i D . Tb i ; Tb i ; : : :; Tb ni / T R n at n frequency channels are col-lected. These vectors form the column space of an n -by- M matrix B D [ b j b j : : : j b M ] R n (cid:2) M , called a brightnesstemperature dictionary, where M (cid:29) n . Analogously, the cor-responding inundation fraction values f f i g Mi D can be col-lected in the column space of the inundation dictionary F D (cid:2) f j f j : : : j f M (cid:3) R (cid:2) M . For each vector b i in the dictionaryof brightness temperatures there is an inundation fraction f i from MODIS-MWP. The collection of these pairs from his-torical observations forms the two dictionaries B and F . Thealgorithm follows two sequential steps: a detection and anestimation step. In the detection step, for each observed vec-tor of brightness temperature b obs , the algorithm first finds its K -neighboring brightness temperatures in B in the Euclideansense and stores them in the column space of B s R n (cid:2) K .Then, knowing the column indices of the neighboring bright-ness temperatures, it isolates their corresponding inundationfraction values in F s R (cid:2) K . In this step, if at least p (cid:2) K number of nearby inundation fraction values in F s are non-zero, the algorithm assumes that b obs is over an inundatedpixel and attempts to estimate the fraction of inundation inthe estimation step. Here, p . (cid:0) / is the detection proba-bility parameter. It should be also noted that the K -nearest-neighbor algorithm in this paper does not directly constrainits search to any specific time or location. In other words, forevery pixel-level vector of Tb, the K -nearest-neighbors algo-rithm searches the entire dictionary regardless of any specifictime or spatial coherency. In the estimation step, the method assumes that b obs can beestimated by a linear combination of a few column vectors of B s as follows: b obs D B s c C e ; (1)where the vector c R K contains a set of representationcoefficients to be estimated and e R n is the error vector.Clearly, for an observed vector of brightness temperatures b obs , the goal is to estimate its unknown inundation fractionvalue O f . We assume that the two paired dictionaries B s and F s represent similar manifolds in a geometric sense that theirlocal structures can be approximated well with the same lin-ear model. This allows us to assume that the representationcoefficients in vector c from Eq. (1) can be used to estimatethe inundation fraction O f as follows: O f D F s c : (2)As a result, using a classic-weighted least-squares method,the representation coefficients c can be estimated as O c D argmin c n k W (cid:16) b obs (cid:0) B s c k (cid:17)o ; (3)where W is a weight matrix (to be discussed later in thissection) that characterizes the importance of each channel inthe retrieval scheme. The number of K -nearest neighbors isoften larger than the number of frequency channels, k (cid:29) n ,making B s a rank-deficient matrix and the above problem ill-posed. To make the optimization problem (Eq. 3) well-posed,we use a mixed ‘ – ‘ -norm regularization as follows: c D Argmin c n k W . b obs (cid:0) B s c / k C (cid:21) k c k C (cid:21) k c k o subject to c ; T c D ; (4)which has been successfully used for passive microwaveprecipitation retrievals (Ebtehaj et al., 2015a, b). The non-negativity of the coefficients assures positivity of the bright-ness temperatures and the sum-to-one constraint enforcesan unbiased estimation. The regularization involves both the ‘ -norm k c k D K P i D j c i j and the ‘ -norm k c k D . K P i D j c i j / .The parameters (cid:21) and (cid:21) in Eq. (4) are regularization param-eters that enforce a trade-off between the two regularizations ‘ and ‘ . In this mixed regularization, the ‘ -norm leveragessparsity in the solution (i.e., forces some of the elements of c to be zero) while the ‘ -norm increases the stability of thesolution as the neighboring brightness temperatures in B s arelikely to be highly correlated (see Zou and Hastie, 2005). Ineffect, due to the use of a mixed regularization, this regu-larization promotes group sparsity (i.e., some blocks of therepresentation coefficients are zero) while it keeps the solu-tion sufficiently stable. In other words, it acknowledges thefact that there are a few clusters of nearby brightness tem-peratures that can properly explain the observation. By en-forcing the ‘ -norm we select vectors that are parts of clus-ters of brightness temperatures, while the ‘ -norm handles Yes Yes
Is the number of inundated pixels within the Knn greater than p K × ? Find K -nearest neighbors of obsi b from B to create s B i = N Find the K corresponding i f from F to create F s ˆ f = 0 ( ) { } Argmin λ λ − + + sc W b B c c c ˆ ˆ s F c i f ≈ i = N No End No
Create B and F dictionaries Input Tb images at 7 frequency Channels that has N pixels (number of obs b vectors) Input fixed parameter: , , , , K p λ λ α
Input coincidental s f and Tbs Start Figure 3.
Flowchart of the inundation retrieval algorithm for N pix-els in each orbit, where Knn stands for the K -nearest neighbor. Seetext for definitions of the notations and detailed explanation. the potential correlation between those clustered neighborsand makes the problem sufficiently stable. The proposed al-gorithm is summarized in a flowchart shown in Fig. 3.As previously noted, in the current implementation ofthe proposed retrieval algorithm, we focus on (almost) co-incidental observations of the brightness temperatures andinundation fractions by the SSMIS and MODIS instru-ments, respectively. The dictionaries B and F are constructedusing 5 years of overlapping data (2010–2014) over theMekong Delta (latitude: 0–10 (cid:14) N and longitude: 100–110 (cid:14)
E)at 12.5 km grid resolution (Fig. 1).To build the dictionary, only the clear-sky MODIS-MWPproducts were considered. At resolution 12.5 km, we labeleda pixel as clear-sky when less than 50 % of the VNIR dataat resolution 250 m is flagged as non-cloudy. Because theMODIS sensor has a much higher resolution than the foot-print of SSMIS and because the number of cloud-free sam-ples over the Mekong are very limited, a threshold above zerois deployed to keep a certain number of partially cloudy pix-els and make sure that the dictionary will not be undersam- pled. For choosing the threshold, we conducted some sensi-tivity analysis (not shown here) and found a 50 % threshold,as a fair probability choice, results in a minimum of potentialbiases.Since the DMSP satellites have two different equatorialcrossing times, here, we use two sets of dictionaries for Tbvalues in the ascending (day or morning) and descending(night or evening) orbits. From all the available coincidentobservations, we randomly chose 2 (cid:2) pairs of brightnesstemperatures and inundation fractions in each ascending anddescending dictionary. The purpose of stratifying the dictio-naries into ascending and descending orbits is to exclude theeffects of Tb modulations from the retrieval process causedby the systematic diurnal variation of surface temperature. Inother words, the same inundation fraction has different PMWspectral signature in a daytime versus a nighttime overpasslargely due to the diurnal variability of skin temperature, pre-cipitation, and soil moisture (see Mears et al., 2002; Ramageand Isacks, 2003; Norouzi et al., 2012). Figure 4a presentsthe systematic difference between the Tbs of the ascendingversus descending tracks for various ranges of pixel-levelinundated fractions. In effect, in this figure, the Tbs in thedictionaries are grouped into five intervals based on theircorresponding inundation fraction (from 0 to 1) in F . Thenfor each interval, the average of Tb values is shown. Theplot clearly demonstrates that the daytime Tbs are thermallywarmer than their nighttime counterparts and this differencebegins to shrink when the inundation fraction increases. Itis worth noting that the difference between ascending anddescending brightness temperatures is larger over the low-frequency channels ( (cid:20)
37 GHz) as they respond more to theland surface structural variability than the higher-frequencychannels that capture atmospheric signatures. Figure 4b de-picts j Tb A (cid:0) Tb D j where Tb A and Tb D stands for ascendingand descending overpasses, respectively. It can be observedthat high values of j Tb A (cid:0) Tb D j depict the coastlines, i.e.,regions with the transient presence and/or absence of waterover land.The probability of detection, p . (cid:0) / , determines if apixel is inundated or not if the number of inundated vectorsin K -nearest neighbors is (cid:21) p (cid:2) K . We found that the in-undation detection with K (cid:21)
50 gives a reasonable rate forthe probability of hit and false alarms. In other words, theprobability of detection does not change significantly for alarger number of nearest neighbors. In the estimation step, tocharacterize the weight matrix W R n (cid:2) n , we used the coef-ficients of variation of each channel in response to changesin the inundation fraction (see Fig. 5). In other words, we as-sume that those channels that exhibit more variability withrespect to changes in inundation fraction contain more infor-mation about inundation and shall be given more weight inthe estimation process. One might ask why it is important toconsider the high-frequency channels (e.g., 91 V, H GHz) de-spite the fact that they show minimal sensitivity to the inun-dation fraction (Fig. 5) and land surface emissivity compared Figure 4. (a)
The systematic difference between passive microwave observations from the ascending (solid lines) and descending orbits(broken lines) as a function of five different sub-pixel intervals of inundation fractions. (b)
July to December daily average of absolutedifferences between the ascending (Tb A / and descending (Tb D / brightness temperatures at vertically polarized 19 GHz channel. The valuesof j Tb A (cid:0) Tb D j mainly capture the coastal regions with significant variability in their surface emissivity values due frequent diurnal tidaleffects. Figure 5.
The normalized coefficients of variation (right panel) of the brightness temperatures (Tb) (left panel) averaged over the entiredataset for different intervals of inundation fractions. Here, N Tb denotes the average of brightness temperatures over the inundation fractions.The coefficients of variation of each channel are used to determine the channel weights for the retrieval algorithm. Channels 19 H GHz and37 V GHz are the most responsive channels to the variability of inundation fraction and are given higher weights. to lower-frequency channels. The high-frequency channelsmainly capture the information content of the atmosphericprofile. Therefore, incorporating them in the proposed re-trieval framework allows us to indirectly consider the effectof atmospheric conditions by narrowing down the search for K -nearest neighbors to those Tb candidates that best matchboth the underlying land surface emissivity and the atmo-spheric conditions.For implementation of the algorithm, the regularizationparameters are set as (cid:21) D (cid:21). (cid:0) (cid:11)/ and (cid:21) D (cid:11)(cid:21) , where (cid:11) . ; / . Here, through cross-validation studies, through cross-validation we empirically found that (cid:21) D :
001 and (cid:11) D : The inundation fractions were estimated during the wet pe-riod of calendar year 2015 from July to December when thewater levels across the delta begin to rise and eventually re-cede (see Fig. 6). The wet season of the region is largelycharacterized by heavy precipitation as a result of the inter-actions of two monsoons including the Indian monsoon andthe East Asia–western North Pacific summer monsoon (Del-gado et al., 2012).To study the performance of the detection step we com-puted the probability of hit
P . O f > j MWP > / and falsealarm P . O f > j MWP D / of the algorithm outputs. Ouranalysis indicates that the probability of hit is around 0.92for both the dry and wet season, demonstrating the capabil-ity of the algorithm in detecting the inundated areas. How-ever, the probability of false alarm is around 0.12 for the Figure 6.
Inundated map of the Mekong Delta in the wet (July–December) and dry (January–June) seasons for the ascending orbits. Theresults of the proposed retrieval algorithm are presented using the ascending dictionary (top row) against the upscaled MODIS near-real-time(NRT) water product (MWP) data (bottom row). Overall, a good agreement is observed with some overestimation of inundated areas by theproposed algorithm compared to MODIS-MWP data around the river banks. dry season and reaches the value of 0.34 for the wet sea-son, which might be due to the generalization of the algo-rithm and MODIS missing data during the wet season. TheMODIS daily data, especially in the wet season, contain alarge number of missing values due to cloud blockages andfrequent heavy rains over the study area. In fact, while wewere collecting the overlapping data for constructing the dic-tionaries, we observed that over 88 % of the MWP productshave some missing portion in the 12.5 km resolution. As aresult, it is very likely that the MWP data underestimate theactual inundation fraction of regions with prolonged precipi-tation events.Figure 6 shows that the algorithm is capable of identifyinghotspots of inundation when its outputs are compared withthe MODIS-MWP; however, the algorithm slightly overesti-mates the inundation fractions for some pixels farther fromthe coastlines, most of which are completely dry in MWP.Here for brevity, we only show the results for ascendingoverpasses, while similar spatial patterns are observed fordescending overpasses. Figure 6 also shows some overesti-mation of inundation fractions near the riverbanks of ma-jor rivers. This might be due to the high soil moisture con-tent ( (cid:21) y axis that have corresponding zero inundatedpixels in MODIS-MWP data. However, in January to June,when there are fewer clouds, the inundation fractions fromthe proposed algorithm are generally more correlated withthe MODIS-MWP data but slightly underestimated (Fig. 7b).This underestimation probably also exists in wet months but Figure 7.
Scatterplots of daily inundation fractions ( f / from the retrieval algorithm against those from MODIS-MWP in wet (a) and dryseasons (b) shown in Fig. 6. The scatterplots demonstrate larger inundation fractions from the retrieval algorithm in July to December (a) compared to MODIS-MWP data. However, in January to June, when there are fewer clouds, the inundation fractions from the proposedalgorithm are more correlated with the MODIS-MWP data, with only a slight underestimation of their variability. is masked because of the all-sky retrieval capability of theproposed algorithm in the presence of the clouds and heavyrains. The reason for this underestimation might stem fromthe choice of 50 % threshold for selecting the clear-sky pix-els. In other words, there are a set of brightness tempera-tures for which the corresponding MODIS data are partiallycloudy and potentially underestimate the actual inundationfraction. As a result, it is likely that those pairs will be iso-lated, used in the retrievals, and eventually lead to some un-derestimation in passive microwave retrievals.As previously mentioned, the interannual climatology ofthe Mekong Delta is highly affected by two tropical mon-soons that characterize the seasonal patterns of precipitation,river stages, and water levels (Delgado et al., 2012). To betterunderstand whether the results of our retrieval algorithm fol-low the regional climatology, the monthly percentage of in-undated area over the Mekong Delta is calculated and shownagainst the monthly water level data in Fig. 8a. The monthlywater level data are obtained by averaging over all 11 stationsshown in Fig. 1. The specific goal is to compare the monthlyvariability of the algorithm outputs with the MWP productsand investigate whether they are consistent with the regionalvariations of the surface water level (river stage), which isconsidered a surrogate for the extent of inundation. It shouldbe acknowledged that this approach is not a direct validation;however, it can provide insight into the performance and cli-matological consistency of the proposed model as the surfacewater level data are positively correlated with the extent ofthe inundated surfaces.The seasonal variations in the monthly percentage of thetotal inundated surfaces from the proposed model follow thetrend of monthly water level data better than the standardMWP products (Fig. 8a). We can see that during the wetmonths of June to November, the MWP data report muchless inundated area than the outputs of the proposed algo-rithm, whereas this pattern is reversed during the dry months of January to March. As previously noted, we suspect thatthe differences in the wet season are due to the large portionof missing data in the MWP products because of the highcloud coverage in the rainy season (Fig. 8b). For quantita-tive comparison of the outputs of the algorithm with MWP, aEuclidean distance between normalized version of the algo-rithm outputs and water level data is calculated and comparedwith its MWP counterpart. The Euclidean distance betweenwater level and the retrieved inundation from ascending anddescending orbits is 3.46 and 3.56, respectively, while thisdistance for MWP and water level data is about 7.89, which ismore than twice the distances calculated from the microwaveretrieval results. This indicates the superior performance ofthe proposed inundation fraction retrievals as compared tothe MWP products, chiefly because of its all-sky skills dur-ing the rain dominant seasons.When is compared to MODIS-MWP, the inundated areaobtained by the retrieval algorithm in the dry months(Fig. 8a) shows some underestimation. One reason for thisunderestimation is the general limitation of the empiricalBayesian estimation method regarding the extreme events(see Coles and Powell, 1996, and the references therein) andwe suspect that it is not just limited to the months of Jan-uary to March but it affects the retrievals at other months toa lesser extent, as well. This limitation arises by the samplescarcity of large flooding scenarios during the warm monthsof the year, which probably lead to the underestimation of in-undation fractions related to those events by our retrieval al-gorithm. We expect that by improving the representativenessof the dataset – especially for extreme events in the summermonths – this shortcoming can be significantly improved.A closer look at Fig. 8a also reveals slightly larger in-undated surfaces in each month for the ascending (eveningoverpasses) compared to the descending (morning over-passes) tracks. This small difference between the ascendingand descending retrievals can be attributed to the expected
Figure 8.
The monthly inundated areas of the Mekong Delta calculated from the proposed retrieval algorithm and MODIS near-real-time(NRT) water product (MWP) data in comparison with ground-based monthly water level data. (a)
Comparison of the total inundated surfaceof the Mekong Delta from MWP products and the retrieval algorithm from ascending and descending dictionaries. From visual inspection, itis obvious that the retrieval algorithm can better follow variations of the water levels compared to MWP. More inundation over the dry seasonis reported by MWP products than the wet season, which contradicts the causality between rivers’ stages and the extent of inundated areas. (b)
The total fraction of land surface areas that are labeled as missing in MWP product because of atmospheric contaminations. The largerdeviations of the MWP products from water level data during the wet months might be attributed to the larger percentage of missing values. diurnal patterns of the precipitation over the Mekong Delta.Indeed, it is well documented (Gupta 2005) that localizedconvective precipitation events are more likely during theevening, which can increase the extent of the inundated ar-eas. To further assess the proposed algorithm performanceat a daily scale, we compare the dependence of the totalarea of ascending daily inundation fractions of the algorith-mic outputs with the average daily water level data, usingSpearman’s rank-correlation coefficient. The rationale is thata stronger rank correlation of an inundation product with thewater level data implies an improved retrieval. The correla-tion coefficient between the daily water level of the rivers andthe total inundated surfaces of the Mekong Delta is equal to0.22, which drops to (cid:0)
In this paper, we introduced a methodology to retrieve in-undation from space for almost all-sky conditions to reducethe gaps that exist in using satellite data in visible to mi-
Figure 9.
The empirical copula (joint probability distribution of quantiles) of the average daily water level and total daily inundated areasfrom the proposed retrieval algorithm (red curves) and MODIS-MWP data (black curves) for 2015. These plots indicate that our productshave stronger dependence to water levels than the MWP products (more L-shaped curves) for both the downstream (a) and upstream (b) regions of the Mekong Delta. The shaded areas (which quantify the difference between the degree of dependence of our products and theMWP products to the daily water levels) are larger in the upstream region, indicating an enhanced performance of the proposed algorithm toretrieve inundation fraction where potential inundation areas are better defined due to topography, e.g., around major riverbanks. crowave bands. The key idea of the proposed method was toexplore the links between overlapping daily high-resolutionobservations in the visible and near-infrared bands fromthe MODIS and the lower-resolution passive microwave ob-servations from the Special Sensor Microwave Imager andSounder (SSMIS) sensor. The developed multi-frequency in-undation retrieval algorithm uses the K -nearest matchingmethod in conjunction with a sparsity-promoting regulariza-tion technique. The proposed method demonstrated promis-ing results in resolving the spatial patterns of inundation,compared with the MODIS-MWP data. Over the monthswith high cloud coverage, the monthly results are consistentwith the seasonal dynamics of water level variation, which iscontrolled by tropical monsoons in the Mekong Delta. Anal-ysis also showed that, at a daily timescale, the outputs of thealgorithm exhibit stronger dependency with the water leveldata than the MWP data.There were three major sources of uncertainty in the pro-posed retrieval model in this paper. The first one related to theuse of the 3-day composite MODIS-MWP data (daily prod-ucts of MODIS-MWP were avoided due to missing valuesand cloud blockages), which might have introduced somebias in the daily retrievals due to mismatch of timescales.This source of error can be significantly reduced if theMODIS dictionary is populated with more accurate dailyproducts. The second source of error related to the lack ofadequate fully clear-sky samples in our dictionary and there-fore the need to define a cloud coverage threshold in orderto increase the sample size. Using partially cloudy MODISdata was the main reason for some observed underestimationof inundation fractions, especially in the dry months (Figs. 7and 8), which can be mitigated by increasing the sample size.The last source of error was more related to the general lim-itation of the Bayesian estimation method regarding the re- trieval of extreme events (see Coles and Powell, 1996, andreferences therein). This limitation is due to scarcity of largefloods in the dictionary, which can be treated by adding morescenarios of extreme events to the dataset from different ge-ographic locations.One of the limitations of the proposed algorithm (becauseof the spatial resolution of microwave data used in this pa-per) was its lack of information about the spatial patterns ofinundation within the 12.5 km pixels. The spatial pattern ofthe estimated inundation fractions can be further enhancedby using the guidance of a high-resolution topographic data(see Galantowicz, 2002). The database can also expand to in-clude some high-resolution cloud-free imageries from newlylaunched satellites, such as Sentinel-2, which can aid in cap-turing the high-resolution inundation areas. Finally, expand-ing the dictionary to include data from the passive microwavechannels of the new satellites, such as Global PrecipitationMission (GPM) Microwave Imager (GMI), will increase thespatial resolution of the retrievals to approximately 5 km. Inthis paper, the seasonality and also different land surfaceclasses have not been directly taken into account in the re-trieval algorithm. Future research should include the stratifi-cation of the dictionary based on different land surface typesand time periods (e.g., seasons). Code availability.
MATLAB code available at ftp://ebtehaj.safl.umn.edu/Codes/ShARP_Demo/. The dictionary of overlappedMODIS-MWP and SSMIS and also the resulting database are pub-licly available upon request (email to [email protected]). The datathat have been used to create the dictionary are available at https://floodmap.modaps.eosdis.nasa.gov/ (NASA Goddard’s HydrologyLaboratory, 2016) and ftp://sidads.colorado.edu/pub/DATASETS/nsidc0032_ease_grid_tbs/global/ for or MODIS-MWP and SSMISdata, respectively.
Let X and X denote two random variables with marginalcumulative distributions F .x / (cid:17) P T X (cid:20) x U and F .x / (cid:17) P T X (cid:20) x U with the cumulative joint distribution function F .x ; x / (cid:17) P [ X (cid:20) x ; X (cid:20) x ]. According to the Sklar’stheorem (Nelsen, 1999), the cumulative joint distribution F .x ; x / of X and X is equal to the cumulative joint dis-tribution function C.u ; u / of the quantiles u D F .x / and u D F .x / by F .x ; x / D P [ X (cid:20) x ; X (cid:20) x ] D P h X (cid:20) F (cid:0) .u /; X (cid:20) F (cid:0) .u / i (cid:17) C [ U (cid:20) u ; U (cid:20) u ] D C .u ; u ; / (A1) where C .u ; u / , is the cumulative copula with uniformmarginal random variables F .x / and F .x / on the inter-val T ; U . The multivariate density function f .x ; x / , if itexists, can be calculated by taking the derivative of C and F ,which results in the following: f .x ; x / D c.u ; u / (cid:1) f .x / (cid:1) f .x / D c .F .X /; F .X // (cid:1) f .x / (cid:1) f .x /: (A2)It shows the copula density function c.u ; u / separates thejoint distribution function f .x ; x / from its marginal prob-ability distribution functions f .x / and f .x / ; therefore, itcan capture the probabilistic dependence between the tworandom variables x and x by quantifying the strength ofthe relationship between their corresponding quantiles. Table B1.
Acronyms and abbreviations.SSMIS Special Sensor Microwave Imager and SounderSSM/I Special Sensor Microwave ImagerDMSP Defense Meteorological Satellite ProgramMSS Multispectral scanner systemVNIR Visible to near infraredMODIS Moderate Resolution Imaging SpectroradiometerNIR Near infraredMIR Mid-infraredPMW Passive microwavesESMR Electrically scanning microwave radiometerSMMR Multi-frequency Microwave RadiometerBWI Basin wetness indexWSF Water surface fractionAMSR-E Advanced Microwave Scanning Radiometer - Earth Observing SystemNRT Near-real timeNSIDC National Snow and Ice Data CenterDFO Dartmouth Flood ObservatoryMODIS-MWP MODIS near-real-time (NRT) water productCDF Cumulative probability function M Number of vectors of microwave brightness temperatures B k Number of nearest neighbors B Brightness temperature dictionary f Inundation fraction F Inundation dictionary b obs Observed vector of brightness temperature K Number of nearest neighbors B s Sub-dictionary of BF s Sub-dictionary of F c Vector of representation coefficients O f Estimated inundation fraction W Weight matrix n Number of frequency channels p Detection probability . ; /‘ & ‘ Regularizations norms (cid:21) & (cid:21) Regularization parameters
Competing interests.
The authors declare that they have no conflictof interest.
Acknowledgements.
This work was supported by the NASA GlobalPrecipitation Measurement Program under grants NNX13AG33Gand NNX16AO56G. It was also partially supported by NSF underthe Belmont Forum DELTAS project (EAR-1342944) and the LIFEproject (EAR-1242458). The MODIS-MWP data over the MekongDelta were kindly provided by Dan Slayback from the NASAGoddard Space Flight Center. The first author would like to thankProfessor Robert Brakenridge for his advice on this research duringthe AGU Fall Meeting 2015.Edited by: M. McCabeReviewed by: two anonymous referees
References