[PDF] Despeckling Sentinel-1 GRD images by deep learning and application to narrow river segmentation

Abstract

This paper presents a despeckling method for Sentinel-1 GRD images based on the recently proposed framework "SAR2SAR": a self-supervised training strategy. Training the deep neural network on collections of Sentinel 1 GRD images leads to a despeckling algorithm that is robust to space-variant spatial correlations of speckle. Despeckled images improve the detection of structures like narrow rivers. We apply a detector based on exogenous information and a linear features detector and show that rivers are better segmented when the processing chain is applied to images pre-processed by our despeckling neural network.

Full PDF

DDESPECKLING SENTINEL-1 GRD IMAGES BY DEEP-LEARNINGAND APPLICATION TO NARROW RIVER SEGMENTATION

Nicolas Gasnier † , Emanuele Dalsasso † , Lo¨ıc Denis ‡ , Florence Tupin † LTCI, T´el´ecom Paris, Institut Polytechnique de Paris, Palaiseau, France ‡ Univ Lyon, UJM-Saint-Etienne, CNRS, Institut d Optique Graduate School,Laboratoire Hubert Curien UMR 5516, F-42023, SAINT-ETIENNE, France

ABSTRACT

This paper presents a despeckling method for Sentinel-1GRD images based on the recently proposed framework“SAR2SAR”: a self-supervised training strategy. Trainingthe deep neural network on collections of Sentinel 1 GRDimages leads to a despeckling algorithm that is robust tospace-variant spatial correlations of speckle. Despeckled im-ages improve the detection of structures like narrow rivers.We apply a detector based on exogenous information anda linear features detector and show that rivers are bettersegmented when the processing chain is applied to imagespre-processed by our despeckling neural network.

Index Terms — SAR, denoising, deep learning, water de-tection, Sentinel-1

1. INTRODUCTION

Synthetic Aperture Radar (SAR) images are very useful forvarious ﬁelds related to Earth observation, thanks to the all-time, all-weather capability and short revisit time of a SARsystem. For applications such as river monitoring that do notneed interferometric information from the SLC images, Sen-tinel 1 GRD IW HD (Ground Range Detected InterferometricWide swath High deﬁnition Dual polarization) images pro-vided by ESA are very popular because of their wide avail-ability and ease of use compared to other SAR data.However, they are affected by strong speckle ﬂuctuations,with an ENL (equivalent number of looks) of 4.4. Thus,their exploitation can be challenging, as they are difﬁcultto interpret. In order to analyze these images, the speckleis commonly treated as a source of noise, and denoisingtechniques are often applied as pre-processing. Removingspeckle from GRD images can be sensitive, especially giventhe spatially variable correlation of speckle. Thus, speciﬁctechniques need to be built to this aim. Among the state-of-the-art techniques, self-supervised deep learning approacheslearn the speckle model directly from noisy images and cantherefore provide results of high quality, even in presence ofspatially correlated speckle where other despeckling methodsproduce strong artifacts. One can ask whether such a denoising step can be bene-ﬁcial even for approaches that have been designed to be ro-bust to speckle noise. In this paper, we present a modiﬁedriver detection method that is based on a ﬁrst denoising stepadapted from SAR2SAR [1] and a detection performed onthe denoised images using a method that uses exogenous in-formation to guide the river detection [2]. Section 2 presentsthe rationale behind the proposed method, experimental re-sults are outlined in section 3. Finally, some conclusions aredrawn in section 4.

2. METHOD

The proposed method consists of two steps: the ﬁrst step ﬁl-ters the speckle from the GRD images using an adaptation ofSAR2SAR to Sentinel 1 GRD images. The second step seg-ments the narrow rivers on these ﬁltered images.

A speckled intensity image w can be related to the reﬂectivity v through a multiplicative model w = v × u , where u is thespeckle component. The measured w is thus subject to strongﬂuctuations, which are signal-dependent. In order to stabi-lize these ﬂuctuations, it is common to apply a homomorphictransform ( i.e. , a log transform), which turns the speckle intoan additive term s : y = x + s , (1)following a Fisher-Tippett distribution:p ( s ) = L L Γ( L ) e Ls · exp( − Le s ) , (2)where y and x respectively correspond to the log-intensityand log-reﬂectivity, and L is the number of looks. Inspired bythe noise2noise approach [3], it is demonstrated in [1] that,given several pairs of noisy samples ( y (cid:48) , y ) drawn under thesame conditional distribution, when training a neural networkusing: L{ f θ ( y (cid:48) ) , y } = − log p ( y | f θ ( y (cid:48) ))= f θ ( y (cid:48) ) − y + exp ( y − f θ ( y (cid:48) )) , (3) a r X i v : . [ ee ss . I V ] F e b +- Fig. 1 . Summary of the self-supervised training of theSAR2SAR algorithm.the training is asymptotically equivalent to a supervised train-ing with the (unknown) log-reﬂectivities x : argmin θ E X,Y | X [ L{ f θ ( y ) , x } ] . (4)This makes it possible to train a network using co-registeredSAR images acquired at different dates.Following the same principle outlined in [1], a U-Net [4]network has been initially trained on noise-free reference im-ages on which speckle noise with L = 4 is syntheticallycreated. In this way, pairs of GRD images are simulated ina realistic way and the network is trained only using noisyimages (step A ). In order to learn speciﬁc characteristics ofGRD images, in particular the spatially-varying correlations,the network is ﬁne-tuned on co-registered stacks of GRD im-ages (step B ). To compensate for changes occurring betweenimages y and y , the target image is replaced by y − ˆx + ˆx . The estimates of the reﬂectivities ˆ x and ˆ x are obtainedthanks to network A . Since network A was trained only ondecorrelated speckle, Sentinel-1 images are downsampled bya factor 2 to reduce the impact of speckle correlations. Theoutputs of network A are then up-sampled by a factor 2 toproduce estimates ˆ x and ˆ x . After this ﬁrst ﬁne-tuning stepon real data, the network learned at step B is used to producemore accurate estimates ˆ x and ˆ x in the change compensa-tion formula (the downsampling is no more necessary sincenetwork B , directly trained on Sentinel-1 images, is robust tospeckle correlations). A last network is obtained at the end ofthis reﬁnement: network C . The goal of the second step of our method is to get an accu-rate segmentation of the rivers in the image. To achieve this,we adapted a framework [2] that uses ﬁrst a linear features de-tector and exogenous information to retrieve the river center-line. Then the river is segmented around the centerline using aspeciﬁc conditional random ﬁeld (CRF) approach.The exoge-neous information on the river consists of control points andcan be found in prior databases such as Global River Widthsfrom Landsat (GRWL) [5] in which the rivers centerlines arestored as sets of nodes. Such information can be of a greathelp to distinguish rivers from other dark linear structures thatare present in Sentinel-1 images, like large roads. However,due to discrepancies between the database centerline and theactual rivers on the image, both in shape and position, theseriver centerlines from the database cannot be directly used tosegment the river, hence the need for a repositioning stage.

The linear features map is computed using a linear featuresdetector based on the Generalized Likelihood Ratio Test(GLRT) [6]. As this detector is agnostic to the number oflooks L , no modiﬁcation was needed to adapt it to denoisedGRD images. As the linear features detected in the previous step can cor-respond to actual rivers, but also to roads, relief artifacts, oreven to false-positive detections caused by corner reﬂectors,the second stage uses exogenous information from a riverdatabase to determine the shape and the position of the rivercenterline in the image.This stage is identical to its counterpart in [2] and con-sists in detecting the centerline as the least-cost path betweentwo nodes that belong to the same river in the exogenousdatabase and are a few kilometers apart. The cost array iscomputed from the previously computed linear features de-tector response using the same parameter N pow = 10 , as fornoisy GRD images and with D max being the maximum valueof D in the image. The last stage of our method consists in segmenting the riveraround the centerline obtained in the previous step, using aconditional random ﬁeld (CRF) approach adapted from [2],with a simpliﬁed expression. The CRF energy is deﬁned asthe sum of a data term, a term that forces centerline pixel to beclassiﬁed as water, and a regularization term that all dependon the class (land or river) of each pixel.he data term depends on the denoised image intensity I and on the class: for the river class, the data term is quadratic: (log( I ) − R log ) , where R log is the estimated log-reﬂectivityof the water. This distribution accounts for the ﬂuctuations ofriver pixel intensities caused by the remaining speckle, if any,and also for the ﬂuctuations of the river reﬂectivities causedby varying roughness of the water surface. The data term fornoisy SAR images processing usually derives from a gammadistribution (to model speckle ﬂuctuations). Variations of in-tensity caused by the spatial evolution of water reﬂectivity areconsidered negligible compared to that of speckle. This is nolonger true when considering despeckled images. We there-fore prefer a quadratic data term to account for spatial changesin water reﬂectivities. In the absence of a model for the landclass and in order to prevent a bias toward it, the data termfor the land class is set constant and proportional to the meanvalue of the data term, computed on all pixels of the center-line of the river, excluding the highest values. We keep thesame regularization term as in the method [2]: it consists ina weighted total variation penalization where weights are in-versely proportional to the magnitude of the spatial gradientof the SAR image. This penalizes transitions between riverand land, except where the gradient is strong. The orientationof the gradient is also considered to align the segmentationonly if the direction is right, i.e., corresponding to a river toland transition.

3. RESULTS

In this section, we present the detection results we obtainedby applying the proposed approach to various Sentinel 1 GRDHD images with narrow rivers. To quantitatively assess theperformance of our proposed method and to compare it withthe original approach that does not use a denoising step, wecompute three metrics using a manually-deﬁned ground truth:Recall (Rec), which is the proportion of actual water pixelsthat are classiﬁed as water, Precision (Pre) that is the propor-tion of actual water among all the pixels classiﬁed as water,and F-score that is the harmonic mean of the precision and therecall.The quantitative results are presented in table 1 with acomparison with the baseline method [2] that does not useany denoising. For each metric and each image, the best re-sult is in bold font. The comparison shows that the proposedapproach improves the detection result over the baseline forall images but one (Redon) for which the F-scores are veryclose.The results of the different steps of the method for a cropof image 3 (Gaoual) that shows the conﬂuence between riversTomine and Koumba near Gaoual (Guinea) are presented inﬁgures 2. Figure 2.D shows the ratio between the noisy imagecrop and its denoised counterpart, illustrating the very gooddespeckling performance of the neural network: no noticeablestructure is present in the ratio image and the denoised image is very smooth. For a more extensive analysis of the denoisingresults, the reader can refer to our Git repository .

4. CONCLUSION

The approach presented in this paper combines and adaptsrecently proposed methods to improve narrow rivers detec-tion. First, the SAR2SAR denoising approach that was origi-nally developed for S1 single-look images has been re-trainedwith S1 GRD HD time series, learning the speckle modeland its non-uniform spatial correlation directly from real data.This adapted method gives denoising results of a high quality:it suppresses speckle noise while preserving ﬁne structureswithout introducing notable artifacts. This is crucial for manyremote sensing applications, among which narrow river de-tection. The images denoised with SAR2SAR-GRD are usedas input for a modiﬁed narrow rivers detection method. Qual-itative and quantitative experiments have shown that a pre-liminary denoising step signiﬁcantly improves the results onthe adapted river detection approach, compared to the originalmethod working on noisy data.In the light of these conclusions, the code of SAR2SAR-GRD is made publicly available , to let the remote sensingcommunity beneﬁt from its use to face the numerous chal-lenges in Earth observation applications.

5. REFERENCES [1] E. Dalsasso, L. Denis, and F. Tupin, “SAR2SAR: aself-supervised despeckling algorithm for SAR images,” arXiv preprint arXiv:2006.15037 , 2020.[2] N. Gasnier, L. Denis, R. Fjortøft, F. Liege, and F. Tupin,“Narrow river extraction from SAR images using exoge-nous information,”

HAL Preprint , 2020.[3] J. Lehtinen, J. Munkberg, J. Hasselgren, S. Laine, T. Kar-ras, M. Aittala, and T. Aila, “Noise2noise: Learningimage restoration without clean data,” arXiv preprintarXiv:1803.04189 , 2018.[4] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convo-lutional networks for biomedical image segmentation,” in

MICCAI . Springer, 2015.[5] G. H. Allen and T. M. Pavelsky, “Global extent of riversand streams,”

Science , 2018.[6] N. Gasnier, L. Denis, and F. Tupin, “Generalized likeli-hood ratio tests for linear structure detection in SAR im-ages,” in

EUSAR 2021 , 2021. https://gitlab.telecom-paris.fr/RING/SAR2SAR able 1 . Comparison of results between the proposed method and the baseline methodImage Baseline method Proposed Method × × × × × ×90.92 83.98 87.31Fig. 2