Robust Rational Polynomial Camera Modelling for SAR and Pushbroom Imaging
Roland Akiki, Roger Marí, Carlo de Franchis, Jean-Michel Morel, Gabriele Facciolo
RRobust Rational Polynomial Camera Modellingfor SAR and Pushbroom Imaging
Roland Akiki , Roger Marí Carlo de Franchis , Jean-Michel Morel Gabriele Facciolo Université Paris-Saclay, CNRS, ENS Paris-Saclay, Centre Borelli, France Kayrros SAS
Abstract
The Rational Polynomial Camera (RPC) model canbe used to describe a variety of image acquisition sys-tems in remote sensing, notably optical and SyntheticAperture Radar (SAR) sensors. RPC functions relate3D to 2D coordinates and vice versa, regardless ofphysical sensor specificities, which has made them anessential tool to harness satellite images in a genericway. This article describes a terrain-independent al-gorithm to accurately derive a RPC model from a setof 3D-2D point correspondences based on a regular-ized least squares fit. The performance of the methodis assessed by varying the point correspondences andthe size of the area that they cover. We test the algo-rithm on SAR and optical data, to derive RPCs fromphysical sensor models or from other RPC modelsafter composition with corrective functions.
Developing a remote sensing application requires aset of tools, one of which is geolocation. Geolocationrelates the 3D world coordinates to the 2D image.This is represented by means of a projection function P : R → R , that maps 3D points to the imageplane, and its inverse, the localization function L : R × R → R . When all the physical phenomenaand components involved in the acquisition processare known, the geolocation functions can be definedby a chain of operations that model such factors, in ( r, c ) ( X, Y, Z )RPC P projection localizationRPC L Figure 1: The RPC model is derived using a grid of3D CNPs (Control Points) and its projection ontothe satellite image.what is known as a physical or rigorous sensor model.Pushbroom scanners are the most common opti-cal satellite image acquisition system, typically con-sisting of a single line of pixel sensors mounted ona platform that captures each line of the image ata different moment in time. As a result, the ex-terior orientation parameters, i.e. the perspectivecenter and the attitude angles, change from line toline. The intrinsic parameters (e.g. pixel size, focallength, lens distortion), related to the physical de-sign of the sensor, are constant across the image [1].A detailed description of a simplified physical sensormodel for pushbroom scanners can be found in [2]. Inthe case of Synthetic Aperture Radar (SAR) images,the most used physical sensor model is the Range-1 a r X i v : . [ ee ss . I V ] F e b oppler model detailed in [3]. SAR satellites send anelectromagnetic wave that is reflected on the ground.The image is acquired line by line (similar to a push-broom system), and the position of a ground patchin the image is related to its distance to the sensor,known as the range . The Range-Doppler model isconstructed based on ephemeris data (time, positionand velocity samples along the orbit) and the acqui-sition timing information. The ephemeris data needsto be interpolated to obtain continuous geolocationfunctions along the orbit.Image vendors have adopted the generic RationalPolynomial Camera (RPC) model to save customersfrom having to deal with the complex specificities ofrigorous sensor models. The RPC model is indepen-dent of physical properties and offers flexibility towork with different coordinate systems. RPCs havebecome essential metadata to process satellite imagesin a generic way, from different sources and for multi-ple tasks, e.g. photogrammetry and radargrammetrybased 3D reconstruction or image ortho-rectificationand coregistration.In this article, we describe a terrain-independentalgorithm to fit a RPC model from a physical sensormodel or any other geolocation model. Our contri-butions are:- An open-source implementation of the methodas an easy-to-use Python package, which is avail-able at https://github.com/cmla/rpcfit .- An evaluation of the algorithm’s precision androbustness based on real scenarios. We test ourmethod using Sentinel-1 and WorldView-3 im-ages, to fit a SAR physical sensor model or cor-rect an existing RPC model by composing it witha complementary transformation. The RPC model defines the projection function P as r n = a ( X n , Y n , Z n ) b ( X n , Y n , Z n ) c n = e ( X n , Y n , Z n ) f ( X n , Y n , Z n ) , (1) Also referred to as Rational Polynomial Coefficients cam-era model or Rational Function Model in the literature. where a, b, e, f are cubic polynomials.
X, Y, Z repre-sent the longitude, latitude and height of a 3D point;and r, c are the row and column of its projection onthe image plane.Equation 1 uses normalized coordinates for bet-ter numerical stability, hence the subscript n . Nor-malized values are in the range [-1, 1] and they areobtained using two scalars, an offset and a scale fac-tor: X n = ( X − X offset ) /X scale , where X could be r, c, X, Y or Z from Equation 1.Each RPC polynomial p is defined by 20 coeffi-cients as p ( X, Y, Z ) = p + p Z + p Y + p X + p ZY + p ZX + p Y X + p X + p Y + p Z + p ZY X + p Z Y + p Z X + p Y Z + p Y X + p ZX + p Y X + p Z + p Y + p X , (2)where p i is the i -th coefficient of p . Since we set p = 1 for the RPC denominator polynomials, a totalof 78 coefficients need to be determined to define a , b , e and f in Equation 1. RPCs have been used for high-resolution opticalsatellite imaging since the launch of Ikonos in 1999[1, 4–6]. In the last decade, they have been proven tobe extremely accurate for SAR acquisition systemsas well [7, 8].The RPC model of a satellite image can be con-structed using a set of correspondences between im-age and object space coordinates. Depending on thenature of these correspondences, the literature canbe classified into terrain-dependent or independentmethods (or a combination of both). Terrain-dependent strategies use Ground Control Points(GCPs), whose object and image coordinates areknown in advance relying on manual labeling or on-site measurements. Oppositely, terrain-independentmethods derive virtual sets of 2D-3D point corre-spondences from other geolocation functions, usuallya physical sensor model. Once the point correspon-dences are available, least squares algorithms are typ-ically used to estimate the RPC coefficients that min-2mize the error between the projected 3D points andtheir image locations.Several works have underlined the importanceof using uniformly distributed points in sufficientamount, covering the different parts of the image andthe whole altitude range of the scene [4, 6]. As a re-sult, regularized least squares methods have becomewidely used to gain robustness to different configura-tions and enforce well-conditioned normal equations[4, 6, 8, 9]. Additionally, terrain-dependent strategieshave explored the selection of optimal and balancedsubsets of GCPs, e.g. [4] propose a bucketing strategyor [9] study the benefits of encouraging correspon-dences located at building edges in urban scenarios.In contrast, terrain-independent methods can arbi-trarily generate regular sets of points, but requirespecial care to the boundaries and density of thestructure, e.g. [8] investigate the impact of differentnumber of elevation layers for flat and mountainousareas.
We follow a terrain-independent approach similar to[4] to fit an RPC model to another input geoloca-tion model. The data used to fit the model consistsof a 3D grid of uniformly distributed Control Points(CNPs) within some longitude, latitude and altitudeboundaries (Fig. 1). The 2D image point of eachCNP can be obtained by projecting it with the inputgeolocation model, so that each sample results in 5normalized values, i.e. ( X i , Y i , Z i , r i , c i ) . For simplic-ity, we drop the subscript n of normalized coordinatesand replace it by i to refer to the sample index.Using N CNPs, Equation 1 can be rewritten asa system of equations, in matrix form, following thederivation of [4]:
W T I − W G = 0 , (3)where W = diag (cid:20) b ( X ) , ..., b ( X N ) , f ( X ) , ..., f ( X N ) (cid:21) b ( X i ) = b ( X i , Y i , Z i ) ,T = block diag [ M r , M c ] M r , M c ∈ R N × M r i = (cid:2) , Z i , Y i , ..., X i , − r i Z i , − r i Y i , ..., − r i X i (cid:3) M c i = (cid:2) , Z i , Y i , ..., X i , − c i Z i , − c i Y i , ..., − c i X i (cid:3) ,I = [ a , ..., a , b , ..., b , e , ..., e , f , ..., f ] T ,G = [ r , ..., r N , c , ..., c N ] T . In Equation 3, W is a weight matrix with shape N × N , b ( X i ) denotes the RPC polynomial b eval-uated with the 3D coordinates of the i -th CNP; T is the design matrix with shape N × ; I is thesolution vector with the 78 RPC coefficients neces-sary to determine a , b , e and f in Equation 1; and G is a vector of length N containing the CNPs imagecoordinates.Equation 3 can be solved by least squares mini-mization to estimate I , using the normal equation T T W T I − T T W G = 0 . (4)To increase numerical stability, ridge estimation reg-ularization [8, 9] is often added so that the normalequation becomes ( T T W T + h E ) I − T T W G = 0 , (5)where E is the identity matrix and h is a scalar con-trolling the regularization that is applied. To choosethe best regularization factor h , the L-curve crite-rion was introduced in [8]. This heuristic computesthe log norm of the solution (log (cid:107) I (cid:107) h ) versus the lognorm of the residual (log (cid:107) W T I − W G (cid:107) h ) across dif-ferent values of h that extend from the minimal tothe maximal singular value of T. This curve usuallyhas a L-shape, in which the optimum corresponds tothe maximum regularization parameter that achievesa small residual. The value corresponding to the cor-ner of the curve, at the position of maximal curvature,is taken to set h automatically. For non-weightedregularized least squares (i.e. weights are set to theidentity, W = E ), the L-curve criterion is fast sincethe curvature can be computed with closed form ex-pressions [10].Therefore, we first set W (0) = E (where the super-script denotes the iteration number) and use the L-curve criterion to determine the optimal h and an ini-tial solution I (0) . Then, for i ≥ , W ( i ) is determined3rom I ( i − and is plugged in Equation 5 to solve for I ( i ) iteratively (the SVD least squares solver is usedfor stability). The iterations stop when the changein terms of RMSE between the RPC projected CNPsand their image coordinates becomes lower than atolerance value. After convergence some final ICCV(Iteration by Correcting Characteristic Value [8]) it-erations are computed to remove possible biases in-troduced by the regularization. The same stoppingcriterion based on the RMSE improvement is used.Each ICCV iteration k can be expressed as ( T T ( W ( k ) ) T + E ) I ( k ) = T T ( W ( k ) ) G + I ( k − . (6) • SAR.
32 Sentinel-1 SAR images (Table 1). Thedata is in Interferometric Wide Swath (IW) mode,each product contains 3 subswaths, and each sub-swath contains multiple bursts that need to bestitched together to get a continuous image. Weconstruct the Range-Doppler physical sensor modeland use the method from Section 3 to fit an equiv-alent RPC model for each image of the dataset.•
Optical.
47 WorldView-3 panchromatic images(Table 1), from the 2016
IARPA Multi-View Stereo3D Mapping Challenge [11]. The original RPCs ofthe images exhibit small inaccuracies, mainly dueto inexact knowledge of the satellite attitude an-gles, which cause a 3D point to be projected to noncorresponding pixels across different images. Bun-dle adjustment (BA) algorithms are a well-knownapproach to correct RPC errors [5, 12, 13]. We ap-ply a BA similar to [12] to correc the projectionfunction P of each RPC into a new P BA , expressedas P BA ( X ) = P ( R ( X − T − C ) + C ) , (7)where X is a 3D point. That is, each RPC is cor-rected by applying a translation T followed by arotation R around an approximate camera center C , before applying the original projection P . C is SAR dataset Optical dataset platform S1 A/B WorldView-3number of images 32 47geographic area Albania Argentina(lon, lat) center (18.82,41.02) (-58.61,-34.47)first acquisition date 2019-08-03 2014-11-15last acquisition date 2020-02-05 2016-01-13altitude range (m) [ − , − , Table 1: SAR and optical data used in the exper-iments. The altitude range of the area covered byeach collection of images is defined using the [min,max] values from the corresponding SRTM digital el-evation model ± m to consider tall buildings orfine irregularities beyond bare ground level.derived by regressing a projective model from eachRPC model. Our method from Section 3 is used tofit P BA from the composition of P with T, R , C . To apply the method described in Section 3, the al-titude limits of the grid of CNPs are set using thealtitude ranges in Table 1. To assess the RPC fit-ting, we use a grid of Check Points (CKPs), whichare located in the middle of each pair of consecutiveCNPs. The RMSE between the image coordinatesobtained by projecting the CKPs with the outputRPC and the image coordinates obtained using theinput geolocation model, measured in pixels, is usedas evaluation metric. We set a tolerance of − for the stopping criterion based on the RMSE im-provement, and a maximum of iterations for theweighted least squares and the ICCV iterations.Two type of experiments were conducted to assessthe performance and robustness of the method:- Varying surface area . For each image, we fit dif-ferent RPCs by gradually increasing the longi-tude and latitude limits of the grid of CNPs froma small square centered at the image center to alarger square including the entire image. Thenumber of CNPs is fixed, with × samplesin the longitude and latitude dimensions and 10elevation layers (25000 points in total).4 surface area (km ) X R M S E ( p x )
1e 4 0 5000 10000 15000 20000 surface area (km ) Y R M S E ( p x )
1e 4 (a) S A R
10 20 30 40 50 60 70 80 90 100 grid length X R M S E ( p x )
1e 4 10 20 30 40 50 60 70 80 90 100 grid length Y R M S E ( p x )
1e 4 (b) surface area (km ) X R M S E ( p x )
1e 6 0 50 100 150 200 surface area (km ) Y R M S E ( p x )
1e 4 (c) O p t i c a l
10 20 30 40 50 60 70 80 90 100 grid length X R M S E ( p x )
1e 6 10 20 30 40 50 60 70 80 90 100 grid length Y R M S E ( p x )
1e 3 (d)Figure 2: RPC fitting error varying grid length and surface area on the SAR (a-b) and optical data (c-d)described in Section 4.1. Each vertical bar corresponds to the [ − σ/ , µ, σ/ values of the root-mean-squareerror (RMSE) evaluated across the different images of the datasets, in each dimension of the image plane,where µ corresponds to the mean and σ to the standard deviation.- Varying grid length . For each image, we fit dif-ferent RPCs, by increasing the number of CNPsin each elevation layer, i.e. n × n where n is the grid length . The longitude and latitude bound-aries of the grid are fixed using the equivalentlimits of the image plane.Overall, the results presented in Fig. 2 show that theRPCs constructed with our method approximate thegeolocation models with very high accuracy for thetwo datasets and the two scenarios outlined in Sec-tion 4.1. The RMSE is in the order of − pixels orless in both dimensions of the image plane for the ma-jority of configurations that were tested, which em-phasizes the robustness of the method.The experiments with different grid lengths(Fig. 2b and 2d) show that 10 samples in the longi-tude and latitude directions is already a good choiceand increasing this number beyond 20 does not re-sult in significant improvements. The experimentswith varying surface area (Fig. 2a and 2c) show thatboth the overall RMSE values and its variation acrossthe different images increase with the size of the areabeing fitted. This is probably due to the fact thatthe SAR physical sensor model is less smooth forlarge neighborhoods. A similar behavior is obtainedwith the optical dataset, where the original RPCs areknown to behave locally as an affine camera [5, 13]. This article described an automatic algorithm to fitthe RPC model of a satellite image in a terrain-independent manner. The inputs of the method area regular grid of 3D points (CNPs), with multiple el-evation layers, and the 2D locations of the points onthe image plane. We evaluated the method on realscenarios using collections of SAR and optical satel-lite images, and assessed its performance by varyingthe CNPs configuration. Finally, we release an open-source implementation of the algorithm as an easy-to-use Python package.
Acknowledgements
Work partly financed by IDEX Paris-Saclay IDI2016, ANR-11-IDEX-0003-02, Office of Naval re-search grant N00014-17-1-2552 and N00014-20-S-B001, DGA Astrid project « filmer la Terre »n o ANR-17-ASTR-0013-01, MENRT, and by a grantfrom Région Île-de-France.
References [1] J. Grodecki, “IKONOS stereo feature extraction- RPC approach,” in
ASPRS Annual Confer- nce , 2001.[2] C. de Franchis, E. Meinhardt-Llopis, D. Gres-lou, and G. Facciolo, “Attitude refinement fororbiting pushbroom cameras: a simple polyno-mial fitting method,” Image Processing On Line ,vol. 2015, pp. 328–361, 2015.[3] J.C. Curlander, “Location of spaceborne SARimagery,”
TGRS , vol. GE-20, no. 3, pp. 359–364, 1982.[4] C.V. Tao and Y. Hu, “A comprehensive study ofthe rational function model for photogrammet-ric processing,”
Photogrammetric Engineering &Remote Sensing , vol. 67-12, 2001.[5] C.S. Fraser, G. Dial, and J. Grodecki, “Sensororientation via RPCs,”
ISPRS , vol. 60, no. 3,pp. 182–194, 2006.[6] T. Long, W. Jiao, and G. He, “RPC estimationvia (cid:96) -norm regularized least squares (L1LS),” TGRS , vol. 53, no. 8, 2015.[7] G. Zhang, W. Fei, Z. Li, X. Zhu, and D. Li,“Evaluation of the RPC model for spaceborneSAR imagery,”
Photogrammetric Engineering &Remote Sensing , vol. 76-6, pp. 727–733, 2010.[8] L. Zhang, X. He, T. Balz, X. Wei, and M. Liao,“Rational function modeling for spaceborne SARdatasets,”
ISPRS , vol. 66, no. 1, pp. 133–145,2011.[9] Y. Wang, Y. Zhang, and N. Su, “RPC estimationvia feature points for urban areas,” in
IGARSS ,2016, pp. 6684–6687.[10] P.C. Hansen and D.P. O’Leary, “The use of theL-curve in the regularization of discrete ill-posedproblems,”
SIAM Journal on Scientific Comput-ing , vol. 14, no. 6, pp. 1487–1503, 1993.[11] M. Bosch, Z. Kurtz, S. Hagstrom, andM. Brown, “A multiple view stereo benchmarkfor satellite imagery,” in
AIPR Workshop , 2016,pp. 1–9. [12] R.A. Beyer, O. Alexandrov, and S. McMichael,“The Ames Stereo Pipeline: NASA’s open sourcesoftware for deriving and processing terraindata,”
Earth and Space Science , vol. 5, no. 9,pp. 537–548, 2018.[13] R. Marí, C. de Franchis, E. Meinhardt-Llopis,and G. Facciolo, “To bundle adjust or not:A comparison of relative geolocation correctionstrategies for satellite multi-view stereo,” in