A rasterized ray-tracer pipeline for real-time, multi-device sonar simulation
Rômulo Cerqueira, Tiago Trocoli, Jan Albiez, Luciano Oliveira
AA rasterized ray-tracer pipeline for real-time,multi-device sonar simulation
Rˆomulo Cerqueira a,c, ∗ , Tiago Trocoli a , Jan Albiez b , Luciano Oliveira c a Brazilian Institute of Robotics, SENAI CIMATEC, Salvador, Bahia, Brazil b Kraken Robotik GmbH, Bremen, Germany c Intelligent Vision Research Lab, Federal University of Bahia, Salvador, Bahia, Brazil
Abstract
Simulating sonar devices requires modeling complex underwater acoustics, si-multaneously rendering time-efficient data. Existing methods focus on basicimplementation of one sonar type, where most of sound properties are dis-regarded. In this context, this work presents a multi-device sonar simulatorcapable of processing an underwater scene by a hybrid pipeline on GPU: Ras-terization computes the primary intersections, while only the reflective areasare ray-traced. Our proposed system launches few rays when compared to afull ray-tracing based method, achieving a significant performance gain withoutquality loss in the final rendering. Resulting reflections are then characterizedas two sonar parameters: Echo intensity and pulse distance. Underwater acous-tic features, such as speckle noise, transmission loss, reverberation and materialproperties of observable objects are also computed in the final generated acousticimage. Visual and numerical performance assessments demonstrated the effec-tiveness of the proposed simulator to render underwater scenes in comparisonto real-world sonar devices.
Keywords:
Acoustic images, Imaging sonar simulation, Rasterization,Ray-tracing, Multipath propagation, Underwater robotics. ∗ Corresponding author: Rˆomulo Cerqueira.
Email addresses: [email protected] (Rˆomulo Cerqueira), [email protected] (Tiago Trocoli), [email protected] (Jan Albiez), [email protected] (Luciano Oliveira), [email protected] (Luciano Oliveira)
Preprint submitted to arxiv.org January 13, 2020 a r X i v : . [ ee ss . SP ] J a n . Introduction The number of underwater structures in the offshore industry has signifi-cantly increased over the last decades, and so the need of monitoring, inspec-tion and intervention of these structures (Wang et al., 2018). Since autonomy isnecessary to reduce mission expenses, the offshore industry has leading the de-velopment of autonomous underwater vehicles (AUVs) to accomplish the mainfield tasks. With a pre-programmed mission and onboard sensors, AUVs areable to perform completely autonomous decisions, returning to surface only forservicing.The accomplishment of AUV tasks demands to deal with challenges inherentto undersea environment. For instance, beneath the water, optical cameras areaffected by turbidity and lightning conditions, thus restricting the image qualityto short visible ranges. On the other hand, imaging sonars take advantage of thelow attenuation of sound waves in order to cover larger areas than those onescovered by optical cameras, although producing noisy data with low resolution.AUV real-world experimentation is challenging, mainly due to human re-sources, time consumption and hazards involved on deployment and testing theunderwater vehicles in the target domain. While initial experiments can be per-formed in water tanks ( e.g. , low-level control and basic prototyping), high-leveltests require trials in deep open waters ( e.g. , way-point navigation, mappingand autonomous control). An unexpected behavior of an AUV may result inan unrecoverable equipment, causing a considerable financial loss. This way,simulation of underwater sensors and reproducible environments is essential tocope with insufficient data, as well as to develop effective algorithms before testsin the wild.To contribute with the development of underwater acoustic-based systems,this paper introduces a novel simulator able to reproduce the operation of differ-ent sonar devices. Rendering of a virtual scene is accelerated by a selective ras-terization and ray-tracing scheme on GPU, where the computational resourcesare allocated only for reflective regions. Subsequently, the resulting reflections2re converted to the acoustic scene representation on CPU, including severalphenomena present on the sonar images.
By considering the complexity in the process of transmitting sound throughthe water, several mathematical and computational models have been proposedto approximate the calculation of acoustic propagation (Etter, 2018). Ray-basedmethods are the most common solutions to simulate underwater sonar systems(Bell & Linnett, 1997; Gu´eriot et al., 2007; Gu et al., 2013; Kwak et al., 2015;DeMarco et al., 2015; Sa¸c et al., 2015; Mai et al., 2018; Soares, 2016), althoughother approaches can also be considered (Coiras & Groen, 2009; Cerqueira et al.,2017; Gwon et al., 2017). All simulation methods try to mimic one or more typesof sonar devices.
Side scan sonar (SSS) simulation : Bell & Linnett (1997) presented asimulator for SSS imagery based on optical ray-tracing, where a group of rays isprojected to insonify the scene and produce the acoustic data; fractal models areused to represent the roughness surface of the seafloor; stochastic influences asnoise and reverberation are neglected in that work. Instead of propagating manyindividual rays, Gu´eriot et al. (2007) developed a volume-based approach witha tube tracing technique; the tubes are composed of four rays, which intersect acertain area to allow computing the backscattered energy; the few launched raysoptimized the sonar rendering, while the surface details and transmitting sig-nal characteristics are suppressed. By using a frequency domain-based method,Coiras & Groen (2009) produced frames from a virtual SSS by using Fouriertransform; the returned intensity relies on the angle of incidence applied to abasic Lambert illumination model; physical effects, such as noise and multi-pathreturns, are considered, although the method was not designed to operate online.With a simplified Lambert diffusion model, Gwon et al. (2017) generated SSSdata integrated with UWSim simulator and ROS framework; acoustic framesare degraded with speckle (low frequency) and Rayleigh (high frequency) noises;although the performance of feature matching methods decays in images con-3aining multiplicative noise, due to the variance for intensity and affine changes,the authors applied SIFT, SURF, ORB and AKAZE algorithms to evaluate thesimilarity between two consecutive frames, obtaining a very low number of in-liers for all feature extractors.
Forward-looking sonar (FLS) simulation : Gu et al. (2013) modeledan FLS system, where the rays are comprised of basic lines, equivalent to thenumber of pixels of sonar image to be emulated; the reflection representation isseverally reduced to three colors only (black, gray and white). Kwak et al. (2015)improved Gu et al. ’s method by introducing sound attenuation effect in orderto produce gray-scale sonar images; by assuming a mirror-like reflection model,the sonar system only considers specular reflections, so that the method is onlysuccessful for smooth surfaces. Sa¸c et al. (2015) introduced an acoustic modelby combining ray-tracing in frequency domain; the intensity and the range ofsonar data are calculated by Lambert diffusion model and Euclidean distancerespectively; the high average time to compute a single FLS frame prevents theuse of the method in real-time operations. DeMarco et al. (2015) detailed anFLS simulator integrated with Gazebo simulator and ROS for diving assistance;the ray path mimics the sound wave to generate a point cloud; the simulatedimages are compared with real ones, although the reflectivity of the objectsand the noise models are analytically defined. Mai et al. (2018) conceived asimulator based on ray propagation to produce acoustic data; by assuming onlythe freshwater component, the sound attenuation is partially considered, whileother physical properties of sound are ignored; time consumption to calculateone single frame has not been well established by Mai et al. (2018).
Mechanical scanning imaging sonar (MSIS) simulation : Soares (2016)fused the ray-tracing and additive noise models, proposed in Bell & Linnett(1997) and Coiras & Groen (2009), respectively, to produce single beam data;in that work, no image distortion induced by robot movement was considered;the simulated frames were later used to feed an underwater localization systembased on Hilbert maps. Cerqueira et al. (2017) introduced a GPU-based simu-lator to reproduce the operation of two sonar devices; by deferred shading, the4 able 1: Summary of state-of-the-art works on imaging sonar simulation.
Works B e ll & L i nn e tt( ) G u ´ e r i o t e t a l. ( ) C o i r a s & G r o e n ( ) G u e t a l. ( ) K w a k e t a l. ( ) S a ¸ce t a l. ( ) D e M a r c o e t a l. ( ) S oa r e s ( ) G w o n e t a l. ( ) C e r q u e i r a e t a l. ( ) M a i e t a l. ( ) O u r s T y p e SSS (cid:32) (cid:32) (cid:32) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:32) (cid:35) (cid:35) (cid:35)
FLS (cid:35) (cid:35) (cid:35) (cid:32) (cid:32) (cid:32) (cid:32) (cid:35) (cid:35) (cid:32) (cid:32) (cid:32)
MSIS (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:32) (cid:35) (cid:32) (cid:35) (cid:32) M o d e l Frequency domain (cid:35) (cid:35) (cid:32) (cid:35) (cid:35) (cid:32) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35)
Tube tracing (cid:35) (cid:32) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35)
Ray-tracing (cid:32) (cid:35) (cid:35) (cid:32) (cid:32) (cid:32) (cid:32) (cid:32) (cid:35) (cid:35) (cid:32) (cid:32)
Rasterization (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:32) (cid:35) (cid:32) F e a t u r e s Reflection model (cid:32) (cid:32) (cid:32) (cid:35) (cid:71)(cid:35) (cid:32) (cid:71)(cid:35) (cid:32) (cid:71)(cid:35) (cid:32) (cid:32) (cid:32)
Surface irregularities (cid:32) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:32) (cid:35) (cid:32)
Surface reflectance (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:71)(cid:35) (cid:35) (cid:35) (cid:32) (cid:35) (cid:32)
Attenuation (cid:32) (cid:35) (cid:35) (cid:35) (cid:71)(cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:71)(cid:35) (cid:32)
Speckle noise (cid:35) (cid:35) (cid:71)(cid:35) (cid:35) (cid:35) (cid:35) (cid:71)(cid:35) (cid:71)(cid:35) (cid:71)(cid:35) (cid:71)(cid:35) (cid:35) (cid:32)
Reverberation (cid:35) (cid:35) (cid:71)(cid:35) (cid:35) (cid:35) (cid:71)(cid:35) (cid:35) (cid:71)(cid:35) (cid:35) (cid:35) (cid:35) (cid:71)(cid:35)
Robotics framework support (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:32) (cid:35) (cid:35) (cid:32) (cid:32) (cid:32) E v a l. Qualitative (cid:32) (cid:32) (cid:32) (cid:32) (cid:32) (cid:32) (cid:32) (cid:32) (cid:32) (cid:32) (cid:32) (cid:32)
Computation time (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:71)(cid:35) (cid:71)(cid:35) (cid:35) (cid:35) (cid:32) (cid:71)(cid:35) (cid:32)
Quantitative (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:35) (cid:32) (cid:35) (cid:32)(cid:32) = provides property; (cid:71)(cid:35) = partially provides property; (cid:35) = does not provide property. rasterization rendering was exploited to compute the acoustic parameters ( i.e. ,echo intensity, pulse distance and azimuth angle); sound phenomena such asmultiplicative noise and material properties were addressed, while multipath re-turns, attenuation and additive noise did not; experiments comparing real-worldacoustic images certified the use of the simulator by real-time applications.
This paper proposes a sonar simulator that extends the work in (Cerqueiraet al., 2017) by combining rasterization and ray-tracing to optimize acoustic5eflections and fulfilling the missing physical phenomena. A comparative sum-mary between the state-of-the-art works and ours is detailed in Table 1.Instead of simulating a specific sonar type (Bell & Linnett, 1997; Gu´eriotet al., 2007; Coiras & Groen, 2009; Gu et al., 2013; DeMarco et al., 2015; Kwaket al., 2015; Sa¸c et al., 2015; Soares, 2016; Gwon et al., 2017; Mai et al., 2018),our proposed method is able to reproduce the operation of FLS and MSIS sen-sors. A selective rasterized ray-tracer is integrated on GPU, where the com-putational resources are restricted to only reflective regions; this combinationenables multipath reflections (not present in rasterization-based works, such asin Cerqueira et al. (2017)), launching few rays with the same final result incomparison with full ray-tracing and tube tracing methods (Bell & Linnett,1997; Gu´eriot et al., 2007; Gu et al., 2013; DeMarco et al., 2015; Kwak et al.,2015; Sa¸c et al., 2015; Soares, 2016; Gwon et al., 2017; Gu´eriot et al., 2007; Maiet al., 2018). Additionally the number of intersection tests of ray-tracing modelis significantly reduced by using bounding volumes and a ray-box intersectionalgorithm, accelerating the rendering time as a consequence.The sonar simulator is already integrated with a robotics framework ( i.e. ,Rock), supporting the integration with real and simulated robotic systems, fea-ture present in (DeMarco et al., 2015; Cerqueira et al., 2017; Mai et al., 2018).The echo intensity from observable objects depends on surface normal direc-tions, material reflectivity and sound attenuation properties, differently fromexisting approaches (Gu et al., 2013; DeMarco et al., 2015; Gwon et al., 2017),where the reflection value is empirically defined. Yet, the reflection model isvalid for any type of surface representation, in opposition to (DeMarco et al.,2015; Sa¸c et al., 2015). Five of the analyzed works consider either additive ormultiplicative noise, while speckle effect is just partially simulated (Coiras &Groen, 2009; Sa¸c et al., 2015; Soares, 2016; Gwon et al., 2017; Cerqueira et al.,2017). In our work, speckle noise is fully reproduced.Our experiments comprises qualitative, computational time and quantita-tive evaluations between simulated and real-world sonar data, assessing time-efficiency and rendering quality of the generated acoustic images.6 y x θ Φ r max max max r min Figure 1: Viewing volume of imaging sonar. The observable scene is defined by the minimumand maximum ranges r min and r max , maximum azimuth angle θ max , and maximum elevationangle φ max .
2. Working with underwater sonars
Sonar systems use the propagation of sound waves to detect and locate ob-jects underwater. These systems are grouped into two basic types: Passive andactive (Rossing, 2015). A passive sonar essentially listens for the sound wavesmade by submerged objects; in contrast, an active sonar transmits sound pulses,and then listens for echoes. Imaging sonars are classified as active devices.To compose an acoustic image, an active sonar insonifies the scene witha sound wave. The visible area is delimited by maximum azimuth angle θ max ,maximum elevation angle φ max , and minimum r min and maximum r max ranges,as illustrated in Fig. 1. In case that a sound wave hits an object, the returningecho is sampled as a function of range and bearing, since the speed of sound inwater is known. The transducer reading in a given direction composes a beam ,while each distance sampled along this beam is named bin . The strength ofbackscattered energy in each bin determines the echo intensity from an insonifiedobject. Combining the array of transducer readings, the group of echo intensitiesforms an image of the reflective surfaces in front of the sonar head.7 Q(r, θ , Φ )P ^ zy x Φ θ r max r min Figure 2: Model of the imaging sonar projection. A spherical point Q ( r, θ, φ ) is projectedinto a point P on an image plane. Considering an orthographic approximation, the point P is mapped onto ˆ P , which is equivalent to all points along the same arc. A 3D point is usually expressed in Cartesian coordinates as [ x, y, z ] T . A sonarsystem has the reference frame defined in spherical coordinates as Q = [ r, θ, φ ] T ,with range r , azimuth angle θ , and elevation angle φ , as depicted in Fig. 2. Theconversion of Cartesian to spherical coordinates is given by Q = rθφ = (cid:112) x + y + z tan − ( y/x )tan − ( (cid:112) x + y /z ) . (1)Since the elevation angle φ is missing during the process of acoustic pro-jection, the sonar system measures the range r and azimuth angle θ onto thezero-elevation plane, as an approximation to an orthographic projection (Jo-hannsson et al., 2010). This two-dimensional system is named polar coordinatesand follows a nonlinear model defined asˆ P = xy = r cos θr sin θ . (2)8 .2. Sound attenuation When a sound pulse propagates through the water, the acoustic energy isgradually converted into heat by a spherical spreading, absorption and chemicalproperties of the sea. This effect decreases the signal amplitude exponentiallywith distance, and the total acoustic attenuation in the ocean is expressed bythree additive components: Relaxation of borid acid ( H BO ) molecules below1 kHz, relaxation of magnesium sulphate ( M gSO ) below 100 kHz, and viscosityof pure water (Bjørnø, 2017). A common attenuation method is proposed byAinslie & McColm (1998), where the attenuation coefficient is expressed as α = α B + α M + α F , (3)with the boric acid component α B defined as α B = 0 . f f f + f e ( pH − / . , (4) f = 0 . (cid:18) S (cid:19) / e T/ , (5)the magnesium sulphate component α M defined as α M = 0 . (cid:18) T (cid:19) (cid:18) S (cid:19) f f f + f e − z/ , (6) f = 42 e T/ , (7)and the freshwater component α F defined as α F = 0 . f e − ( T/ z/ , (8)where α is the intensity absorption coefficient in dB/km, f is frequency in kHz, S is salinity in parts per thousand (ppt), pH is acidity, z is depth in km, and T is the water temperature in Celsius. 9 .3. Speckle noise Due to coherent nature of scattering phenomena, sonar images are affectedby speckle noise, a granular pattern, which severely deteriorates the visual qual-ity, and reduces relevant features as edges and shapes. This type of noise pro-duces random variations of image intensity, which causes light and dark pixelsand interferes in further operations, such as object detection and segmentation.The noisy image, ˆ I , has been expressed as (Mateo & Fern´andez-Caballero, 2009)ˆ I ( r, θ ) = I ( r, θ ) η m ( r, θ ) + η a ( r, θ ) , (9)where ( r, θ ) are the polar coordinates, I is the noise-free image, and η m and η a are the multiplicative and additive noise components, respectively. When active sonars transmit sound pulses, incoming echoes are usually re-turned from several different sources. The result of unwanted echoes is namedreverberation, which is mainly caused by the multiple path propagation andsuccessive interactions of the transmitted signal, weakening the sound intensity(Hodges, 2010). Sources of reverberation in the ocean include the surface, theseafloor and the volume of water.
The raw sonar data is generated as a function of range and bearing, resultingin a polar image I ( r, θ ) with an echo intensity value for each pixel. For a betterhuman interpretation, this polar representation can be converted to Cartesiancoordinates I ( x, y ) using Eq. (2). In Cartesian coordinates, the fan-shapedimage preserves the target geometry. The conversion from polar to Cartesiansystem entails a non-uniform resolution, where the representation of the clos-est bins to sonar origin are superscripted, while the far ones are interpolated,yielding to image distortions and object flatness. The raw polar data and thecorresponding representation in Cartesian space is illustrated in Fig. 3.10 a) Raw data as Cartesian image. (b) Raw data as polar image. Figure 3: Different types of acoustic data representations. A wrecked ferry was captured withFLS Tritech Gemini 720i sensor from a real AUV. In this paper, the simulated images aredisplayed in Cartesian coordinates to retain the characteristics of the insonified objects (a),while the polar image is applied during similarity evaluation without loss of original data (b).
3. Simulating acoustic images based on a rasterized ray-tracing pipeline
The pipeline of the proposed sonar simulator is depicted in Fig. 4, andbridges two domains. On
GPU domain , the engine computes reflections usingan approach based on a selective rasterized ray-tracing. The resulting sonarrendering parameters are processed into the simulated acoustic data on
CPUdomain , where the sonar image is presented. This approach is detailed intothe following subsections.
Underwater environment is defined with Rock-Gazebo integration (Watan-abe et al., 2015). Gazebo handles with physical simulations, where the hy-drostatic and hydrodynamic forces and moments are modelled and applied onunderwater vehicles, and provides access to the simulated objects and data; os-gOcean, a plugin for OpenSceneGraph, renders the ocean with several visualeffects such as sunlight, ocean surface foam, water turbidity, and light absorp-tion and scattering. Rock framework manages the communication and syn-11 c ho i n t en s i t y M AX M I N P u l s e d i s t an c e F A R N EA R D i s t an c e h i s t og r a m N ea r F a r F a r S i gna l a tt enua t i on
101 0 N o i s e s i m u l a t i on E ne r g y no r m a li z a t i on S on a r r e nd e r i ng p a r a m e t e r s c a m e r a v i e w po r t U nd e r w a t e r s i m u l a t e d sce n e S on a r se tt i ng s P o s e , f i e l d o f v i e w , r ange , r e s o l u t i on c ap t u r ed r ende r i ng a r ea t r i angu l a t ed m e s he s calculation r a w i n t en s i t y r e t u r n no r m a li z a t i on D a t a s t r u c t u r e o f c o rr es pond i ng b ea m V a l ue B i n no i sy i n t en s i t ys e l e c t bea m s e l e c t b i n P r i m a r y r e f l ec t i on s G - bu ff e r s R a s t e r i z a t i on S ec ond a r y r e f l ec t i on s T r i ang l e s R a y - t r a c i ng un i f i ed r e f l e c t i on s r e f l e c t i v e a r ea s a tt enua t ed r e f l e c t i on s G P U D O M A I N C P U D O M A I N ( i )( ii ) ( iii ) ( i v )( v ) ( v i )( v ii )( v iii )( i x ) F i g u r e : O v e r v i e w o f p r o p o s e d i m ag i n g s o n a r s i m u l a t i o n . O n G P U d o m a i n : ( i ) a v i r t u a l c a m e r a c a p t u r e s t h e o b s e r v a b l e s ce n e ; ( ii ) b y r a s t e r i z a t i o n , t h e p r i m a r y r e fl ec t i o n s a r ec o m pu t e db y u s i n g t h e s u r f a ce n o r m a l a ndd e p t h v a l u e s f r o m G - bu ff e r s ; ( iii ) o n l y t h e r e fl ec t i v e a r e a s a r e r a y - t r a ce d f o r s ec o nd a r y r e fl ec t i o n s ; ( i v )t h e s i g n a l a tt e nu a t i o n m o d e l d ec a y s t h e a m p li t ud e o f t o t a l r e fl ec t i o n s ; ( v )t w o s o n a r p a r a m e t e r s a r e r e nd e r i ze d : E c h o i n t e n s i t y a ndpu l s e d i s t a n ce . O n C P U d o m a i n , t h e s h a d e r d a t a i ss o r t e d i nb e a m p a r t s , w h e r e : ( v i ) a d i s t a n ce h i s t og r a m c o rr e l a t e s t h e p i x e l s w i t h r e s p ec t i v e b i n s ; ( v ii )t h e b i n i n t e n s i t y i s c o m pu t e db y e n e r g y n o r m a li z a t i o n ; ( v iii ) n o i s e s i m u l a t i o nd e g r a d e s t h e s o n a r d a t a ; ( i x )t h e n o i s y b i n i n t e n s i t y i ss t o r e d a s a s o n a r d a t a s t r u c t u r e o n R o c k . A virtual camera, properly configured with the desired sonar settings ( i.e. ,pose, field of view, range and resolution), samples the underwater simulatedscene (Fig. 4 (i)). In vertex and fragment shaders, the captured rendering areapasses by a rasterization and selective ray-tracing scheme, where the deferredshading provides the information needed to compute the primary reflections andonly reflected areas are ray-traced for secondary reflections. This effectivelyenables a multipath propagation, prevents a whole interaction of intersectiontests, and produces the same result in comparison of a full ray-tracer. The first reflection comes from the closest intersection of source wave witha scene object in 3D space. In order to improve the performance of findings theclosest intersections, this work uses the deferred shading technique to mimicthe first reflections with a sound wave. Rather than launching individual raysthrough the virtual environment, the primary reflections take advantage oftwo geometric information stored in G-buffers (position and normal vectors inworld space) to compute the sonar rendering parameters during rasterizationprocess (Figure 4 (ii)): • Pulse distance : Reproduces the length of sound wave. This parameteruses the depth information to compute the Euclidean distance betweencamera center and object surface, as defined by r in Eq. (1). • Echo intensity : Simulates the backscattered power of sound wave. Thevalue is initially obtained from the normal incidence concerning the virtualcamera.Multiple factors can affect the strength of the reflected sound waves. Inorder to produce more realistic sonar images, four phenomena are considered13ere: Surface irregularities, material reflectance, sound attenuation and specklenoise. The former property enables the Lambertian diffuse reflection by applyingnormal map, an RGB texture, which changes the normal directions and, as aconsequence, fakes roughness at the object surface with no additional polygons.The material reflectance deals with the acoustic reflectivity of sound waves,whose echoes are stronger from objects with densities different than water. Sorocks, air-filled objects and compact gas reflect more sonar energy than softersurface types, like plastic and mud (Christ & Wernli Sr, 2013). In this context,when an object has the reflectivity defined, this value is multiplied by the echointensity. These two characteristics are detailed in Cerqueira et al. (2017). Nextthe influence of sound attenuation and speckle noise effects are presented.
Ray-tracing extends the wave propagation theory to simulate effects likereflections, ambient occlusions and shadows, but at a great computational cost.For highly complex scenes, this model generally becomes time-consuming dueto excessive amount of intersection tests. In this work, the world position andnormal vectors from G-buffers are used to compute the primary reflections ofsound wave through the virtual scene, which identify where each ray starts andin which direction it should be reflected. Then the proposed ray-tracer startsthe secondary reflections by selecting all pixels with surface normal valuesgreater than zero to be traced (Fig. 4 (iii)). In practice, this scheme propagatesfew rays when compared to a full ray-tracer, resulting in a significant speed-upwith no significant loss of information.Testing if a ray intersects any surface requires analyzing all objects in thescene. According to the complexity of geometric surfaces, these scene objectscan be described by simple shapes like spheres, cylinders and planes, or usingmathematical models such as polygon meshes and splines for high detailed rep-resentations. In this context, ray-geometry intersection methods have to dealwith each supported type of surface, increasing then the complexity of the im-plementation, drastically. Here all objects in the virtual underwater scene are14epicted as triangulated meshes by using tessellation at rendering time, and thetriangles data ( i.e. , vertices, surface normal, and centroid) feed the shader astextures. This way, any arbitrary surface can be rendered since each cameraray is tested against every individual triangle producing each polygon object inthe scene with ray-triangle intersections.The amount of time to compute ray-triangle intersections is directly propor-tional to the number of triangles in the scene. The rendering time can be savedby reducing the number of intersection tests (Akenine-M¨oller et al., 2018). Theselective ray-tracer is accelerated by bounding volume and a classic axis-alignedbounding boxes (AABBs) algorithm (Williams et al., 2005), as follows: For eachobject in the scene, one box encapsulates all vertices; if the ray does not inter-sect a box, it is not able to intersect any triangle within this bounding volume;otherwise, the ray is tested against each triangle contained into the box with aM¨oller-Trumbore intersection algorithm (M¨oller & Trumbore, 1997); in case ofnew intersection, the pulse distance and echo intensity values between triangleand ray origin are stored in a resulting image. This approach reproduces thesecondary reflections by saving a significant number of calls to the ray-triangleroutine, being summarized in the Algorithm 1.
After the computation of primary and secondary reflections, the correspond-ing results are blended in an unified shader image with echo intensity and pulsedistance values, and finally the signal attenuation effect is applied (Fig. 4(iv)). Since the water is a dissipative medium, the sound intensity decreasesexponentially with the distance travelled by absorption and spreading, whilepropagating. Equation (3) expresses the attenuation coefficient α , which can beconverted to Np/km, as follows:1 dB = 120 log e N p ≈ . N p, γ = 0 . α . (10)15 lgorithm 1 Selective ray-tracer in GPU function SecondaryReflections ( first ) second ← (0 , for all n in first.normals such that n> do [ orig, dir ] ← GetW orldCoordinates ( n ) ray ← CalculateRay ( orig, dir ) for all box in boxes do intersection ← IntersectBox ( ray , box ) if intersection.hit then for all triangle in box do intersection ← IntersectT riangle ( ray , triangle ) if intersection.hit then normal ← triangle.normal distance ← Length ( ray , triangle ) second . writeReflection ( normal , distance ) end if end for end if end for end for end function The sound pressure decays according to p d = p e − γd . (11)Within the same medium, the sound intensity is proportional to the averageof the squared pressure (Dunn et al., 2015) I ≈ p . (12)Therefore I d = I e − γd , (13)where the initial intensity I is reduced to I d at a distance d (in km), and theattenuation coefficient γ in Np/km. An example of the effect is showed in Fig. 5,where the attenuation coefficient weakens the acoustic intensity with increasingpropagation distance. 16 a) (b)(c) (d) Figure 5: Example of different attenuation coefficient values, α , applied on scene renderingof a cone. In shader images, the blue and green channels express the pulse distance andecho intensity data, respectively, for (a) α = 0 dB/km and (c) α = 0 .
013 dB/km. Thecorresponding sonar images are depicted in ((b)) and (d). By applying sound attenuationeffect, the echo intensity decreases exponentially with distance from the source.
The final pulse distance and echo intensity values are organized as blue andgreen channels of shader image, respectively (see sonar rendering parame-ters in Fig. 4(v)). These values range from 0 to 1. For the echo intensity, zeromeans no energy, while one denotes maximum reflection returned. For the pulsedistance, the minimal and maximum values express the near and far planes, re-spectively. At the end, the sonar parameters are rendered to a floating-pointRGBA texture, using a framebuffer object (FBO), to avoid loss of precisionmainly for the pulse distance values. 17 .3. Generating the sonar image on CPU
On CPU domain, the resulting sonar rendering parameters are converted intothe respective acoustic data. While the azimuth angle is radially spaced overthe virtual camera, the elevation angle is lost during sonar projection geometry.This process implies all pixels belong a column have the same bearing angle.The shader image columns are then divided into a number of beam parts. Foreach beam section, a distance histogram groups the pixels in bins, accordingto pulse distance values (Fig. 4 (vi)). Finally, the accumulated bin intensityvalue, I , is computed with an energy normalization function (Fig. 4 (vii)),given by I ( r, θ ) = N (cid:88) x =1 N S ( i x ) , (14)where ( r, θ ) are polar coordinates, N is the number of pixels with respect toone bin, x is the pixel index, and S is a sigmoid function applied over the echointensity i x .Due to the acquisition process and complexity of underwater sound prop-agation, acoustic devices suffer from speckle noise and random variations ofecho intensity. All these make further data interpretation difficult. To sim-ulate the speckle noise in the resulting image, Eq. (9) is used according to noise simulation functions (Fig. 4 (viii)). The multiplicative component fol-lows a non-uniform Gaussian distribution, while the additive one is denoted bya Gaussian random variable with zero mean and standard deviation σ . Thenoise model is repeated for each acoustic frame.The simulation ends with the conversion of noisy intensity values into a data structure of corresponding beam (Fig. 4). The sonar data is latterdisplayed in Cartesian coordinates on Rock framework, according to Eq. (2).18 . Experimental evaluation Our sonar simulator was implemented in C++, OpenCV and OpenScene-Graph on CPU. Shaders relies in massive parallelism available on GPU to renderthe underwater scene using rasterization and ray-tracing, and Ruby scripts con-nect and monitor components on Rock framework. All experiments were per-formed on an Intel Core i7-8750H 2.20 GHz, running with 16 GB DDR3 RAM,NVIDIA GeForce GTX 1060 video card and Ubuntu 16.04 64 bits operatingsystem.
To evaluate the visual quality of the images generated by the simulator,FLS and MSIS devices, equipping a virtual AUV, were simulated to insonifyfour different scenarios. In the first scene, illustrated in Fig. 6(a), a wreckedferry was used; the shape of the ferry is insonified in the FLS image, as well asthe corrugated seabed after a normal mapping technique, as can be seen in thesonar chart of Fig. 6(b); given the material reflectance is defined, the target isdistinguishable from other scene components. The second scene consists of asubsea cooler connected to pipelines in an oil production field (see Fig. 6(c));front faces of targets and the shadows occluding part of the scene are clearlyvisible in FLS chart image (see Fig. 6(d)); the echo intensity of acoustic imageis perturbed with speckle noise pattern; also, the attenuation effect weakensthe intensity of the farthest bins from sonar head. The third scene contains adestroyed car on the seafloor, and is depicted in Fig. 7(a); using the MSIS inhorizontal orientation, the regions with approximated perpendicular angle tothe sonar viewpoint, or multiple returns, are identified as brightest areas in thesonar chart of Fig. 7(b); the image of this sonar chart is also characterized bythe granular disturbance of the speckle noise. An offshore Christmas tree is themain target of the last scene (see Fig. 7(c)); an MSIS vertically mounted on theAUV captures the slice scanning of seafloor and the Christmas tree (see Fig.7(d)). 19 a) (b)(c) (d)
Figure 6: Demonstration of acoustic images generated by the sonar simulation system: (a)and (c) are the insonified targets in the underwater environment; (b) and (d) present sonardata produced by FLS device. The simulated representation of wrecked ferry in Fig. 3 isdepicted by (a) and (b).
For all experiments, the initial bins presented low intensity values, caused bythe lack of acoustic feedback in short ranges. The rendering of complex sceneswas also addressed, highlighting the details present on geometries of insonifiedobjects. Yet, acoustic shadows contain valuable information for the accurateinterpretation of the sonar images; depending on the angle of incidence, theshadows can present more details than the sonar acoustic return, as illustratedby the pipelines in Fig. 6(d). 20 a) (b)(c) (d)
Figure 7: Demonstration of acoustic images generated by the sonar simulation system: (a)and (c) are the insonified targets in the underwater environment; (b) and (d) denotes the vir-tual acoustic representations by MSIS device mounted in horizontal and vertical orientations,respectively.
To evaluate the computational cost of our simulator, we built a data setcontaining four different geometric shapes randomly positioned along the sonarviewport, for each frame: Cylinder, box, sphere and cone. These geometricshapes provides a good variation in the amount of triangle meshes during tes-selation process. To measure the execution time, three metrics were used, assuch: Average time, standard deviation and frame rate. For each iteration, the21 a b l e : P r o ce ss i n g t i m e t og e n e r a t e s a m p l e s f r o m F L S s e n s o r w i t hd i ff e r e n t c o nfi g u r a t i o n s . S e t up C e r q u e i r a e t a l. ( ) O u r s o f b e a m s o f b i n s F i e l d o f v i e w ( w x h ) A v g . t i m e ( m s ) S t dd e v ( m s ) F r a m e r a t e ( f p s ) A v g . t i m e ( m s ) S t dd e v ( m s ) F r a m e r a t e ( f p s ) ◦ x ◦ . . . . . . ◦ x ◦ . . . . . . ◦ x ◦ . . . . . . ◦ x ◦ . . . . . . ◦ x ◦ . . . . . . ◦ x ◦ . . . . . . ◦ x ◦ . . . . . . ◦ x ◦ . . . . . . T a b l e : P r o ce ss i n g t i m e t og e n e r a t e s a m p l e s f r o m M S I S s e n s o r w i t hd i ff e r e n t c o nfi g u r a t i o n s . S e t up C e r q u e i r a e t a l. ( ) O u r s o f b i n s F i e l d o f v i e w ( w x h ) A v g . t i m e ( m s ) S t dd e v ( m s ) F r a m e r a t e ( f p s ) A v g . t i m e ( m s ) S t dd e v ( m s ) F r a m e r a t e ( f p s ) ◦ x ◦ . . . . . . ◦ x ◦ . . . . . . ◦ x ◦ . . . . . . ◦ x ◦ . . . . . . ◦ x 20 ◦ and 256 beams, owns refresh ratesof 5-30 fps (range dependent).In comparison with other simulators, for the FLS device, our rates are su-perior to the rates listed by DeMarco et al. (2015) (3 fps), Mai et al. (2018)(1 fps) and Sa¸c et al. (2015) (2.5 min), even with additional acoustic phenom-ena present in the simulated sonar image. For MSIS type, to the best of ourknowledge, there is no previous work with rates for comparison.The number of bins and beams also impacts on the simulation performance.The former is directly proportional to image resolution; the amount of pixelsto be processed increases with the number of bins. The latter determines thenumber of beam sections of shader images to be rendered. For quantitative analysis, two different scenarios were sampled by real FLSand MSIS sensors equipped on FlatFish AUV (Albiez et al., 2015). In the for-23 a) (b)
Figure 8: Trials with FlatFish AUV during acoustic data acquisition for quantitative evalua-tion: (a) At DFKI RIC, Bremen, Germany; (b) In Todos os Santos Bay, Salvador, Brazil. mer scenario, a Tritech Gemini 720i insonified a subsea safety isolation valve(SSIV) mockup on the seabed in Todos os Santos Bay, Salvador, Brazil; the lat-ter scenario is comprised of a Tritech Micron DST sonar horizontally mounted tocapture the tank walls surrounding the AUV at DFKI RIC, Bremen, Germany.Figure 8 present the FlatFish AUV during these trials. The aforementionedexperiments were repeated in the virtual underwater scenario with the sametargets, and our sonar system generated the corresponding acoustic representa-tions. A summary of the experiments is depicted in Figs. 9 and 10.Similarity measurements between real and simulated sonar images dependon device configuration, environment characteristics, observable objects and ac-quisition viewpoint. Four metrics were chosen to compute the similarity be-tween the acoustic frames of the real and simulated images: Mean-squarederror (MSE), peak signal-to-noise (PSNR), structural similarity index measure(SSIM) (Yang et al., 2008) and multi-scale structural similarity index measure(MS-SSIM) (Wang et al., 2003). In order to preserve the original data, polar-coordinate based images were used in this evaluation. Table 4 summarizes theresults found in comparison with the method proposed in (Cerqueira et al.,2017). Values in the table were normalized to zero representing minimum simi-24 able 4: Similarity evaluation results between real and simulated sonar images.
Scene Cerqueira et al. (2017) OursDevice Target MSE PSNR SSIM MS-SSIM MSE PSNR SSIM MS-SSIMFLS SSIV 0.990 0.669 0.361 0.628 0.994 0.690 0.405 0.683MSIS Tank 0.996 0.761 0.834 0.852 0.996 0.760 0.832 0.849 larity, while one denotes maximum correlation.In the FLS experiment, the values of our proposed work with MSE, PSNR,SSIM and MS-SSIM were higher to those values found in (Cerqueira et al.,2017), mainly explained by the attenuation, additive noise and reverberationphenomena presented in the complex and full-detailed image. Conversely, SSIMindividually presents lower performance for our proposed system, due to thesensitivity of this metric to changes on local intensity and contrast patternson a very simple scene image (see Figs. 9 (d) and (e)). In MSIS experiment,MSE, PSNR, SSIM and MS-SSIM showed values approximately equal to theones found in Cerqueira et al. (2017), what can be justified to the simplicity ofthe object edges insonified by the MSIS device. Indeed, the level of attenuation,speckle noise and reverberation was not enough in the image to define a gainon the simulated image in comparison with the real one, specifically regardingthe quality of the image.
5. Discussion and conclusions
Existing methods focus on simplified implementation of one specific sonartype, where the majority of underwater sound properties are disregarded. Theproposed simulator here was able to reproduce the operation of two differentsonar devices: FLS and MSIS. All experimental scenarios were defined to demon-strate phenomena usually found in real sonar images, such as speckle noise,transmission loss and material properties of insonified surfaces. It is notewor-thy that the sea level is not considered during sonar rendering, turning thisparticular reverberation component absent from the computation of the final25 a) (c)(b) (d)(e)
Figure 9: Experimental results with FLS device: (a) An SSIV mockup; (b) the acoustic datacaptured by Tritech Gemini 720i sonar equipped on the FlatFish AUV; (c) a 3D model of theSSIV; (d) and (e) are the acoustic data generated by our simulator and that one proposed inCerqueira et al. (2017), respectively. a) (c)(b) (d)(e) Figure 10: Experimental results with MSIS device: (a) DFKI tank; (b) the acoustic datacaptured by Tritech Micron DST sonar horizontally mounted on the FlatFish AUV; ((c))simulated tank; ((d)) and ((e)) are acoustic data generated by our simulator and that oneproposed in Cerqueira et al. (2017), respectively. . If the paper is accepted, code of the proposed simulator will become available for all theresearch community. eferencesReferences Ainslie, M., & McColm, J. (1998). A simplified formula for viscous and chemicalabsorption in sea water.
Acoustical Society of America Journal , , 1671–1672. doi: .Akenine-M¨oller, T., Haines, E., & Hoffman, N. (2018). Real-time rendering .AK Peters/CRC Press.Albiez, J., Joyeux, S., Gaudig, C., Hilljegerdes, J., Kroffke, S., Schoo, C.,Arnold, S., Mimoso, G., Alcantara, P., Saback, R., Britto, J., Cesar, D.,Neves, G., Watanabe, T., Paranhos, P. M., Reis, M., & Kirchner, F. (2015).FlatFish - a compact AUV for subsea resident inspection tasks. In
MTS/IEEEOCEANS Conference (pp. 1–8). doi: .Bell, J., & Linnett, L. (1997). Simulation and analysis of synthetic sidescansonar images.
IEE Proceedings - Radar, Sonar and Navigation , , 219–226. doi: .Bjørnø, L. (2017). Applied Underwater Acoustics . Elsevier Science.Cerqueira, R., Trocoli, T., Neves, G., Joyeux, S., Albiez, J., & Oliveira, L.(2017). A novel gpu-based sonar simulator for real-time applications.
Com-puters & Graphics , , 66–76. doi: .Christ, R. D., & Wernli Sr, R. L. (2013). The ROV manual: a user guide forremotely operated vehicles . Butterworth-Heinemann.Coiras, E., & Groen, J. (2009). Simulation and 3d reconstruction of side-lookingsonar images. In S. Silva (Ed.),
Advances in Sonar Technology chapter 1. (pp.1–15). InTech. doi: .DeMarco, K., West, M., & Howard, A. (2015). A computationally-efficient2d imaging sonar model for underwater robotics simulations in Gazebo. In29
TS/IEEE OCEANS Conference (pp. 1–8). doi: .Dunn, F., Hartmann, W., Campbell, D., & Fletcher, N. (2015).
Springer hand-book of acoustics . Springer.Etter, P. (2018).
Underwater Acoustic Modeling and Simulation . CRC Press,Taylor & Francis Group.Gu, J.-H., Joe, H.-G., & Yu, S. (2013). Development of image sonar simulatorfor underwater object recognition. In
MTS/IEEE OCEANS Conference (pp.1–6). IEEE. doi: .Gu´eriot, D., Sintes, C., & Garello, R. (2007). Sonar data simulation based ontube tracing. In
OCEANS 2007 - Europe (pp. 1–6). doi: .Gwon, D., Kim, J., Kim, M. H., Park, H. G., Kim, T. Y., & Kim, A. (2017). De-velopment of a side scan sonar module for the underwater simulator. In (pp. 662–665). doi: .Hodges, R. (2010).
Underwater Acoustics: Analysis, Design and Performanceof Sonar . John Wiley & Sons.Hurt´os Vilarnau, N. (2014).
Forward-looking sonar mosaicing for underwaterenvironments . Ph.D. thesis Universitat de Girona.Johannsson, H., Kaess, M., Englot, B., Hover, F., & Leonard, J. (2010). Imagingsonar-aided navigation for autonomous underwater harbor surveillance. In (pp. 4396–4403). doi: .Kwak, S., Ji, Y., Yamashita, A., & Asama, H. (2015). Development of acous-tic camera-imaging simulator based on novel model. In
IEEE International onference on Environment and Electrical Engineering (EEEIC) (pp. 1719–1724). doi: .Mai, N., Ji, Y., Woo, H., Tamura, Y., Yamashita, A., & Asama, H. (2018).Acoustic image simulator based on active sonar model in underwater envi-ronment. In (pp. 775–780). doi: .Mateo, J., & Fern´andez-Caballero, A. (2009). Finding out general tendenciesin speckle noise reduction in ultrasound images. Expert Systems with Appli-cations , , 7786–7797. doi: .M¨oller, T., & Trumbore, B. (1997). Fast, minimum storage ray-triangle inter-section. Journal of Graphics Tools , , 21–28. doi: .Rossing, T. (2015). Springer handbook of acoustics . Springer Science & BusinessMedia.Sa¸c, H., Leblebicio˘glu, K., & Akar, G. B. (2015). 2d high-frequency forward-looking sonar simulator based on continuous surfaces approach.
TurkishJournal of Electrical Engineering and Computer Sciences , , 2289–2303.doi: .Soares, E. (2016). Underwater simulation and mapping using imaging sonarthrough ray theory and Hilbert Maps . Master’s thesis Federal University ofRio de Janeiro.Wang, P., Tian, X., Peng, T., & Luo, Y. (2018). A review of the state-of-the-art developments in the field monitoring of offshore structures.
OceanEngineering , , 148–164. doi: .Wang, Z., Simoncelli, E. P., & Bovik, A. C. (2003). Multiscale structural sim-ilarity for image quality assessment. In The Thrity-Seventh Asilomar Con-ference on Signals, Systems Computers, 2003 (pp. 1398–1402). volume 2.doi: . 31atanabe, T., Neves, G., Cerqueira, R., Trocoli, T., Reis, M., Joyeux, S.,& Albiez, J. (2015). The Rock-Gazebo integration and a real-time AUVsimulation. In (pp. 132–138). doi: .Williams, A., Barrus, S., Morley, R., & Shirley, P. (2005). An efficient androbust ray-box intersection algorithm. In
Journal of Graphics Tools (pp.49–54). Taylor & Francis volume 10. doi: .Yang, C., Zhang, J.-Q., Wang, X.-R., & Liu, X. (2008). A novel similarity basedquality metric for image fusion.
Information Fusion , , 156–160. doi:10.1016/j.inffus.2006.09.001