[PDF] Underwater Acoustic Communication Receiver Using Deep Belief Network

Abstract

Underwater environments create a challenging channel for communications. In this paper, we design a novel receiver system by exploring the machine learning technique--Deep Belief Network (DBN)-- to combat the signal distortion caused by the Doppler effect and multi-path propagation. We evaluate the performance of the proposed receiver system in both simulation experiments and sea trials. Our proposed receiver system comprises of DBN based de-noising and classification of the received signal. First, the received signal is segmented into frames before the each of these frames is individually pre-processed using a novel pixelization algorithm. Then, using the DBN based de-noising algorithm, features are extracted from these frames and used to reconstruct the received signal. Finally, DBN based classification of the reconstructed signal occurs. Our proposed DBN based receiver system does show better performance in channels influenced by the Doppler effect and multi-path propagation with a performance improvement of 13.2dB at 10^{-3} Bit Error Rate (BER).

Full PDF

11 Underwater Acoustic Communication Receiver Using DeepBelief Network

Abigail Lee-Leon ∗† , Chau Yuen ∗ , Dorien Herremans ∗ ∗ Singapore University ofTechnology and Design (SUTD), 8 Somapah Road, Singapore 487372 † Thales SolutionsAsia Pte Ltd, 21 Changi North Rise, Singapore 498788

Abstract —Underwater environments create a challeng-ing channel for communications. In this paper, we designa novel receiver system by exploring the machine learningtechnique–Deep Belief Network (DBN)– to combat the sig-nal distortion caused by the Doppler effect and multi-pathpropagation. We evaluate the performance of the proposedreceiver system in both simulation experiments and seatrials. Our proposed receiver system comprises of DBNbased de-noising and classiﬁcation of the received signal.First, the received signal is segmented into frames beforethe each of these frames is individually pre-processed usinga novel pixelization algorithm. Then, using the DBN basedde-noising algorithm, features are extracted from theseframes and used to reconstruct the received signal. Finally,DBN based classiﬁcation of the reconstructed signal occurs.Our proposed DBN based receiver system does showbetter performance in channels inﬂuenced by the Dopplereffect and multi-path propagation with a performanceimprovement of 13.2dB at − Bit Error Rate (BER).

Index Terms —Underwater Acoustic Communications,Receiver Systems, Machine Learning, Signal Processing,Doppler Effect

I. I

NTRODUCTION

Underwater Acoustic Communications (UWAC)is a knowledge rich ﬁeld that has, in the recentyears, gained a tremendous amount of interest forits many applications in the ﬁeld of ocean ex-ploration, defense, and marine commercial indus-tries. A few notable applications are underwaterexploration [1], underwater mine detection [2], andunderwater communications between submarines orunderwater nodes [3]. Due to a rapidly growing needfor data-heavy underwater systems, the expectationsand requirements of the underwater system designhas risen up to a point where a growing numberof researchers are starting to turn to unconventionalmethods like machine learning (ML) and deep learn-ing (DL) to combat the challenging underwater envi-ronment. In this paper, we propose a novel receiver system that utilizes the capabilities of Deep BeliefNetworks (DBNs) to redesign the de-noising anddemodulation blocks of the communication system.Generally, conventional signal processing algo-rithms in communications are based on strong math-ematical foundations and are designed speciﬁcallyfor a variety of speciﬁc channels and system models[4], [5]. For instance, the Binary Phase-Shift Key(BPSK) modulation was designed for the detectionof a constellation symbol in a channel of additivewhite Guassian noise (AWGN) [6]. These signalprocessing algorithms are constructed on expertknowledge of the tractable channel models, which inturn are established on a simpliﬁcation of Maxwell’sequations [7].UWAC signals, however, are not electromagneticin nature [8], [9]. As such, the UWAC channel iswidely characterized as one of the most complexchannels to model and has yet to develop a palpableor deﬁnite model. Its high complexity is mostlyderived from its fast varying characteristics, such asthe Doppler effect and the propagation properties.Since communications are heavily reliant on thecharacteristics of sound, it shows a strong correla-tion with the properties of sound [8]. By understand-ing how sounds are inﬂuenced during sea trials, onecan optimize the efﬁciency of the communicationthrough adaptation. As sound propagates underwaterat a very low speed of approximately 1500 m/s [9],and propagation occurs over multiple paths, it isvery common to observe a delay spreading overtens or even hundreds of milliseconds which resultsin frequency-selective signal distortion. This motionalso results in an extreme Doppler effect.Multi-path propagation in the ocean is governedby three effects– (1) sound reﬂecting off underwatersurfaces like bubbles and the seabed, (2) soundrefraction in the water due to density change, and (3)energy loss [10]. These effects will cause an elon- a r X i v : . [ ee ss . A S ] F e b gation of the path traveled, and thus a time delay.The ﬁrst also creates reverberation, which causes areﬂection phase change and a reﬂection amplitudechange. The second is a consequence of the spatialvariability of sound speed, which is dependent ontemperature, salinity, and pressure. These factorsvary with depth and location. The ﬁnal effect isheavily dependent on the signal frequency, as wellas the pH level of the water. This dependence is aconsequence of absorption, where the signal energyis converted to heat. In addition to the absorptionloss, the signal typically experiences a spreadingloss, which increases with distance [8]. To correctfor the intersymbol interference (ISI) caused by thepropagation, the works done in [11], [12], [13] usedan adaptive equalizer to ﬂexibly compensate for thechanges in the channel.Another distinguishing property of UWAC is thechannel’s time variability – (1) inherent changesin the propagation medium and (2) transmit-ter/receiver motion. Inherent changes include longterm changes like seasonal temperatures and in-stantaneous changes caused by shipping routes andmoving water surfaces. These factors result in botha scattering of the signal and a Doppler effectspreading due to the changing path length [10]. Acombination of these factors creates a complex chal-lenge of modeling a sufﬁciently accurate channelmodel. To combat Doppler shifts, many Dopplerscale estimation techniques have been proposed asseen in [14], [15].In recent years, ML and more speciﬁcally DLhave gained recognition for their performance inﬁelds known for their high modelling complex-ity [16], such as image recognition [17], naturallanguage processing [18], and handwriting analysis[19]. Currently researchers have begun to explorethe applications of ML and DL in the area of com-munications. For example, recent research by Wanget al. [20] exploited deep learning to detect signalmodulations in underwater channels. Other studieslike [21], [22] used deep neural network (NN)-basedauto encoders to demodulate received signals. Assuch, we can expect that applying ML techniquesto communication blocks in order to provide apromising solution to the complex channel problemwill yield signiﬁcant improvements in decodingthe physical layer. Works in [23], [24] investigatethe use of DL based orthogonal frequency-divisionmultiplexing (OFDM) receiver to recover signals corrupted by the UWAC channel.In this paper, we explore a receiver system thatfundamentally rethinks the traditional communica-tions system design. The receiver system utilizesDBNs, more speciﬁcally a greedy layer-wise MLalgorithm that is able to automatically learn a newlatent representation of the data. The two main func-tions of the receiver system are – (1) de-noising thereceived signal and (2) classifying the signals intotheir binary representatives. The features extractedby the DBN are considered as the properties of inputdata and are formed by considering the output layerof the DBN. For the ﬁrst aspect of the proposedreceiver, we train a DBN such that it learns toextract features of the received signal. The trainedDBN distinguishes the features of the segmentedpre-processed signal and groups them with the same“clean” framed training data. Furthermore, DBNis capable of reconstructing the input data basedon their reduced, learned representation. After theDBN, the classiﬁcation part of our system uses thefeatures which can be tuned by back-propagationfor classiﬁcation.The main contributions of this paper are as fol-lows:1) Developed a de-noising DBN model. Thetrained DBN model distinguishes the featuresextracted from a segmented pre-processed sig-nal. It then groups these features with thesame “clean” framed training data.2) Redesigned the demodulation block using aclassifying DBN model that utilizes featureextraction and back-propagation for classiﬁ-cation of the received signals.3) The simulation results show that our proposedDBN based system is able to remain relativelyresolute against the different characteristics ofthe UWAC channel. Furthermore, a sea trialwas conducted to verify the performance ofthe proposed DBN-based receiver system in areal life environment.The remainder of this paper is organized asfollows. Section II describes the communicationsystem, underwater channel, and receiver systemmodels. Section III illustrates the proposed DBNbased de-noising technique used. Section IV pro-vided a description of the proposed DBN baseddemodulation technique used in the novel receiversystem. Section V discussed the results of theproposed receiver system and the sea trial used for validation. Finally, conclusions are drawn inSection VI.II. S YSTEM M ODEL O VERVIEW

In this section, we describe the proposed end-to-end communication system, represented in Fig. 1,comprising of a single transmitter and receiver.The following subsections will describe the over-all communication system model used to test ourproposed system, the derivation of the underwateracoustic channel model, and an overview of theproposed receiver system.

A. Communication System Model

First, let y ( n ) be the representation of the con-voluted transmitted bits, Y ( n ) , and the binaryrepresentation of the modulated target transmittedsignal z ( t ) during the n -th transmission. A seriesof transmission symbols y ( n ) are translated intodifferent transmission signal waveforms z ( t ) via aPhase Shift Key (PSK) modulator as described in[25].Second, let x ( t ) represent the overall transmittedsignal consisting of z ( t ) and a Hyperbolic Fre-quency Modulated (HFM) pilot [26]. x ( t ) is thenrelayed through a channel model to obtain thereceived signal s ( t ) : s ( t ) = H ( t ) · x ( t ) + no ( t ) (1)where H ( t ) is the channel equation, · representsthe dot product, and no ( t ) is the Additive WhiteGaussian Noise (AWGN).Finally, we generate a set of training data [ z t , y t ] , t = 1 , , ..., n , where z t is a training signal, y t is thecorresponding BPSK binary label vector consistingof 0s and 1s, and n is the number of the trainingsignals. Once detection and removal of the HFMpilot is completed, the desired section of the re-ceived signal s (cid:48) ( t ) acts as the input for the proposedreceiver system. The algorithm will build a modelfrom the training data L , such that for a given s (cid:48) ( t ) , the trained model will be able to predict areconstructed waveform ˜ z ( n ) and its correspondinglabel ˜ y ( n ) . Therefore, predicting the received bits, ˜ Y ( n ) . B. Underwater Channel

Underwater acoustic communication channels areregarded by researchers as one of the most com-plex communication channels to model. Multi-pathpropagation and Doppler effects are recognizedas one of the most challenging factors of theunderwater acoustic channel [27], [28], [29]. Themore common techniques to approximately simulatethe underwater acoustic channel vary from signal-to-noise ratio (SNR)-based channel models that relyon empirical equations as seen in [8] to modelsthat are based on the assumption of Rayleigh signalfading in [30], [31].In this subsection, the channel model used for thesimulation will be presented, taking into considera-tion multi-path propagation and Doppler effect.

1) Multi-path propagation:

Multi-path propaga-tion in the ocean is mostly governed by sound re-ﬂecting off underwater surfaces like bubbles and theseabed [32]. These effects will cause an elongationof the path traveled, and thus a time delay. Thereceived signal in a mutli-path environment can begenerally represented as seen in [33], [34]: s ( t ) = N (cid:88) i =0 A ( t ) · x ( t − τ i ) (2)where A represents the reverberation created by thereﬂection and scattering. This phenomena results ina reﬂection phase change and a propagation loss[32]. As such, Eq. 2 can be further expressed as: s ( t ) = N (cid:88) i =0 a ai a bi ( θ ( t )) ◦ x ( t − τ i ) + no ( t ) (3)where a ai is the amplitude variation caused by thereverberation and fading of the channel, ◦ representsthe Hadamard product, and a bi ( θ ( t )) is the phasevariation and is modeled as seen in [35]: a b ( θ ( t )) = [1 e − jθ ( t ) ... e − jθ k − ( t ) ] where j represents the imaginary number, k isthe length of the signal and θ k is the phase shiftcorresponding to the change in angle.

2) Doppler Effect:

In underwater communica-tions, a combination of the low speed of underwatersound propagation and the relative movement ofthe transmitter and receiver introduces the Dopplereffect [8]. Let v denote the speed of the relativemovement of the transmitter and the receiver, and Fig. 1: End-to-end System Model f c denote the carrier frequency of the transmittedsignal. The carrier frequency at the receiver is givenby: f = (1 + ∆ rt ∆ s ) f c (4)where ∆ s denotes the sound propagation speedunderwater. Note that ∆ rt is positive in the eventthat the receiver is moving toward the transmitter,otherwise ∆ rt is negative.In the time domain, the Doppler effect can beconstrued as a lengthening or compression of thetransmitted waveform [36], [37]. The Doppler effectcan be depicted in the time-domain as: s ( t ) = x ((1 − α ) t ) + no ( t ) (5)where α is the Doppler co-efﬁcient.Taking into consideration the above contributingcharacteristics, the channel model used in this paperis: s ( t ) = N (cid:88) i =0 a ai a bi ( θ ( t )) ◦ x ((1 − α i ) t − τ i ) + no ( t ) (6) C. Receiver System Model

The receiver model is comprised of two blocks –(1) de-noising and (2) demodulation.In the de-noising component, the input signal s (cid:48) ( t ) is ﬁrst converted and normalized into a pix-elized matrix m ( t ) via the proposed pre-processingmethod. m ( t ) is then partitioned into i number of m i ( t ) to meet the requirement of the proposed al-gorithm for feature extraction. The learning featuresof the training data is used to ﬁnd the closest matchto the features of m i ( t ) , which is then used as abasis for reconstruction via: ˜ z ( t ) = Ψ( W · m i ( t ) + b ) (7) where Ψ( · ) is a learning function, W and b repre-sents the weights and bias of the network.In the demodulation block, the reconstructedwaveform ˜ z is classiﬁed to a label ˜ y ( n ) via: ˜ y ( n ) = Φ(˜ z ( t )) (8)where Φ( · ) is a learning function.The focus of this paper is then to optimize thelearning functions, Ψ( · ) and Φ( · ) , and their corre-sponding weights and bias.III. P ROPOSED

DBN-

BASED D E - NOISING

In this section, the de-noising algorithm, con-sisting of both the pre-processing method and thede-noising DBN, is described. An overview of ourproposed algorithm is shown in Fig.2. The inputis the received signal s (cid:48) ( t ) and the output is thereconstructed signal ˜ z ( t ) .Fig. 2: Overview of De-noising Block A. Pre-processing

The pixelization method is deﬁned as: m ( t ) = f ( normalized ( s (cid:48) ( t ))) The input signal s (cid:48) ( t ) is ﬁrst normalized into therange of 0 to 1. We then proceed to pixelize the (a) Received Signal s (cid:48) ( t ) = 1 × n Matrix(b) Pre-processed Signal m i ( t ) Fig. 3: Visualization of the Pre-processing BlockDiagram presents the received signal (input) and thepixelized matrix m i ( t ) (output). The received signalis pre-processed into 4 matrices to provide the DBNbased de-noising algorithm with more features toextract.signal to form m ( t ) . Let P ix be the number ofpixels (length wise), which controls the resolutionof the pixelization. The implemented pixelizationalgorithm is shown in Algorithm 1. The input andoutput of the pixelization is shown in Fig.3a andFig.3b respectively.Lastly, m ( t ) is resized to various resolutions, asshown in Fig. 3. This allows for more features tobe extracted and used for the reconstruction. B. De-noising DBN (stacked RBMs)

DBNs are probabilistic generative algorithmswhich provide a joint probability distribution overobservable data and labels. Restricted BoltzmannMachines (RBMs) are the building blocks of aDBN. Hence, in this section ﬁrst we brieﬂy describe

Algorithm 1

Pixelization algorithm

Input: s (cid:48) and P ix

Output: m length s ← length of s (cid:48) m ← ones ( P ix, F l ); Res ← P ix ; range ← − Res : 0; for i ← to length s do D ← range (cid:48) − s (cid:48) ( i ) A ← min ( abs ( D )) loc ← D [ A ] m ( loc, i ) ← return m RBMs and then we will explore DBN. Fig.4 illus-trates the concept of stacking RBMs to form a DBN.Fig. 4: Overview of a DBN consisting of stackedRBMsA Boltzmann Machine (BM) is a particular formof a Markov Random Field (MRF), where its energyfunction is linear in its free parameters. Some of itsvariables (hidden units) allow the machine to repre-sent complicated distributions internally. However,they are unobserved.

1) RBMs:

The energy function of the joint con-ﬁguration in Boltzmann machines is given as fol-lows: E ( v, h ) = − u v (cid:88) k =1 u h (cid:88) j =1 hW v − u v (cid:88) k =1 bv − u h (cid:88) j =1 ch = − h T Wv − b T v − c T h (9)where the visible nodes v ∈ R correspond to theinput and u v is the number of visible nodes, the hidden nodes h ∈ R represent the latent features and u h is the number of hidden nodes, W represents theconcurrent weights linking the nodes of the visibleto the hidden layer, b and c are the bias terms ofthe hidden and visible nodes respectively.The free energy can also be expressed in thefollowing form: F ( v ) = − log (cid:88) h e − E ( v,h ) = − log ( h T Wv + b T v + c T h )= − b T v − u h (cid:88) j =1 log (cid:88) h j e h j ( c j + W j v ) (10)Because visible and hidden units are conditionallyindependent of one-another, the following equationshold true. P ( v | h ) = (cid:89) j =1 p ( v j | h ) (11) P ( h | v ) = (cid:89) k =1 p ( h k | v ) (12)When binary units are used, so that v j and h k ∈ , , and a probabilistic version of the usual neuralactivation is obtained: P ( v j = 1 | h ) = sigm ( b j + W Tj h ) (13) P ( h k = 1 | v ) = sigm ( c k + W k v ) (14)The free energy of an RBM with binary unitsbecomes F ( v ) = − b T v − u h (cid:88) j =1 log(1 + e h j ( c j + W j v ) ) (15)Since RBMs are energy based algorithms, i.e.they associate a scalar energy to each conﬁgurationof the variables of interest, training them corre-sponds to modifying that energy function so that itsshape has desirable properties, such as low energyconﬁgurations.Energy-based probabilistic models deﬁne a prob-ability distribution through an energy function, asfollows: P ( v, h ) = 1 Z exp ( − E ( v, h )) (16)where Q is the partition function that is obtainedvia: Q = u v (cid:88) k =1 u h (cid:88) j =1 exp ( − E ( v, h )) (17) To optimize the parameters of the network at eachlayer k , the following optimization problem shownby Eqn. 18 is minimize via partial differentiationwith respects to W, b, c . g k ( v, h ) = − m m (cid:88) j =1 log( P ( v jk , h jk )) (18)

2) Stacking RBMs into a DBN:

A DBN is com-prised of stacked restricted Boltzmann machineswith a fast-learning algorithm that allows the struc-ture to achieve better results with less computationaleffort.It models the joint distribution between an ob-served vector x and l hidden layers h k as follows: P ( x , h , ..., h l ) = (cid:34) l − (cid:89) k =0 P ( h k | h k + ) (cid:35) P ( h l − | h l ) (19)where x = h , P ( h k | h k + )) is a conditionaldistribution for the visible units conditioned on thehidden units of the RBM at level k , and P ( h l − | h l ) is the visible-hidden joint distribution (output).TABLE I: The de-noising DBN structure consist of2 RBMs. Input Structure Output m i ( t ) No of layers 2 ˜ z ( t ) Nodes [875,625]Activation Function [Sig, Sig]Epoch 1000

3) Training:

To train a DBN such that it canperform matrix de-noising, the normalized pixelvalues of the pixelized signal are used as input. Byusing min-max normalization, m i ( t ) is transformedinto a ﬂoating-point number system with a range of0 and 1. Unlike the ﬁrst and last layer of the DBN,hidden layers consist of binary nodes. The mainidea is to train a DBN to be able to associate noisy m i ( t ) to m i ( t ) with lower noise or no noise. Thisidea can be implemented by learning the featuresextracted from the noisy and clean m i ( t ) contents.These features are then presented in some nodes atthe last layer of the network.The network is trained with a variety of noisy m i ( t ) as input and clean m i ( t ) as the desired output.Using a standard basis called relative activity todetect noise nodes, each node is deﬁned as the difference between two values of a particular nodewhich results from feeding the network a clean m i ( t ) and its corresponding noisy m i ( t ) . As a result,if a particular node is a noise node, it should havehigher relative activity. On the other hand, if it is aclean noiseless node, it should have a lower relativeactivity. This theory is justiﬁed by the fact that theactivation of m i ( t ) nodes should be same for bothclean noiseless and its corresponding noisy m i ( t ) .By performing the above action for all m i ( t ) and averaging the values of the last layer’s nodes,the average relative activity of the last layer iscomputed. The nodes with a higher average relativeactivity are still viewed as noise nodes. Once thenoise nodes are discovered, the next step is to lowertheir activity by selecting the average value of allthe noise nodes as their neutral values. As such, thenoise nodes are then considered inactive and a cleannoiseless m i ( t ) can be reconstructed. C. Results of DBN based De-noising

In this subsection, we evaluate the proposed DBNbased de-noising technique. As a baseline for com-parison, we used the conventional MLE methoddevised in [38] and the de-noising auto encoderin [39]. To analyze the only performance of the de-noising capability, the system used was uncoded.For the following simulation experiments, thesimulated BPSK dataset contains 100,000 trans-mitted signals periods, in which 50% is used fortraining, 20% on validation and the remaining 30%on testing. The dataset was generated using Eq.6 andTable.II. The f c , sampling frequency f s and bit rate Rb of the BPSK signals were set at kHz , kHz and kbits/s . The frequency of random change, f δ , was 2kHz. For consistency, the de-noising autoencoder used as a comparison in this section wastrained using the same dataset.TABLE II: Mean and Standard deviation of randomdistribution for simulated channel parameters Parameter µ σa a a b ( θ ) π π τ fs fs First, we conducted a simulation experiment toevaluate the proposed DBN based de-noising tech-nique’s ability to remove noise for channels with extremely high noise. Fig.5 shows the Bit ErrorRate (BER) of the proposed DBN based de-noisingtechnique under the AWGN channel. As a baselinefor comparison, we have provided the BER of theMLE and de-noising auto encoder to highlight thesubstantial gains for highly negative

EbNo . A reasonfor this could be the existence of noise invariableproperties in the features extracted by the DBNin the proposed DBN based de-noising technique.Evidence of this can be seen by the convergingperformance of the algorithm to the baseline asthe noise level decreases, resulting in a decreasein functionality of the noise invariant property. AtBER of − , the performance of the proposedDBN based de-noising technique has a signiﬁcantlysmaller gain of 2.4dB for de-noising auto encoderand 2.6dB for the MLE.Fig. 5: De-noising BER Accuracy comparison withuncoded BPSK under AWGN channel. For consis-tency, MLE was used as the demodulation techniquefor all 3 methods.Fig.6 shows a visualization of the de-noisingoutcome. At EbNo = -30dB, the received signal ishighly distorted by the channel noise. However, theproposed technique is still able to partially predictthe waveform shape. At EbNo = -5dB, the waveformcan be almost perfectly reconstructed.The second simulation experiment we conductedtested the algorithm’s ability to remain resoluteagainst multi-path propagation. The simulated chan-nel distorted received signals, utilized as test casesin this experiment, were modelled by Eq. 2. Thenumber of multi-paths in the dataset was distributedas 40% 1-path, 30% 2-paths and 30% 3-paths. The Fig. 6: Visualization of proposed DBN based de-noising algorithm under AWGN channel presentsthe received signal (input) and the reconstructedsignal (output). The reconstructed signal, depictedby the magenta line, is shown in relation to thetransmitted signal (ideal output), depicted in green.results of the experiment are shown in Fig.7. Withthe increasing number of paths, the BER of theproposed DBN based de-noising algorithm achievesconsiderable gains while remaining relatively stablein comparison to the de-noising auto encoder andMLE.Fig. 7: De-noising BER Accuracy comparison withuncoded BPSK under Multi-path channel, modelledin Eq.3. For consistency, MLE was used as thedemodulation technique for all 3 methods.Finally, to test the inﬂuence of the Doppler effecton the proposed DBN based de-noising algorithm,a simulation experiment was conducted using Eq.5 for the channel. Fig.8 depicts the results underthree different scenarios, where the α = 0 . , , . resulting in a f c = 1 kHz, kHz, kHz . The BERof the proposed algorithm for all three scenarios areobserved to be similar. Thus implying that for acertain range of α , the algorithm is able to remainrelatively rigid to the inﬂuences of the Dopplereffect.Fig. 8: De-noising Demodulation BER Accuracycomparison with uncoded BPSK, modelled in Eq.5.For consistency, MLE was used as the demodulationtechnique for all 3 methods.IV. P ROPOSED

DBN-

BASED D EMODULATION

In this section, the demodulation algorithm, con-sisting of a classiﬁcation DBN, is described.

A. Classiﬁcation DBN

For the classiﬁcation DBN, the same generalstacked energy based RBM algorithm is used asdescribed in Section III-B. The input is the recon-structed signal ˜ z ( t ) and the output consists of therespective binary labels ˜ y ( n ) of 0s and 1s. B. Determining the structure of Classiﬁcation DBN

To determine the classiﬁcation DBN structure,we investigated the inﬂuence of different networkstructures on the performance of the algorithm in theclassiﬁcation task at

EbNo = The best classiﬁcation BER results obtained wasusing the [1250 , structure. Although the struc-ture [1250 , seems to achieve approximatelythe same results, the time needed for training issigniﬁcantly larger.TABLE III: Effects of training epoch and numberof nodes on Classiﬁcation DBN Number of units[Layer 1,Layer 2] ResultsEpoch Number BER Time (s) [650 ,

200 0.1712 65.79 [650 ,

500 0.1386 378.2 [650 , [1250 ,

200 0.0854 1261 [1250 ,

500 0.0825 5731 [1250 , [1250 ,

200 0.0883 5228 [1250 ,

500 0.0844 18674 [1250 , To minimize complexity and maximize the per-formance of the algorithm, we have chosen to usethe structure as illustrated in Table IV.TABLE IV: The ﬁnal hyperparameters used in ourproposed classiﬁcation DBN, which consists of 2layers of RDMs.

Input Structure Output ˜ z ( t ) No of layers 2 ˜ y ( t ) Nodes [1250,50]Activation Function [Sig, Sig]Epoch 1000

C. Results of Classiﬁcation DBN

In this subsection, we evaluate the proposedclassiﬁcation DBN. As a baseline for comparison,we used the conventional MLE method devisedin [38] to illustrate the similar performance of thedemodulation techniques.For the following simulation experiments, thesimulated dataset contains 100,000 transmitted sig-nals periods, in which 50% is used for training, 20%for validation, and the remaining 30% for testing.The dataset was generated using a AWGN channelmodel at a range of

EbNo = -10dB to 30dB. For a faircomparison with MLE, the f c of the BPSK signalsis set at 2kHz. Fig. 9: Demodulation BER Accuracy comparisonwith BPSK and QPSKUsing the MLE as a baseline, this experimentillustrates that the demodulation performance levelof the proposed classiﬁcation DBN is similar toMLE. The results are shown in Fig. 9. This im-plies that the classiﬁcation DBN has learned toextract signiﬁcant features from the PSK modulationscheme. For a truly fair comparison, the proposedalgorithm is also compared to Quadrature PhaseShift Keying (QPSK), derived in [25], without muchextra training. As seen, at BER − , the algorithm’sperformance for QPSK has a BER of 0.67dB lessin comparison to MLE. A more inclusive trainingdataset for higher-order modulation schemes couldincrease the performance of the algorithm in thisarea. V. R ESULTS AND D ISCUSSION

This section will evaluate the proposed receiveras a whole as seen in Fig.1. First, the performanceof the receiver will analyzed using the simulatedunderwater model shown in Eq.6. Then, the condi-tions of the conducted sea trial will be described.Finally, the collected sea trial data was used tovalidate the real application of the proposed receiversystem.The data frame of the testing dataset used in boththe simulation experiments and sea trials is shownin Fig.10. The pilot consists of a single up-sweepand a down-sweep HFM signal, which is used fordetection of the incoming received data signal. TheHFM modulated signal has a bit rate of 50 bits/s and a frequency range of 1-4kHz. The data frameincludes 416 bits of coded BPSK modulated signals.The speciﬁcations of the data structure is recordedin Table.V.Fig. 10: Transmitted Data StructureTABLE V: Speciﬁcations on the simulated and seatrial data set Experiment Parameters ValueSampling Rate 40 kHzBit Rate of HFM 50 bits/s f c of HFM 1 – 4 kHzBit Rate of BPSK 1 kbits/s f c * of BPSK 1, 2, 3 & 4 kHz A. Simulation Overall Results

In this subsection, we evaluate the overall pro-posed receiver system. To assess performance undera underwater environment, we will be employing 5systems for evaluation – (1) MLE demodulation, (2)de-noising auto encoder with MLE demodulation,(3) the proposed DBN based receiver, (4) DL or-thogonal frequency-division multiplexing (OFDM)[23], and (5) SIC DL [24].For the following simulation experiments, thetraining data and channel model used to train theindividual parts of the proposed receiver systemwere the same as stated in Section IV and V. Forequitable contrast, the de-noising auto encoder didnot go through any extra training.The simulated BPSK testing dataset contains10,000 transmitted signals periods. The dataset wasgenerated using Eq.6 and the random distributionsseen in Table.VI. The number of multi-paths in thedataset was distributed as 40% 1-path, 30% 2-pathsand 30% 3-paths. The dataset contains dataset of60% kHz f δ and 40% kHz f δ in each multi-path cluster. The increase in f δ is used to furthersimulate the complex occurrence of the underwaterscattering. To fairly evaluate the performance of the proposed receiver with the two systems mentionedabove, the f c of the BPSK signals is set at 2kHz.TABLE VI: Mean and Standard deviation of randomdistribution for simulated overall channel parame-ters Parameter µ σa a a b ( θ ) π π α . τ fs fs In a previous investigation seen in [40], we dis-covered that the feature extraction ability of theDBN has created a characteristic that is invariantto the inﬂuences of the Doppler effect. Therefore,we assume that even though the classiﬁcation DBNwas only trained on f c = 2 kHz , the performanceof the proposed classiﬁcation DBN will not besigniﬁcantly degraded by the range of f c used inthe testing dataset used.Fig. 11: BER Accuracy comparison with codedBPSK and OFDM-BPSK under simulatedunderwater conditions, modelled by Eq.6Fig.11 depicts the performance of the ﬁve sys-tems with regards to the above described testingscenario. The proposed receiver (consisting of bothDBN De-noise and DBN Classiﬁcation) is seen tooutperform the other algorithms over a large rangeof EbNo for both uncoded BPSK and OFDM-BPSK.This implies that the proposed receiver system isable to remain invariant to changes in instantaneousamplitude, phase and frequencies, such that theshown coding gain can be achieved. Table.VII compares the computational complex-ity of the ﬁve algorithms, where n represents inputsize n for each function. The results show that ourproposed system requires a large amount of trainingtime in comparison to the auto encoder and MLE.However, shown in Fig.11, our proposed algorithmoutperformed the auto encoder and MLE by 7.8dBand 12dB at EbN = 5 and EbN = 0 respectively forthe BPSK modulated system.TABLE VII: Algorithmic Computational Complex-ity Comparison Training TestingMLE - O( n )Auto-encoder O(1000 n ) O(100 n )DL OFDM [23] O(19200 n ) O(9600 n )DL SIC [24] O(9600 n ) O(4800 n )Proposed System O(16000 n ) O(100 n ) B. Sea Trial Set-up

The communication system used in theunderwater acoustic sea trial is depicted in Fig.12a.Before transmission, the desired transmitted signal x ( t ) is converted from digital values to analogsensor signals using a National Instruments-DataAcquisition (NI-DAQ) hardware unit. The signalis then ampliﬁed before being transmitted. At thereceiver end, the signal is ﬁrst received by thehydrophone and ampliﬁed by ISO-TECH IPS-3303. The corresponding NI-DAQ will translatethe analog sensor signal to digital values for theproposed communication system.In March 2019, a sea trial was conducted in thewaters near Selat Pauh, Singapore, where the bottomis muddy with the deepest depth of approximately25m. The waters is considered to be relatively sta-tionary with occasional disturbance from the largevessels traveling to the port. In this trial, the distanceand depth of the transmitter and receiver was keptat about 300m and 9m respectively, with a variationof 50m and 1m due to the changing currents. Thecarrier frequency of the BPSK modulated signal wasvaried at 1kHz intervals for different trials. Thetrials were conducted at f c = 1 kHz, f c = 2 kHz, f c = 3 kHz and f c = 4 kHz. Due to the limitationsof the hardware used in the trial, the sampling ratewas set at 40 kHz. The speciﬁcations of the sea trialis recorded in Table.VIII. (a) Sea Trial End-to-End System Diagram(b) Transmitter Set-up(c) Receiver Set-up Fig. 12: Sea Trial On-site Set-upTABLE VIII: Speciﬁcations on the Sea Trial

Trial Parameters ValueDistance between Transducer & Hydrophone 300m ±

50 mDepth of Transducer 9m ± ± C. Sea Trial Results

In this subsection, the collected data from the seatrial described in Section V-B was used to validatethe real application of the proposed receiver system.The results of which are shown in Table.IX.The estimated average α was calculated usingthe up and down sweep of the HFM pilot signaland the SNR was estimated using MATLAB. Forthis evaluation, we collected data for 10 trials. TABLE IX: Sea Trial Data Results and AccuracyComparison between MLE with Doppler Synchro-nization and Proposed Receiver System

TrialNo. f c SNR α BER of MLE+ Doppler Sync. BER of ProposedReceiver System1 4kHz -6.081 1.02 0.485 0.01482 2kHz -1.845 1.00 0.435 0.03303 3kHz -4.818 1.12 0.490 0.00934 1kHz -4.638 1.00 0.465 0.01025 4kHz -21.409 1.09 0.535 0.07906 4kHz -22.019 0.90 0.495 0.08577 4kHz -28.468 0.87 0.492 0.09938 2kHz -23.466 1.01 0.486 0.08999 1kHz -28.781 1.14 0.502 0.124010 1kHz -24.951 0.99 0.512 0.0917

Trial 1-4 were conducted on Day 1 and the resultsobtained from the sea trial were signiﬁcantly betterthan seen in Fig.11 with the most signiﬁcant onExp.3 with an coded BER of 0.0093 in comparisonto BER of 0.045. This implies that during Day1, the complexity of the channel was signiﬁcantlylower than that simulated in the above trial. Thedata collected from Trials 5 and 6 on Day 2 hada much lower performance signiﬁcance with animprovement of 0.08. On Day 3, while carryingout Exp. 7-10, we experienced heavy rain, whichresulted in a more complex dataset. As such, theBER seen from the sea trials conducted on Day 3shows similar performance to the simulated results.Overall, our proposed receiver system is able tokeep a signiﬁcant performance improvement fromthe − BER of the MLE with Doppler sync. to a − BER.VI. C

ONCLUSION AND O UTLOOK

In this paper, we have proposed a novel receiversystem that uses DBNs to redesign the de-noising and demodulation techniques for underwateracoustic communications. Our approach has alsoprovided an interesting and important pathway forthe application of machine learning techniques tounderwater communications systems.Firstly, although the performance of the receiversystem matches performance of traditional systems,without signiﬁcant improvement, in the AWGNchannels, it does show better performance in themore realistically simulated underwater channelsinﬂuenced by Doppler and multi-path. A compar-ison with the traditional MLE and the promisingde-noising auto encoder was completed in variousunderwater scenarios. These simulated experiments revealed extremely competitive BER performanceswith a performance improvement of 13.2dB at − BER. Therefore, demonstrating the powerful po-tential for machine learning to be used in morecomplex underwater acoustic channels. As a fur-ther investigation, we will increase the complexityby accommodating different mixtures of noise likerayleigh noise and exponential noise.As an additional step, we collected real life data,through a sea trial, to analyze the performance of theproposed receiver in a real scenario. The results ofwhich were promising with a substantial improve-ment from a coded − BER using the traditionalMLE method with Doppler synchronization to acoded − BER with the proposed receiver. Thisimplies the real possibility of designing machinelearning based underwater acoustic communicationsystems.Finally, the strength of using DBNs to designour proposed receiver is denoted by its seeminglylearned ability to comprehend and classify differ-ing sets of received signals. Despite the varyingparameters– frequency, amplitude, phase and timeframes between each random shift– of the scenarioswe have chosen to examine the receivers under, ourproposed receiver has remained relatively invariantwith the largest variation of 5.2dB at an uncoded − BER between the presence of 1-path and 3-paths. This phenomena suggests that the DBNs havesuccessfully extracted meaningful features from thesignals that could potentially be unchanged to theﬂuctuations of the underwater channel.R

EFERENCES [1] P. B. Sujit, J. Sousa, and F. L. Pereira, “Uav and auvs coordina-tion for ocean exploration,” in

OCEANS 2009-EUROPE , May2009, pp. 1–7.[2] T. Yeu, S. Yoon, S. Park, Sup-Hong, H. Kim, C. Lee, J. Choi,and K. Sung, “Study on path tracking approach for underwatermining robot,” in , May 2012, pp. 1–5.[3] S. Al-Dharrab, M. Uysal, and T. M. Duman, “Coopera-tive underwater acoustic communications [accepted from opencall],”

IEEE Communications Magazine , vol. 51, no. 7, pp. 146–153, July 2013.[4] T. Rappaport,

Wireless communications: Principles and prac-tice . Prentice Hall, 1996.[5] P. Almers, E. Bonek, A. Burr, N. Czink, M. Debbah, V. Degli-Esposti, H. Hofstetter, P. Ky¨osti, D. Laurenson, G. Matz,A. Molisch, C. Oestges, and H. ¨Ozcelik, “Survey of channeland radio propagation models for wireless mimo systems,”

EURASIP Journal on Wireless Communications and Network-ing , vol. 2007, Feb 2007. [6] L. Hong and K. C. Ho, “Bpsk and qpsk modulation clas-siﬁcation with unknown signal level,” in MILCOM 2000Proceedings. 21st Century Military Communications. Archi-tectures and Technologies for Information Superiority (Cat.No.00CH37155) , vol. 2, Oct 2000, pp. 976–980 vol.2.[7] F. K. Jondral, “From maxwell’s equations to cognitive radio,” in ,May 2008, pp. 1–5.[8] M. Stojanovic and J. Preisig, “Underwater acousticcommunication channels: Propagation models and statisticalcharacterization,”

IEEE Communications Magazine , vol. 47,no. 1, pp. 84–89, January 2009.[9] M. Yusof and S. Kabir, “An overview of sonar and electromag-netic waves for underwater communication,”

IETE TechnicalReview , vol. 29, p. 307, 07 2012.[10] R. J. Urick,

Principles of underwater sound / Robert J. Urick ,[rev. ed.]. ed. McGraw-Hill New York, 1975.[11] M. Stojanovic, J. A. Catipovic, and J. G. Proakis, “Phase-coherent digital communications for underwater acoustic chan-nels,”

IEEE Journal of Oceanic Engineering , vol. 19, no. 1, pp.100–111, Jan 1994.[12] M. Stojanovic, “Recent advances in high-speed underwateracoustic communications,”

IEEE Journal of Oceanic Engineer-ing , vol. 21, no. 2, pp. 125–136, April 1996.[13] D. B. Kilfoyle and A. B. Baggeroer, “The state of the artin underwater acoustic telemetry,”

IEEE Journal of OceanicEngineering , vol. 25, no. 1, pp. 4–27, Jan 2000.[14] L. Wan, Z. Wang, S. Zhou, T. Yang, and Z. Shi, “Performancecomparison of doppler scale estimation methods for underwateracoustic ofdm,”

Journal of Electrical and Computer Engineer-ing , vol. 2012, 2012.[15] S. Anwar, C. Yuen, H. Sun, Y. L. Guan, and Z. Babar, “A novelreceiver design of nonorthogonal fdm systems in underwateracoustics communication,”

IEEE Systems Journal , pp. 1–10,2019.[16] Z. Zheng, “Analysis and comparison of dimensional reductionbased on capture data,” in , April 2010, pp. 163–164.[17] J. S. Mashford, “A neural network image classiﬁcation systemfor automatic inspection,” in

Proceedings of ICNN’95 - Inter-national Conference on Neural Networks , vol. 2, Nov 1995, pp.713–717 vol.2.[18] D. Herremans and C.-H. Chuan, “The emergence of deeplearning: new opportunities for music and audio technologies,”

Neural Computing and Applications , 04 2019.[19] M. M. A. Ghosh and A. Y. Maghari, “A comparative study onhandwriting digit recognition using neural networks,” in , Oct 2017, pp. 77–81.[20] Y. Wang, H. Zhang, Z. Sang, L. Xu, C. Cao, and T. A. Gul-liver, “Modulation classiﬁcation of underwater communicationwith deep learning network,”

Computational Intelligence andNeuroscience , vol. 2019, pp. 1–12, 04 2019.[21] V. Q. Dang and Y. Pei, “A study on feature extraction ofhandwriting data using kernel method-based autoencoder,” in , Sep. 2018, pp. 1–6.[22] T. O’Shea and J. Hoydis, “An introduction to deep learningfor the physical layer,”

IEEE Transactions on Cognitive Com-munications and Networking , vol. 3, no. 4, pp. 563–575, Dec2017.[23] Y. Z. X. L. Youwen Zhang, Junxuan Li and J. Li, “Deeplearning based underwater acoustic ofdm communications,”

Applied Acoustics , 2019. [24] J. Wang, S. Ma, Y. Cui, H. Sun, M. Zhou, B. Wang, J. Li, andL. Liu, “Signal detection for full-duplex cognitive underwateracoustic communications with sic using model-driven deeplearning network,” in ,2019, pp. 1–6.[25] F. Gardner, “A bpsk/qpsk timing-error detector for sampledreceivers,”

IEEE Transactions on Communications , vol. 34,no. 5, pp. 423–429, May 1986.[26] M. Kim, T. Im, Y. Cho, K. Kim, and H. Ko, “Hfm designfor timing synchronization in underwater communications sys-tems,” in

OCEANS 2017 - Aberdeen , June 2017, pp. 1–4.[27] K. C. H. Blom, H. S. Dol, A. B. J. Kokkeler, and G. J. M. Smit,“Blind equalization of underwater acoustic channels using im-plicit higher-order statistics,” in , Aug2016, pp. 1–5.[28] M. Johnson, L. Freitag, and M. Stojanovic, “Improved dopplertracking and correction for underwater acoustic communica-tions,” in , vol. 1, April 1997, pp. 575–578 vol.1.[29] M. Stojanovic, J. Catipovic, and J. G. Proakis, “Adaptive mul-tichannel combining and equalization for underwater acousticcommunications,”

Acoustical Society of America Journal ,vol. 94, no. 3, pp. 1621–1631, Sep 1993.[30] J. W. Chavhan and G. G. Sarate, “Channel estimation modelfor underwater acoustic sensor network,” in ,May 2015, pp. 978–981.[31] M. A. Munoz Gutierrez, P. L. Prospero Sanchez, and J. V. doVale Neto, “An eigenpath underwater acoustic communicationchannel simulation,” in

Proceedings of OCEANS 2005MTS/IEEE , Sep. 2005, pp. 355–362 Vol. 1.[32] T. C. Yang, “Characteristics of underwater acousticcommunication channels in shallow water,” in

OCEANS2011 IEEE - Spain , June 2011, pp. 1–8.[33] A. K. Morozov and J. C. Preisig, “Underwater acoustic com-munications with multi-carrier modulation,” in

OCEANS 2006 ,Sep. 2006, pp. 1–6.[34] L. M. Wolff, E. Szczepanski, and S. Badri-Hoeher, “Acousticunderwater channel and network simulator,” in , May 2012, pp. 1–6.[35] Xiao Liu and J. Bousquet, “Acoustic doppler compensationusing feedforward retiming for underwater coherent transmis-sion,” in

OCEANS 2015 - MTS/IEEE Washington , Oct 2015,pp. 1–5.[36] G. Eynard and C. Laot, “Blind doppler compensation schemefor single carrier digital underwater communications,” in

OCEANS 2008 , Sep. 2008, pp. 1–5.[37] W. Pan, P. Liu, F. Chen, F. Ji, and J. Feng, “Doppler-shiftestimation of ﬂat underwater channel using data-aided least-square approach,”

International Journal of Naval Architectureand Ocean Engineering , vol. 7, 03 2015.[38] P. Stoica, R. L. Moses, B. Friedlander, and T. Soderstrom,“Maximum likelihood estimation of the parameters of multiplesinusoids from noisy measurements,”

IEEE Transactions onAcoustics, Speech, and Signal Processing , vol. 37, no. 3, pp.378–392, March 1989.[39] I. Goodfellow, Y. Bengio, and A. Courville,

Deep Learning2019 IEEE VTS Asia Paciﬁc WirelessCommunications Symposium (APWCS)