Wiener Filter versus Recurrent Neural Network-based 2D-Channel Estimation for V2X Communications
Moritz Benedikt Fischer, Sebastian Dörner, Sebastian Cammerer, Takayuki Shimizu, Bin Cheng, Hongsheng Lu, Stephan ten Brink
WWiener Filter versus Recurrent NeuralNetwork-based 2D-Channel Estimation for V2XCommunications
Moritz Benedikt Fischer , Sebastian D¨orner , Sebastian Cammerer ,Takayuki Shimizu , Bin Cheng , Hongsheng Lu , and Stephan ten Brink Institute of Telecommunications, Pfaffenwaldring 47, University of Stuttgart, 70569 Stuttgart, Germany { fischer,doerner,cammerer,tenbrink } @inue.uni-stuttgart.de InfoTech Lab, Toyota Motor North America { takayuki.shimizu,bin.cheng,hongsheng.lu } @toyota.com Abstract —We compare the potential of neural network (NN)-based channel estimation with classical linear minimum meansquare error (LMMSE)-based estimators, also known as Wienerfiltering. For this, we propose a low-complexity recurrent neuralnetwork (RNN)-based estimator that allows channel equalizationof a sequence of channel observations based on independenttime- and frequency-domain long short-term memory (LSTM)cells. Motivated by Vehicle-to-Everything (V2X) applications, wesimulate time- and frequency-selective channels with orthogonalfrequency division multiplex (OFDM) and extend our channelmodels in such a way that a continuous degradation fromline-of-sight (LoS) to non-line-of-sight (NLoS) conditions canbe emulated. It turns out that the NN-based system cannotjust compete with the LMMSE equalizer, but it also can betrained w.r.t. resilience against system parameter mismatch. Wethereby showcase the conceptual simplicity of such a data-drivensystem design, as this not only enables more robustness against,e.g., signal-to-noise-ratio (SNR) or Doppler spread estimationmismatches, but also allows to use the same equalizer over awider range of input parameters without the need of re-building(or re-estimating) the filter coefficients. Particular attention hasbeen paid to ensure compatibility with the existing IEEE 802.11ppiloting scheme for V2X communications. Finally, feeding thepayload data symbols as additional equalizer input unleashesfurther performance gains. We show significant gains over theconventional LMMSE equalization for highly dynamic channelconditions if such a data-augmented equalization scheme is used.
I. I
NTRODUCTION
In wireless communications, high-spectral efficient datatransmission strongly depends on the availability of preciseknowledge of the current channel state information (CSI).Thus, the CSI needs to be estimated at the receiver – eitherexplicitly or implicitly. In particular, for channels that aretime- and frequency-selective due to motion and multi-pathpropagation, respectively, channel estimation becomes a keychallenge in today’s OFDM-based transceiver implementa-tions. Although channel estimation is a well-known field of classical signal processing-based research, its performance islimited by the following challenges: (a) mismatch between
This work has been supported by Toyota Motor North America the mathematical traceability of the actual physical channel,i.e., an inaccuracy of the underlying channel models, (b)computational complexity and also the induced latency ofthe algorithms, and (c) robustness w.r.t. the system parameterestimation such as SNR or velocity. All of this leads to adiscrepancy between the theoretical optimal solution and theactually achieved performance of channel equalization whendeployed in a real system.Although the channel changes continuously, the amount oftransmitted pilots used for channel estimation should be as lowas possible to maximize the transmission rate. Obviously, thechannel state in-between pilot positions must be interpolated.Several of such pilot-aided channel estimation methods haveevolved and essentially differ in the quality of this interpola-tion, but also in the required availability of assumptions aboutthe channel, i.e., how is the channel correlated over time andfrequency. A simple channel estimator is the least squares(LS) estimator with bilinear interpolation. The 2D-LMMSEestimator is a more advanced linear filter, which yields anoptimal estimate of the channel under certain conditions [1],[2]; however, it is computationally complex and relies onthe knowledge of the channel’s statistical properties, whichboils down to the task of estimating the channel covariancematrix. To reduce the complexity, it can be replaced by twosequentially executed 1D-LMMSE estimators [3].Recently, deep learning for communications has attracted alot of attention in academia and also in industry for virtuallyany possible application [4], [5], [6]. This new paradigm ofa data-centric system design allows to learn equalizers thatare perfectly aware of the channel statistics including all its(potential) impairments. In [7] an NN structure for channelestimation is derived from the minimum mean square error(MMSE) filter. The main focus of [8] lies on the usage ofadditional meta data to obtain a more accurate channel esti-mate. Further, learning an efficient pilot arrangement togetherwith a channel estimator for massive multiple-input multiple-output (MIMO) has been done in [9]. A joint demapping anddecoding scheme is presented in [10] for a complete message a r X i v : . [ c s . I T ] F e b ( t ) a a a a L . . | aL | p (cid:0) | a L | (cid:1) Rayleigh . . | a | p ( | a | ) Rayleigh . . | a | p ( | a | ) Rayleigh . . | a | p ( | a | ) RayleighRice moreLoS τ τ τ = τ − τ τ Lτ L = τ L − τ L − y ( t ) n ( t ) Fig. 1: Tapped-delay line channel model with the delay of the first channel tap set to zero.frame. Further investigation on joint estimation and demappingas well as the transition towards end-to-end learning of thewhole system including learning of the pilot arrangementand superimposed pilots are analyzed in [11]. It has beenobserved in [10] and [11] that a data-aware channel estimationcan further improve the system’s performance by analyzingthe payload data besides considering only the specific pilotpositions. Finally, the authors of [12] propose an NN-basedOFDM receiver that implicitly estimates the required CSI.The main focus of our work is to showcase and analyze the universality and flexibility of NN-based solutions. For this, weseek to train our network over a wide range of possible inputparameters. Motivated by the result in [13], which showedthat RNNs are an attractive architecture for sequential dataprocessing in communications, a novel RNN-based channelestimator is presented in this work. The presented channel esti-mation scheme is evaluated over time- and frequency-selectivechannels and compared to conventional channel estimationtechniques such as 2D-LMMSE and 2x1D-LMMSE. For this,we use the piloting scheme as given in the IEEE 802.11pstandard [14]. By investigating the influence of channel pa-rameters such as velocity, noise and strength of a possible LoSconnection on the accuracy of the estimated channel, we studywhether it is beneficial to use an RNN-based channel estimatorinstead of conventional channel estimation schemes. Besidesthis, we analyze to what extent gains are possible whilerelying on inaccurate knowledge of the statistical propertiesand parameters of the channel. In the last section, we showthat a data-augmented version of the same RNN equalizer canfurther extract statistical information about the channel stateby observing the received payload data symbols. II. S
YSTEM SETUP
A. Channel model
We assume a time- and frequency-selective wireless channeldescribed by its time-variant channel impulse response h ( t, τ ) = L (cid:88) (cid:96) =1 a (cid:96) ( t ) δ ( τ − τ (cid:96) ) (1)where L describes the number of channel taps, a (cid:96) ( t ) denotesthe complex-valued gain with an average power p (cid:96) and τ (cid:96) represents the delay of the (cid:96) -th channel tap. Each channel tapcorresponds to one resolvable subpath of all paths in the multi-path environment, which is common for V2X communications.If transmitter and receiver are physically separated and thereis no line-of-sight between them, the channel gain a (cid:96) ( t ) is complex Gaussian distributed with zero-mean resulting inRayleigh fading channel taps. This is a well-suited assumptionif all received signal components are reflected or diffracted byobjects of the environment. For further details we refer thereader to [15], [16]. In case there is a strong LoS component,it is more appropriate to model the first channel tap as Ricianfading. The K -factor K of the model is defined as the ratiobetween the power of the LoS component and the power ofall the remaining resolvable multi-path components. We followthe simulation framework as described in [17].Throughout this work, we assume OFDM with cyclic prefix(CP) leading to the possibility of single-tap equalization. Let X k,n be the k -th complex symbol that should be transmittedover the n -th sub-carrier, where k ∈ N is the index of discretetime and n ∈ { , ..., N Sub − } is the sub-carrier over whichthe symbol should be transmitted. The symbol X k,n can eitherbe a data carrying symbol or a pilot symbol that can be utilizedfor the channel estimation. The pilot arrangement for the IEEE802.11p standard is schematically shown in Fig. 2. The map-ping of the symbols X k,n onto the orthogonal sub-carriers can s ub ca rr i e r n discrete time k PilotDataNull
Fig. 2: IEEE 802.11p pilot arrangement.be expressed by an inverse discrete Fourier transform (IDFT)and to eliminate intersymbol interference (ISI) a CP is used.Let X k = ( X k, , X k, , ..., X k,N Sub − ) T denote the complex-valued vector containing the symbols per sub-carrier of the k -th OFDM symbol and s k = ( s k, , s k, , ..., s k,N Sub − ) T is thecomplex vector containing N Sub output samples of the IDFTrepresenting the k -th OFDM symbol in the time domain. If the maximal delay of the channel is smaller than theduration of the CP, the received symbols r k in the time domainafter the removal of the CP are given by r k = h k (cid:126) s k + n k (2)where h k is the sampled version of the channel impulseresponse h ( t, τ ) , n k denotes an additive white Gaussian noise(AWGN) term and (cid:126) denotes the circular convolution. Thereceived symbols in the frequency domain are defined as Y k = H k X k + N k . (3)where H k is the sampled channel transfer function, i.e., thediscrete Fourier transform (DFT) of the sampled channelimpulse response h k . In the following we assume that theindividual entries of N k are independent and identicallycomplex Gaussian distributed with zero-mean and a varianceof σ . Hence, N k is N k ∼ CN (0 , σ I N Sub ) (4)distributed, where I N Sub is the identity matrix of size N Sub × N Sub .At the receiver the symbols Y k are equalized by scalingsuch that ˆ Y k = Y k ˆ H ∗ k | ˆ H k | . (5)For the consideration of OFDM frames of n T OFDM symbols(3) can be extended to Y = H ◦ X + N (6) For simplicity, we neglect the CP in our description. Y p ˆH LMMSE
Wiener filter v K σ model properties (a) 2D-Wiener filter Y p Y ˆH
RNN
RNNtraining data (b) RNN
Fig. 3: Block diagram of the 2D-Wiener filter and the RNN-based channel estimator.where Y ∈ C n T × n F is the received symbol matrix, H ∈C n T × n F is the channel matrix, X ∈ C n T × n F is the transmittedsymbol matrix and N ∈ C n T × n F is the AWGN noise ma-trix. Furthermore, ◦ denotes the Hadamard product, i.e., theelement-wise multiplication. B. Wiener filter baseline
The Wiener filter as shown in Fig. 3a, also denoted asLMMSE estimator, is a linear filter that minimizes the meansquared error (MSE) between the true channel and the esti-mated channel. Therefore, it utilizes received pilot symbols,an estimated SNR and knowledge of the statistical propertiesof the channel. Precisely, it needs to know the channel’s auto-correlation function and the noise variance. For the derivationof the 2D-Wiener filter (6) is rewritten into Y = X H + N (7) = diag (vec ( X )) vec ( H ) + vec ( N ) (8)where X is a diagonal matrix with all transmitted symbolson the main diagonal and Y , H and N are the vectorizedversions of the transmitted symbols, channel matrix and theAWGN matrix, respectively. In the same way the received pilotsymbols are defined as Y p = X p H p + N p (9) = diag (Π vec ( X )) Π vec ( H ) + Π vec ( N ) (10)where Π ∈ N p × n T n F is a pilot selection matrix and p is thenumber of pilots distributed over the OFDM frame. The pilotselection matrix contains a single one per row to select thepilot position and all the other elements of a row are set tozero.Following [1] and [2] the channel estimate of the 2D-Wienerfilter is given by ˆ H LMMSE = WX − Y p (11) s ub ca rr i e r n discrete time k Y Sec. IV Y p X p L S e s ti m a ti on ˆH p , LS X p L S T M b i d i r . i n f t r a n s po s e & r e s h a p e LSTMbidir. in t t r a n s po s e & r e s h a p e L S T M b i d i r . i n f TDDLTDDLTDDLTDDL r e s h a p e ˆH RNN channel estimation RNN s ub ca rr i e r n discrete time k Fig. 4: Block diagram of the RNN-based channel estimator. The additional input Y is only used in Sec. IV for data-augmented channel estimation.with the definition of the filter matrix WW = R HH Π T (cid:0) Π (cid:0) R HH + σ I n T n F (cid:1) Π T (cid:1) − . (12)Note that the multiplication of X − with Y p is the LSestimate of the channel at pilot positions. R HH denotes theautocorrelation matrix of the channel, which we assume tobe a separable composition of the temporal and the spectralautocorrelation of the channel and can be expressed as theKronecker product [18] R HH = R T ⊗ R F . (13)This is a valid assumption for wide-sense stationary uncor-related scattering (WSSUS) channel models. In general, theautocorrelation of unknown channels has to be determinedempirically.The temporal autocorrelation matrix R T and the spectralautocorrelation matrix R F are both Toeplitz matrices with theirelements given by the temporal autocorrelation function r T ( kT S ) = J (2 πf D , max T S k ) (14)and the spectral autocorrelation function r F ( m ∆ f ) = L − (cid:88) l =0 p l e j πτ l ∆ fm , (15)respectively. Hereby, T S is the duration of an OFDM symbolwith CP, f D , max = v / c f c denotes the maximal Dopplerfrequency of the channel, v denotes the velocity, c is thewave propagation speed, i.e., speed of light, f c is the carrier-frequency, ∆ f is the sub-carrier spacing and J ( . ) is the zero-th order Bessel function of the first kind.To reduce the computational complexity of the 2D-Wienerfilter it can be replaced by a 2x1D-Wiener filter, consid-ering that the two-dimensional channel estimation problemis separated into two one-dimensional tasks that are solvedsubsequently. The first Wiener filter interpolates the channelover the time axis, and the second Wiener filter estimates thechannel over the frequency axis. For further details we referthe reader to [3], [1] and [2].As can be seen from (12), (14) and (15) and as illustratedin Fig. 3a, the filter matrices of the 2D-Wiener filter and the2x1D-Wiener filter are dependent on several parameters ofthe channel, including velocity, noise power and K -factor. This additional knowledge of the channel parameters hasto be provided to the Wiener filters by external parameterestimators. Furthermore, every time such a parameter changes,a recalculation of the filter matrices is required.III. RNN- BASED CHANNEL ESTIMATION
A. Neural network structure and training
The structure of the proposed RNN-based channel estimatoris illustrated in Fig. 4 and mainly consists of 3 bidirectionalrecurrent LSTM cells, which consecutively handle the inputdata in frequency-, then time- and then frequency-dimensionagain. This allows to keep the computational complexity fea-sible. We use recurrent NNs to exploit temporal and frequencycorrelations of the input data, but instead of simple recurrentneurons, we use LSTM units to circumvent the inherentvanishing gradient problem [19].The RNN only takes the received pilot symbols Y p , andthe actually sent (known) pilot symbols X p as inputs for thetwo-dimensional channel estimation task. Only in Sec. IV, weadditionally provide all the received symbols Y (including Y p )to the RNN for data-augmented equalization. This is indicatedby the red box in Fig. 4.There are no explicit assumptions on channel statistics andnoise variance fed to the RNN, but all of this information isimplicitly provided through the data-driven training approach(cf. Fig. 3b). This is also the reason why we additionallyfeed Y (which includes random noisy payload data symbols)to the RNN (only in Sec. IV), as this allows the NN tobetter learn and estimate these channel statistics by itself.The first preprocessing step of the RNN-based estimatoris the calculation of the LS estimate at all pilot positions ˆ H p , LS = X − Y p . Thereafter, the resulting vector is reshapedinto a matrix ˆH p , LS of target shape C n T × n F and zeros areinserted at positions where data is transmitted. Similar to [10],all inputs of shape C n T × n F , including the pilot mask andpositions in time and frequency dimension, are then stackedto one large input tensor.After a complex- to real-valued conversion, the input data isprocessed by a bidirectional LSTM cell in frequency direction(i.e., vertically in the resource grid in Fig. 2). This means thatthe data (in frequency domain) is used as the LSTM cell’stime dimension. The output of the initial LSTM cell is thus anestimation for each symbol, only dependent on input data forABLE I: Training parameters Parameter valueEpochs 100Iterations per epoch 1000Batch size 200Learning rate 0.001Velocities 0 kmh - 300 kmh
SNR 5dB - 30dB K -factors 0 - 5Fract. of pure NLoS scenarios in train. set 50% the specific symbol and incoming cell states from decisions onprevious and subsequent sub-carrier symbols. After the initialLSTM cell, the outputs are reshaped and time and frequencydimensions are transposed, so that the second LSTM cell cannow operate in actual time domain (i.e., horizontally in theresource grid in Fig. 2). This second LSTM cell thereby hasno direct connection in the frequency domain as its states onlytraverse trough the time domain. Finally, the second cell’soutput is again reshaped and transposed so that a third LSTMcell can operate on frequency domain again.The third LSTM cell’s output is then further processed bytwo time-distributed dense layer (TDDL), i.e., dense layersthat are applied to each time-frequency step separately withthe same shared weights. The number of densely connectedneurons in the first layer equals twice the number of LSTMunits used in the third cell, as the only purpose of this rectifiedlinear unit (ReLU) activated dense layer is to combine thebidirectional LSTM cell’s forward and backward output. Thesecond TDDL layer consists of two linearly activated neuronswhich finally provide an estimation on real and imaginary partof H k,n . We chose this 3x1D RNN structure to minimize theNN’s complexity for a single time-frequency step as muchas possible with only recurrent state information traversing intime or frequency dimension. This approach renders this archi-tecture highly scalable for all kinds of input data dimensions.Throughout this work, we use the proposed structure with 64LSTM units per cell.During training of the RNN-based channel estimator,stochastic gradient descent (SGD) and back-propagationthrough time (BPTT) are applied. Since the channel estimationproblem can be classified as a regression task, we use theMSE loss; further training parameters are summarized in Tab.I. Note that, to obtain a universal channel estimator, the varietyin the training data has to be high. For this, the RNN is trainedwithin a given interval of uniformly distributed velocities andSNR values. In contrast to that, the K -factors used duringtraining are not uniformly distributed, but instead, training datais divided into two halves. One half of each training batch iscreated with NLoS conditions, i.e., K = 0 and the other half iscreated with uniformly distributed K -factors with K ∈ (0 , . B. Simulation results
We consider the V2X wireless channel with a carrierfrequency of 5.9 GHz. The parameters assumed for the sim-ulations are guided by the IEEE 802.11p standard [14] and TABLE II: Simulation parameters
Parameter valueCarrier frequency f c B
10 MHzOFDM symbol duration with CP T S µ s Cyclic prefix duration T CP µ s Sub-carrier spacing ∆ f kHz Number of sub-carriers N Sub L τ max
500 nspower delay profile (PDP) equally weightedFading statistics Rayleigh or Rician − − − Velocity v [ kmh ] M S E [ d B ] Fig. 5: MSE comparison for different velocities at an SNR of15 dB and a K -factor of K = 0 . The influence of up to ± %velocity estimation mismatch on the 2D-LMMSE is visualizedby the shaded red area around the LMMSE channel-matchedcurve .are shown in detail in Tab. II. Thereby, a pilot arrangement asdepicted in Fig. 2 is used.In Fig. 5 to Fig. 8 the novel RNN-based channel estimatoris compared with the 2D-Wiener filter and the 2x1D-Wienerfilter (with reduced computational complexity) for varyingvelocities, SNR values and K -factors. The Wiener filters arerecalculated for every evaluation point whereas the RNN-based channel estimator does not change for varying channelparameters. Besides the Wiener filters based on the accurateknowledge of the statistical properties and parameters of thechannel ( channel-matched LMMSE), more practicable esti-mated versions of them are also included ( channel-mediated
LMMSE) that are valid over a wider range of input param-eters. Their filter matrices are estimated over channelrealizations for uniformly distributed velocities between km / h and km / h , SNR values between and and K -factors between K = 0 and K = 5 . They are assumed to beapplicable for a range of channel parameters and, therefore,a recalculation of the Wiener filters at runtime is not neededanymore. To summarize, the following filters are shown: • Channel- matched
LMMSE with genie-aided (perfect)parameter knowledge at receiver. We show the 2D-
10 0 10 20 30 − − − SNR [dB] M S E [ d B ] Fig. 6: MSE comparison for different SNRs. The influence ofup to ± % SNR estimation mismatch on the 2D-LMMSE isvisualized by the shaded red area around the LMMSE channel-matched curve .LMMSE version and its 2x1D simplification for scenarioswhere it provides further insights. • Channel- mediated
LMMSE with estimated (averagedover many channel realizations) parameters, in an attemptto obtain an universal
LMMSE estimator applicable overa wide range of parameters, serving as a best possibleconventional competitor to the (later) universal RNNapproach. • RNN universal trained over a wide range of input param-eters that does not require any other explicit parameterknowledge during inference.The results reveal that the RNN-based channel estimatoris universally applicable over a wide range of parameters andcan adapt to varying and also difficult channel conditions, e.g.,high user velocities as shown in Fig. 5. It can be seen that ahigh velocity renders the channel estimation into a difficulttask and, thus, a general degradation of all estimators canbe observed due to the high time-selectivity of the channel.Somewhat to our surprise, the RNN-based channel estimatorcan compete with the channel-matched LMMSE equalizerwithout being fed with a velocity estimation. This can beintuitively explained by the practice of training over a widerange of different channel parameters and thereby forcing theRNN to inherently learn to handle these effects solely basedon input received channel observations. Note that the visibledegradation of the 2x1D LMMSE performance is mainlycaused by this standard’s pilot positioning scheme.Fig. 6 analyzes the impact of the channel SNR and thered region around the LMMSE equalizer shows the influenceof an ± SNR parameter estimation mismatch. We wantto emphasize, that the RNN-based channel estimator has aninherent advantage compared to the Wiener filter, as it onlytakes received channel observations as input and no further pa- − − K -factor K M S E [ d B ] NLoS LoSFig. 7: MSE comparison for different K -factors at a velocityof km / h and an SNR of 15 dB.rameter estimation is required. A recalculation during runtime,as well as a separate parameter estimation block, is not needed.Therefore, this reduces its design complexity compared to a2D-Wiener filter (assuming a continuous adaption of the filtercoefficients to newly estimated parameters). Its universalityand the absence of a separate parameter estimation lead to theconjecture, that the RNN implicitly learns to estimate theseparameters. This is also supported by the observation that itoutperforms both channel-mediated methods, which also onlyuse the received pilot symbols without additional knowledgeof the exact parameters. In cases where the 2D-Wiener filteris known to be optimal, the RNN can compete with it and is,in the most relevant regions, on par with the 2D-Wiener filterin terms of MSE performance.In Fig. 7 we now vary the channel properties by increasingthe K -factor of our model. It is instructive to realize that thisimplies that the LoS components increasingly dominate thechannel’s behavior. Note that this does not necessarily renderthe problem more difficult (often LoS conditions are mucheasier to handle; thus, the overall performance also improvesfor all the estimators with increasing K ), but is simplyunexpected by the equalizer and, thus, difficult to handle due tothe model mismatch. Or in other words, we have designed theequalizer for the wrong channel conditions, which is beneficialfor the RNN that only extracts the model from training dataitself. We want to emphasize that this experiment operates theequalizers under mismatched conditions on purpose, i.e., wewant to showcase the universality of the RNN-based system.Furthermore, in the case of evaluating the performance forvarying K -factors and K -factors larger than zero, as shown inFig. 7, the 2D-Wiener filter is not optimal anymore. Although,it is still the best conventional channel estimation methodfor this task, the RNN is able to outperform the 2D-Wienerfilter for the given mismatch. However, the importance of thisexperiment is not necessarily found in the pure performancegain, but also in the simplicity of the system design.These gains in universality are also visible in Fig. 8, wherethe RNN’s MSE performance is compared to a mismatchedLMMSE estimator, which assumes K = 0 , v = 150 km / h andan SNR of . The shaded areas visualize regions for which − − − − − − − Mismatch K -factor M S E [ d B ]
2D LMMSE mediated( K ) 2D LMMSE 15 dB2D LMMSE v = 150 kmh
2D LMMSE K = 0 RNN universal(SNR) RNN universal( v )RNN universal( K ) − −
20 0 20 40 − − − − − − − Mismatch SNR in %Mismatch v in % M S E [ d B ] universality gainFig. 8: MSE comparison for the mismatch of the velocity, forthe mismatch of the SNR and for the mismatch of the K -factor (based on a velocity of km / h , an SNR of 15 dB anda K -factor of 0). − − − Velocity v [ kmh ] M S E [ d B ] Fig. 9: MSE comparison for different velocities at randomlychosen K -Factors between 0 and 5 (at least 20% K = 0 ) andrandomly chosen SNRs in the range between and .The RNN-based equalizer now uses Y as additional input, i.e., data-augmented equalization is used.the RNN provides a significantly better performance than thebest LMMSE version. For readability, other estimators are notshown here. It can be clearly seen that the LMMSE equalizerperforms slightly better if all parameters are well known,however, when introducing a mismatch of SNR, velocity or K -factor, the RNN clearly outperforms the classical approaches. − − − − − − E b /N [dB] B E R m = 2 - 2D LMMSE channel-mediated m = 2 - RNN universal (data-augmented) m = 4 - 2D LMMSE channel-mediated m = 4 - RNN universal (data-augmented) Fig. 10: bit error rate (BER) vs. SNR for quadrature amplitudemodulation (QAM) constellations of m = 2 and m = 4 bitper complex symbol after soft-demapping and 40 iterations ofbelief propagation (BP) decoding.IV. D ATA A UGMENTED E QUALIZATION
We now extend the RNN equalizer by additionally providingthe data symbols (without involving any decision feedback) tothe equalizer similar to [10], [11] and, thereby, realize a data-aware equalization. This means we simply enable the addi-tional input of Y in Fig. 4 (red box) and re-train the system.Note that we only re-use existing data that must be available atthe receiver anyway but remains unused in the classical linearequalization schemes. The observed gains (shown in Fig. 9-10 and discussed in the following) can intuitively be explainedby the fact that the data symbols – although unknown at thereceiver – can only occur at specific positions of the regularQAM constellation grid. This information is typically not used,however, the conceptual elegance of NNs easily allows toextract further statistical information about the current channelstate even with data suffering from high uncertainty (unknowndata bits and additional channel noise). So far the gains of theRNN-based equalizer have been limited and could be foundmore in terms of universality but not necessarily in termsof MSE performance gains when compared to the LMMSEequalizer. However, this experiment underlines the conceptualsimplicity of such a data-driven system design w.r.t. re-usingadditional data sources. We simply re-train the system with theadditional data input and do not need any reconsideration ofthe system setup itself. We empirically observe that this dataaugmentation provides significant additional gains in highlydynamic situations such as high velocity scenarios whichwould otherwise require more pilots. Note that we only provide the raw input Y without involving any furtherdecisions as it is common for classical decision feedback and iterative channelestimation and decoding algorithms. However, the scheme could be extendedto also account for decision feedback from the forward error correction asdone in [20]. ig. 9 shows the MSE performance at varying velocitiesin a scenario with K -factors randomly chosen between 0 to5 with at least 20% of NLoS conditions ( K = 0 ) and SNRrandomly chosen between and 30 dB . Due to all channelparameters being randomly chosen, Fig. 9 only shows thechannel-mediated estimators, with optimized filter coefficientsfor given parameter ranges, as conventional baseline. Clearly,the additional input of the received data symbols Y helpsthe RNN to track the variations of the channel and it canoutperform both comparable channel-mediated 2x1D and 2DLMMSE baseline systems over the whole range of investigatedvelocities.To further highlight these advantages in terms of resilienceand a more accurate channel estimation, we now examinewhether the observed gains for MSE on channel estimation canbe transferred to the more tangible metric of final BER perfor-mance. Therefore, we use ˆ H to employ a posterior probability(APP) soft-demapping and add an outer irregular low-densityparity-check (LDPC) code of rate r = / , length n = 1296 bit and 40 iterations of BP decoding. We also evaluate theBER performance in the more realistic channel scenario withparameters K randomly chosen from 0 to 5 with at least20% of NLoS ( K = 0 ) realizations and velocity randomlychosen between 0 and 300 km / h at varying SNR. Fig. 10 showsthe BER performance vs. E b / N for randomly generated bit,modulated with quadrature phase shift keying (QPSK) ( m = 2 bit per complex symbol) and 16-QAM ( m = 4 ) constellations.As can be seen, the RNN’s generally improved estimation ˆH RNN in terms of MSE also leads to improved final BERperformance. Thereby it is shown, that soft-values which aredemapped using ˆH RNN provide more information to the BPdecoder at a given SNR than soft-values which are demappedusing the conventional LMMSE estimator. Finally, we canobserve gains in this specific scenario with varying parametersof up to 1dB for m = 2 and even higher gains of up to 1.4dBfor m = 4 . This performance increase for higher constellationorders is also expected, as higher constellations require betterestimation to ensure good symbol and bit decisions.V. C ONCLUSION AND O UTLOOK
We have shown that NN-based estimation for time- andfrequency-selective channels exhibits a robust behavior w.r.t.inaccurate channel parameter knowledge. It turned out thatNN-based channel estimation can compete with the (closeto) optimal LMMSE estimator for a wide range of channelparameters and can even outperform the baseline for scenarioswith inaccurate parameter estimations. When compared to thebaseline without perfect velocity estimation and evaluated ona more realistic channel with LoS components, we observedsignificant gains. We have kept full compatibility to the IEEE802.11p piloting scheme, however, extensions to other pilotingschemes are straightforward. Finally, we have showcased thepossibility of data-aware equalization that is able to extractfurther statistical information of the channel from the payloaddata sequence leading to significant MSE and consequentlyBER performance gains. It remains open for future work to evaluate (and potentiallytrain) the system with real-world data from larger measurementcampaigns, in particular, from V2X scenarios. We want toemphasize that the main contribution of this work is notnecessarily found in plain performance improvements for fixedchannel setups, but to showcase the conceptual simplicity ofsuch a data-driven and RNN-based system design.R
EFERENCES[1] R. Nilsson, O. Edfors, M. Sandell, and P. O. Borjesson, “An analysis oftwo-dimensional pilot-symbol assisted modulation for OFDM,” in . IEEE, 1997, pp. 71–74.[2] P. Hoeher, S. Kaiser, and P. Robertson, “Two-dimensional pilot-symbol-aided channel estimation by Wiener filtering,” in , vol. 3.IEEE, 1997, pp. 1845–1848.[3] X. Dong, W.-S. Lu, and A. C. Soong, “Linear interpolation in pilotsymbol assisted channel estimation for OFDM,”
IEEE Transactions onWireless Communications , vol. 6, no. 5, pp. 1910–1920, 2007.[4] T. O’Shea and J. Hoydis, “An Introduction to Deep Learning for thePhysical Layer,”
IEEE Trans. Cogn. Commun. Netw. , vol. 3, no. 4, pp.563–575, Dec. 2017.[5] N. Farsad and A. Goldsmith, “Neural network detection of data se-quences in communication systems,”
IEEE Trans. on Signal Process. ,vol. 66, no. 21, pp. 5663–5678, 2018.[6] E. Nachmani, Y. Be’ery, and D. Burshtein, “Learning to decode linearcodes using deep learning,” in
Allerton Conf.
IEEE, 2016, pp. 341–346.[7] D. Neumann, T. Wiese, and W. Utschick, “Learning the MMSE channelestimator,”
IEEE Transactions on Signal Processing , vol. 66, no. 11, pp.2905–2917, 2018.[8] C. Luo, J. Ji, Q. Wang, X. Chen, and P. Li, “Channel state informationprediction for 5G wireless communications: A deep learning approach,”
IEEE Transactions on Network Science and Engineering , 2018.[9] M. B. Mashhadi and D. Gunduz, “Pruning the pilots: Deep learning-based pilot design and channel estimation for MIMO-OFDM systems,” arXiv preprint arXiv:2006.11796 , 2020.[10] M. Honkala, D. Korpi, and J. M. Huttunen, “Deeprx: Fully convolutionaldeep learning receiver,” arXiv preprint arXiv:2005.01494 , 2020.[11] F. A. Aoudia and J. Hoydis, “End-to-end learning for OFDM:From neural receivers to pilotless communication,” arXiv preprintarXiv:2009.05261 , 2020.[12] H. Ye, G. Ye Li, B. Juang, “Power of Deep Learning for ChannelEstimation and Signal Detection in OFDM Systems,”
IEEE WirelessCommunications Letters , 2018.[13] D. Tandler, S. D¨orner, S. Cammerer, and S. ten Brink, “On recurrentneural networks for sequence-based processing in communications,” in .IEEE, 2019, pp. 537–543.[14] I. S. Association et al. , “802.11p-2010-IEEE standard for informationtechnology-local and metropolitan area networks-specific requirements-part 11: Wireless LAN medium access control (MAC) and physicallayer (PHY) specifications amendment 6: Wireless access in vehicularenvironments,” 2010.[15] A. F. Molisch,
Wireless Communications , 2nd ed. John Wiley & SonsLtd., 2016.[16] G. L. St¨uber,
Principles of Mobile Communication , 4th ed. SpringerInternational Publishing, 2017.[17] C. Xiao, Y. R. Zheng, and N. C. Beaulieu, “Statistical simulation modelsfor Rayleigh and Rician fading,” in
IEEE International Conference onCommunications, 2003. ICC ’03. , vol. 5, 2003, pp. 3524–3529 vol.5.[18] M. ˇSimko, C. Mehlf¨uhrer, M. Wrulich, and M. Rupp, “Doubly dispersivechannel estimation with scalable complexity,” in , 2010, pp. 251–256.[19] S. Hochreiter and J. Schmidhuber, “Long short-term memory,”
Neuralcomputation , vol. 9, pp. 1735–80, 12 1997.[20] S. Cammerer, F. A. Aoudia, S. D¨orner, M. Stark, J. Hoydis, and S. tenBrink, “Trainable communication systems: Concepts and prototype,”