[PDF] Channel Estimation and Hybrid Combining for Wideband Terahertz Massive MIMO Systems

Abstract

Terahertz (THz) communication is widely considered as a key enabler for future 6G wireless systems. However, THz links are subject to high propagation losses and inter-symbol interference due to the frequency selectivity of the channel. Massive multiple-input multiple-output (MIMO) along with orthogonal frequency division multiplexing (OFDM) can be used to deal with these problems. Nevertheless, when the propagation delay across the base station (BS) antenna array exceeds the symbol period, the spatial response of the BS array varies across the OFDM subcarriers. This phenomenon, known as beam squint, renders narrowband combining approaches ineffective. Additionally, channel estimation becomes challenging in the absence of combining gain during the training stage. In this work, we address the channel estimation and hybrid combining problems in wideband THz massive MIMO with uniform planar arrays. Specifically, we first introduce a low-complexity beam squint mitigation scheme based on true-time-delay. Next, we propose a novel variant of the popular orthogonal matching pursuit (OMP) algorithm to accurately estimate the channel with low training overhead. Our channel estimation and hybrid combining schemes are analyzed both theoretically and numerically. Moreover, the proposed schemes are extended to the multi-antenna user case. Simulation results are provided showcasing the performance gains offered by our design compared to standard narrowband combining and OMP-based channel estimation.

Full PDF

11 Channel Estimation and Hybrid Combining forWideband Terahertz Massive MIMO Systems

Konstantinos Dovelos, Michail Matthaiou, Hien Quoc Ngo, and Boris Bellalta

Abstract —Terahertz (THz) communication is widely consid-ered as a key enabler for future 6G wireless systems. However,THz links are subject to high propagation losses and inter-symbol interference due to the frequency selectivity of thechannel. Massive multiple-input multiple-output (MIMO) alongwith orthogonal frequency division multiplexing (OFDM) canbe used to deal with these problems. Nevertheless, when thepropagation delay across the base station (BS) antenna arrayexceeds the symbol period, the spatial response of the BS arrayvaries across the OFDM subcarriers. This phenomenon, knownas beam squint, renders narrowband combining approachesineffective. Additionally, channel estimation becomes challengingin the absence of combining gain during the training stage. In thiswork, we address the channel estimation and hybrid combiningproblems in wideband THz massive MIMO with uniform planararrays. Speciﬁcally, we ﬁrst introduce a low-complexity beamsquint mitigation scheme based on true-time-delay. Next, wepropose a novel variant of the popular orthogonal matchingpursuit (OMP) algorithm to accurately estimate the channelwith low training overhead. Our channel estimation and hybridcombining schemes are analyzed both theoretically and numeri-cally. Moreover, the proposed schemes are extended to the multi-antenna user case. Simulation results are provided showcasing theperformance gains offered by our design compared to standardnarrowband combining and OMP-based channel estimation.

Index Terms —Beam squint effect, compressive channel estima-tion, hybrid combining, massive MIMO, planar antenna arrays,wideband THz communication.

I. I

NTRODUCTION

Spectrum scarcity is the main bottleneck of current wire-less systems. As a result, new frequency bands and signalprocessing techniques are needed to deal with this spectrumgridlock. In view of the enormous bandwidth available atTerahertz (THz) frequencies, communication over the THzband is considered as a key technology for future 6G wirelesssystems [1]. More particularly, the THz band, spanning from . to THz, offers bandwidths orders of magnitude largerthan the millimeter wave (mmWave) band. For example, the li-censed bandwidth in the mmWave band is usually up to GHzwhilst that of the THz band is at least GHz [2]. On theother hand, as the frequency increases, the signals experiencemuch more severe path attenuation compared to their mmWaveand microwave counterparts, according to Friis transmissionformula. Thanks to the very short wavelength of THz signals,though, a very large number of antennas can be tightly packed

K. Dovelos and B. Bellalta are with the Department of Information andCommunication Technologies, Pompeu Fabra University, Barcelona, Spain(e-mail: [email protected]; [email protected]).M. Matthaiou and H. Q. Ngo are with the Institute of Electronics, Com-munications and Information Technology (ECIT), Queen’s University Belfast,Belfast, U.K. (e-mail: [email protected]; [email protected]). into a small area to form a massive multiple-input multiple-output (MIMO) transceiver, and effectively compensate for thepropagation losses by means of beamforming [3]. Therefore,THz massive MIMO is expected to be a key enabler for ultra-high-speed networks, such as terabit wireless personal/localarea networks and femtocells [4].Despite the promising performance gains of THz massiveMIMO systems, the wideband transmissions in conjunctionwith the large array aperture, with respect to the symbol pe-riod, give rise to spatial-frequency wideband (SFW) effects [5].Speciﬁcally, the channel response becomes frequency-selectivenot only because of the delay spread of the multi-path channel,but also due to the large propagation delay across the arrayaperture [6]. As a result, the response of the BS array canbe frequency-dependent also in a line-of-sight (LoS) scenario.When orthogonal frequency division multiplexing (OFDM)modulation is employed to combat inter-symbol interference,the spatial-wideband effect renders the direction-of-arrival(DoA) and direction-of-departure (DoD) of the signals to varyacross the subcarriers. This phenomenon, termed beam squint ,calls for frequency-dependent beamforming/combining, whichis not available in a typical hybrid array architecture of THzmassive MIMO. More particularly, narrowband beamform-ing/combining approaches can substantially reduce the arraygain across the OFDM subcarriers, hence leading to perfor-mance degradation [7]. Consequently, beam squint compen-sation is of paramount importance for THz massive MIMO-OFDM systems.Since accurate channel state information (CSI) is essentialto effectively apply combining and/or beam squint mitigation,channel estimation under SFW effects is another importantproblem to address. Speciﬁcally, in the absence of combininggain during channel estimation, the detection of the pathspresent in the channel becomes challenging in the low signal-to-noise ratio (SNR) regime. Additionally, due to the massivenumber of BS antennas and the limited number of radiofrequency (RF) chains in a hybrid array architecture, thechannel estimation overhead becomes excessively large evenfor single-antenna users under standard approaches, such asthe least squares (LS) method. In conclusion, THz massiveMIMO brings new challenges in the signal processing design,and calls for carefully tailored solutions that take into accountthe unique propagation characteristics in THz bands.

A. Prior Work

In this section, we review prior work on channel estimationand hybrid beamforming in wideband mmWave/THz systems. a r X i v : . [ c s . I T ] F e b The authors in [8] proposed a novel single-carrier transmis-sion scheme for THz massive MIMO, which utilizes minimummean-square error precoding and detection. Nevertheless, anarrowband antenna aray model was considered, and hencethe SFW effect was ignored. A stream of recent papers onwideband mmWave MIMO-OFDM systems (see [9]–[12], andreferences therein) proposed methods to jointly optimize theanalog combiner and the digital precoder in order to maximizethe achievable rate under the beam squint effect. In a similarspirit, [13] and [14] proposed a new analog beamforming code-book with wider beams to avoid the array gain degradation dueto beam squint. These methods can enhance the achievablerate when the beam squint effect is mild. However, theirperformance becomes poor in THz MIMO systems due to themuch larger signaling bandwidth and number of BS antennascompared to their mmWave counterparts [17]. To this end,[15] proposed a wideband codebook for beam training for uni-form linear arrays (ULAs) using true-time-delay (TTD) [16].However, this design is limited to ULAs and beam alignmentwithout explicitely estimating the channel. From the relevantliterature on hybrid beamforming, we distinguish [17], whichproposed a TTD-based hybrid beamformer for THz massiveMIMO, however assuming ULAs and perfect CSI.Despite the importance of channel estimation, there are onlyfew recent works in the literature investigating the channelestimation problem under the spatial-wideband effect. Moreparticularly, the seminal paper [5] introduced the SFW formmWave massive MIMO systems, and proposed a channel es-timation algorithm by capitalizing on the asymptotic propertiesof SFW channels. However, the proposed algorithm relies onmultiplying the channel of an N -element uniform linear arrayby an N -point discrete Fourier transform (DFT) matrix, andhence entails high training overhead when the number of RFchains is much smaller than the number of BS antennas. In asimilar spirit, [18] employed the orthogonal matching pursuit(OMP) algorithm along with an energy-focusing preprocessingstep to estimate the SFW channel, while minimizing the powerleakage effect. Finally, [19] leveraged tools from compressivesensing (CS) theory to tackle the channel estimation problemin frequency-selective multiuser mmWave MIMO systems butin the absence of the spatial-wideband effect. B. Contributions

In this paper, we address the channel estimation and hybridcombining problems in wideband THz MIMO. To this end,we assume OFDM modulation, which is the most populartransmission scheme over frequency-selective channels. Themain contributions of the paper are summarized as follows: • We model the SFW effect in THz MIMO-OFDM systemswith a uniform planar array (UPA) at the BS. Note thatprior studies (e.g., [20], [21]) on mmWave/THz com-munication with UPAs ignore the SFW effect. We nextshow that frequency-ﬂat combining leads to substantialperformance losses due to the severe beam squint effectoccuring across OFDM subcarriers, and propose a beamsquint compensation strategy using TTD [22] and virtualarray partition. The scope of the virtual array partition is to reduce the number of TTD elements needed to effec-tively mitigate beam squint. To this end, we derive thewideband combiner expression for a rectangular planararray, and establish its near-optimal performance withrespect to fully-digital combining analytically, as well asthrough computer simulations. • We propose a solution to the channel estimation problemunder the SFW effect. Speciﬁcally, by availing of thechannel sparsity in the angular domain, we ﬁrst adopt asparse representation of the THz channel, and formulatethe channel estimation problem as a CS problem. Wethen propose a solution based on the OMP algorithm,which is one of the most common and simple greedy CSmethods. Contrary to existing works, we employ a wide-band dictionary and show that channels across differentOFDM subcarriers share a common support. This enablesus to apply a variant of the simultaneous OMP algorithm,coined as generalized simultaneous OMP (GSOMP),which exploits the information of multiple subcarriersto increase the probability of successfully recovering thecommon support. We also evaluate the computationalcomplexity of the GSOMP to showcase its efﬁciencywith respect to the OMP. Numerical results show thatthe propounded estimator outperforms the OMP-basedestimator in the low and moderate SNR regimes, whilstachieving the same accuracy in the high SNR regime. • We analyze the mean-square error of the GSOMP schemeby providing the Cram´er-Rao lower bound (CRLB).Moreover, we calculate the average achievable rate as-suming imperfect channel gain knowledge at the BS. Wethen show numerically that when the angle quantizationerror involved in the sparse channel representation isnegligible, the performance of the GSOMP-based es-timator is very close to the CRLB. Additionally, theaverage achievable rate approaches that of the perfectchannel knowledge case at moderate and high SNRvalues, hence corroborating the good performance of ourdesign. Finally, we extend our analysis to the case of amulti-antenna user, and discuss the beneﬁts of deployingmultiple antennas at the user side.The rest of this paper is organized as follows: Section II in-troduces the system and channel models. Section III describesthe hybrid combining problem under the beam squint effect,and presents the proposed combining scheme. Section IVformulates the channel estimation problem, introduces thestandard estimation methods, and explains the propounded al-gorithm for estimating the SFW channel. Section V extends theanalysis to the multi-antenna user case. Section VI is devotedto numerical simulations. Finally, Section VII summarizes themain conclusions derived in this work.

Notation : Throughout the paper, D N ( x ) = sin( Nx/ N sin( x/ is theDirichlet sinc function; A is a matrix; a is a vector; a isa scalar; A † , A H , and A T are the pseudoinverse, conjugatetranspose, and transpose of A , respectively; A ( i ) is the i thcolumn of matrix A ; A ( I ) is the submatrix containing thecolumns of A given by the indices set I ; |I| is the cardinalityof set I ; tr { A } is the trace of A ; blkdiag ( A , . . . , A n ) is the BSuser h ( f ) scatteringobject (a) Uplink setup θφx y ( , ) t h B S an t enna z d (b) Array geometryFig. 1: Illustration of the BS antenna array and its geometry considered in the system model. block diagonal matrix; [ A ] n,m is the ( n, m ) th element of ma-trix A ; F{·} denotes the continuous-time Fourier transform; ∗ denotes convolution; Re {·} is the real part of a complexvariable; N × M is the N × M matrix with unit entries; I N isthe N × N identity matrix; [ v ] n is the n th entry of vector v ;supp ( v ) = { n : [ v ] n (cid:54) = 0 } is the support of v ; ⊗ denotes theKronecker product; (cid:12) is the element-wise product; (cid:107) a (cid:107) and (cid:107) a (cid:107) are the l -norm and l -norm of vector a , respectively; δ ( · ) is the Kronecker delta function; E {·} denotes expectation;and CN ( µ , R ) is a complex Gaussian vector with mean µ andcovariance matrix R . TABLE IM

AIN N OTATION U SED IN THE S YSTEM M ODEL

Notation Description N B Number of BS antennas N RF Number of RF chains S Number of subcarriers f s Frequency of the s th subcarrier B Total signal bandwidth L Number of NLoS paths α l ( f ) Frequency-selective attenuation of the l th path τ l ToA of the l th path ( φ l , θ l ) DoA of the l th path τ l,nm Time delay to the ( n, m ) th BS antenna over the l th path τ nm ( φ l , θ l ) Time delay from the (0 , th to the ( n, m ) th BS antenna x ( t ) Baseband-equivalent of transmitted signal x ( f ) Fourier transform of x ( t ) x l ( t ) Distorted version of x ( t ) over the l th path ˜ r nm ( t ) Passband signal received by the ( n, m ) th BS antenna r nm ( t ) Baseband-equivalent of ˜ r nm ( t ) r nm ( f ) Fourier transform of r nm ( t ) d Antenna spacing f c Carrier frequency c Speed of light k abs Molecular absorption coefﬁcient D Distance between the BS and the user Γ l ( f ) Reﬂection coefﬁcient of the l th NLoS path II. S

YSTEM M ODEL

Consider the uplink of a THz massive MIMO system wherethe BS is equipped with an N × M -element UPA, and serves asingle-antenna user as depicted in Fig 1(a); the multi-antennauser case is investigated in Section V. The total number of BSantennas is N B = N M , and the baseband frequency responseof the uplink channel is denoted by h ( f ) ∈ C N B × . In thesequel, we present the channel and hybrid transceiver modelsused in this work. A. THz Channel Model with Spatial-Wideband Effects

Due to limited scattering in THz bands, the propagationchannel is represented by a ray-based model of L + 1 rays [21], [23]. Hereafter, we assume that the th ray cor-responds to the LoS path, while the remaining l = 1 , . . . , L ,rays are non-line-of-sight (NLoS) paths. Speciﬁcally, each path l = 0 , . . . , L , is characterized by its frequency-selective pathattenuation α l ( f ) , time-of-arrival (ToA) τ l , and DoA ( φ l , θ l ) ,where φ l ∈ [ − π, π ] and θ l ∈ [ − π , π ] are the azimuth andpolar angles, respectively. In the far-ﬁeld region of the BSantenna array, the total delay between the user and the ( n, m ) thBS antenna through the l th path, τ l,nm , is calculated as τ l,nm = τ l + τ nm ( φ l , θ l ) , (1)where τ nm ( φ l , θ l ) accounts for the propagation delay acrossthe BS array, and is measured with respect to the (0 , th BSantenna. For a UPA placed in the xy -plane (see Fig. 1(b)), wethen have [24] τ nm ( φ l , θ l ) (cid:44) d ( n sin θ l cos φ l + m sin θ l sin φ l ) c , (2)where d is the antenna separation, and c is the speed oflight. The channel frequency response is derived as follows.Let x ( t ) be the baseband signal transmitted by the user, with F{ x ( t ) } = x ( f ) . The passband signal, ˜ r nm ( t ) , received by the ( n, m ) th BS antenna is written in the noiseless case as [25] ˜ r nm ( t ) = L (cid:88) l =0 √ Re (cid:110) x l ( t − τ l,nm ) e j πf c ( t − τ l,nm ) (cid:111) , (3)where f c is the carrier frequency, x l ( t ) (cid:44) x ( t ) ∗ χ l ( t ) is thedistorted baseband waveform due to the frequency-selective at-tenuation of the l th path, and χ l ( t ) models the said distortion;namely, F{ χ l ( t ) } = α l ( f ) and F{ x l ( t ) } = α l ( f ) x ( f ) [26].Next, the received passband signal ˜ r nm ( t ) is down-convertedto the baseband signal r nm ( t ) , which is given by r nm ( t ) = L (cid:88) l =0 e − j πf c τ l e − j πf c τ nm ( φ l ,θ l ) x l ( t − τ l,nm ) . (4) Near-ﬁeld considerations are provided in Section VI-D.

Taking the continuous-time Fourier transform of (4) yields r nm ( f ) = F{ r nm ( t ) } = L (cid:88) l =0 β l ( f ) e − j π ( f c + f ) τ nm ( φ l ,θ l ) x ( f ) e − j πfτ l , (5)where β l ( f ) (cid:44) α l ( f ) e − j πf c τ l is the complex gain ofthe l th path. Lastly, collecting all r nm ( f ) into a vector r ( f ) ∈ C N B × gives the relation r ( f ) = h ( f ) x ( f ) , where h ( f ) = L (cid:88) l =0 β l ( f ) a ( φ l , θ l , f ) e − j πfτ l (6)is the baseband frequency response of the uplink channel, and a ( φ, θ, f ) = (cid:104) , . . . , e − j π ( f c + f ) dc ( n sin θ cos φ + m sin θ sin φ ) ,. . . , e − j π ( f c + f ) dc (( N −

1) sin θ cos φ +( M −

1) sin θ sin φ ) (cid:105) T (7)is the array response vector of the BS. Here, the array responseis frequency-dependent due to the spatial-wideband effect. We now introduce the path attenuation model. First, the so-called molecular absorption loss is no longer negligible at THzfrequencies. Therefore, the path attenuation of the LoS pathis calculated as [27] | β ( f ) | = α ( f ) = c π ( f c + f ) D e − k abs ( f c + f ) D , (8)where D denotes the distance between the BS and the user, and k abs ( · ) is the molecular absorption coefﬁcient determined bythe composition of the propagation medium; different frommmWave channels, the major molecular absorption in THzbands comes from water vapor molecules [27]. For the NLoSpaths, we consider single-bounce reﬂected rays, since thediffused and diffracted rays are heavily attenuated for distanceslarger than a few meters [28]. To this end, the reﬂectioncoefﬁcient, Γ l ( f ) , should be taken into account, which isspeciﬁed as [29] Γ l ( f ) = cos φ i,l − n t cos φ t,l cos φ i,l + n t cos φ t,l e − (cid:18) π fc + f )2 σ rough cos2 φi,lc (cid:19) , (9)where n t (cid:44) Z /Z is the refractive index, Z = 377 Ohmis the free-space impedance, Z is the impedance of thereﬂecting material, φ i,l is the incidence and reﬂection angle, φ t,l = arcsin (cid:0) n − t sin φ i,l (cid:1) is the refraction angle, and σ rough characterizes the roughness of the reﬂecting surface. The pathattenuation of the l th NLoS path is ﬁnally given by [30] | β l ( f ) | = α l ( f ) = | Γ l ( f ) | α ( f ) , (10)where l = 1 , . . . , L . If the delay across the BS array is small relative to the symbol period,then x l ( t − τ l,nm ) ≈ x l ( t − τ l ) . In this case, we have a spatially narrowbandchannel with frequency-ﬂat array response vector, i.e., a ( φ, θ, . Baseband Combiner (Digital)DataStreams RF ChainRF Chain

RF Combiner (Analog)

Fig. 2: Illustration of the hybrid array structure considered in the systemmodel.

B. Hybrid Transceiver Model

Due to the frequency selectivity of the THz channel, OFDMmodulation is employed to combat inter-symbol interference.Speciﬁcally, we consider S subcarriers over a signal band-width B . Then, the baseband frequency of the s th subcar-rier is speciﬁed as f s = (cid:0) s − S − (cid:1) BS , s = 0 , . . . , S − .A hybrid analog-digital architecture with N RF (cid:28) N B RFchains is also considered at the BS to facilitate efﬁcienthardware implementation; each RF chain drives the arraythrough N B analog phase shifters, as shown in Fig. 2. Thehybrid combiner for the s th subcarrier is hence expressed as F [ s ] = F RF F BB [ s ] ∈ C N B × N RF , where F RF ∈ C N B × N RF is thefrequency-ﬂat RF combiner with elements of constant ampli-tude, i.e., √ N B , but variable phase, and F BB [ s ] ∈ C N RF × N RF isthe baseband combiner. Finally, the post-processed basebandsignal, y [ s ] ∈ C N RF × , for the s th subcarrier is written as y [ s ] = F H [ s ] r [ s ]= F H [ s ] (cid:16)(cid:112) P d h [ s ] x [ s ] + n [ s ] (cid:17) , (11)where r [ s ] (cid:44) r ( f s ) and h [ s ] (cid:44) h ( f s ) are the received signaland uplink channel, respectively, x [ s ] (cid:44) x ( f s ) ∼ CN (0 , isthe data symbol transmitted at the s th subcarrier, P d denotesthe average power per data subcarrier assuming equal powerallocation among subcarriers, and n [ s ] ∼ CN ( , σ I N B ) isthe additive noise vector. Remark 1.

A promising alternative to OFDM is single-carrier with frequency domain equalization (SC-FDE) dueto its favorable peak-to-average power ratio (PAPR). In ourwork, we exploit the inherent characteristics of THz channels,i.e., high path loss and directional transmissions, which resultin a coherence bandwidth of hundreds of MHz [28]. Therefore,a relatively small number of subcarriers is used, which isexpected to yield a tolerant PAPR.

III. H

YBRID C OMBINING

A. The Beam Squint Problem

Even for a moderate number of BS antennas, the propaga-tion delay across the array can exceed the sampling period due

Fig. 3: Normalized array gain for various bandwidths; × -element UPA, f c = 300 GHz, coherence bandwidth of

MHz, and ( φ, θ ) = ( π/ , π/ . to the ultra-high bandwidth used in THz communication. As aresult, the DoA/DoD varies across the OFDM subcarriers, andthe array gain becomes frequency-selective. This phenomenon,known as beam squint in the array processing literature, callsfor a frequency-dependent combining design which is feasibleonly in a fully-digital array architecture.To demonstrate the detrimental effect of beam squint whenfrequency-ﬂat RF combining is employed, we consider anarbitrary ray impinging on the BS array with DoA ( φ, θ ) ;therefore, we omit the subscript “ l ’ hereafter. In the narrow-band case, the uplink channel is described as β a ( φ, θ, . Let f RF = (1 / √ N B ) f , with (cid:107) f (cid:107) = N B , be an arbitrary RFcombiner. For the combiner f RF , the power of the receivedsignal is calculated as | β | (cid:12)(cid:12) f H a ( φ, θ, (cid:12)(cid:12) N B P d = | β | N B G ( φ, θ, P d , (12)where G ( φ, θ, f ) (cid:44) | f H a ( φ, θ, f ) | /N B is the normalizedarray gain . Choosing f = a ( φ, θ, yields G ( φ, θ,

0) = 1 ,and the maximum array gain is obtained. In a widebandTHz system, though, the array gain varies across the OFDMsubcarriers. In particular, we have that G ( φ, θ, f ) = | a H ( φ, θ, a ( φ, θ, f ) | N B = | D N (2 πf ∆ x ( φ, θ )) | | D M (2 πf ∆ y ( φ, θ )) | , (13)where ∆ x ( φ, θ ) (cid:44) ( d sin θ cos φ ) /c and ∆ y ( φ, θ ) (cid:44) ( d sin θ sin φ ) /c ; please refer to Appendix A for the proof.Figure 3 shows the array gain for various bandwidths, whenthe narrowband RF combiner f RF = 1 / ( √ N B ) a ( φ, θ, isused. As we see, the array gain reduces substantially across theOFDM subcarriers. Furthermore, using the technique of [31],one can show that G ( φ, θ, f ) → as N M → ∞ . Contraryto narrowband massive MIMO, where the signal power in-creases monotonically with the number of BS antennas, hereit may decrease. Consequently, beam squint compensation is ofparamount importance for the successful deployment of THzmassive MIMO systems.

B. Proposed Combiner for Single-Path Channels

In this section, we introduce our wideband combiningscheme for single-path channels, and then extend it to the multi-path case. To this end, we consider that the BS employsa single RF chain to combine the incoming signal, and hencethe RF combiner is denoted by f RF . Next, we analyze thenormalized array gain by decomposing the array into N sb × M sb virtual subarrays of ˜ N ˜ M antennas each, where ˜ N (cid:44) N/N sb and ˜ M (cid:44) M/M sb .

1) Virtual Array Partition:

The array response vector in (7)is decomposed as a ( φ, θ, f ) = a x ( φ, θ, f ) ⊗ a y ( φ, θ, f ) , (14)where a x ( · ) and a y ( · ) are deﬁned, respectively, as a x ( φ, θ, f ) (cid:44) (cid:104) , . . . , e − j π ( f c + f ) n ∆ x ( φ,θ ) ,. . . , e − j π ( f c + f )( N − x ( φ,θ ) (cid:105) T (15)and a y ( φ, θ, f ) (cid:44) (cid:104) , . . . , e − j π ( f c + f ) m ∆ y ( φ,θ ) ,. . . , e − j π ( f c + f )( M − y ( φ,θ ) (cid:105) T . (16)Using the previously mentioned virtual array partition, we canwrite a x ( φ, θ, f ) = [ a x, ( φ, θ, f ) , . . . , a x,N sb ( φ, θ, f )] T , (17) a y ( φ, θ, f ) = [ a y, ( φ, θ, f ) , . . . , a y,M sb ( φ, θ, f )] T , (18)where a x,n ( φ, θ, f ) corresponds to the response vector of the n th virtual subarray, and is deﬁned as a x,n ( φ, θ, f ) (cid:44) (cid:104) e − j π ( f c + f )( n −

1) ˜ N ∆ x ( φ,θ ) ,. . . , e − j π ( f c + f )( n ˜ N − x ( φ,θ ) (cid:105) T . (19)Finally, each vector a x,n ( φ, θ, f ) is expressed in terms of a x, ( φ, θ, f ) , i.e., the response of the ﬁrst subarray , as a x,n ( φ, θ, f ) = e − j π ( f c + f )( n −

1) ˜ N ∆ x ( φ,θ ) a x, ( φ, θ, f ) . (20)We stress that similar expressions hold for the vector a y .Using the virtual subarray notation, the normalized array gain G ( φ, θ, f ) is recast as in (21) at the bottom of the next page.For an adequately small ˜ N ˜ M , we then have the approximation D ˜ N (2 πf s ∆ x ( φ, θ )) D ˜ M (2 πf s ∆ y ( φ, θ )) ≈ . RF Chain TTD ElementTTD Element

Virtual Subarrays

Fig. 4: Illustration of the TTD-based wideband combiner with virtual arraypartition; the circles with arrows represent frequency-ﬂat phase shifters.

2) Size of Virtual Subarrays:

The size of each virtualsubarray, ˜ N × ˜ M , is selected such that the maximum delayacross the ﬁrst virtual subarray is smaller than the samplingperiod /B . Speciﬁcally, the maximum delay, τ max , acrossthe ﬁrst subarray is given by (2) for n = ˜ N − , m =˜ M − , sin θ = 1 , and sin φ = cos φ = 1 / √ , yielding τ max = d ( ˜ N + ˜ M − / ( √ c ) . For half-wavelength antennaspacing and ˜ N = ˜ M , the condition τ max < /B reduces to ( ˜ N − < √ f c /B , which is used to determine ˜ N .

3) TTD-Based Combining:

The factor Ω( φ, θ, f ) ≤ in (21) accounts for the losses caused by the delay betweenconsecutive virtual subarrays, and it can be canceled througha TTD network placed between virtual subarrays, as depictedin Fig. 4. Then, we obtain Ω( φ, θ, f s ) = 1 by multiplyingthe signal at the ( n, m ) th virtual subarray by e j πf s ∆ mn ( φ,θ ) ,where ∆ mn ( φ, θ ) (cid:44) ( n −

1) ˜ N ∆ x ( φ, θ ) + ( m −

1) ˜ M ∆ y ( φ, θ ) is the delay to be mitigated. Because all OFDM subcarriersshare the same delay ∆ mn ( φ, θ ) , it can be compensatedusing a single TTD element modeled as a linear ﬁlter withimpulse response δ ( t − ∆ nm ( φ, θ )) . Therefore, the widebandRF combiner is designed as f RF [ s ] = 1 √ N B vec ( A ( φ, θ, (cid:12) T [ s ]) , (22)where T [ s ] (cid:44) (cid:2) e − j πf s ∆ mn ( φ,θ ) (cid:3) M sb ,N sb m =1 ,n =1 ⊗ ˜ M × ˜ N , A ( φ, θ, (cid:44) a y ( φ, θ, a Tx ( φ, θ, , and (cid:107) f RF [ s ] (cid:107) = 1 . Proposition 1.

With the proposed combiner (22) , we have (cid:12)(cid:12) f H RF a ( φ, θ, f ) (cid:12)(cid:12) = N B | D ˜ N (2 πf ∆ x ) | | D ˜ M (2 πf ∆ y ) | , (23) where D N ( x ) = sin( Nx/ N sin( x/ is the Dirichlet sinc function.Proof. See Appendix B.From (23), we conclude that for sufﬁciently small ˜ N and ˜ M , an array gain N B is approximately achieved over thewhole signal bandwidth B . Thus, the SNR at the s th OFDM subcarrier is | β ( f s ) | N B P d /σ . Lastly, ( N sb M sb − TTDelements are employed per RF chain, where N sb = N/ ˜ N and M sb = M/ ˜ M . C. Proposed Combiner for Multi-Path Channels

The propounded method can readily be applied to multi-pathchannels. For example, consider a THz channel comprisingof L = 2 NLoS paths. In a fully-digital array, the optimalcombiner for the s th subcarrier is given by the maximum-ratiocombiner h [ s ] / (cid:107) h [ s ] (cid:107) . By employing N RF = 2 RF chains, wehave that h [ s ] (cid:107) h [ s ] (cid:107) = F RF [ s ] F BB [ s ] × , (24)where F RF [ s ] = 1 √ N B (cid:2) a ( φ , θ , f s ) a ( φ , θ , f s ) (cid:3) , (25) F BB [ s ] = √ N B | h [ s ] | (cid:20) β ( f s ) e − j πf s τ β ( f s ) e − j πf s τ (cid:21) . (26)The columns of the wideband RF combiner F RF [ s ] are thenapproximated using (22), whilst the vector × with unit en-tries performs the addition of the two outputs of the basebandcombiner. Note that N RF = L are required to implement themaximum-ratio combiner in a hybrid array architecture. Remark 2.

A few recent papers in the literature (e.g., [32]and references therein) suggested the use of TTD to pro-vide frequency-dependent phase shifts at each antenna of an N -element ULA, yielding a wideband multi-beam architecture.In our work, we adopt a hybrid array architecture, where eachfrequency-independent phase shifter drives a single antennawhilst each TTD element controls a group of antennas, i.e.,virtual subarray. Moreover, we consider a UPA, and hence ourdesign enables squint-free three-dimensional (3D) combining. IV. S

PARSE C HANNEL E STIMATION

We have introduced an effective wideband combiner as-suming that the BS has perfect knowledge of the uplinkchannel. In this section, we investigate the channel estimationproblem under the spatial-wideband effect. More particularly,we ﬁrst formulate a compressive sensing problem to estimatethe channel at each subcarrier independently with reducedtraining overhead. We then propound a wideband dictionaryand employ an estimation algorithm that leverages informationfrom multiple subcarriers to increase the reliability of thechannel estimates in the low and moderate SNR regimes. G ( φ, θ, f ) = (cid:12)(cid:12) a Hx, ( φ, θ, a x, ( φ, θ, f ) (cid:12)(cid:12) (cid:12)(cid:12) a Hy, ( φ, θ, a y, ( φ, θ, f ) (cid:12)(cid:12) ˜ N ˜ M (cid:12)(cid:12)(cid:12)(cid:80) N sb n =1 (cid:80) M sb m =1 e − j π ( n −

1) ˜ Nf ∆ x ( φ,θ ) e − j π ( m −

1) ˜ Mf ∆ y ( φ,θ ) (cid:12)(cid:12)(cid:12) N sb M sb (cid:124) (cid:123)(cid:122) (cid:125) Ω( φ,θ,f ) = | D ˜ N (2 πf ∆ x ( φ, θ )) | | D ˜ M (2 πf ∆ y ( φ, θ )) | Ω( φ, θ, f ) . (21) A. Problem Formulation

We assume a block-fading model where the channel coher-ence time is much larger than the training period. Speciﬁcally,the training period consists of N slot time slots. At each timeslot t = 1 , . . . , N slot , the user transmits the pilot signal x t [ s ] = (cid:112) P p , ∀ s ∈ S , where S (cid:44) { , . . . , S } denotesthe set of OFDM subcarriers, and P p is the power perpilot subcarrier. In turn, the BS combines the pilot signalat each subcarrier s ∈ S using a training hybrid combiner W t [ s ] ∈ C N B × N RF . Therefore, the post-processed signal atslot t , y t [ s ] ∈ C N RF × , is written as y t [ s ] = (cid:112) P p W Ht [ s ] h [ s ] + W Ht [ s ] n t [ s ] , (27)where n t [ s ] ∼ CN ( , σ I N B ) is the additive noise vector. Let N beam = N slot N RF denote the total number of pilot beams.After N slot training slots, the BS acquires the measurementvector ¯ y [ s ] (cid:44) [ y T [ s ] , . . . , y TN slot [ s ]] T ∈ C N beam × for h [ s ] as ¯ y [ s ] = (cid:112) P p  W H [ s ] ... W HN slot [ s ]  h [ s ] +  W H [ s ] n [ s ] ... W HN slot [ s ] n N slot [ s ]  = (cid:112) P p W H [ s ] h [ s ] + ¯ n [ s ] , (28)where W [ s ] (cid:44) [ W [ s ] , . . . , W N slot [ s ]] ∈ C N B × N beam , and ¯ n [ s ] ∈ C N beam × denotes the effective noise. More particularly, R ¯ n [ s ] (cid:44) σ diag (cid:0) W H [ s ] W [ s ] , . . . , W HN slot [ s ] W N slot [ s ] (cid:1) is thecovariance matrix of the effective noise, which is colored ingeneral. Regarding the pilot combiners, due to the hybridarray architecture, W [ s ] = W RF W BB [ s ] , with W RF =[ W RF , , . . . , W RF ,N slot ] ∈ C N B × N beam containing the RF pilotbeams and W BB [ s ] = blkdiag ( W BB , [ s ] , . . . , W BB ,N slot [ s ]) ∈ C N beam × N beam comprising the N RF × N RF baseband combiners.The design of the pilot combiners is detailed in Section IV-D3. B. Least Squares Estimator

From (28), we have N beam observations, while h [ s ] includes N B variables. Thus, to obtain a good estimate of h [ s ] , we needthat N beam ≥ N B . With this condition, the LS estimate is ˆ h LS [ s ] = Q † s ¯ y [ s ] , (29)where Q s (cid:44) (cid:112) P p W H [ s ] ∈ C N beam × N B is the sensing matrix.The mean square error (MSE) of the LS estimator for the s thsubcarrier is given by J LS s (cid:44) E (cid:26)(cid:13)(cid:13)(cid:13) h [ s ] − ˆ h LS [ s ] (cid:13)(cid:13)(cid:13) (cid:27) = tr (cid:0) Q † s R ¯ n [ s ] ( Q † s ) H (cid:1) . (30)The optimal Q s satisﬁes Q Hs Q s = P p I N B [33], [34]. In thehybrid array architecture under consideration, this is achievedby W BB [ s ] = I N B and W RF = U ∈ C N B × N B , where U isthe DFT matrix generating the RF pilot beams [34]. We thenhave R ¯ n [ s ] = σ I N B , Q † s = (1 / (cid:112) P p ) U , and J LS s = σ N B /P p . (31) We consider the LS instead of the minimum mean-square error (MMSE)method because we focus on estimators that exploit only instantaneous CSI.

The LS estimator (29) requires N beam ≥ N B , and hence yieldsa prohibitively high training overhead when the number of RFchains is much smaller than the number of BS antennas. C. Sparse Formulation and Orthogonal Matching Pursuit

By exploiting the angular sparsity of THz channels, we canhave a sparse formulation of the channel estimation problemas follows. The physical channel in (6) is also expressed as h [ s ] = A [ s ] β [ s ] , (32)where A [ s ] (cid:44) [ a ( φ , θ , f s ) , . . . , a ( φ L , θ L , f s )] ∈ C N B × ( L +1) ,with a ( φ l , θ l , f s ) being speciﬁed by (7) for f = f s , is theso-called wideband array response matrix, and β [ s ] (cid:44) [ β ( f s ) e − j πf s τ , . . . , β L ( f s ) e − j πf s τ L ] T ∈ C ( L +1) × isthe vector of channel gains. Next, consider a dictionary ¯ A [ s ] ∈ C N B × G whose G columns are the array responsevectors associated with a predeﬁned set of DoA. Then, theuplink channel can be approximated as h [ s ] ≈ ¯ A [ s ] ¯ β [ s ] , (33)where ¯ β [ s ] ∈ C G × has L +1 nonzero entries whose positionsand values correspond to their DoA and path gains [35].Therefore, (28) is recast as ¯ y [ s ] = Φ s ¯ β [ s ] + ¯ n [ s ] , (34)where Φ s (cid:44) (cid:112) P p W H [ s ] ¯ A [ s ] ∈ C N beam × G is the equivalent sensing matrix. Since ( L + 1) (cid:28) G , the channel gain vector ¯ β [ s ] is ( L +1) -sparse, and the channel estimation problem canbe formulated as the sparse recovery problem [34] ˆ¯ β [ s ] = arg min ¯ β [ s ] (cid:107) ¯ β [ s ] (cid:107) s.t. (cid:13)(cid:13) ¯ y [ s ] − Φ s ¯ β [ s ] (cid:13)(cid:13) ≤ (cid:15) (35)where (cid:15) ≤ E {(cid:107) ¯ n [ s ] (cid:107) } is an appropriately chosen boundon the mean magnitude of the effective noise. The aboveoptimization problem can be solved for each subcarrier inde-pendently, i.e., single measurement vector formulation. Finally,the estimate of h [ s ] is obtained as ˆ h CS [ s ] = ¯ A [ s ]ˆ¯ β [ s ] .Several greedy algorithms have been proposed to ﬁnd ap-proximate solutions of the l -norm optimization problem. Theorthogonal matching pursuit (OMP) algorithm [36] describedin Algorithm 1 is one of the most common and simple greedyCS methods that can solve (35). Algorithm 1

OMP-Based EstimatorInput: equivalent sensing matrix Φ s and measurementvector ¯ y [ s ] for the s th subcarrier, and a threshold (cid:15) . I − = ∅ , G = { , . . . , G } , r − [ s ] = , r − [ s ] = ¯ y [ s ] ,and l = 0 . while (cid:107) r l − [ s ] − r l − [ s ] (cid:107) > (cid:15) do g (cid:63) = arg max g ∈G (cid:12)(cid:12) Φ Hs ( g ) r l − [ s ] (cid:12)(cid:12) I l = I l − ∪ { g (cid:63) } r l [ s ] = (cid:0) I N beam − Φ s ( I l ) Φ † s ( I l ) (cid:1) ¯ y [ s ] l = l + 1 end while ˆ¯ β [ s ] = Φ † s ( I l − )¯ y [ s ] return ˆ h CS [ s ] = ¯ A [ s ]ˆ¯ β [ s ] . Fig. 5: cumulative distribution function (CDF) of the normalized array gain and quantization error for a single-path channel and a super-resolution dictionarywith G x = 4 N and G y = 4 M ; 1,000 channel realizations, × -element UPA, f RF = (1 / √ N B ) a (¯ ω x ( q ) , ¯ ω y ( p ) , f s ) , B = 40 GHz, S = 400 subcarriers,and s = 200 th subcarrier. D. Proposed Channel Estimator1) Wideband Dictionary for UPAs:

For half-wavelengthantenna separation, the array response vector (7) is recast as a ( ω x , ω y , f ) = (cid:104) , . . . , e − j π ( ffc ) ( nω x + mω y ) ,. . . , e − j π ( ffc ) (( N − ω x +( M − ω y ) (cid:105) T , (36)where ω x = 1 / θ cos φ and ω y = 1 / θ sin φ are the spatial frequencies [37]. The one-to-one mapping between thespatial frequencies ( ω x , ω y ) and the physical angles ( φ, θ ) isgiven by the relationships φ = tan − ( ω y /ω x ) , (37) θ = sin − (cid:16) (cid:113) ω x + ω y (cid:17) . (38)Since both ω x and ω y lie in [ − / , / , we can consider thegrids of discrete spatial frequencies G x = { ¯ ω x ( q ) = q/G x , q = − ( G x − / , . . . , ( G x − / } , (39) G y = { ¯ ω y ( p ) = p/G y , p = − ( G y − / , . . . , ( G y − / } , (40)where G x G y = G is the overall dictionary size.For the s th subcarrier, we deﬁne the array response matrices ¯ A x [ s ] ∈ C N × G x and ¯ A y [ s ] ∈ C M × G y whose columns arethe array response vectors a x ( · , f s ) and a y ( · , f s ) evaluatedat the grid points of G x and G y , respectively. Now, thedictionary ¯ A [ s ] (cid:44) ¯ A x [ s ] ⊗ ¯ A y [ s ] ∈ C N B × G can be usedto approximate the uplink channel h [ s ] at the s th subcarrier.Although this approximation entails quantization errors, theybecome small for large G x and G y [35]. More speciﬁcally,we can use a super-resolution dictionary with G x > N and G y > M to reduce the mismatch between the quantized andthe actual channel. We evaluate the accuracy of the proposeddictionary by generating a DoA with ( ω x , ω y ) , which is thenquantized to the closest value (¯ ω x ( q ) , ¯ ω y ( p )) . Figure 5 showsthe cumulative distribution function (CDF) of the normalizedarray gain (cid:12)(cid:12) a H (¯ ω x ( q ) , ¯ ω y ( p ) , f s ) a ( ω x , ω y , f s ) (cid:12)(cid:12) /N B , and thequantization errors | ω x − ¯ ω x ( q ) | and | ω y − ¯ ω y ( p ) | of the spatialfrequencies. As we observe, the errors are small, and do not affect signiﬁcantly the normalized array gain. Consequently,we can neglect the quantization errors, and assume that theDoA of each path lies on the dictionary grid. Note thatfor G x = N and G y = M , the dictionary ¯ A [ s ] reducesto the known virtual channel representation (VCR) [38] inthe spatial-narrowband case. Lastly, a similar representation,termed extended VCR, was introduced in [39] for narrowbandmassive MIMO systems.

2) Generalized Multiple Measurement Vector Problem:

Due to the frequency-dependent dictionary, the channel gainvectors { ¯ β [ s ] } S − s =0 share the same support. Therefore, wecan exploit the common support property and consider theproblem in (35) as a generalized multiple measurement vec-tor (GMMV) problem, where multiple sensing matrices areemployed [40]. To tackle the GMMV problem, we employthe simultaneous OMP algorithm [41]. The proposed channelestimator is described in Algorithm 2. Algorithm 2

GSOMP-Based EstimatorInput: set S of pilot subcarriers, sensing matrices Φ s andmeasurement vectors ¯ y [ s ] , ∀ s ∈ S , and a threshold (cid:15) . I − = ∅ , G = { , . . . , G } , r − [ s ] = ¯ y [ s ] , MSE = (cid:80) s ∈S (cid:107) ¯ y [ s ] (cid:107) , and l = 0 . while MSE > (cid:15) do g (cid:63) = arg max g ∈G\I l − (cid:80) s ∈S (cid:12)(cid:12) Φ Hs ( g ) r l − [ s ] (cid:12)(cid:12) I l = I l − ∪ { g (cid:63) } r l [ s ] = (cid:0) I N beam − Φ s ( I l ) Φ † s ( I l ) (cid:1) ¯ y [ s ] , ∀ s ∈ S MSE = |S| (cid:80) s ∈S (cid:107) r l [ s ] − r l − [ s ] (cid:107) l = l + 1 end while ˆ¯ β [ s ] = Φ † s ( I l − )¯ y [ s ] , ∀ s ∈ S return ˆ h CS [ s ] = ¯ A [ s ]ˆ¯ β [ s ] , ∀ s ∈ S .Regarding the stopping criterion of the OMP/GSOMP al-gorithm, we design the pilot combiners so that the effectivenoise is white. In this case, the variance of the noise poweris E (cid:8) (cid:107) ¯ n [ s ] (cid:107) (cid:9) = N beam σ , and the threshold can be chosenas (cid:15) = N beam σ , or a fraction of the average noise power.Additionally, a thresholding step can be incorporated intothe algorithms, in which only the entries of the estimate ˆ¯ β with power higher than the noise variance will be selected asdetected paths. After estimating the spatial frequencies of eachpath, the physical angles are obtained through (37) and (38),which are then used in the TTD-based wideband combiner.

3) Pilot Beam Design:

The elements of the RF combiner W RF are selected from the set {− / √ N B , / √ N B } withequal probability. The reason we adopt a randomly formedRF combiner is that it has been shown to have a low mutual-column coherence, and therefore can be expected to attaina high recovery probability according to the compressivesensing theory [42]. The speciﬁc RF pilot design leads toa colored effective noise, however the SOMP algorithm isbased on the assumption that the noise covariance matrixis diagonal. To this end, we design the baseband combinersuch that the combined noise remains white. In particular,let D Ht D t be the Cholesky decomposition of W H RF,t W RF,t ,where D ∈ C N RF × N RF is an upper triangular matrix. Then, thebaseband combiner of the t th slot is set to W BB ,t [ s ] = D − t ,and hence W [ s ] = W RF blkdiag ( D − , . . . , D − N slot ) . Under thispilot beam design, the covariance matrix of the effective noisebecomes R ¯ n = σ I N beam , yielding the desired result. We ﬁnallypoint out that the combiners W [ s ] can be computed ofﬂine. E. Performance of the Proposed Channel Estimator1) Lower Bound Error Analysis:

As previously mentioned,for semi-unitary combiners W t [ s ] with W Ht [ s ] W t [ s ] = I N RF , ∀ t = 1 , . . . , N slot , the covariance matrix of the effectivenoise ¯ n [ s ] is equal to σ I N beam . Next, we derive the CRLBassuming that the GSOMP recovers the exact support of ¯ β [ s ] ,i.e., I l − = supp (cid:0) ¯ β [ s ] (cid:1) = I . To this end, we can deﬁne thefollowing linear model for the s th subcarrier ¯ y [ s ] = Φ s ( I )˜¯ β [ s ] + ¯ n [ s ] , (41)where ˜¯ β [ s ] ∈ C L × denotes the vector to be estimated, and ¯ y [ s ] is distributed as CN (cid:16) Φ s ( I )˜¯ β [ s ] , σ I N beam (cid:17) . The modelin (41) is linear on the parameter vector ˜¯ β [ s ] , and the solution ˆ¯ β [ s ] = Φ † s ( I )¯ y [ s ] gives E (cid:110) ˆ¯ β [ s ] (cid:111) = ˜¯ β [ s ] . Speciﬁcally, ˆ¯ β [ s ] is the mininum variance unbiased estimator of ˜¯ β [ s ] , henceattaining the CRLB [43]. Next, the Fisher information matrixfor (41) is calculated as I (cid:16) ˜¯ β [ s ] (cid:17) = 1 σ Φ Hs ( I ) Φ s ( I ) . (42)The channel estimate for the s th subcarrier is acquired as ˆ h CS [ s ] = ¯ A s ( I )ˆ¯ β [ s ] , where ¯ A s ( I ) denotes the matrix withthe columns of ¯ A [ s ] given by the support I . Let J CS s denote theMSE of the OMP. Since E (cid:110) ˆ h CS (cid:111) = ¯ A s ( I )˜¯ β [ s ] (cid:44) ψ (cid:16) ˜¯ β [ s ] (cid:17) ,the CRLB for the s th subcarrier is given by [43] J CS s ≥ tr  ∂ ψ (cid:16) ˜¯ β [ s ] (cid:17) ∂ ˜¯ β [ s ] I − (cid:16) ˜¯ β [ s ] (cid:17) ∂ ψ H (cid:16) ˜¯ β [ s ] (cid:17) ∂ ˜¯ β [ s ]  , (43)where ∂ ψ (cid:16) ˜¯ β [ s ] (cid:17) /∂ ˜¯ β [ s ] = ¯ A s ( I ) . This is a well accepted assumption in the related literature; see [19] andreferences therein.

2) Complexity Analysis:

In this section, we detail the com-putational complexity per iteration l of the GSOMP scheme.Speciﬁcally, we have the following operations: • The l -norm operations at step and step have O ( |S| N beam ) complexity. • The calculation of the product Φ Hs ( g ) r l − [ s ] at step is O ( |S| N beam ( G − l )) because there are G − l elements toexamine at the l th iteration, where G is the size of thedictionary. • To ﬁnd the maximum element from G − l values at step is on the order of O ( G − l ) . • The LS operation at step is O ( l + 2 l N beam ) for eachpilot subcarrier. This is because Φ ( I l ) is a N beam × l matrix, and hence its pseudoinverse entails l + l N beam operations plus the multiplication with Φ ( I l ) entailing l N beam additional multiplications.Given the above, the overall online computational complex-ity is O (cid:0) |S| ( N beam ( G − l ) + l + 2 l N beam ) + ( G − l ) (cid:1) . Notethat the OMP has O ( |S| G ) at step for ﬁnding the maximumcorrelation between the measurement vector and the columnsof the dictionary. As a result, the GSOMP leads to a compu-tational reduction as well.V. T HE M ULTI -A NTENNA U SER C ASE

We now discuss how the previous analysis can be extendedto the case of a multi-antenna user. To this end, we considera user with an N U -element ULA. The frequency response ofthe uplink channel, H ( f ) ∈ C N B × N U , is then expressed as H ( f ) = L (cid:88) l =0 β l ( f ) a B ( φ l , θ l , f ) a HU ( ϕ l , f ) e − j πfτ l , (44)where a B ( · , · , · ) denotes the response vector (7) of the BSarray, ϕ l is the angle-of-departure (AoD) of the l th path fromthe user, and a U ( ϕ, f ) (cid:44) (cid:104) , e − j π ( f c + f ) dc sin ϕ ,. . . , e − j π ( f c + f )( N U − dc sin ϕ (cid:105) T (45)is the wideband response vector of the user array.At the BS, the post-processed baseband signal for the s thsubcarrier is expressed as y [ s ] = F H [ s ] ( H [ s ] B [ s ]˜ x [ s ] + n [ s ]) , (46)where B [ s ] ∈ C N U × N u RF is the hybrid precoder when the useremploys N u RF RF chains, ˜ x [ s ] = P [ s ] x [ s ] is the transmittedsignal at the s th subcarrier, P [ s ] = diag ( p ,s , . . . , p N u RF ,s ) isthe power allocation matrix, and x [ s ] ∼ CN ( , I N u RF ) is thevector of data symbols. Furthermore, the power constraint (cid:80) S − s =0 E (cid:8) (cid:107) B [ s ]˜ x [ s ] (cid:107) (cid:9) ≤ P t should be satisﬁed, so that thetransmit power does not exceed the user’s power budget P t . A. Hybrid Combining and Beamforming

Consider a single-path channel with AoD ϕ from the userand DoA ( φ, θ ) at the BS. For the frequency-ﬂat beamformer (1 / √ N U ) a U ( ϕ, and combiner (1 / √ N B ) a B ( φ, θ, , the normalized array gain in (13) is recast as in (47) at the bottomof this page, where ∆( ϕ ) (cid:44) d sin ϕ/c . Employing TTD-based combining and beamforming yields G ( φ, θ, ϕ, f ) ≈ ,and the SNR at the s th subcarrier is approximately equal to | β ( f s ) | N U N B P d /σ . Compared to the single-antenna usercase, we have an additional beamforming gain N U .Now consider, for instance, a multi-path channel of L = 2 NLoS paths. In a fully-digital array, the combiner and precodermaximizing the achievable rate are given by the singular valuedecomposition (SVD) of the channel matrix H [ s ] [11]. Forour hybrid analog-digital array structure, we adopt a practicalapproach, as in [17]. We ﬁrst decompose the channel matrixas H ( f ) = H B ( f ) H HU ( f ) , where H B ( f ) = (cid:2) a B ( φ , θ , f ) , a B ( φ , θ , f ) (cid:3) , (48)and H U ( f ) = (cid:2) β ( f ) a U ( ϕ , f ) e − j πfτ ,β ( f ) a U ( ϕ , f ) e − j πfτ (cid:3) . (49)Next, the RF combiner and beamformer are the matched ﬁltersof the channels H B ( f ) and H HU ( f ) , respectively, whereas thebaseband combiner and precoder are designed using the SVDof the effective channel, when both ends have full CSI. Notethat for a multi-path channel with L > N RF paths, the usercommunicates at most min( L, N RF ) spatial streams to the BSin the absence of inter-stream interference through SVD-basedtransmission. B. Sparse Channel Estimation

The user employs a training codebook { v i ∈ C N U × , i =1 , . . . , N u beam } , which consists of N u beam pilot RF beamformers.When the i th pilot beamformer is used during N slot trainingslots, (28) is recast as ¯ y i [ s ] = (cid:112) P p W H [ s ] H [ s ] v i + ¯ n i [ s ] . (50)By collecting all vectors ¯ y i [ s ] into a single matrix Y [ s ] =[¯ y [ s ] , . . . , ¯ y N u beam [ s ]] ∈ C N beam × N u beam , we can write Y [ s ] = (cid:112) P p W H [ s ] H [ s ] V + N [ s ] , (51)where V = [ v , . . . , v N u beam ] ∈ C N U × N u beam , and N =[¯ n [ s ] , . . . , ¯ n N u beam [ s ]] ∈ C N beam × N u beam . Utilizing the identityvec ( ABC ) = ( C T ⊗ A ) vec ( B ) , we express (51) in vectorform asvec ( Y [ s ]) = (cid:112) P p (cid:16) V T ⊗ W H [ s ] (cid:17) vec ( H [ s ]) + vec ( N [ s ]) , (52)where vec ( Y [ s ]) ∈ C N beam N u beam × is the overall measure-ment vector, vec ( H [ s ]) ∈ C N B N U × is the uplink chan-nel to be estimated, and vec ( N [ s ]) ∈ C N beam N u beam × is thenoise vector. Now, the proposed GSOMP-based estimator can readily be used by considering the equivalent sensingmatrix Φ s = (cid:112) P p (cid:16) V T ⊗ W H [ s ] (cid:17) ¯ A [ s ] ∈ C N beam N u beam × GG u ,where ¯ A [ s ] (cid:44) ¯ A ∗ u [ s ] ⊗ ( ¯ A x [ s ] ⊗ ¯ A y [ s ]) ∈ C N B N U × GG u isthe equivalent dictionary accounting also for the dictionary ¯ A u [ s ] ∈ C N U × G u of size G u at the user side. Finally, theestimated channel is constructed as vec ( ˆ H [ s ]) = ¯ A [ s ]ˆ¯ β [ s ] . TABLE IIM

AIN S IMULATION P ARAMETERS [27], [28]

Parameter Value

Bandwidth B = 40 GHzCarrier frequency f c = 300 GHzTransmit power P t = 10 dBmPower density of noise σ = − dBm/HzAzimuth AoA φ l ∼ U ( − π, π ) Polar AoA θ l ∼ U ( − π/ , π/ LoS path length D = 15 mToA of LoS τ = 50 nsecToA of NLoS τ l ∼ U (50 , nsecAbsorption coefﬁcient k abs = 0 . m − Refractive index n t = 2 . − j . Roughness factor σ rough = 0 . · − m VI. N

UMERICAL R ESULTS

In this section, we conduct numerical simulations to eval-uate the performance of the proposed channel estimator andhybrid combiner. To this end, we consider the following setup: • Number of OFDM Subcarriers: For a NLoS multi-pathscenario where τ l ∼ U (50 , nsec, the delay spread is D s = 5 nsec. The coherence bandwidth is then calculatedas B c = 1 / (2 D s ) = 100 MHz [25], which results in S ≈ B/B c = 400 subcarriers. On the other hand, for aLoS scenario, the delay spread is equal to the maximumdelay across the UPA due to the spatial-wideband effect.This results in S ≈ subcarriers for an × -element UPA and B = 40 GHz. • Antenna Gain: Each BS antenna element has a directionalpower pattern, Λ( φ, θ ) , which is speciﬁed according tothe 3GPP standard as [48] Λ( φ, θ ) = Λ max − min [ − Λ H ( φ ) − Λ V ( θ ) , Λ FBR ] , (53)where Λ H ( φ ) = − min (cid:34) (cid:18) φφ (cid:19) , Λ FBR (cid:35) , (54) Λ V ( θ ) = − min (cid:34) (cid:18) θ − ◦ θ (cid:19) , SLA v (cid:35) , (55)where min [ · , · ] denotes the minimum between the inputarguments, Λ max is the maximum gain in the boresightdirection, φ = 65 ◦ and θ = 65 ◦ are the horizontaland vertical half-power beamwidths, respectively, Λ FBR =30 dB is the front-to-back ratio, and SLA v = 30 dB is the G ( φ, θ, ϕ, f ) = | a HB ( φ, θ, a B ( φ, θ, f ) | N B | a HU ( ϕ, f ) a U ( ϕ, | N U = | D N (2 πf ∆ x ( φ, θ )) | | D M (2 πf ∆ y ( φ, θ )) | | D N U (2 πf ∆( ϕ )) | . (47) Fig. 6: NMSE versus SNR for a single-antenna user. The OMP, NBOMP, and GSOMP estimators are evaluated under partial training of N beam = 0 . N B pilot beams; × -element UPA, N RF = 2 , NLoS channel with L = 3 paths, S = 400 subcarriers, and super-resolution dictionary with G = 4 N B . side lobe attenuation in the vertical direction. We choose Λ max = 50 dBi [27]. At the user side, we assume om-nidirectional antennas. The channel model is then recastby replacing a ( φ, θ, f ) with (cid:112) Λ( φ, θ ) a ( φ, θ, f ) [49].The other simulation parameters are summarized in Table II. A. Channel Estimation1) Single-Antenna User:

Our main performance metric isthe normalized mean-square error (NMSE) versus the averagereceive SNR for the estimators intoduced previously. Specif-ically, for a given channel realization, the NMSE metric isdeﬁned asNMSE (cid:44) |S| (cid:88) s ∈S E (cid:26)(cid:13)(cid:13)(cid:13) h [ s ] − ˆ h [ s ] (cid:13)(cid:13)(cid:13) (cid:14) (cid:107) h [ s ] (cid:107) (cid:27) , (56)where ˆ h [ s ] denotes the estimate of the corresponding estima-tor. The NMSE is computed numerically over 100 channelrealizations. The channel gains { β l ( f s ) } Ll =1 are generated as CN (0 , σ β ) , with σ β = 10 − , i.e., − dB, modeling the highpath attenuation at THz frequencies [23]. The average SNRis then calculated as SNR = σ β P p /P n , where P p = P t / |S| is the power per pilot subcarrier, and P n = ∆ Bσ is thenoise power at each subcarrier, with ∆ B ≈ B/S being thesubcarrier spacing.In the ﬁrst numerical experiment, we compare the followingestimation schemes: • The LS scheme under full training, i.e., N beam = N B . • The narrowband OMP-based estimator (NBOMP) with afrequency-ﬂat dictionary [44], [45]. • The OMP-based estimator with the frequency-dependentdictionary of Section IV-D. • The proposed GSOMP-based estimator and its CRLB.The NMSE metrics for the LS method and the CRLB arecomputed using (31) and (43) in the numerator of (56), The path gains are generated in this way in order to have a single averageSNR metric. respectively. The NMSE attained by each scheme is depictedin Fig. 6(a). As we observe, the NMSE of the LS methodis prohibitively high since it scales linearly with the numberof BS antennas. Likewise, the NBOMP exhibits a very poorperformance since it neglects the spatial-wideband effect.Moreover, the OMP-based estimator fails to successfully re-cover the common support in the low SNR regime, henceresulting in signiﬁcant estimation errors. On the other hand,the proposed GSOMP-based estimator accurately detects thecommon support of the channel gain vectors for all SNR valuesranging from − dB to dB, and thus attains the CRLB.Next, we focus on the state-of-the-art of estimation tech-niques based on the OMP. To this end, we distinguish thework in [46], which proposed a nonuniform dictionary and anRF pilot beam design based on the DFT for a narrowbandsystem with ULAs; henceforth, we will refer to this schemeas OMP-DFT. Here, we extend the said design to the UPAcase with spatial-wideband effects, and compare it with ourproposed method. As we see from Fig. 6(b), the GSOMPoutperfoms the OMP-DFT. The poor performance of the OMP-DFT stems from the fact that the dictionary and RF pilotbeams become highly correlated for a large number of BSantennas and high SNR values. To see this, recall that thedictionary resembles a DFT matrix. Consequently, the productof the DFT-based pilot combiner and the dictionary tendsto have multiple close-to-zero columns, hence destroying theincoherence of the equivalent sensing matrix.

2) Multi-Antenna User:

We now investigate how multipleuser antennas affect the channel estimation performance atthe BS. In order to have a fair comparison between thesingle-antenna and multi-antenna user cases, we ﬁx the totalnumber of antennas to N B N U = 160 , and consider an × -element UPA at the BS and an -element ULA atthe user. For ϕ ∼ U ( − π/ , π/ , the continuous spatialfrequency ω = 1 / ϕ lies in the interval [ − / , / .Thus, the user’s dictionary consists of the spatial frequencies In this way, the overhead of partial training, . N B N U , is kept ﬁxed too. Fig. 7: NMSE versus SNR for a user with an -element ULA; × -elementUPA, N RF = 2 , NLoS channel with L = 3 paths, S = 400 subcarriers, andsuper-resolution dictionaries with G = 4 N B and G u = 4 N U . { ¯ ω ( p ) = p/G u , p = − ( G u − / , . . . , ( G u − / } . Theelements of the pilot RF beamformers { v i } are selected fromthe set {− / √ N U , / √ N U } with equal probability.The NMSE is computed by replacing h [ s ] and ˆ h [ s ] in (56)with vec ( H [ s ]) and vec ( ˆ H [ s ]) , respectively. The MSE of theLS scheme (31) is the same as in the single-antenna usercase since we have kept ﬁxed the total number of antennas.Figure 7 depicts the performance of the GSOMP and OMP.As observed, there is a slight increase in the NMSE comparedto the single-antenna user case, i.e., Fig. 6(a). Furthermore,this increase becomes signiﬁcant in the high SNR regime,but yet, the proposed estimator outperforms the OMP for lowand moderate SNR values. The performance degradation isbecause the equivalent sensing matrices { Φ s } S − s =0 have highertotal coherence compared to the single-antenna user case,which is deﬁned for each matrix Φ s as [46] µ ( Φ s ) (cid:44) GG u (cid:88) i =1 GG u (cid:88) j =1 ,j (cid:54) = i | Φ Hs ( i ) Φ s ( j ) |(cid:107) Φ s ( i ) (cid:107)(cid:107) Φ s ( j ) (cid:107) . (57)It is worth pointing out that different pilot beam designs mightchange the performance of the estimators, which hinges on thecoherence of the equivalent sensing matrices { Φ s } S − s =0 .

3) Subcarrier Selection:

In the previous experiments, weassumed that the GSOMP-based estimator employs all thesubcarriers, i.e., |S| = 400 , to estimate the common supportof the channel gain vectors { β [ s ] } S − s =0 . However, this mightlead to a very high computation burden. Thus, we can employonly a set of successive subcarriers to detect the commonsupport, i.e., steps − of Algorithm , and then use thissupport to estimate the channel at every subcarrier s ∈ S ,which corresponds to step of Algorithm . We refer to thisscheme as GSOMP with subcarrier selection (GSOMP-SS).From Fig. 8, we observe that we can accurately estimate theuplink channel in the moderate SNR regime by employingonly a small number of pilot subcarriers in the commonsupport detection steps. Note, though, that using one subcarrierper pilot subcarriers slightly increases the NMSE in the lowSNR regime. Fig. 8: NMSE versus SNR for a single-antenna user. In GSOMP-SS, one pilotsubcarrier per subcarriers is used to detect the common support; × -element UPA, N RF = 2 , NLoS channel with L = 3 paths, and S = 400 subcarriers.Fig. 9: Normalized array gain for an × -element UPA. In the proposedscheme, N sb M sb − TTD elements are employed; LoS channel, ( φ , θ ) = ( π/ , π/ , and S = 18 subcarriers. B. Hybrid Combining for Single-Antenna Users1) Achievable Rate with Perfect CSI:

We start the per-formance assessment of our combiner by considering a LoSchannel. In this case, the complex path gain is given by β ( f ) = α ( f ) e − j πf c τ , where τ = D /c is the ToA of theLoS path, and α ( f ) is speciﬁed according to (8). For eachchannel realization, perfect knowledge of the DoA is assumedat the BS, which can be acquired using the GSOMP estimator.We also consider the following cases: • A fully-digital architecture where the BS employs thefrequency-selective combiner / √ N B a ( φ , θ , f ) . • A hybrid architecture where the BS uses the narrowbandcombiner / √ N B a ( φ , θ , . • A hybrid architecture where the proposed combiner (22)is used, with N sb = 10 and M sb = 10 virtual subarrays.The normalized array gain is plotted in Fig. 9, where we seethat our combiner atttains approximately the maximum gainover the entire signal bandwidth of B = 40 GHz. Next, we Fig. 10: Average achievable rate under perfect CSI for a LoS channel; single-antenna user, × -element UPA, TTD elements in the proposedscheme, and S = 18 subcarriers. focus on the average achievable rate, which is calculated as R = S (cid:88) s =1 ∆ B E (cid:26) log (cid:18) P d | f H RF h [ s ] | ∆ Bσ (cid:19)(cid:27) , (58)where P d = P t /S is the power per subcarrier, and f RF denotesthe corresponding combiner. The results are given in Fig. 10.Speciﬁcally, the achievable rates are Gbps,

Gbps, and

Gbps for the digital, proposed, and narrowband schemes,respectively. Thus, the proposed combiner performs very closeto the fully-digital scheme, while offering a gain withrespect to the narrowband combiner. Additionally, this is doneby employing only N sb M sb − TTD elements for an × -element UPA, which yields an excellent trade-offbetween hardware complexity and performance. Lastly, notethat transmission rates at least R = 0 . Tbps at D = 15 meters can be achieved through an × -element UPA,which would not be feasible with an equivalent ULA under afootprint constraint.

2) Achievable Rate with Imperfect CSI:

We now evaluatethe average achievable rate attained by the proposed combineralong with the GSOMP-based estimator. To this end, weconsider a NLoS multi-path channel. The complex path gainof the l th NLoS path is β l ( f ) = α l ( f ) e − j πf c τ l , where τ l isthe ToA, and α l ( f ) is calculated according to (10) assuming φ i,l ∼ U ( − π/ , π/ . Under imperfect CSI, the BS treatsthe channel estimate as the true channel, and combines thereceived signal with the maximum-ratio combiner ˆ h [ s ] / (cid:107) ˆ h [ s ] (cid:107) .Let h [ s ] = ˆ h [ s ] − e [ s ] , with e [ s ] denoting the channelestimation error for the s th subcarrier. The combined signalfor the s th subcarrier is then written as y [ s ] = (cid:112) P d (cid:107) ˆ h [ s ] (cid:107) x [ s ] − (cid:112) P d ˆ h H [ s ] e [ s ] (cid:107) ˆ h [ s ] (cid:107) x [ s ] + ˆ h H [ s ] (cid:107) ˆ h [ s ] (cid:107) n [ s ]= (cid:112) P d (cid:107) ˆ h [ s ] (cid:107) x [ s ] + n eff [ s ] , (59)where n eff [ s ] = ( −√ P d ˆ h H [ s ] e [ s ] x [ s ] + ˆ h H [ s ] n [ s ]) / (cid:107) ˆ h [ s ] (cid:107) isthe effective noise. Unfortunately, it is challenging to derive anachievable rate of channel model (59) since the effective noise Fig. 11: Average achievable rate under imperfect CSI for a NLoS channelwith L = 2 paths; single-antenna user, × -element UPA, TTDelements per RF chain, and S = 400 subcarriers. is correlated with the desired signal. Nevertheless, as shown inthe previous numerical results, the channel estimation error issmall. Hence, it is reasonably assumed that, conditioned on thechannel estimates, the effective noise is uncorrelated with thedesired signal. Then, we obtain the following approximationfor the equivalent SNR at the s th subcarrier [47]SNR eq [ s ] ≈ P d (cid:107) ˆ h [ s ] (cid:107) ∆ Bσ + P d ˆ h H [ s ] R e [ s ] ˆ h [ s ] / (cid:107) ˆ h [ s ] (cid:107) , (60)where R e [ s ] (cid:44) E { e [ s ] e H [ s ] } . The corresponding averageachievable rate under imperfect CSI is then [47] R ≈ S (cid:88) s =1 ∆ B E { log (1 + SINR eq [ s ]) } . (61)A closed-form expression for R e [ s ] can be derived by assum-ing perfect recovery of the common support of the channelgain vectors. More speciﬁcally, from the CRLB analysis, wehave that the error e [ s ] (cid:44) ¯ A s ( I ) (cid:16) ˆ¯ β [ s ] − ˜¯ β [ s ] (cid:17) is distributedas CN (cid:0) , R e [ s ] (cid:1) , where R e [ s ] = ¯ A s ( I ) I − (cid:16) ˜¯ β [ s ] (cid:17) ¯ A Hs ( I ) .Figure 11 depicts the average achievable rate under perfect andimperfect CSI. In the imperfect CSI case, the common supportof the channel gain vectors is computed by the GSOMP-basedestimator. As observed from Fig. 11, the average achievablerate attained by the proposed channel estimator approachesthat of the perfect CSI case. C. Hybrid SVD Transmission for Multi-Antenna Users

In this section, we consider a multi-antenna user. As previ-ously shown, we can accurately estimate the channel using theGSOMP-based estimator, and hence perfect CSI is assumed.To have a fair comparison between the single-antenna andmulti-antenna user cases, we ﬁx the number of antennas to N U N B = 100 × , and we consider an × -elementUPA at the BS and an -element ULA at the user. Due to thesmall user array size, we assume a fully-digital array at theuser, where N u RF = N U = 2 . Subsequently, we compare thefollowing transmission schemes: Fig. 12: Average achievable rate a for a NLoS channel with L = 2 paths;multi-antenna user with an -element ULA, × -element UPA, TTDelements per RF chain, and S = 400 subcarriers. • Digital: the combiner F [ s ] and precoder B [ s ] are de-signed using the SVD of the channel H [ s ] . • Proposed: the wideband RF combiner F RF [ s ] implementsthe scaled matrix / √ N B H B ( f ) , deﬁned in (48), usingTTD and virtual array partition. The baseband combiner F BB [ s ] and precoder B [ s ] are then designed using theSVD of the effective channel F H RF [ s ] H [ s ] . • Narrowband: the frequency-ﬂat RF combiner F RF imple-ments the scaled matrix / √ N B H B (0) deﬁned in (48).The baseband combiner F BB [ s ] and precoder B [ s ] arethen designed based on the SVD of the effective chan-nel F H RF H [ s ] .The average achievable rate is calculated as R = S − (cid:88) s =0 N u RF (cid:88) n =0 ∆ B E (cid:26) log (cid:18) p n,s σ n ( F H [ s ] H [ s ] B [ s ])∆ Bσ (cid:19)(cid:27) , (62)where the set { p n,s } of powers is calculated using thewaterﬁlling power allocation algorithm, and σ n ( · ) denotesthe n th singular value of the input matrix. From Fig. 12,we consolidate that effectiveness of the proposed TTD-basedmethod, which performs close to the fully-digital transmissionscheme. More importantly, the deployment of a few antennasat the user side along with waterﬁlling power allocation booststhe average achievable rate compared to the single-antennauser case, which enables rates much higher than R = 0 . Tbpsat a distance D = 15 m. Another beneﬁt of having multiple userantennas is the reduction of the BS array size, which permitscombating the spatial-wideband effect with a small number ofTTD elements. In particular, for the × -element UPAunder consideration, we have used N sb = 10 and M sb = 5 virtual subarrays, resulting in N sb M sb − TTD elements.

D. Near-Field Considerations

In the far-ﬁeld region, the spherical wavefront degeneratesto a plane wavefront, which allows the use of the parallel-ray

Fig. 13: Average achievable rate of the TTD-based wideband combiner for aLoS channel; single-antenna user, and × -element UPA. approximation to derive the array response vector (7). Due tothe large array aperture of THz massive MIMO, though, near-ﬁeld considerations are of particular interest. Recall that near-ﬁeld refers to distances smaller than the Fraunhofer distance D f (cid:44) D /λ , where D max is the maximum dimension ofthe antenna array, and λ is the carrier wavelength. For a UPAwith N = M , we have D = 2( N − d , i.e., lengthof its diagonal dimension, which leads to D f = ( N − λ for a half-wavelength spacing. Then, for f c = 300 GHz andan × -element UPA, D f ≈ . meters. As a result,the plane wave assumption may not hold anymore in smalldistances from the BS [50]. In this case, a spherical wavefrontis a more appropriate model [51]. Under this model, the arrayresponse matrix, A ( φ, θ, f ) ∈ C M × N , of the BS is deﬁned as [ A ( φ, θ, f )] m,n (cid:44) e − j π ( f c + f ) D mn ( φ,θ ) c , (63)where D mn ( φ, θ ) = (cid:0) ( x − nd ) + ( y − md ) + z (cid:1) / is thedistance between the ( n, m ) th BS antenna and the scattererwith coordinates ( x, y, z ) ; x (cid:44) D cos φ sin θ , y (cid:44) D sin φ sin θ ,and z (cid:44) D cos θ , where D denotes the distance from the (0 , thBS antenna. The array response vector is then obtained as a ( φ, θ, f ) = vec ( A ( φ, θ, f )) . We now calculate the averageachievable rate for the TTD-based combiner (22) under theplane and spherical wave models. The combiner is designedassuming a plane wavefront in both cases. From Fig. 13, avery good match between the two models is observed evenfor distances smaller than the Fraunhofer distance. Thus, theproposed combiner can be used at near-ﬁeld distances withoutincurring a signiﬁcant rate loss. However, we stress that acomprehensive study of the near-ﬁeld effects under differentarray arrangements and sizes is left for future work.VII. C ONCLUSIONS

We have proposed a solution to the channel estimation andhybrid combining problems in wideband THz massive MIMO.Speciﬁcally, we ﬁrst derived the THz channel model with SFWeffects for a UPA at the BS and a single-antenna user. We thenshowed that standard narrowband combining leads to severereduction of the array gain due to beam squint. To tacklethis problem, we introduced a novel TTD-based widebandcombiner with a low-complexity implementation due to thevirtual subarray rationale. We next proposed a CS algorithm along with a wideband dictionary to acquire reliably the CSIwith reduced training overhead under the spatial-widebandeffect. To study the performance of the proposed schemes,we derived the CRLB and computed the achievable rate underimperfect CSI. We also extended our analysis to the multi-antenna user case, and conducted numerical results.Simulations demonstrated that our design provides nearly beam squint-free operation, as well as enables accurate CSI ac-quisition even in the low SNR regime. Regarding the insightsdrawn from our study, the deployment of multiple antennas atthe user can alleviate the spatial-wideband effect by reducingthe BS’ array size, whilst keeping constant the total numberof antennas. As a result, the TTD-based wideband array canoffer the power gain required to compensate for the very highpropagation losses at THz bands. Additionally, in the caseof multi-path propagation, it has been shown that SVD-basedtransmission can boost performance and permit rates morethan half terabit per second over a distance of several meters.In conclusion, wideband massive MIMO is expected to be akey enabler for future THz wireless networks.Regarding future work, it would be interesting to study theperformance of wideband THz massive MIMO under hardwareimpairments, as well as investigate the beam tracking problemin high-mobility scenarios. Moreover, it would be interestingto compare OFDM with SC-FDE, and derive an analyticalexpression for the PAPR metric.A PPENDIX

AFor the normalized array gain, we have that | a H ( φ, θ, a ( φ, θ, f ) | N B == | (cid:0) a x ( φ, θ, H ⊗ a y ( φ, θ, H (cid:1) ( a x ( φ, θ, f ) ⊗ a y ( φ, θ, f )) | N M = | (cid:0) a x ( φ, θ, H a x ( φ, θ, f ) (cid:1) (cid:0) a Hy ( φ, θ, a y ( φ, θ, f ) (cid:1) | N M .

Then, it holds | a x ( φ, θ, H a x ( φ, θ, f ) | N = 1 N (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N − (cid:88) n =0 e − j πfn dc sin θ cos φ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 1 N (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − e − j πfN dc sin θ cos φ − e − j πf dc sin θ cos φ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 1 N (cid:12)(cid:12)(cid:12)(cid:12) sin ( N πf ∆ x )sin ( πf ∆ x ) (cid:12)(cid:12)(cid:12)(cid:12) = D N (2 πf ∆ x ) , where ∆ x = dc sin θ cos φ . Likewise, we get | a y ( φ, θ, H a y ( φ, θ, f ) | M = D M (2 πf ∆ y ) , where ∆ y = dc sin θ sin φ , which yields the desired result. A PPENDIX

BUsing the identity a x ⊗ a y = vec (cid:0) a y a Tx (cid:1) , we have A ( φ, θ, f ) (cid:44) a y ( φ, θ, f ) a Tx ( φ, θ, f )=  a y, ( φ, θ, f ) ... a y,M sb ( φ, θ, f )  (cid:2) a Tx, ( φ, θ, f ) , · · · , a Tx,N sb ( φ, θ, f ) (cid:3) =  A ( φ, θ, f ) · · · A N sb ( φ, θ, f ) A ( φ, θ, f ) · · · A N sb ( φ, θ, f ) ... . . . ... A M sb ( φ, θ, f ) · · · A M sb N sb ( φ, θ, f )  , (64)where A mn ( φ, θ, f ) (cid:44) a y,m ( φ, θ, f ) a Tx,n ( φ, θ, f ) . We alsohave that A mn ( φ, θ, f ) = a y,m ( φ, θ, f ) a Tx,n ( φ, θ, f )= e − j π ( n −

1) ˜ N ( f c + f )∆ x − j π ( m −

1) ˜ M ( f c + f )∆ y A ( φ, θ, f ) . Using the above relationships, we can write A ( φ, θ, (cid:12) T [ s ] = v y v Tx , (65)where v x = (cid:104) e − j π ( n −

1) ˜ N ( f c + f )∆ x a x, ( φ, θ, (cid:105) N sb n =1 , (66)and v y = (cid:104) e − j π ( m −

1) ˜ M ( f c + f )∆ y a y, ( φ, θ, (cid:105) M sb m =1 . (67)Now consider a path with array response a H ( φ, θ, f ) . Then, f H RF a ( φ, θ, f ) == 1 √ N B vec H ( A ( φ, θ, (cid:12) T [ s ]) a ( φ, θ, f )= √ N B N B (cid:0) v Hx ⊗ v Hy (cid:1) ( a x ( φ, θ, f ) ⊗ a y ( φ, θ, f ))= √ N B N B (cid:0) v Hx a x ( φ, θ, f ) (cid:1) (cid:0) v Hy a y ( φ, θ, f ) (cid:1) = (cid:112) N B a Hx, ( φ, θ, a x, ( φ, θ, f )˜ N a Hy, ( φ, θ, a y, ( φ, θ, f )˜ M .

As a result, we obtain (23) in Proposition 1.R

EFERENCES[1] T. S. Rappaport et al ., “Wireless communications and applications above100 GHz: opportunities and challenges for 6G and beyond,”

IEEEAccess , vol. 7, pp. 78729-78757, 2019.[2] T. Kurner, “Towards future THz communications systems,”

TerahertzSci. Technol. , vol. 5, no. 1, pp. 11–17, 2012.[3] J. Zhang et al ., “Prospective multiple antenna technologies for beyond5G,”

IEEE J. Sel. Areas Commun. , vol. 38, no. 8, pp. 1637–1660,Aug. 2020.[4] H. J. Song and T. Nagatsuma, “Present and future of terahertz commu-nications,”

IEEE Trans. THz Sci. Technol. , vol. 1, no. 1, pp. 256–263,Sep. 2011.[5] B. Wang, F. Gao, S. Jin, H. Lin, and G. Y. Li, “Spatial- and frequency-wideband effects in millimeter wave massive MIMO systems,”

IEEETrans. Signal Process. , vol. 66, no. 13, pp. 3393-3406, Jul. 2018.[6] B. Wang et al ., “Spatial-wideband effect in massive MIMO withapplication in mmWave systems,”

IEEE Commun. Mag. , vol. 56, no.12, pp. 134-141, Dec. 2018. [7] R. J. Mailloux, Phased Array Antenna Handbook . Norwood, MA, USA:Artech House, 2005.[8] B. Peng, S. Wesemann, K. Guan, W. Templ, and T. K¨urner, “Precodingand detection for broadband single carrier terahertz massive MIMOsystems using LSQR algorithm,”

IEEE Trans. Wireless Commun. , vol.18, no. 2, pp. 1026-1040, Feb. 2019.[9] F. Sohrabi and W. Yu, “Hybrid analog and digital beamforming formmWave OFDM large-scale antenna arrays,”

IEEE J. Sel. Areas Com-mun. , vol. 35, no. 7, pp. 1432-1443, July 2017.[10] J. P. Gonz´alez-Coma, W. Utschick, and L. Castedo, “Hybrid LISAfor wideband multiuser millimeter-wave communication systems underbeam squint,”

IEEE Trans. Wireless Commun. , vol. 18, no. 2, pp. 1277-1288, Feb. 2019.[11] S. Park, A. Alkhateeb, and R. W. Heath, Jr., “Dynamic subarrays forhybrid precoding in wideband mmwave MIMO systems,”

IEEE Trans.Wireless Commun. , vol. 16, no. 5, pp. 2907–2920, May 2017.[12] L. Kong, S. Han, and C. Yang, “Hybrid precoding with rate and coverageconstraints for wideband massive MIMO systems,”

IEEE Trans. WirelessCommun. , vol. 17, no. 7, pp. 4634–4647, July 2018.[13] M. Cai et al , “Effect of wideband beam squint on codebook design inphased-array wireless systems,” in

Proc. IEEE GLOBECOM , Dec. 2016.[14] X. Liu and D. Qiao, “Space-time block coding-based beamforming forbeam squint compensation,”

IEEE Wireless Commun. Lett. , vol. 8, no.1, pp. 241–244, Feb. 2019.[15] C. Lin, G. Y. Li, and L. Wang, “Subarray-based coordinated beamform-ing training for mmWave and sub-THz communications,”

IEEE J. Sel.Areas Commun. , vol. 35, no. 9, pp. 2115-2126, Sept. 2017.[16] H. Hashemi, T. Chu, and J. Roderick, “Integrated true-time-delay-basedultra-wideband array processing,”

IEEE Commun. Mag. , vol. 46, no. 9,pp. 162–172, Sep. 2008.[17] J. Tan and L. Dai, “Delay-phase precoding for THz massive MIMO withbeam split,” in

Proc. IEEE GLOBECOM , Dec. 2019.[18] B. Wang, X. Li, F. Gao, and G. Y. Li, “Power leakage eliminationfor wideband mmWave massive MIMO-OFDM systems: An energy-focusing window approach,”

IEEE Trans. Signal Process. , vol. 67, no.21, pp. 5479-5494, Nov. 2019.[19] J. P. Gonz´alez-Coma, J. Rodr´ıguez-Fern´andez, N. Gonz´alez-Prelcic, L.Castedo, and R. W. Heath, Jr., “Channel estimation and hybrid precodingfor frequency selective multiuser mmWave MIMO systems,”

IEEE J. Sel.Topics Signal Process. , vol. 12, no. 2, pp. 353-367, May 2018.[20] C. Lin and G. Y. Li, “Indoor terahertz communications: How manyantenna arrays are needed?,”

IEEE Trans. Wireless Commun. , vol. 14,no. 6, pp. 3097-3107, June 2015.[21] H. Sarieddeen, M. S. Alouini, and T. Y. Al-Naffouri, “Terahertz-bandultra-massive spatial modulation MIMO,”

IEEE J. Sel. Areas Commun. ,vol. 37, no. 9, pp. 2040-2052, Sept. 2019.[22] Q. Ma, D. M. W. Leenaerts, and P. G. M. Baltus, “Silicon-based true-time-delay phased array front-ends at Ka-band,”

IEEE Trans. Microw.Theory Techn. , vol. 63, no. 9, pp. 2942–2952, Sep. 2015.[23] C. Han and Y. Chen, “Propagation modeling for wireless communica-tions in the terahertz band,”

IEEE Commun. Mag. , vol. 56, no. 6, pp.96-101, June 2018.[24] C. A. Balanis,

Antenna Theory: Analysis and Design , John Wiley &Sons, 2012.[25] D. Tse and P. Viswanath,

Fundamentals of Wireless Communication .New York, NY, USA: Cambridge Univ. Press, 2005.[26] A. F. Molisch, “Ultrawideband propagation channels-theory, measure-ment, and modeling,”

IEEE Trans. Veh. Technol. , vol. 54, no. 5, pp.1528-1545, Sept. 2005.[27] A. A. Boulogeorgos, E. N. Papasotiriou, and A. Alexiou, “Analyticalperformance assessment of THz wireless systems,”

IEEE Access , vol. 7,pp. 11436-11453, 2019.[28] C. Han, A. O. Bicen, and I. F. Akyildiz, “Multi-ray channel modelingand wideband characterization for wireless communications in theterahertz band,”

IEEE Trans. Wireless Commun. , vol. 14, no. 5, pp.2402–2412, May 2015.[29] R. Piesiewicz et al. , “Scattering analysis for the modeling of THzcommunication systems,”

IEEE Trans. Antennas Propag ., vol. 55, no.11, pp. 3002–3009, Nov. 2007.[30] C. Lin and G. Y. Li, “Adaptive beamforming with resource allocationfor distance-aware multi-user indoor terahertz communications,”

IEEETrans. Commun. , vol. 63, no. 8, pp. 2985-2995, Aug. 2015.[31] O. E. Ayach, R. W. Heath, Jr., S. Abu-Surra, S. Rajagopal, and Z.Pi, “The capacity optimality of beam steering in large millimeter waveMIMO systems,” in

Proc. IEEE SPAWC , June 2012, pp. 100-104. [32] S. M. Perera, A. Madanayake, and R. J. Cintra, “Radix-2 self-recursive sparse factorizations of delay Vandermonde matrices forwideband multi-beam antenna arrays,”

IEEE Access , vol. 8, pp. 25498-25508, 2020.[33] W. U. Bajwa, J. Haupt, A. M. Sayeed, and R. Nowak, “Compressedchannel sensing: a new approach to estimating sparse multipath chan-nels,”

Proc. IEEE , vol. 98, no. 6, pp. 1058-1076, June 2010.[34] R. M´endez-Rial, C. Rusu, N. Gonz´alez-Prelcic, A. Alkhateeb, andR. W. Heath, Jr., “Hybrid MIMO architectures for millimeter wavecommunications: Phase shifters or switches?,”

IEEE Access , vol. 4, pp.247-267, 2016.[35] R. W. Heath, Jr., N. Gonz´alez-Prelcic, S. Rangan, W. Roh, and A. M.Sayeed, “An overview of signal processing techniques for millimeterwave MIMO systems,”

IEEE J. Sel. Topics Signal Process. , vol. 10, no.3, pp. 436-453, Apr. 2016.[36] T. T. Cai and L. Wang, “Orthogonal matching pursuit for sparse signalrecovery with noise,”

IEEE Trans. Inf. Theory , vol. 57, no. 7, pp. 4680-4688, July 2011.[37] L. Harry and V. Trees,

Optimum Array Processing: Detection, Estima-tion, and Modulation Theory . John Wiley & Sons, 2002.[38] A. M. Sayeed, “Deconstructing multi-antenna fading channels,”

IEEETrans. Signal Process. , vol. 50, no. 10, pp. 2563–2579, Oct. 2002.[39] H. Xie, F. Gao, S. Zhang, and S. Jin, “A uniﬁed transmission strategy forTDD/FDD massive MIMO systems with spatial basis expansion model,”

IEEE Trans. Veh. Technol. , vol. 66, no. 4, pp. 3170-3184, Apr. 2017.[40] Z. Gao, L. Dai, Z. Wang, and S. Chen, “Spatially common sparsity basedadaptive channel estimation and feedback for FDD massive MIMO,”

IEEE Trans. Signal Process. , vol. 63, no. 23, pp. 6169-6183, Dec. 2015.[41] J. A. Tropp, A. C. Gilbert, and M. J. Strauss, “Algorithms for simulta-neous sparse approximation—part I: Greedy pursuit,”

Signal Process. ,vol. 86, no. 3, pp. 572–588, 2006.[42] J. A. Tropp and A. C. Gilbert, “Signal recovery from random measure-ments via orthogonal matching pursuit,”

IEEE Trans. Inf. Theory , vol.53, no. 12, pp. 4655–4666, Dec. 2007.[43] S. M. Kay,

Fundamentals of Statistical Processing: Estimation Theory ,vol. 1. Upper Saddle River, NJ, USA: Prentice-Hall, 1993.[44] J. Lee, G. Gil, and Y. H. Lee, “Exploiting spatial sparsity for estimatingchannels of hybrid MIMO systems in millimeter wave communications,”in

Proc. IEEE GLOBECOM , Dec. 2014, pp. 3326-3331.[45] Y. You, L. Zhang, and M. Liu, “IP aided OMP based channel estimationfor millimeter wave massive MIMO communication,” in

Proc. IEEEWCNC , Apr. 2019.[46] J. Lee, G. Gil, and Y. H. Lee, “Channel estimation via orthogonalmatching pursuit for hybrid MIMO systems in millimeter wave com-munications,”

IEEE Trans. Commun. , vol. 64, no. 6, pp. 2370-2386,June 2016.[47] T. L. Marzetta, E. G. Larsson, H. Yang, and H. Q. Ngo,

Fundamentalsof Massive MIMO . Cambridge, UK: Cambridge University Press, 2016.[48] Q. Nadeem, A. Kammoun, and M. Alouini, “Elevation beamformingwith full dimension MIMO architectures in 5G systems: A tutorial,”

IEEE Commun. Surveys Tuts. , vol. 21, no. 4, pp. 3238-3273, Jul. 2019.[49] O. E. Ayach, S. Rajagopal, S. A.-Surra, Z. Pi, and R. W. Heath, Jr.,“Spatially sparse precoding in millimeter wave MIMO systems,”

IEEETrans. Wireless Commun. , vol. 13, no. 3, pp. 1499-1513, Mar. 2014.[50] M. Matthaiou, P. de Kerret, G. K. Karagiannidis, and J. A. Nossek,“Mutual information statistics and beamforming performance analysisof optimized LoS MIMO systems,”

IEEE Trans. Commun. , vol. 58, no.11, pp. 3316-3329, Nov. 2010.[51] F. Bøhagen, P. Orten, and G. E. Øien, “On spherical vs. plane wavemodeling of line-of-sight MIMO channels,”