[PDF] Generalized Expectation Consistent Signal Recovery for Nonlinear Measurements

Abstract

In this paper, we propose a generalized expectation consistent signal recovery algorithm to estimate the signal \mathbf{x} from the nonlinear measurements of a linear transform output \mathbf{z}=\mathbf{A}\mathbf{x}. This estimation problem has been encountered in many applications, such as communications with front-end impairments, compressed sensing, and phase retrieval. The proposed algorithm extends the prior art called generalized turbo signal recovery from a partial discrete Fourier transform matrix \mathbf{A} to a class of general matrices. Numerical results show the excellent agreement of the proposed algorithm with the theoretical Bayesian-optimal estimator derived using the replica method.

Full PDF

aa r X i v : . [ c s . I T ] M a y Generalized Expectation Consistent SignalRecovery for Nonlinear Measurements

Hengtao He, Chao-Kai Wen, Shi Jin

Abstract —In this paper, we propose a generalized expectationconsistent signal recovery algorithm to estimate the signal x fromthe nonlinear measurements of a linear transform output z = Ax . This estimation problem has been encountered in many ap-plications, such as communications with front-end impairments,compressed sensing, and phase retrieval. The proposed algorithmextends the prior art called generalized turbo signal recoveryfrom a partial discrete Fourier transform matrix A to a class ofgeneral matrices. Numerical results show the excellent agreementof the proposed algorithm with the theoretical Bayesian-optimalestimator derived using the replica method. Index Terms —Compressed sensing, signal recovery, quantiza-tion, state evolution, replica method.

I. I

NTRODUCTION

Signal reconstruction problems are encountered in manyengineering ﬁelds. Compressed sensing (CS) [1], [2] aims toreconstruct a sparse signal with a high-dimension space froma low-dimension measurement space. Signiﬁcant attention hasbeen given to the usage of l -norm minimization because itis capable of recovering sparse signal with a computationalcost of polynomial complexity. However, this approach is stillgenerally far from optimal [3].Given that the prior distribution of the signal is used, theBayesian inference offers an optimal recovery approach in theminimum mean square error (MMSE) perspective althoughits exact execution is computationally difﬁcult in most cases[4]. Approximate message passing (AMP), which is based onthe Gaussian approximations of loopy belief propagation, is atractable and less complex alternative, and it has attracted con-siderable attention for such problems [5], [6]. Unfortunately,AMP and its generalization, GAMP [7], are fragile in termsof the choice of matrix, and can perform poorly outside thespecial case of zero-mean, i.i.d., sub-Gaussian matrix.Ma et al. [8] developed a signal recovery (SR) algorithm un-der linear measurements called Turbo-SR with partial discreteFourier transform (DFT) as the sensing matrix. SubsequentlyLiu et al. [9] proposed the generalized Turbo-SR (GTurbo-SR)to address non-linear measurements. Ma and Li [10] further Hengtao He and S. Jin are with the National Mobile CommunicationsResearch Laboratory, Southeast University, Nanjing 210096, China (e-mail:[email protected], and [email protected]).C.-K. Wen is with the Institute of Communications Engineer-ing, National Sun Yat-sen University, Kaohsiung 804, Taiwan (e-mail:[email protected]).The work of Hengtao He and S. Jin was supported in part by the NationalScience Foundation (NSFC) for Distinguished Young Scholars of China withGrant 61625106 and the National Natural Science Foundation of China underGrant 61531011. The work of C.-K. Wen was supported by the ITRI inHsinchu, Taiwan, and the MOST of Taiwan under Grants MOST 103-2221-E-110-029-MY3. proposed the orthogonal AMP (OAMP) algorithm for generalsensing matrices but under linear measurements. In contrastto suboptimal developments along this line, such as AMP andGAMP, Turbo-SR, GTurbo-SR, and OAMP are optimal andhave excellent convergence properties. The state evolutionsof the three algorithms agree perfectly with those predictedby the theoretical replica method. However, these algorithmsonly consider either the partial DFT sensing matrix or linearmeasurements.The purpose of this paper is to develop a novel algorithmfor Bayesian SR with a much broader class of sensing matri-ces under non-linear measurements. We employ an advancedmean ﬁeld method known as the expectation consistent (EC)approximation developed in statistical mechanics [11], [12]and machine learning [13]. Recently, “vector AMP” which ispresented in [14], can be interpreted as an instance of thegeneralized EC (GEC) [15] algorithm.Our wok is inspired by [15]. Speciﬁcally, we present theGEC-SR to recover sparse signals from nonlinear measure-ments, especially from low-resolution quantized output, whichhas been of particular interest in recent years. We show that theperformance of our GEC-SR is superior to “initial GEC” [15]because of different update manner. When partial DFT matrixis considered, the GEC-SR is reduced to GTurbo-SR [9]. Inaddition, we give the state evolution (SE) analysis and showthat the analytical SE of the GEC-SR is consistent with thatobtained by the replica method. This consistency indicates theoptimality of the GEC-SR for non-linear measurements withgeneral sensing matrices.

Notations —For any matrix A , A H is the conjugate trans-pose of A , and tr ( A ) denotes the traces of A . In addition, I isthe identity matrix, is the zero matrix, Diag( v ) is the diag-onal matrix whose diagonal equals v , n is the n -dimensionalall-ones vector, d ( Q ) is the diagonalization operator, whichreturns a constant vector containing the average diagonalelements of Q , and < a > is the average operator, whichreturns a constant vector containing the average elements of a . In addition, ⊘ and ⊙ denote componentwise vector divisionand vector multiplication, respectively. A random vector z drawn from the proper complex Gaussian distribution of mean µ and covariance Ω is described by the probability densityfunction: N C ( z ; µ , Ω ) = 1det( π Ω ) e − ( z − µ ) H Ω − ( z − µ ) . One can introduce various iterative algorithms to the EC approximation.However, a proper update manner is important because an improper one mightresult in a poor convergence in particular for small measurement ratio. e use Dz to denote the real Gaussian integration measure Dz = φ ( z ) dz, φ ( z ) , √ π e − z , and we use Dz c = e −| z | π dz to denote the complex Gaussianintegration measure. Finally, Φ( x ) , R x −∞ Dz denotes thecumulative Gaussian distribution function.II. P ROBLEM D ESCRIPTION

A. Observation Model

We consider the generalized linear model (GLM) where a N -dimensional random vector x ∈ C N is observed througha linear output z = Ax , followed by a componentwise,probabilistic measurement channel p ( y | x ) = M Y m =1 p ( y m | z m ) , z = Ax , (1)where A ∈ C M × N is a known transform matrix. The sparsesignal x is assumed to be i.i.d. with the n th entry of x following the Bernoulli-Gaussian distribution: p ( x ) = (1 − ρ ) δ ( x ) + ρ N C ( x ; 0 , ρ − ) , (2)where δ ( x ) is the Dirac function, and the variance of each x n is normalized, that is, E {| x n | } = 1 . We denote the measure-ment ratio by α = M/N (i.e., the number of measurementsper variable). In addition, for ease of notation, we deﬁne P x = E {| x n | } and P z = P x · tr ( AA H ) /M. (3) B. Quantized Measurements

In this study, we are interested in the measurements acquiredthrough the complex-valued quantizer Q c . Speciﬁcally, eachcomplex-valued quantizer Q c consists of two real-valued B -bit quantizers Q , which is deﬁned as ˜ y m = Q c ( y m ) , Q ( y R,m ) + jQ ( y I,m ) . (4)Therefore, the resulting quantized signal ˜ y is provided by ˜ y = Q c ( y ) = Q c ( z + w ) , (5)where w ∼ N C ( , σ I ) represents the additive Gaussian noise.The output is assigned the value ˜ y m when the quantizer inputfalls in the corresponding interval (˜ y low m , ˜ y up m ] (namely, the b -thbin). For example, the quantized output of a typical uniformquantizer with a quantizer step size ∆ is given by ˜ y m ∈ (cid:26)(cid:18) −

12 + b (cid:19) ∆; b = − B , · · · , B (cid:27) , R B , (6)and the associated lower and upper thresholds are given by ˜ y low m = ( ˜ y m − ∆2 , if ˜ y m ≥ − (cid:16) B − (cid:17) ∆ , −∞ , otherwise . (7) ˜ y up m = ( ˜ y m + ∆2 , if ˜ y m ≤ (cid:16) B − (cid:17) ∆ , ∞ , otherwise . (8) We suppose that each entry of x is generated from adistribution (2) independently, that is, p ( x ) = N Q n =1 p ( x n ) . Thecomponentwise, probabilistic measurement channel is givenby p (˜ y m | z m ) = Ψ (cid:18) ˜ y R,m ; z R,m , σ (cid:19) Ψ (cid:18) ˜ y I,m ; z I,m , σ (cid:19) , (9)where Ψ(˜ y ; z, c ) = Φ (cid:18) ˜ y up − zc (cid:19) − Φ (cid:18) ˜ y low − zc (cid:19) . (10)III. G ENERALIZED

EC S

IGNAL R ECOVERY

In this section, we present the GEC-SR. The block diagramof the GEC-SR is illustrated in Figure 1, which consists ofthree modules: modules A, B and C. Module A computes theposterior mean and variance of z , module C constrains the esti-mation into the linear space z = Ax , and module B computesthe posterior mean and variance of x . These procedures followa circular manner, that is, A → C → B → C → A → · · · .In addition, each module uses the turbo principle in iterativedecoding, that is, each module passes the extrinsic messages toits next module. The GEC-SR is different from the GTurbo-SR[9] and “initial GEC” [15]. We will discuss their differencesin the following subsections.Algorithm 1 speciﬁes the iterative procedure of the GEC-SR. In Algorithm 1, the posterior mean and the variance of z and x are obtained from (11) and (15), respectively. We takethe expectation and variance in (15a) and (15b) with respectto the posterior probability p ( x | r x , v x ) = e log p ( x ) −k x − r x k v x R e log p ( x ) −k x − r x k v x d x , (19)where k a k v , N X n =1 | a n | v n . (20)We can calculate the expectation and variance on each entryof x separately because the prior p ( x ) is separable, and thuswe omit index n in the following expressions. Using theGaussian reproduction property [16], we can obtain the explicitcomponentwise expression E { x | r, v } = C rρ − v + ρ − , (21) Var { x | r, v } = C vρ − v + ρ − + (cid:12)(cid:12)(cid:12)(cid:12) rρ − v + ρ − (cid:12)(cid:12)(cid:12)(cid:12) ! − | ˆ x | , (22)where C = ρ N C (0; r, v + ρ − )(1 − ρ ) N C (0; r, v ) + ρ N C (0; r, v + ρ − ) . (23)Similarly, the posterior mean and variance of z in (11a) and(11b) are taken with respect to the posterior p ( z | r z , v z ) = e log p ( y | z ) −k z − r z k v z R e log p ( y | z ) −k z − r z k v z d z . (24) (cid:82)(cid:86)(cid:87)(cid:72)(cid:85)(cid:76)(cid:82)(cid:85)(cid:3)(cid:48)(cid:72)(cid:68)(cid:81)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:82)(cid:85)(cid:3)(cid:82)(cid:73) (cid:72)(cid:91)(cid:87)(cid:72)(cid:91)(cid:87) (cid:51)(cid:82)(cid:86)(cid:87)(cid:72)(cid:85)(cid:76)(cid:82)(cid:85)(cid:3)(cid:48)(cid:72)(cid:68)(cid:81)(cid:3)(cid:40)(cid:86)(cid:87)(cid:76)(cid:80)(cid:68)(cid:87)(cid:82)(cid:85)(cid:3)(cid:82)(cid:73) (cid:72)(cid:91)(cid:87) (cid:72)(cid:91)(cid:87) (cid:47)(cid:76)(cid:81)(cid:72)(cid:68)(cid:85)(cid:3)(cid:54)(cid:83)(cid:68)(cid:70)(cid:72) (cid:48)(cid:82)(cid:71)(cid:88)(cid:79)(cid:72)(cid:3)(cid:36)(cid:48)(cid:82)(cid:71)(cid:88)(cid:79)(cid:72)(cid:3)(cid:38)(cid:48)(cid:82)(cid:71)(cid:88)(cid:79)(cid:72)(cid:3)(cid:37) Fig. 1. Block diagram of the GEC-SR algorithm

The mean and variance can also be computed in a compo-nentwise manner. (11a) and (11b) are nonlinear because ofthe quantization, and their explicit expressions are provided in[17].Under the linear constraint z = Ax , the estimation of theposterior mean and covariance matrix of x are obtained in(13b) and (13a) with the corresponding posterior probability p ( x | r , v ) = e −k x − r x k v x −k z − r z k v z R e −k x − r x k v x −k z − r z k v z d x . (25)The posterior mean and covariance matrix of z can be obtainedin (17) following the linear space of z = Ax . A. Relation of GEC-SR and Initial GEC

In the introduction, we mention that our work is inspiredby the “initial GEC” algorithm from [15], which considersthe standard linear measurement and GLM. However, ouralgorithm is different from the initial GEC in terms of theupdate manner. In the GEC-SR, we ﬁrst estimate z from thenonlinear measurements ˜ y followed by estimating the signal x using the prior information from module C, whereas theinitial GEC estimates x and z simultaneously. In addition,before computing the mean and covariance of z in (17c) and(17d), we compute the mean and covariance of x once againin (17a) and (17b). Because of these modiﬁcations, the GEC-SR algorithm converges faster than initial GEC and can agreeperfectly with the theoretical SE analysis that predicted by thereplica method. We will show the theoretical SE analysis inthe next section. B. Relation of GEC-SR and GTurbo-SR

GTurbo-SR [9] is a promising algorithm to recover sparsesignals from nonlinear measurements, and the idea uses theturbo principle in iterative decoding to compute the extrinsicmessages of x and z . A visual examination of the GEC-SRshows many similarities with the GTurbo-SR in terms of theiterative approach. In particular, the posterior probabilities of x and z in the GEC-SR are identical to those in the GTurbo-SR. Similarly, the computation of extrinsic information inthe GEC-SR is also identical to the one in the GTurbo-SR.However, GTurbo-SR only considers the sensing matrix A as a partial DFT matrix, while general matrices can be appliedin the GEC-SR. If we replace A by a partial DFT matrix inthe GEC-SR, the GEC-SR is reduced to the GTurbo-SR.IV. S TATE E VOLUTION

In this section, we show the SE equations of the GEC-SR. From the statistical mechanics perspective, the iterativeprocedure of the GEC-SR is equivalent to ﬁnding the saddlepoints of the free energy deﬁned by F = − N E { log p (˜ y ) } . (26)The calculation of F is very difﬁcult. Fortunately, the replicamethod from statistical physics provides a highly sophisticatedprocedure to address this calculation. In the calculation, we usethe assumptions that N, M → ∞ while keeping

M/N = α ﬁxed and ﬁnite. Only the ﬁnal analytical results in Proposition1 are shown because of space limitation.Proposition 1 involves several new parameters. Most param-eters (except for some auxiliary parameters) can be illustratedsystematically by a scalar channel r = x + w, (27)where w ∼ N C ( w ; 0 , η − ) . The MMSE estimate of (18) isgiven by E { x | r } = Z xp ( x | r ) dx, (28)where p ( x | r ) = p ( r | x ) p ( x ) p ( r ) and p ( r | x ) = ηπ e − η | r − x | . Wedeﬁne the MMSE of this estimator as mmse( η ) = E (cid:8) | x − E { x | r }| (cid:9) , (29)where the expectation is taken over the joint distribution p ( r, x ) = p ( r | x ) p ( x ) . If x follows the Bernoulli-Gaussiandistribution (2), mmse( η ) can be obtained explicitly [18] mmse( η ) = 1 − ηηρ − + 1 × Z Dz c | z | ρ + (1 − ρ ) e −| z | ηρ − ( ηρ − + 1) . (30) lgorithm 1 GEC-SR for the GLM

Input:

Nonlinear measurements ˜ y , sensing matrix A , likelihood p (˜ y | z ) ,and prior distribution p ( x ) . Output:

Recovered signal ˆ x . Initialize: t ← , r z ← , r x ← , v z ← P z , and v x ← P x .1: while t < T max do

1) Compute the posterior mean and covariance of z ˆ z = E { z | r z , v z } , (11a) v post1 z = Var { z | r z , v z } . (11b)Compute the extrinsic information of zv z = ⊘ (cid:16) ⊘ < v post1 z > − ⊘ v z (cid:17) , (12a) r z = v z ⊙ (cid:16) ˆ z ⊘ < v post1 z > − r z ⊘ v z (cid:17) . (12b)2) Compute the mean and covariance of x from the linear space Q x = (cid:16) Diag( ⊘ v x ) + A H Diag( ⊘ v z ) A (cid:17) − , (13a) ˆ x = Q x (cid:16) r x ⊘ v x + A H r z ⊘ v z (cid:17) . (13b)Compute the extrinsic information of xv x = ⊘ ( ⊘ d ( Q x ) − ⊘ v x ) , (14a) r x = v x ⊙ (ˆ x ⊘ d ( Q x ) − r x ⊘ v x ) . (14b)3) Compute the mean and covariance of x ˆ x = E { x | r x , v x } , (15a) v post1 x = Var { x | r x , v x } . (15b)Compute the extrinsic information of xv x = ⊘ (cid:16) ⊘ < v post1 x > − ⊘ v x (cid:17) , (16a) r x = v x ⊙ (cid:16) ˆ x ⊘ < v post1 x > − r x ⊘ v x (cid:17) . (16b)4) Compute the mean and covariance of z from the linear space Q x = (cid:16) Diag( ⊘ v x ) + A H Diag( ⊘ v z ) A (cid:17) − , (17a) ˆ x = Q x (cid:16) r x ⊘ v x + A H r z ⊘ v z (cid:17) , (17b) Q z = AQ x A H , (17c) ˆ z = A ˆ x . (17d)Compute the extrinsic information of zv z = ⊘ ( ⊘ d ( Q z ) − ⊘ v z ) , (18a) r z = v z ⊙ (ˆ z ⊘ d ( Q z ) − r z ⊘ v z ) . (18b)2: return the recovered signal ˆ x . For ease of expressions, we deﬁne two auxiliary equations: ˜ η z = E ( v z v x v x + λv z ) , (31) P x − ˜ η x = (1 − α ) v x + α E ( v x + λv z ) , (32)where λ is the eigenvalues of AA H , the expectation withrespect to λ is deﬁned by E { f ( λ ) } = M P Mi =1 f ( λ i ) , (˜ η x , ˜ η z , v x , v z ) will be given in Proposition 1, and ( P x , P z ) have been deﬁned in (3). In addition, we denote Ψ ′ (˜ y ; z, c ) = ∂ Ψ(˜ y ; z,c ) ∂z . Proposition 1:

The saddle points of the free energy can beobtained byInitial t = 0 , v x = P x and η z = 0 . t = 0 , , , . . .

1) ˜ η tz := X ˜ y ∈R B Z Dz (cid:18) Ψ ′ (cid:18) ˜ y ; q η tz z, σ + P z − η tz (cid:19) (cid:19) Ψ (cid:18) ˜ y ; q η tz z, σ + P z − η tz (cid:19) ; v t +1 z := 1˜ η tz − ( P z − η tz );2) Get P x − ˜ η tx using (32) for a given ( v tx , v t +1 z ) , η t +1 x := 1 P x − ˜ η tx − v tx ;3) v t +1 x := (cid:18) η t +1 x ) − η t +1 x (cid:19) − ;4) Get ˜ η t +1 z using (31) for a given ( v t +1 x , v t +1 z ) ; P z − η t +1 z := 1˜ η t +1 z − v t +1 z . (cid:4) As t → ∞ , { η tx , η tz } converges to a saddle point of the freeenergy. The above iterative expressions also correspond to theSEs of the GEC-SR in Algorithm 1. In particular, mmse( η tx ) represents the MSE of ˆ x .If A is obtained by the random selection of a set of rowsfrom the standard DFT matrix, then A is the row-orthogonalmatrix with eigenvalues λ i = 1 for i = 1 , . . . , N . Bycombining all the coupled equations, we ﬁnally obtain ˜ η tz := X ˜ y ∈R B Z Dz (cid:18) Ψ ′ (cid:18) ˜ y ; q P z − v tx z, σ + v tx (cid:19) (cid:19) Ψ (cid:18) ˜ y ; q P z − v tx z, σ + v tx (cid:19) , (33) η t +1 x := (cid:18) α ˜ η tz − v tx (cid:19) − , (34) v t +1 x := (cid:18) η t +1 x ) − η t +1 x (cid:19) − . (35)The above iterative equations agree with those in the GTurbo-SR [9]. V. N UMERICAL RESULTS

In this section, we conduct numerical experiments to verifythe accuracy of our analytical results. In all the cases, weconsider the recovery x from the quantized output ˜ y con-structed from (9), where x is drawn i.i.d., zero-mean Bernoulli-Gaussian with ρ = 0 . . The noise level σ is set as − . Themetric MSE is deﬁned as MSE = k x − ˆ x k k x k = k x − ˆ x k N . (36)We use the typical uniform quantizer with quantization stepsize ∆ = 2 − B , where B is the quantization resolution. M SE ( d B ) GEC−SR (Simulation)GEC−SR (Simulation)GEC−SR (Simulation)GEC−SR (SE)GEC−SR (SE)GEC−SR (SE)

Fig. 2. Simulated and analytical MSEs of the GEC-SR under different quanti-zation levels. The singular values of sensing matrix A ∈ C × are setas [ λ M λ M ] with M = 5000 , M = 734 , and ( λ , λ ) = (1 , . M SE ( d B ) GEC−SR (Simulation)initial GEC (Simulation)GTurbo−SR (Simulation)GEC−SR (SE)GEC−SR (Simulation)initial GEC (Simulation)GTurbo−SR (Simulation)GEC−SR (SE) ρ =0.4 ρ =0.1 Fig. 3. MSE results of Algorithm 1, GTurbo-SR, and initial GEC with partialDFT sensing matrix under different sparasity levels.

The simulation results are obtained by averaging over , realizations.Figure 2 plots the average MSEs achieved by the GEC-SRand the theoretical result derived by the replica method undera general matrix. We constructed A ∈ C × from thesingular value decomposition A = UDV T , where unitarymatrices U and V are drawn uniformly with respect to theHaar measure. The singular values are set as [ λ M λ M ] with M = 5000 , M = 734 , and ( λ , λ ) = (1 , .Figure 3 shows the corresponding MSEs of Algorithm 1,GTurbo-SR [9], and initial GEC [15] with partial DFT sensingmatrix under different sparasity levels. The quantization levelis - bit . For comparison, the simulation scenarios completelyfollow those presented in [8], [9], where the system parametersare set as follows: α = 0 . , N = 8192 , and M = 5734 . Theﬁgure clearly demonstrates that the GEC-SR is identical to the GTurbo-SR when partial DFT is considered, and the SEanalysis precisely predicts the per iteration performance. Inaddition, the initial GEC cannot coverage to the ﬁxed-pointwhen the signal is very sparse, but our GEC-SR algorithm ismore robust because of the different update manner.VI. C ONCLUSION

In this paper, we developed a computationally feasiblesignal recovery approximation scheme called GEC-SR fornonlinear measurements affected by quantization. We showedthat the performance of the GEC-SR is superior to initialGEC for general sensing matrices, and the GEC-SR is reducedto GTurbo-SR for partial DFT sensing matrices. Finally, wepresented the SE analysis to precisely describe the asymptoticbehavior of the GEC-SR algorithm.R

EFERENCES[1] D. L. Donoho, “Compressed sensing,”

IEEE Trans. Inf. Theory , vol. 52,no. 4, pp. 1289–1306, Apr. 2006.[2] E. J. Candes and M. B. Wakin, “An introduction to compressivesampling,”

IEEE Signal Process. Mag. , vol. 25, no. 2, pp. 21–30, Mar.2008.[3] Y. Shiraki and Y. Kabashima, “Typical reconstruction limits for dis-tributed compressed sensing based on ℓ p minimization and bayesianoptimal reconstruction,” J. Statist. Mech. , no. 5, p. P05029, 2015.[4] S. Ji, Y. Xue, and L. Carin, “Bayesian compressive sensing,”

IEEE Trans.Signal Process. , vol. 56, no. 6, pp. 2346–2356, Jun. 2008.[5] D. L. Donoho, A. Maleki, and A. Montanari, “Message-passing algo-rithms for compressed sensing,”

Proceedings of the National Academyof Sciences , vol. 106, no. 45, pp. 18914–18919, 2009.[6] F. Krzakala, M. Mzard, F. Sausset, Y. Sun, and L. Zdeborov, “Probabilis-tic reconstruction in compressed sensing: algorithms, phase diagrams,and threshold achieving matrices,”

J. Statist. Mech. , no. 8, p. P08009,2012.[7] S. Rangan, “Generalized approximate message passing for estimationwith random linear mixing,” in

Proc. IEEE Int. Symp. Inf. Theory (ISIT) ,Saint Petersburg, Russia, Aug. 2011, pp. 2168–2172.[8] J. Ma, X. Yuan, and L. Ping, “Turbo compressed sensing with partialDFT sensing matrix,”

IEEE Signal Process. Lett. , vol. 22, no. 2 pp. 158–161, Feb. 2015.[9] T. Liu, C. K. Wen, S. Jin, and X. You, “Generalized turbo signal recoveryfor nonlinear measurements and orthogonal sensing matrices,” in

Proc.IEEE Int. Symp. Inf. Theory (ISIT) , pp. 2883–2887, Jul. 2016.[10] J. Ma and L. Ping, “Orthogonal AMP,”

IEEE Access , vol. 5, no. 14, pp.2020–2033, 2017.[11] M. Opper and O. Winther, “Adaptive and self-averaging thouless-anderson-palmer mean-ﬁeld theory for probabilistic modeling,”

Phys.Rev. E , vol. 64, p. 056131, Oct. 2001.[12] M. Opper and O. Winther, “Expectation consistent approximate infer-ence,”

J. Mach. Learn. Res. , vol. 6, pp. 2177–2204, Dec. 2005.[13] T. P. Minka, “A family of algorithms for approximate Bayesian In-ference,” Ph.D. dissertation, Dept. Elect. Eng. Comput. Sci., MIT,Cambridge, MA, USA, 2001.[14] P. Schniter, S. Rangan, and A. Fletcher, “Vector approximate mes-sage passing for the generalized linear model,” arXiv preprintarXiv:1612.01186 , 2016.[15] A. Fletcher, M. Sahraee-Ardakan, S. Rangan, and P. Schniter, “Ex-pectation consistent approximate inference: Generalizations and conver-gence,” in

Proc. IEEE Int. Symp. Inf. Theory (ISIT) , pp. 190–194, Jul.2016.[16] C. E. Rasmussen and C. K. I. Williams,

Gaussian Processes for MachineLearning . Cambridge, MA, USA: MIT Press, 2006.[17] C. K. Wen, C. J. Wang, S. Jin, K. K. Wong, and P. Ting, “Bayes-optimaljoint channel-and-data estimation for massive MIMO with low-precisionADCs,”

IEEE Trans. Signal Process. , vol. 64, no. 2, pp. 2541–2556, May2016.[18] A. M. Tulino, G. Caire, S. Verd´u, and S. Shamai, “Support recoverywith sparsely sampled free random matrices,”