[PDF] Prior Image-Constrained Reconstruction using Style-Based Generative Models

Abstract

Obtaining a useful estimate of an object from highly incomplete imaging measurements remains a holy grail of imaging science. Deep learning methods have shown promise in learning object priors or constraints to improve the conditioning of an ill-posed imaging inverse problem. In this study, a framework for estimating an object of interest that is semantically related to a known prior image, is proposed. An optimization problem is formulated in the disentangled latent space of a style-based generative model, and semantically meaningful constraints are imposed using the disentangled latent representation of the prior image. Stable recovery from incomplete measurements with the help of a prior image is theoretically analyzed. Numerical experiments demonstrating the superior performance of our approach as compared to related methods are presented.

Full PDF

PPrior Image-Constrained Reconstruction using Style-Based Generative Models

Varun A. Kelkar Mark A. Anastasio Abstract

Obtaining an accurate and reliable estimate of anobject from highly incomplete imaging measure-ments remains a holy grail of imaging science.Deep learning methods have shown promise inlearning object priors or constraints to improvethe conditioning of an ill-posed imaging inverseproblem. In this study, a framework for estimat-ing an object of interest that is semantically re-lated to a known prior image, is proposed. Anoptimization problem is formulated in the disen-tangled latent space of a style-based generativemodel, and semantically meaningful constraintsare imposed using the disentangled latent repre-sentation of the prior image. Stable recovery fromincomplete measurements with the help of a priorimage is theoretically analyzed. Numerical exper-iments demonstrating the superior performanceof our approach as compared to related methodsare presented.

1. Introduction

In recent years, generative models based on deep neuralnetworks have risen to the forefront of machine learningresearch. By taking advantage of the approximately lowdimensional structure of natural image data, generativemodels, such as generative adversarial networks (GANs)have been able to learn effective mappings from a simple,tractable, low dimensional distribution to a complex imagedata distribution, giving it the ability to generate highly re-alistic looking images, approximately from the distributioninduced by the training data. Apart from image synthesis,generative models have found applications such as densityestimation (Dinh et al., 2016), image restoration (Lediget al., 2017), object detection (Prakash & Karam, 2019), andvideo generation (Cai et al., 2018) and style transfer (Zhanget al., 2018) to name a few. Recent research in GANs hasachieved state of the art performance in terms of visual University of Illinois at Urbana-Champaign, Champaign, IL61801, USA. Correspondence to: Varun A. Kelkar, Mark A. Anas-tasio < [vak2, maa]@illinois.edu > . quality of images generated, invertibility and image repre-sentation, and a meaningful control of semantic features ofthe image (Karras et al., 2019; 2020).Generative models have also found applications for imagereconstruction in computed imaging systems, where a com-putational procedure is employed to form an estimate ofan object of interest from imaging measurements. Whenthe measurements are insufﬁcient to uniquely determinethe object of interest, this procedure amounts to solvingan ill-posed inverse problem, and requires additional priorinformation about the object distribution. Approaches in-corporating sparsity-based priors have been successful inobtaining accurate reconstructions from incomplete mea-surements (Cand`es & Wakin, 2008; Lustig et al., 2008).Seeking to better characterize the object distribution, gen-erative models have been proposed as a prior for solvingill-posed inverse problems in imaging (Bora et al., 2017).Improvements in the image-synthesis performance, stabil-ity, and quality of generative neural networks have in-turnimproved the performance of generative-model constrainedreconstruction methods.In this study, style-based generative models are investigatedfor constraining image reconstruction problems in which thesolution is known to be close to a given prior image. Thisscenario, known as prior image-constrained reconstruction ,is of particular signiﬁcance when, for instance, the sameobject evolving over time is to be imaged multiple times. Itﬁnds applications in several scientiﬁc and medical imagingsituations, such as monitoring tumor progression or perfu-sion (Chen et al., 2008), multi-contrast magnetic resonanceimaging (MRI) (Weizman et al., 2016), or sequential radarimaging (Becquaert et al., 2019). Traditionally, this problemhas been solved by assuming that the difference betweenthe object of interest and the prior image is sparse in somedomain (Chen et al., 2008; Mota et al., 2017), and solving anoptimization problem penalizing this difference. However,although many natural images are compressible in somedomain, their differences may not be so. Hence, penalizingthe (cid:96) norm of the difference between the object and theprior image in a linear transform domain may not be thebest strategy to ensure that the ground truth is related to theprior image in a meaningful way. Style-based generativemodels have been known to be able to control individualsemantic features, or styles in an image by varying the disen- a r X i v : . [ ee ss . I V ] F e b rior Image-Constrained Reconstruction using Style-Based Generative Models tangled latent representation of the image at different scales.Here, the inverse problem is formulated as an optimizationproblem in the disentangled latent space of a style-basedgenerative model. The disentangled latent representationof the prior image is computed and prior image-based regu-larization is imposed by constraining the estimate to havecertain styles equal to the corresponding styles of the priorimage. Related work.

Generative model-constrained reconstruc-tion has been an active area of research, after being ﬁrstproposed by (Bora et al., 2017). Several approaches havetried to reduce the representation error arising from the useof GANs as priors (Asim et al., 2020; Hussein et al., 2020).Other works have examined denoising and image recon-struction when the measurement noise is distributed in acomplex way (Whang et al., 2020). Recently, StyleGAN hasbeen used for image superresolution (Menon et al., 2020),and image reconstruction in general (Marinescu et al., 2020).Individual control over image semantics has been achieved(Karras et al., 2019; Tewari et al., 2020) by controling thedisentangled latent respresentation of the image. Since thedisentangled latent representation of an image is crucial,several studies have focused on stylegan inversion (Wulff &Torralba, 2020; Abdal et al., 2019). Prior image-constrainedcompressed sensing has been studied previously from thetheoretical (Mota et al., 2017) point of view, to applications(Chen et al., 2008; Becquaert et al., 2019). Some studieshave used adaptive weights on the prior image and the imageestimate to better model the differences between the groundtruth and the prior image (Weizman et al., 2016).This study is organized as follows. Section 2 de-scribes the background of compressed sensing, generativemodel-constrained image reconstruction and prior image-constrained reconstruction and style-based generative mod-els. Section 3 describes the proposed approach motivatedby StyleGANs. Section 4 describes a theoretical analysisof the problem at hand. 5 describes the setup of the pro-posed numerical studies, with the results being described in6. Finally, discussion and conclusion is presented in Section7.

2. Background

Several digital imaging systems can be approximately mod-eled by a linear imaging model, described as (Barrett &Myers, 2013) g = H f + n , (1)where f ∈ E n is a vector that approximates the object to-be-imaged, g ∈ E m corresponds to the imaging measurements,and n ∈ E m represents the measurement noise. Here, E l corresponds to an l ∈ N dimensional Euclidean space. H ∈ E m × n corresponds to the linear operator that approximates the underlying physical model of the imaging system. Often,the measurements are incomplete ( m < n ) and, as such, f cannot be uniquely recovered from g . In this case, inorder to obtain a useful estimate of the true object, priorknowledge about f is needed to constrain the domain of H . In recent decades, compressed sensing has emerged as aleading framework to solve such underdetermined systemsof equations, by constraining f to a set of vectors that aresparse in some domain, along with certain conditions on H . Speciﬁcally, stable recovery of k -sparse objects can beguaranteed if H satisﬁes the Restricted Isometry Property (RIP) over the set of k -sparse signals (Cand`es et al., 2006;Cand`es & Wakin, 2008). Deﬁnition 2.1 (Restricted Isometry) . Let S k be the set ofall k -sparse vectors in R n . H is said to satisfy the RIP over S k if ∃ δ k ∈ (0 , that satisﬁes (1 − δ k ) (cid:107) f (cid:107) ≤ (cid:107) H f (cid:107) ≤ (1 + δ k ) (cid:107) f (cid:107) , (2)for all f ∈ S k , and δ k is not too close to 1 in a way pre-scribed in (Cand`es et al., 2006). Prior image-constrained reconstruction is a scenario wherethe true object ˜ f is related, or close to a previously known prior image f (PI) (Chen et al., 2008; Mota et al., 2017). Intraditional approaches to this problem, this similarity is im-posed in the following way. In addition to the conventionalsparsity constraint with respect to a transformation Φ , it isassumed that the difference of ˜ f and f (PI) is sparse withrespect to a transform Ψ , such as the wavelet transform, orthe 2D difference operator. Solving the inverse problemcan then be cast as obtaining the solution to versions of thefollowing optimization problem (Chen et al., 2008; Motaet al., 2017; Weizman et al., 2016): ˆ f = arg min f (cid:107) g − H f (cid:107) + λ (cid:16) α (cid:13)(cid:13)(cid:13) Ψ( f − f (PI) ) (cid:13)(cid:13)(cid:13) + (1 − α ) (cid:107) Φ f (cid:107) ) . (3)It is important to note that the assumption of sparse differ-ences between the prior image and the ground truth may notalways be valid. This is because although the difference oftwo k -sparse vectors is at worst k -sparse, in real life, im-ages are better modeled as being compressible, not sparse,with approximately the same set of coefﬁcients carryingmost of the signal energy, as evidenced in (Adcock et al.,2017). Hence, a difference of two such images is likely toannihilate the coefﬁcients carrying the most energy, whichmay result in the difference not being compressible. Hence,it is important to develop new ways to quantify the similaritybetween the ground truth and the prior image. rior Image-Constrained Reconstruction using Style-Based Generative Models Generative model-constrained reconstruction is a frame-work for image reconstruction, where the domain of H isconstrained with the help of a generative model trained to ap-proximate the distribution of objects (Bora et al., 2017; Asimet al., 2020; Marinescu et al., 2020). Let G : R k → R n be a generative model, typically parametrized by a deepneural network with parameters θ . G is trained on a datasetof images, such that under G , a sample z ∈ R k from atractable distribution such as N (0 , I k ) maps to a sample G ( z ) that approximately comes from the distribution of thetraining dataset images. Here, z is called the latent repre-sentation of G ( z ) . Since many real-life image datasets areapproximately low dimensional, popular generative models,such as GANs often have a low dimensional domain withdimensionality k (cid:28) n . Taking advantage of this fact, Bora et al. proposed a way to guarantee stable reconstruction ofan object in the range of G from O ( k log( Lr/δ )) measure-ments, if the object’s latent representation has an (cid:96) normof at most r , and if H satisﬁes the following set-restrictedeigenvalue condition (S-REC): Deﬁnition 2.2 (Set-restricted eigenvalue condition) . Let S ⊆ R n . A matrix H ∈ R m × n satisﬁes the set-restrictedeigenvalue condition S-REC ( S, γ, δ ) for some constants γ > and δ ≥ , if for any f , f ∈ S , (cid:107) H ( f − f ) (cid:107) ≥ γ (cid:107) f − f (cid:107) − δ. (4)Intuitively, this property stipulates that two objects f and f in the range R ( G ) of G may give rise to measurementsunder H that are close, only if they themselves are close.Certain sensing matrices, such as iid Gaussian sensing matri-ces with an appropriate column length have been shown tosatisfy the S-REC (Bora et al., 2017). The guarentees of sta-ble recovery are applicable to the solution of the followingconstrained optimization problem: ˆ z = arg min z , (cid:107) z (cid:107)≤ r (cid:107) g − HG ( z ; θ ) (cid:107) , ˆ f ≡ G (ˆ z ; θ ) , (5)where g = H ˜ f + n is the measurement corresponding tothe unknown true object ˜ f . Since the above objective isnon-convex, standard gradient descent-based methods arenot guaranteed to converge to the optimal solution. How-ever, it is observed that in practice, gradient-based methodsgive estimates of ˆ z that are close to the optimum, at leastin the case when ˜ f ∈ R ( G ) . Although Bora, et al. shownumerical studies using a deep convolutional GAN (DC-GAN), an optimization problem similar to Eq. (5) can alsobe formulated for recent advanced GAN architectures, suchas StyleGAN, with demonstrably improved empirical per-formance (Marinescu et al., 2020). StyleGAN and its successor, StyleGAN2, are well knownfor producing highly realistic samples from a real-life nat-ural image distribution. They are characterized by an ar-chitecture, that consists of two sub-networks - (1) a map-ping network g mapping : R k → R k , and an L -layer syn-thesis network G : R Lk → R n . In the conventional im-age generation mode, the mapping network maps a sample z ∈ Z ≡ R k from an iid standard normal distribution to avector u ∈ W ≡ R k . The input w to the L -layer synthesisnetwork G is formed by stacking L copies of u to form a K = kL dimensional vector w ∈ W + ≡ R K . The i thcopy of u is fed as in input to the i th layer of G , repre-senting the i th level of detail in the generated image. Inaddition to these, G also takes as input a collection of latent-noise vectors η that control minor stochastic variations ofthe generated image at different resolutions. The abilityof a StyleGAN to control features of the generated imageat different scales comes in part due to this architecture,and in part, due to the style-mixing regularization duringtraining (Karras et al., 2019), which loosely correspondsto evaluating the training loss using images generated by a“mixed” w vector, formed by stacking the W -space outputsof different realizations of z . In addition to these basic char-acteristics, StylGAN2 introduces path-length regularization,which aids in better conditioning of G and reducing therepresentation error (Karras et al., 2020). For conventionalimage generation, a sampled w + vector is degenerate, con-taining L copies of u , and hence lies in a k -dimensionalsubspace of W + . However, studies have shown that fromthe point-of-view of projecting an image to the range of G ,utilizing the entire W + space has beneﬁts in terms of lowerrepresentation error (Wulff & Torralba, 2020).

3. Approach

StyleGAN and StyleGAN2 are able to vary certain stylesof an image while keeping certain other styles ﬁxed. Forexample, for the StyleGAN trained on a dataset of faces, itis possible to vary the hairstyle and hair color while keepingthe general structure of the face the same. For a medicalimaging dataset such as a dataset of multi-contrast brainMRI images, it is possible to control the contrast, or exactplacement of the folds, ventricles, and other ﬁne scale fea-tures while keeping the general structure of the image thesame. Hence, comparing the ground truth and the prior im-age in the latent space of a StyleGAN is a natural approachto quantify the similarity between the ground truth and theprior image.Motivated by the style mixing experiments, one possible rior Image-Constrained Reconstruction using Style-Based Generative Models way to formulate the inverse problem at hand is as follows.Let G : R K → R n denote the synthesis network of atrained StyleGAN2. Note that the domain of G is taken tobe the extended space W + . Let f (PI) = G ( w (PI) ) denote aknown prior image in the range of G . Then, the proposedmeasurement model can be written as g = H ˜ f + n , ˜ f ∈ { G ( w ) s.t. w p = w (PI)1: p , w p : K = w (PI) p : K } , (6)where p , p are multiples of k , ≤ p < p ≤ n , and w u : v denotes the section of vector w from indices u through v .Even assuming that f (PI) and ˜ f are in-distribution images,there are some practical concerns about the measurementmodel described above. (1) In practice, f (PI) may not liein R ( G ) since R ( G ) is a K -dimensional manifold in R n .(2) If f (PI) ∈ R ( G ) , it is still possible that its disentangledlatent representation w (PI) lies in an unstable region of W + ,due to which it does not inherit the style mixing propertiesof samples drawn from G . (3) If f (PI) ∈ R ( G ) and the stylemixing performance using w (PI) is consistent with that ofimages drawn from G , ˜ f , i.e. the sought-after object, maystill not lie in { G ( w ) s.t. w p = w (PI)1: p , w p : K = w (PI) p : K } for any p , p . The following solutions are proposed in or-der to alleviate the aforementioned concerns. Concern (1)essentially refers to the representation error when approxi-mating f (PI) . It was observed that this is not a concern forin-distribution images, if a latent representation w (PI) in theextended W + space is sought, and even if the latent-noisevectors η are not optimized over, a close approximation to f (PI) can be obtained by gradient-descent based optimiza-tion, except for minor stochastic detail represented by η .In an attempt to resolve concern (2), (Wulff & Torralba,2020) observed that a transformed version of w , givenby v = LReL α ( w ) approximately follows a multivari-ate gaussian distribution with mean ¯ v and covariance Σ .Here, LReL α ( . ) denotes the leaky-ReLU nonlinear activa-tion (Maas et al., 2013), deﬁned asLReL α ( x ) i = (cid:26) x i , x i ≥ ,αx i , x i < . (7)The value of α is the reciprocal of the scaling value for neg-ative numbers included in the last leakyReLU layer in themapping network g mapping . This means that it is possibleto regularize the inversion of G with the help of a Gaus-sian prior on v (Wulff & Torralba, 2020). The inversionprocess can then be formulated in terms of the followingoptimization problem (Wulff & Torralba, 2020) w (PI) = arg min w (cid:13)(cid:13)(cid:13) f (PI) − G ( w ) (cid:13)(cid:13)(cid:13) + λ (cid:107) v − ¯ v (cid:107) , s.t. v = LReL α ( w ) , (8) Algorithm 1

Projected Adam algorithm for minimizing theobjective in Eq. (9).

Input:

Measurements g , prior image latent w (PI) , Regu-larization parameters p , p , λ , maximum iterations n iter L ( w ; λ ) : Objective function from Eq. (9).Initialize Adam optimizer parameters ( α, β , β ) . (De-fault parameters are used).Initialize iteration number t ← .Initialize w [0] ← w (PI) . while w [ t ] not converged do Adam update (Kingma & Ba, 2014): w [ t ] ← ADAM α,β ,β ( L ( w [ t ] ; λ )) Projection step: w [ t ]1: p ← w (PI)1: p w [ t ] p : K ← w (PI) p : K t ← t + 1 end while where (cid:107) x (cid:107) = x (cid:62) Σ − x is used to impose a prior on v corresponding to a Gaussian distribution with mean ¯ v andcovariance Σ . As observed in (Wulff & Torralba, 2020), R ( G ) -projected estimates of f (PI) obtained in this way in-herit the style-mixing and stability properties of samplesfrom G . The tradeoff between an accurate representationof f (PI) and the style-mixing properties is governed by theregularization parameter λ .Concern (3) can be addressed by arguing that since ˜ f is anin-distribution image, it has minimal representation errorwhen optimizing over W + . Also, p , p can be treatedas tunable regularization parameters, which manage thetradeoff between imposition of the prior from f (PI) , andconsistency with the measurements g .Taking into account the above arguments, the inverse prob-lem associated with Eq. (6) is formulated as the followingoptimization problem: ˆ w = arg min w (cid:107) g − HG ( w ) (cid:107) + λ (cid:107) v − ¯ v (cid:107) , s.t. w p = w (PI)1: p , w p : K = w (PI) p : K , v = LReL α ( w ) . (9)Although the above problem is non-convex, similar to pre-vious works (Bora et al., 2017; Asim et al., 2020; Kelkaret al., 2020), useful estimates ˆ f can be obtained by iterativegradient descent-based optimization. Algorithm 1 shows theprojected-Adam algorithm used for this purpose (Kingma &Ba, 2014). rior Image-Constrained Reconstruction using Style-Based Generative Models In order to present a fair comparison, CSGM using Style-GAN2 is also formulated in a manner similar to Eq. (9): ˆ w = arg min w (cid:107) g − HG ( w ) (cid:107) + λ (cid:107) v − ¯ v (cid:107) , s.t. v = LReL α ( w ) . (10)The Adam algorithm was used for obtaining approximatesolutions to the above problem (Kingma & Ba, 2014).

4. Theoretical analysis

The theoretical analysis presented here is motivated by theresults presented in (Bora et al., 2017). There, the authorsprovide a stable recovery guarantee in terms of the Lipschitzconstant of the generative network. However, in practice,Lipschitz constants can be difﬁcult to estimate, or a genera-tive network may not be Lipschitz stable. Hence, theoreticalanalysis in terms of the Jacobians of the generative networkis presented here, in order to derive a limited, non-uniformguarantee for the stable recovery of typical in-distributionobjects that lie in the range of the generative network. Inorder to do so, we utilize properties of StyleGAN2.The following assumptions are made based on the Style-GAN2 properties:1.

Path length regularity : E w ∼ p w (cid:0) (cid:107) J ( w ) (cid:107) F − a (cid:1) < b, (AS1) J ( w ) denotes the Jacobian of G evaluated at w , b > and a = E w (cid:107) J ( w ) (cid:107) F are global constants. Thisassumption is inspired by the path-length regularizationused in (Karras et al., 2020).2. Approximate local linearity : E w ∼ p w max w (cid:48) (cid:107) w (cid:48) − w (cid:107) ≤ (cid:15) L ( w (cid:48) , w ) ≤ β ( (cid:15) ) , (AS2)where L ( w (cid:48) , w ) = (cid:107) G ( w (cid:48) ) − G ( w ) − J ( w )( w (cid:48) − w ) (cid:107) This property essentially measures how close to a lin-early G behaves in an (cid:15) -neighborhood around a point w .Empirically estimated values of a, b and β are presentedin the supplementary ﬁle. As described in (Karras et al.,2020), a, b are relatively cheap to estimate computationally,by evaluating E w E y ∼N (0 ,I n ) (cid:13)(cid:13) J ( w ) (cid:62) y (cid:13)(cid:13) . Estimating β ( (cid:15) ) for a given value of (cid:15) is however not tractable. However, an approximate estimation was obtained by ﬁrst computing theJacobian at a point w ∼ p w , and then iteratively maximizing L ( w (cid:48) , w ) using a projected gradient ascent-type algorithm.Using these two assumptions, the following results are de-rived. Notation 4.1.

Let p w denote the distribution of the ex-tended latent space vector w = [ u u . . . u L ] (cid:62) ∈ W + ,where u i ∼ g mapping ( z i ) , z i ∼ N (0 , I k ) , z i ’s are inde-pendently distributed. Let p v denote the distribution of v = LReL α ( w ) , and let Σ be the covariance matrix of v .Let w (PI) be a sample from p w , and ≤ p < p ≤ K. Assume that p and p are multiples of k . Let B p ,p w ( r ) := (cid:110) w s.t. (cid:107) LReL α ( w ) − ¯ v (cid:107) Σ ≤ r, w p = w (PI)1: p , w p : K = w (PI) p : K (cid:111) , By Markov’s inequality and concentration of norm, we havethe following (Vershynin, 2018):

Lemma 4.1. If w is a sample from p w , then it satisﬁesthe following three properties with probability at least − O (1 /K ) : (cid:107) J ( w ) (cid:107) F ≤ √ Ka, (P1) max w (cid:48) (cid:107) w (cid:48) − w (cid:107) ≤ (cid:15) L ( w (cid:48) , w ) ≤ √ Kβ ( (cid:15) ) (P2) (cid:13)(cid:13)(cid:13) Σ − / LReL α ( w ) (cid:13)(cid:13)(cid:13) ≤ √ K (1 + o (1)) (P3) Lemma 4.2 (Set-restricted eigenvalue condition) . Let w (PI) be a sample from p w . Assume that p v is a Gaussian dis-tribution with covariance matrix Σ . Let ˜ B p ,p w ( r ) be theset of all points in B p ,p w ( r ) satisfying properties P1 andP2. Let τ < , δ > . Let H ∈ R m × n be an matrix withelements h ij ∼ N (0 , /m ) . For all δ (cid:48) < δ , let β ( δ (cid:48) /a ) gopolynomially as δ (cid:48) /a , with β (0) = 0 . If m = Ω (cid:18) p − p τ log ar (cid:107) Σ (cid:107) F δ (cid:19) , (11) then H satisﬁes the S-REC ( G ( ˜ B p ,p w ( r )) , − τ, δ + √ Kβ ( δ/a )) with probability − e − Ω( α m ) . Based on this, the following reconstruction guarantee isarrived at.

Theorem 4.1.

Let H ∈ R m × n satisfyS-REC ( G ( ˜ B p ,p w ( r )) , γ, δ + √ Kβ ( δ/a )) . Let n bethe measurement noise. Let w , w (PI) ∼ p w . Let f (PI) = G ( w (PI) ) be the known prior image. Let ˜ w = [ w (PI)1: p (cid:62) w (cid:62) p : p w (PI) p : K (cid:62) ] (cid:62) . rior Image-Constrained Reconstruction using Style-Based Generative Models Let ˜ f = G ( ˜ w ) represent the object to-be-imaged. Let g = H ˜ f + n be the imaging measurements. Let ˆ f = arg min f ∈ G ( ˜ B p ,p w ( r )) (cid:107) g − H f (cid:107) . (12) Then, (cid:107) ˆ f − ˜ f (cid:107) ≤ γ (2 (cid:107) n (cid:107) + δ + √ Kβ ( δ/a )) (13) with probability − O (1 /K ) . Note that instead of optimizing over f ∈ G ( ˜ B p ,p w ( r )) ,Eq. (9) proposes to solve a relaxed version, over G ( B p ,p w ( r )) . In addition to this, as mentioned previously,Eq. (9) is non-convex and convergence is not guaranteed.Due to this, the proposed approach may not always yield anoptimum to Eq. (12). However, whether or not the solutionlies in G ( ˜ B p ,p w ( r )) can always be checked if P1, P2, P3are satisﬁed. For estimating in-distribution objects in therange of G , it was observed empirically that the conditionsP1, P2, P3 are satisﬁed by the estimated ˆ w . Ground Truth Prior Image PICGM

Figure 1:

Ground truth, prior image and image estimated fromGaussian measurements with n/m = 10 and n/m = 50 usingthe proposed approach in the inverse crime case. n/m . . . R M S E n/m . . . SS I M Figure 2:

Ensemble RMSE and SSIM values for in the inverse-crime setting, for various subsampling ratios. The error bars showa span of one standard deviation.

5. Numerical studies

The numerical studies were split into two parts - (1) inverse-crime study – reconstruction of in-distribution objects sam-pled from G , from noisy measurements using an iid Gaus-sian distributed matrix as the forward model, and (2) non-inverse crime study – reconstruction of brain MRI imagesfrom noisy simulated MRI measurements.

1) Dataset and forward model:

For both parts, StyleGAN2was trained on a composite brain MR image dataset con-sisting of a total of 200676 T1 and T2 weighted images ofsize 256 ×

256 from the fastMRI initiative database (Zbontaret al., 2018) and 866 images from the brain tumor progres-sion dataset (Schmainda & Prah, 2018), resized to 256 × u i is sampled inde-pendently from the mapping network, and the compositevector w = [ u , u , . . . , u L ] is used to sample the image.This was treated as the prior image. Four of the style vec-tors, u . . . u were then replaced by new styles u (cid:48) . . . u (cid:48) ,which were each again sampled independently using themapping network. This was treated as the ground truthobject to-be-recovered. Reconstruction performance wasevaluated on a dataset of 50 such images. An iid Gaussianmatrix H ∈ R m × n was used as the forward model. The re-construction performance was evaluated for undersamplingratios n/m = 5 , , and . Real iid Gaussian noisewith signal-to-noise ratio (SNR) 20 dB was added to themeasurements.For evaluating the reconstruction performance in the caseof the non-inverse crime study, a dataset of 17 held-outimage pairs from the tumor progression dataset were used.The ﬁrst image was used as the prior image. Imaging mea-surements were simulated from the second image, which rior Image-Constrained Reconstruction using Style-Based Generative Models Ground Truth Prior Image PLS-TV CSGM PICCS PICGM

Figure 3:

Ground truth, prior image, and images reconstructed from simulated MRI measurements with n/m = 4 along with differenceimages for the non-inverse crime study

Ground Truth Prior Image PLS-TV CSGM PICCS PICGM

Figure 4:

Ground truth, prior image, and images reconstructed from simulated MRI measurements with n/m = 8 along with differenceimages for the non-inverse crime study was treated as the ground truth. A Fourier undersamplingforward model, described as H = m (cid:12) F ∈ R m × n wasused to simulate MR measurements. Here, m correspondsto a binary mask, and F corresponds to the 2D discreteFourier transform. This forward operator fully samples afraction of the lower frequencies and randomly subsamplesa fraction of high frequencies of the image. Five differentundersampling ratios are considered – n/m = 2 , , , and . Complex iid Gaussian noise with 20 dB SNR was addedto the measurements.

2) Generative network training details:

The StyleGAN2architecture proposed in (Karras et al., 2020) was used. Foran image size of 256 ×

3) Baselines:

For obtaining estimates of the true object from g , the performance of the following reconstruction meth-ods were qualitatively and quantitatively compared – (1)Penalized least squares with TV regularization (PLS-TV),(2) compressed sensing using StyleGAN2 (CSGM) men-tioned in Eq. (10), (3) prior image-constrained compressedsensing (PICCS) mentioned in Eq. (3), and (4) the proposedmethod (PICGM) introduced in Eq. (9). Note that the ﬁrsttwo methods described do not utilize information from theprior image, while the last two do. For PICCS, discretedifference operator (corresponding to TV semi-norm) wasused as the sparsifying transform Φ , and for the transform Ψ , a 2D Haar wavelet transform of level 7 (i.e. equal to rior Image-Constrained Reconstruction using Style-Based Generative Models n/m . . . . . . R M S E PLS-TVCSGMPICCSPICGM

Figure 5:

Ensemble RMSE values for in the non-inverse-crimesetting. The error bars show a span of one standard deviation. the number of resolution levels in StyleGAN2) was utilized.For evaluating PICGM, latent representation w (PI) of theprior image was computed with the help of the trained Style-GAN using the procedure introduced in Eq. (8). For eachimage, the regularization parameters for all the methodswere tuned using either a line search or a grid search de-pending upon the number of regularization parameters. Theimage estimates obtained by use of each of the four methodswere quantatively evaluated using root mean-squared error(RMSE) and structural similarity (SSIM).

6. Results

Figure 1 shows some of the reconstructed images for theinverse crime case, with n/m = 10 and n/m = 50 , respec-tively. The RMSE and SSIM values over the ensemble ofimages are shown in Fig. 2. It can be seen that the proposedmethod performs well in terms of RMSE and SSIM even inthe severe undersampling case, such as n/m = 50 .Figures 3 and 4 show the images reconstructed from simu-lated MRI measurements with undersampling ratios n/m =4 and n/m = 8 respectively. Although PLS-TV and CSGMhave difﬁculty in properly recovering the image, CSGMhas noticeably better deﬁned boundaries. Also, note thatthe ground truth and the prior image are visually similarin certain regions, and different in others. It can be seenthat visually, the performance of the proposed method is thebest. Also, it can be seen from the difference maps, thatthe PICCS method has a signiﬁcant amount of error concen-trated in the tumor region and the boundaries, as comparedto the other regions. This is potentially detrimental in amedical imaging scenario. Ensemble RMSE and SSIM val-ues for the non-inverse-crime study are shown in ﬁgures 5 n/m . . . . . . . SS I M PLS-TVCSGMPICCSPICGM

Figure 6:

Ensemble SSIM values for in the non-inverse-crimesetting. The error bars show a span of one standard deviation. and 6. Additional images are included in the supplementaryﬁle. Algorithm 1 takes around 5 minutes to converge to anoptimal solution.

7. Discussion and Conclusion

Firstly, as seen in the numerical studies presented here, theobject to-be-imaged and the prior image can be similar insome ways, while differing in others, due to factors such asevolution of the object over time, changes in the exact con-ﬁguration of the imaging instrument, the depth at which thetomogram slice is recorded, changes in the deﬁning featuresdue to the tumor, and small changes in pose and orientation.Such changes are not always structured in a predictableway, and hence cannot be modeled properly by the sparse-differences model assumed in the case of PICCS. PICGMdoes not suffer from this limitation, since optimizing overthe style vector accounts for these differences.The theoretical analysis presented here potentially offers away to analyze the stability of the reconstructed estimate.In particular, the presented analysis applies to non-uniformrecovery of in-distribution objects in the range of the gen-erative model, that are sufﬁciently regular, in the sense thatthe Jacobian of the generative network computed at theirlatent representation is well behaved. This means that smallchanges in the estimated ˆ w due to the measurement noisedo not correspond to huge changes in the estimated image,which is what is obtained via the presented recovery guar-antees. Hence, if the proposed algorithm converges to asolution that does not satisfy the properties presented inLemma 4.1, it likely has converged to a highly unstableand nonlinear region of the optimization landscape. Thisprovides a certain degree of graceful failure, in the face of rior Image-Constrained Reconstruction using Style-Based Generative Models hallucinations that may occur in the image reconstruction.Lastly, a task-informed approach to evaluating image recon-struction algorithms is necessary in addition to traditionalimage quality metrics employed in this study. Acknowledgements

The authors would like to thank Sayantan Bhadra andWeimin Zhou for their help. This work was supported inpart by NIH Awards EB020604, EB023045, NS102213,EB028652, and NSF Award DMS1614305.

References

Abdal, R., Qin, Y., and Wonka, P. Image2stylegan: Howto embed images into the stylegan latent space? In

Pro-ceedings of the IEEE/CVF International Conference onComputer Vision , pp. 4432–4441, 2019.Adcock, B., Hansen, A. C., Poon, C., and Roman, B. Break-ing the coherence barrier: A new theory for compressedsensing.

Forum of Mathematics, Sigma , 5:e4 1–84, 2017.Asim, M., Daniels, G., Leong, O., Ahmed, A., and Hand,P. Invertible generative models for inverse problems:mitigating representation error and dataset bias. In

Pro-ceedings of the International Conference on MachineLearning , pp. 4577–4587, 2020.Barrett, H. H. and Myers, K. J.

Foundations of image sci-ence . John Wiley & Sons, 2013.Becquaert, M., Cristofani, E., Lauwens, B., Vandewal, M.,Stiens, J. H., and Deligiannis, N. Online sequential com-pressed sensing with multiple information for through-the-wall radar imaging.

IEEE Sensors Journal , 19(11):4138–4148, 2019.Bora, A., Jalal, A., Price, E., and Dimakis, A. G. Com-pressed sensing using generative models. In

Proceed-ings of the 34th International Conference on MachineLearning-Volume 70 , pp. 537–546. JMLR. org, 2017.Cai, H., Bai, C., Tai, Y.-W., and Tang, C.-K. Deep videogeneration, prediction and completion of human actionsequences. In

Proceedings of the European Conferenceon Computer Vision (ECCV) , pp. 366–382, 2018.Cand`es, E. J. and Wakin, M. B. An introduction to compres-sive sampling.

IEEE signal processing magazine , 25(2):21–30, 2008.Cand`es, E. J., Romberg, J. K., and Tao, T. Stable signalrecovery from incomplete and inaccurate measurements.

Communications on Pure and Applied Mathematics , 59(8):1207–1223, 2006. doi: https://doi.org/10.1002/cpa.20124. Chen, G.-H., Tang, J., and Leng, S. Prior image constrainedcompressed sensing (piccs): a method to accurately re-construct dynamic ct images from highly undersampledprojection data sets.

Medical physics , 35(2):660–663,2008.Dinh, L., Sohl-Dickstein, J., and Bengio, S. Density esti-mation using real nvp. arXiv preprint arXiv:1605.08803 ,2016.Dumer, I. Covering an ellipsoid with equal balls.

Journalof Combinatorial Theory, Series A , 113(8):1667–1676,2006.Hussein, S. A., Tirer, T., and Giryes, R. Image-adaptive ganbased reconstruction. In

Proceedings of the AAAI Confer-ence on Artiﬁcial Intelligence , volume 34, pp. 3121–3129,2020.Karras, T., Laine, S., and Aila, T. A style-based genera-tor architecture for generative adversarial networks. In

Proceedings of the IEEE Conference on Computer Visionand Pattern Recognition , pp. 4401–4410, 2019.Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J.,and Aila, T. Analyzing and improving the image qualityof stylegan. In

Proceedings of the IEEE/CVF Conferenceon Computer Vision and Pattern Recognition , pp. 8110–8119, 2020.Kelkar, V. A., Bhadra, S., and Anastasio, M. A. Com-pressible latent-space invertible networks for generativemodel-constrained image reconstruction. arXiv preprintarXiv:2007.02462 , 2020.Kingma, D. P. and Ba, J. Adam: A method for stochasticoptimization. arXiv preprint arXiv:1412.6980 , 2014.Ledig, C., Theis, L., Husz´ar, F., Caballero, J., Cunningham,A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang,Z., et al. Photo-realistic single image super-resolutionusing a generative adversarial network. In

Proceedingsof the IEEE conference on computer vision and patternrecognition , pp. 4681–4690, 2017.Lustig, M., Donoho, D. L., Santos, J. M., and Pauly, J. M.Compressed sensing MRI.

IEEE signal processing maga-zine , 25(2):72–82, 2008.Maas, A. L., Hannun, A. Y., and Ng, A. Y. Rectiﬁer non-linearities improve neural network acoustic models. In

Proc. icml , volume 30, pp. 3. Citeseer, 2013.Marinescu, R. V., Moyer, D., and Golland, P. Bayesianimage reconstruction using deep generative models. arXivpreprint arXiv:2012.04567 , 2020. rior Image-Constrained Reconstruction using Style-Based Generative Models

Menon, S., Damian, A., Hu, S., Ravi, N., and Rudin, C.PULSE: Self-Supervised Photo Upsampling via LatentSpace Exploration of Generative Models. In

Proceedingsof the IEEE/CVF Conference on Computer Vision andPattern Recognition , pp. 2437–2445, 2020.Mota, J. F., Deligiannis, N., and Rodrigues, M. R. Com-pressed sensing with prior information: Strategies, ge-ometry, and bounds.

IEEE Transactions on InformationTheory , 63(7):4472–4496, 2017.Prakash, C. D. and Karam, L. J. It gan do better: Gan-baseddetection of objects on images with varying quality. arXivpreprint arXiv:1912.01707 , 2019.Schmainda, K. and Prah, M. Data from Brain-Tumor-Progression.

The Cancer Imaging Archive ,2018. URL http://doi.org/10.7937/K9/TCIA.2018.15quzvnb .Tewari, A., Elgharib, M., Bharaj, G., Bernard, F., Seidel,H.-P., P´erez, P., Zollhofer, M., and Theobalt, C. Stylerig:Rigging stylegan for 3d control over portrait images. In

Proceedings of the IEEE/CVF Conference on ComputerVision and Pattern Recognition , pp. 6142–6151, 2020.Vershynin, R.

High-dimensional probability: An introduc-tion with applications in data science , volume 47. Cam-bridge university press, 2018.Weizman, L., Eldar, Y. C., and Ben Bashat, D. Reference-based mri.

Medical physics , 43(10):5357–5369, 2016.Whang, J., Lei, Q., and Dimakis, A. G. Compressed sensingwith invertible generative models and dependent noise. arXiv preprint arXiv:2003.08089 , 2020.Wulff, J. and Torralba, A. Improving inversion and gen-eration diversity in stylegan using a gaussianized latentspace. arXiv preprint arXiv:2009.06529 , 2020.Zbontar, J., Knoll, F., Sriram, A., Muckley, M. J., Bruno,M., Defazio, A., Parente, M., Geras, K. J., Katsnelson,J., Chandarana, H., et al. fastMRI: An open datasetand benchmarks for accelerated MRI. arXiv preprintarXiv:1811.08839 , 2018.Zhang, Y., Zhang, Y., and Cai, W. Separating style andcontent for generalized style transfer. In

Proceedingsof the IEEE conference on computer vision and patternrecognition , pp. 8447–8455, 2018. rior Image-Constrained Reconstruction using Style-Based Generative Models

A. Theoretical Analysis

As described in the main manuscript, the theoretical analysis presented here provides a non-uniform recovery guarantee,which applies to in-distribution objects that are in the range of the StyleGAN G , further constrained by styles from the priorimage. In contrast to the theoretical results presented in (Bora et al., 2017) where the Lipschitz constant of the generatornetwork G is used to bound the number of measurements, the analysis presented here is in terms of the expected Frobeniusnorm of its Jacobian. Due to this, the presented analysis applies to generative networks having Lipschitz constants thatare large as compared to the typical scaling of errors due to the network, or generative networks that are not Lipschitzstable, such as StyleGAN2. The price paid is in terms of the guarantee being non-uniform, and allowing for an additionalnetwork-dependent term in the reconstruction error. Nevertheless, as we show, the proposed guarantees are useful inanalyzing the behaviour of generative model-constrained reconstruction in general, and the PICGM in particular.We begin by deﬁning the notation used. Notation A.1.

1. Let p w denote the distribution of the extended latent space vector w = [ u u . . . u L ] (cid:62) ∈ W + , where u i = g mapping ( z i ) , z i ∼ N (0 , I k ) , z i ’s are independently distributed, with I k denoting the real k × k identity matrix.2. Recall that as evidenced by (Wulff & Torralba, 2020), it can be assumed that if w ∼ p w , v = LReL α ( w ) is distributedas a multivariate Gaussian distribution. Let ¯ v and Σ be its mean and covariance matrix respectively. Recall thatLReL α denotes the leaky-ReLU nonlinear activation, deﬁned asLReL α ( x ) i = (cid:26) x i , x i ≥ ,αx i , x i < . (14)

3. Let p , p be positive integer multiples of k , with ≤ p < p ≤ K . Let W + p ,p be the P -dimensional subspace of W + containing all w such that w p = , w p : K = .4. Let B K w ( r ) := { w s.t. (cid:107) LReL α ( w ) − ¯ v (cid:107) Σ ≤ r } ,B K v ( r ) := { v s.t. (cid:107) v − ¯ v (cid:107) Σ ≤ r } . Similarly, let B p ,p w ( r ) := (cid:110) w s.t. w ∈ B K w ( r ) , w p = w (PI)1: p , w p : K = w (PI) p : K (cid:111) ,B p ,p v ( r ) := (cid:110) v s.t. v ∈ B K v ( r ) , v p = v (PI)1: p , v p : K = v (PI) p : K (cid:111) , where for a vector x ∈ R K , (cid:107) x (cid:107) Σ := x (cid:62) Σ − x . Note that α > and ≤ p < p ≤ K .5. Let J ( w ) denote the Jacobian of the synthesis network G evaluated at w . Let J p : p ( w ) denote the Jacobian of G withrespect to w p : p , evaluated at w . We ﬁrst prove the following series of lemmas.

Lemma A.1.

Let σ ≥ σ ≥ · · · ≥ σ K be the singular values of √ Σ . Let σ = [ σ σ . . . σ K ] (cid:62) . For r > , if N p ,p w ( (cid:15) ) is an optimal (cid:15) -net of B p ,p w ( r ) , then log |N p ,p w | ≤ P log (cid:20) r(cid:15) (cid:18) (cid:15) + (cid:107) σ (cid:107) √ K (cid:19)(cid:21) . Proof.

First, note that if w ∼ p w , then the subsections of w corresponding to the different style inputs, i.e. w lk +1:( l +1) k , l =0 , . . . , L − are distributed such that w lk +1:( l +1) k is independent to and identically distributed as w l (cid:48) k +1:( l (cid:48) +1) k if l (cid:54) = l (cid:48) .This implies that the singular values of √ Σ are degenerate to a certain degree. Speciﬁcally, σ jL +1 = σ jL +2 = · · · = σ ( j +1) L , j = 0 , , . . . , k − (15) rior Image-Constrained Reconstruction using Style-Based Generative Models Let σ (cid:48) ≥ σ (cid:48) ≥ · · · ≥ σ (cid:48) P be the singular values of (cid:112) Cov( v p : p ) , and let σ (cid:48) = [ σ (cid:48) σ (cid:48) . . . σ (cid:48) P ] T . Therefore, by Eq. (15), (cid:107) σ (cid:48) (cid:107)√ P = (cid:107) σ (cid:107)√ K . (16)Note that B p ,p v ( r ) is an ellipsoid with center ¯ v p : p and principal radii of lengths σ (cid:48) i r, i = 1 , , . . . , P . Assume for amoment, that there exists an integer p such that σ (cid:48) p r > (cid:15) ≥ σ (cid:48) p +1 r . Let N v ( (cid:15) ) be an optimum (cid:15) -net of B p ,p v ( r ) .Therefore, by Theorem 2 in (Dumer, 2006), log |N v ( (cid:15) ) | ≤ p (cid:88) i =1 log (cid:18) rσ (cid:48) i (cid:15) (cid:19) + P log 6 , ≤ log (cid:16) r(cid:15) (cid:17) P p (cid:89) i =1 σ (cid:48) i P (cid:89) i = p +1 (cid:15)  + P log 6 , ≤ P log (cid:34) r(cid:15) (cid:32) (cid:15) + 1 P P (cid:88) i =1 σ (cid:48) i (cid:33)(cid:35) , (by AM-GM inequality,) ≤ P log (cid:20) r(cid:15) (cid:18) (cid:15) + (cid:107) σ (cid:48) (cid:107)√ P (cid:19)(cid:21) , (by AM-RMS inequality,) ≤ P log (cid:20) r(cid:15) (cid:18) (cid:15) + (cid:107) σ (cid:107)√ K (cid:19)(cid:21) . Observe that this bound is valid even if (cid:15) > σ (cid:48) r or (cid:15) < σ (cid:48) P r . Since α > , LReL /α ( . ) is a bijective function with Lipschitzconstant 1. Therefore, for every v = LReL α ( w ) , and v = LReL α ( w ) , (cid:107) w − w (cid:107) ≤ (cid:107) v − v (cid:107) . Therefore, log |N p ,p w ( (cid:15) ) | ≤ P log (cid:20) r(cid:15) (cid:18) (cid:15) + (cid:107) σ (cid:107)√ K (cid:19)(cid:21) . (17)Now, the following assumptions about StyleGAN are made.1. Path length regularity : E w ∼ p w (cid:0) (cid:107) J ( w ) (cid:107) F − a (cid:1) < b, (AS1)where b > and a = E w (cid:107) J ( w ) (cid:107) F are global constants. As described in the main manuscript, this assumption isinspired by the path-length regularization used in (Karras et al., 2020). Although during training a is implemented as E w E y ∼N (0 ,I n ) (cid:13)(cid:13) J ( w ) (cid:62) y (cid:13)(cid:13) , as per the Hanson-Wright inequality, this concentrates to E w (cid:107) J ( w ) (cid:107) F , when y is high-dimensional (Vershynin, 2018). The value of a was estimated by empirically estimating E w ∼ p w E y ∼N (0 ,I n ) (cid:13)(cid:13) J ( w ) (cid:62) y (cid:13)(cid:13) using 100 samples w drawn from p w and 100 samples y drawn from N (0 , I n ) for each sample of w . b was empiricallyestimated over the same dataset of samples using Eq. (AS1). The values of a and √ b were estimated to be around 80.1and 16.7 respectively for the speciﬁc StyleGAN2 trained and used in this study.2. Approximate local linearity : E w ∼ p w max w (cid:48) (cid:107) w (cid:48) − w (cid:107) ≤ (cid:15) L ( w (cid:48) , w ) ≤ β ( (cid:15) ) , (AS2) rior Image-Constrained Reconstruction using Style-Based Generative Models where L ( w (cid:48) , w ) = (cid:107) G ( w (cid:48) ) − G ( w ) − J ( w )( w (cid:48) − w ) (cid:107) This property essentially measures how close G is to its linear approximation in an (cid:15) -neighborhood around a point w .For ease of notation, we will write φ p ,p ( (cid:15) ; w ) := max w (cid:48) (cid:107) w (cid:48) − w (cid:107) ≤ (cid:15) w (cid:48) − w ∈W + p ,p L ( w (cid:48) , w ) , (18)with φ p ,p ( (cid:15) ; w ) ≥ . Approximate estimates of β ( (cid:15) ) were obtained for several values of (cid:15) by ﬁrst computingthe Jacobian at a point w ∼ p w , and then iteratively maximizing L ( w (cid:48) , w ) using a projected gradient ascent-typealgorithm. Figure 7 shows the plot of β ( (cid:15) ) versus (cid:15) estimated over a dataset of 100 samples w from p w for theStyleGAN2 trained and used in this work. Lemma 4.1. If w is a sample from p w , then it satisﬁes the following three properties with probability at least − O (1 /K ) : (cid:107) J ( w ) (cid:107) F ≤ √ Ka, (P1) φ ,K ( (cid:15) ; w ) ≤ √ Kβ ( (cid:15) ) (P2) (cid:13)(cid:13)(cid:13) Σ − / LReL α ( w ) (cid:13)(cid:13)(cid:13) ≤ √ K (1 + o (1)) (P3) − − − − − (cid:15) − − β ( (cid:15) ) Figure 7:

Plot of β ( (cid:15) ) versus (cid:15) Proof.

For w sampled from p w , P1 and P2 are true with probability at least − a + bKa and − /K respectively, due toMarkov’s inequality. P3 is true with probability at least − Ω ( e − cK ) due to concentration of norm. Therefore, by unionbound, w satisﬁes all the three with probability at least − O (1 /K ) .As a consequence of Lemma 4.1, for w s = [ w (PI)1: p (cid:62) w (cid:62) p : p w (PI) p : K (cid:62) ] (cid:62) , the following properties also hold with probability at least − O (1 /K ) if w , w (PI) ∼ p w : (cid:107) J p : p ( w s ) (cid:107) F ≤ √ Ka, (TP1)(Since J p : p ( w s ) is a sub-matrix of J ( w s ) ) φ p ,p ( (cid:15) ; w s ) ≤ √ Kβ ( (cid:15) ) (TP2) rior Image-Constrained Reconstruction using Style-Based Generative Models If w satisﬁes properties P1, P2 and P3, then G ( w ) will be referred to as an in-distribution image in the range of G , sincethese are the properties of a typical sample from the StyleGAN. Notation A.2.

Let ˜ B p ,p w ( r ) be the set of all points in B p ,p w ( r ) satisfying properties P1 and P2. Lemma A.2.

Let < δ ≤ a (cid:107) σ (cid:107) √ K . Let ˜ N f ( δ ) be a discrete set in G ( ˜ B p ,p w ( r )) such that for every f ∈ G ( ˜ B p ,p w ( r )) , thereexists an f ∈ N f ( δ ) such that (cid:107) f − f (cid:107) ≤ δ + √ Kβ ( δ/a ) . (19) Then, | ˜ N f ( δ ) | ≤ P log (cid:18) ar (cid:107) σ (cid:107)√ P δ (cid:19) . (20) Proof.

The outline of this proof is as follows. First, an δ/a -covering over ˜ B p ,p w ( r ) will be constructed. Then, each of thespherical balls covering ˜ B p ,p w ( r ) is transformed into approximately an ellipsoid depending upon the Jacobian of G at thecenter of the spherical ball. Then, each of these ellipsoids will be approximately covered by a δ -net. The collection of allsuch approximate δ -nets covering the individual ellipsoids will give an approximate δ -net over G ( ˜ B p ,p w ( r )) , which is theresult required.Let (cid:15) = δ/a . Let ˜ N w ( (cid:15) ) be an optimal (cid:15) -covering of ˜ B p ,p w ( r ) . Also, let N w ( (cid:15) ) denote an optimal (cid:15) -covering of B p ,p w ( r ) .Therefore, since ˜ B p ,p w ( r ) ⊆ B p ,p w ( r ) , (Vershynin, 2018), log | ˜ N w ( (cid:15) ) | ≤ log |N w ( (cid:15)/ |≤ P log (cid:20)(cid:18) r(cid:15) (cid:19) (cid:18) (cid:107) σ (cid:107)√ K + (cid:15) (cid:19)(cid:21) . (using Lemma A.1.)Now, consider a point w ∈ ˜ N w ( (cid:15) ) . Therefore, for every w (cid:48) ∈ ˜ B p ,p w ( r ) such that (cid:107) w (cid:48) − w (cid:107) ≤ (cid:15) , we have (cid:107) G ( w (cid:48) ) − G ( w ) (cid:107) ≤ (cid:107) J ( w )( w (cid:48) − w ) (cid:107) + √ Kβ ( (cid:15) ) . (21)Therefore, upto an error of √ Kβ ( (cid:15) ) , G ( w (cid:48) ) − G ( w ) lies in an ellipsoid E ( w ) with principal radii ς (cid:15), ς (cid:15), . . . , ς P (cid:15) ,where ς ≥ ς ≥ · · · ≥ ς P are the singular values of J p : p ( w ) .Let N δ ( w ) be a δ -covering of E ( w ) . For a moment assume that there exists an integer p such that ς p < a ≤ ς p +1 .Therefore, from Theorem 2 in (Dumer, 2006), we have log |N δ ( w ) | ≤ p (cid:88) i =1 log (cid:16) ς i a (cid:17) + P log 6 , = log  a P p (cid:89) i =1 ς i P (cid:89) i = p +1 a  + P log 6 ≤ P log (cid:20) a (cid:18) (cid:107) ς (cid:107) √ P + a (cid:19)(cid:21) + P log 6 , ≤ P log (cid:34) a (cid:32) (cid:107) ς (cid:107) √ P + a (cid:114) KP (cid:33)(cid:35) + P log 6 , ≤ P log 12 (cid:114) KP . (using Property TP1.)Note that this bound holds even when a ≥ ς , or a ≤ ς P . rior Image-Constrained Reconstruction using Style-Based Generative Models Therefore, for every w (cid:48) such that (cid:107) w (cid:48) − w (cid:107) ≤ (cid:15) , there exists a point f in N δ ( w ) such that (cid:107) G ( w ) + J ( w )( w (cid:48) − w ) − f (cid:107) ≤ δ, ⇒ (cid:107) G ( w ) + J ( w )( w (cid:48) − w ) − G ( w (cid:48) ) + G ( w (cid:48) ) − f (cid:107) ≤ δ, ⇒ (cid:107) G ( w (cid:48) ) − f (cid:107) ≤ δ + √ Kβ ( δ/a ) (22)(by triangle inequality)This holds for all w ∈ ˜ N K w ( r ) . Therefore, a suitable candidate set for ˜ N f ( δ ) is ˜ N f ( δ ) = { w s.t. w ∈ N δ ( w ) , w ∈ ˜ N K w ( r ) } (23)Therefore, log | ˜ N f ( δ ) | ≤ P log 12 (cid:114) KP + P log (cid:20) arδ (cid:18) (cid:107) σ (cid:107)√ K + δ a (cid:19)(cid:21) , ≤ P log (cid:34)(cid:32) ar √ Kδ √ P (cid:33) (cid:18) (cid:107) σ (cid:107)√ K + δ a (cid:19)(cid:35) , ≤ P log (cid:18) ar (cid:107) σ (cid:107)√ P δ (cid:19) . Proof of Theorem 4.1 :First, we note that the result of Lemma A.2 applies to a decreasing sequence of δ ’s: δ i = δ / i . Also, from Fig. 7, we notethat β ( (cid:15) ) goes polynomially with (cid:15) with β (0) = 0 . Due to these, Lemma 8.2 in (Bora et al., 2017) can be reformulated asfollowing, with the proof proceeding similarly as (Bora et al., 2017). Lemma A.3.

Let H ∈ R m × n be a matrix with iid Gaussian random elements having mean 0 and variance /m . Let < δ ≤ a (cid:107) σ (cid:107) √ K . For all δ (cid:48) < δ , let β ( δ (cid:48) /a ) go polynomially as δ (cid:48) /a , with β (0) = 0 . If m = Ω (cid:18) P log ar (cid:107) σ (cid:107) δ (cid:19) , (24) then for any f ∈ G ( ˜ B p ,p w ( r )) , if f (cid:48) = arg min ˆ x ∈ ˜ N f ( δ ) (cid:13)(cid:13)(cid:13) f − ˆ f (cid:13)(cid:13)(cid:13) , (cid:107) H ( f − f (cid:48) ) (cid:107) ≤ O ( δ + √ Kβ ( δ/a )) with probability − e − Ω( m ) . The proof goes similarly as the proof of Lemma 8.2 in (Bora et al., 2017). Furthermore, similar to Lemma 4.1 in (Bora et al.,2017), Lemma A.2 and Lemma A.3 give rise to the set-restricted eigenvalue condition of H on G ( ˜ B p ,p w ( r )) as follows: Lemma A.4 (Set-restricted eigenvalue condition) . Let τ < . Let H be an matrix with iid Gaussian-distributed elementswith mean 0 and variance /m . Let < δ ≤ a (cid:107) σ (cid:107) √ K . For all δ (cid:48) < δ , let β ( δ (cid:48) /a ) go polynomially as δ (cid:48) /a , with β (0) = 0 . If m = Ω (cid:18) Pτ log ar (cid:107) σ (cid:107) δ (cid:19) , (25) then H satisﬁes the S-REC (cid:0) G ( ˜ B p ,p w ( r )) , − τ, δ + √ Kβ ( δ/a ) (cid:1) with probability − e − Ω( τ m ) . Lemma A.4, Lemma 4.3 in (Bora et al., 2017) and Lemma 4.1 imply Theorem 4.1 which is restated here for convenience.

Theorem 4.1. rior Image-Constrained Reconstruction using Style-Based Generative Models

Let H ∈ R m × n satisfy S-REC ( G ( ˜ B p ,p w ( r )) , γ, δ + √ Kβ ( δ/a )) . Let n ∈ R m . Let w , w (PI) ∼ p w . Let f (PI) = G ( w (PI) ) be the known prior image. Let ˜ w = [ w (PI)1: p (cid:62) w (cid:62) p : p w (PI) p : K (cid:62) ] (cid:62) . Let ˜ f = G ( ˜ w ) represent the object to-be-imaged. Let g = H ˜ f + n be the imaging measurements. Let ˆ f = arg min f ∈ G ( ˜ B K w ( r )) (cid:107) g − H f (cid:107) . (26)Then, (cid:107) ˆ f − ˜ f (cid:107) ≤ γ (2 (cid:107) n (cid:107) + δ + √ Kβ ( δ/a )) (27)with probability − O (1 /K ) . B. Additional Figures

B.1. Inverse crime study: n/m = 5 G r o und T r u t h P r i o r I m a g e P I C G M Figure 8:

Ground truth, prior image and image estimated from Gaussian measurements with n/m = 5 using the proposed approach inthe inverse crime case. rior Image-Constrained Reconstruction using Style-Based Generative Models

B.2. Inverse crime study: n/m = 10 G r o und T r u t h P r i o r I m a g e P I C G M Figure 9:

Ground truth, prior image and image estimated from Gaussian measurements with n/m = 10 using the proposed approach inthe inverse crime case. rior Image-Constrained Reconstruction using Style-Based Generative Models

B.3. Inverse crime study: n/m = 20 G r o und T r u t h P r i o r I m a g e P I C G M Figure 10:

Ground truth, prior image and image estimated from Gaussian measurements with n/m = 20 using the proposed approach inthe inverse crime case. rior Image-Constrained Reconstruction using Style-Based Generative Models

B.4. Inverse crime study: n/m = 50 G r o und T r u t h P r i o r I m a g e P I C G M Figure 11:

Ground truth, prior image and image estimated from Gaussian measurements with n/m = 50 using the proposed approach inthe inverse crime case. rior Image-Constrained Reconstruction using Style-Based Generative Models

B.5. Non-inverse-crime study: n/m = 4

Ground Truth Prior Image PLS-TV CSGM PICCS PICGMGround Truth Prior Image PLS-TV CSGM PICCS PICGMGround Truth Prior Image PLS-TV CSGM PICCS PICGM

Figure 12:

Ground truth, prior image, and images reconstructed from simulated MRI measurements with n/m = 4 along with differenceimages for the non-inverse crime study rior Image-Constrained Reconstruction using Style-Based Generative Models

B.6. Non-inverse-crime study: n/m = 8

Ground Truth Prior Image PLS-TV CSGM PICCS PICGMGround Truth Prior Image PLS-TV CSGM PICCS PICGMGround Truth Prior Image PLS-TV CSGM PICCS PICGM

Figure 13:

Ground truth, prior image, and images reconstructed from simulated MRI measurements with n/m = 8 along with differenceimages for the non-inverse crime study rior Image-Constrained Reconstruction using Style-Based Generative Models

B.7. Non-inverse-crime study: n/m = 12

Ground Truth Prior Image PLS-TV CSGM PICCS PICGMGround Truth Prior Image PLS-TV CSGM PICCS PICGMGround Truth Prior Image PLS-TV CSGM PICCS PICGM

Figure 14:

Ground truth, prior image, and images reconstructed from simulated MRI measurements with n/m = 12= 12