[PDF] Beyond multi-view deconvolution for inherently-aligned fluorescence tomography

Abstract

In multi-view fluorescence microscopy, each angular acquisition needs to be aligned with care to obtain an optimal volumetric reconstruction. Here, instead, we propose a neat protocol based on auto-correlation inversion, that leads directly to the formation of inherently aligned tomographies. Our method generates sharp reconstructions, with the same accuracy reachable after sub-pixel alignment but with improved point-spread-function. The procedure can be performed simultaneously with deconvolution further increasing the reconstruction resolution.

Full PDF

BBeyond multi-view deconvolution forinherently-aligned ﬂuorescence tomography

Daniele Ancora , Gianluca Valentini , Antonio Pifferi , and Andrea Bassi Dipartimento di Fisica, Politecnico di Milano, Piazza Leonardo da Vinci 32, 20133 Milano, Italy Istituto di Fotonica e Nanotecnologie, Consiglio Nazionale delle ricerche, Piazza Leonardo da Vinci 32, 20133Milano, Italy * corresponding author: [email protected] ABSTRACT

In multi-view ﬂuorescence microscopy, each angular acquisition needs to be aligned with care to obtain an optimal volumetricreconstruction. Here, instead, we propose a neat protocol based on auto-correlation inversion, that leads directly to theformation of inherently aligned tomographies. Our method generates sharp reconstructions, with the same accuracy reachableafter sub-pixel alignment but with improved point-spread-function. The procedure can be performed simultaneously withdeconvolution further increasing the reconstruction resolution.

Introduction

The ﬁeld of Computed Tomography (CT) experienced a silent revolution during the last decade. A strong demand driven bydeep-learning and data-mining has prompted hardware manufacturers to improve computing performances while keeping theprice affordable. Nowadays, high throughput computation is possible with graphic-card units (GPU). GPUs allow paralleldata-processing with performances beyond belief just a few years ago, radically changing the ﬁeld of signal processing. Inparticular, standard image processing tasks such as Fourier-transformation, convolution, and matrix operations experience aconstant-rate performance increase . GPUs are the ideal solution for the massive image processing tasks required by CT . Atvisible wavelengths, optical projection tomography (OPT) is an example of a CT technique applied for tomographic studies atmicroscopic level . By rotating the specimen and collecting its optical projections at multiple angles, it is possible to formthe reconstruction of the specimen via CT inversion. Another optical technique, light-sheet ﬂuorescence microscopy (LSFM),offers a straightforward way to optically section the sample, for the inspection of its internal structure . Even if LSFM is a directtomographic technique (i.e. it does not strictly require computation to generate a section of the sample) it is often desirableto observe the object from different angles to increase the quality of the reconstruction. LSFM suffers from non-isotropicresolution (the axial resolution is lower that the lateral) and in many cases, the sample is not visible as a whole, due to tissuescattering or absorption. Multi-view approaches address these problems, either relying on the sample rotation or exploitingmultiple objectives to observe the specimen from different angles . Before their fusion, each acquisition is registered (aligned)against a chosen reference , to correctly overlap the information captured at different angles. The registration is usuallyaccomplished by locating the best overlap between the volumes, eventually including beads around the specimen to enforce thealignment ﬁdelity . Here we discuss a new reconstruction strategy for the formation of an inherently-aligned tomographicview of a specimen. We exploit the property of the auto-correlation (we indicate it by the operator A ) to avoid any alignmentprocedure. At the same time, we demonstrate that the auto-correlation based reconstruction brings an improved resolution, dueto the rejections of second order-correlations of the point-spread-function in the A -space. The work is inspired by previousresults in OPT, where the auto-correlation is used to perform alignment-free reconstructions , because A commutes with theprojection operator . Here, instead, we calculate a tomographic auto-correlation of the sample based on multi-view light-sheetacquisitions. Fusing them, leaves us with an ensembled A , created without aligning the views. This constitutes our startingpoint for the reconstruction: by inverting A , we form a tomographic view aligned at sub-pixel level. Furthermore, we provethat this inversion turns into a reconstruction which is sharper than the average fusion carried out in direct space. Since itis desirable to take into account the resolution-loss determined by the ﬁnite aperture of the optical system, our protocol canfurther accomplish simultaneous deconvolution with a modiﬁed Bayesian A inversion. To implement our protocols, the use ofpowerful GPUs is essential, otherwise this problem would just remain a mere theoretical exercise. a r X i v : . [ ee ss . I V ] F e b igure 1. Reconstruction pipeline. A) Rendering of the reference view taken at ϕ = ◦ . The planes indicate the xy -cameraacquisition along the z -scanning direction. B) Orthogonal detection by rotating the sample at ϕ = ◦ . C) Aligned-average of12 measurements. The axes are taken according to the reference view ( x -lateral, y -transverse, z -longitudinal) D)Auto-correlation of the view at 0 ◦ . E) A of the view at 90 ◦ . F) A averaged through 12 angles. G) Reconstructions obtained byusing deauto-correlation methods For visual comparison, the upper part shows the result using the Schultz-Snyder protocol, thebottom one compares it with that of the Anchor-Update method. Results

Our reconstruction strategy is grounded on the property of the auto-correlation, that is centered in the shift-space. Eachobservation of the object is auto-correlated and concurs to the formation of the tomographic average A of the sample. Let ususe the subscript µ to indicate the stack obtained by camera acquisitions, and the superscript ϕ i to denote its angular orientationindexed by i . In an experimental measurement, we observe a blurred version of the object due to the point spread function(PSF) of the system h , further corrupted by the presence of the noise ε . We assume an additive ε that can be neglected in caseof high signal-to-noise ratio measurements. Now, let us arrange the auto-correlation in a more convenient form. Applying theoperator A to a given stack (Supplement Materials), we have that: χ µ ≡ A { o µ } = o µ (cid:63) o µ = ( o ∗ h ) (cid:63) ( o ∗ h ) (1) = χ ∗ H = o ∗ K . (2)Here, χ = o (cid:63) o is the ideal auto-correlation of the object, H = A { h } is the PSF in auto-correlation space and K = o (cid:63) H . Theﬁrst equality in Eq. 2 implies that the auto-correlation of the ideal object is blurred by a kernel H given by the auto-correlationof the direct space PSF h . The second indicates that χ µ can be seen as a convolution of the object with a blurring kernel thatcontains itself. We consider N evenly rotated measurements that we denote with the index ϕ i . The only pre-processing steprequired is the rotation of each measurement back to the reference angle 0 ◦ by − ϕ i . We subtract the mean value of a darkregion where the sample is not present. Denoting each observation as o ϕµ , and its corresponding A as χ ϕµ , the quantities ofinterest are the averages: o µ = N N ∑ i = o ϕ i µ , χ µ = N N ∑ i = χ ϕ i µ and H = N N ∑ i = H ϕ i . (3)Before computing the fusion as o µ (we use this as the ground truth), each measurement requires a perfect alignment against thereference. The χ µ is accurate, instead, because auto-correlations are centered by deﬁnition. Ideally, this implies that we canobtain an intrinsically aligned average-reconstruction from χ µ , provided that we have a robust way to carry out the inversion o ρ = A − { χ µ } . This problem falls within the class of phase retrieval (PR), since we have access to the Fourier modulus of areal object but the phase information is missing . With these two quantities in hand, we try to extrapolate intrinsically-alignedreconstructions by inverting the A with two schemes: ). Given χ µ = o ρ (cid:63) o ρ , ﬁnd o ρ ;II). Given χ µ = ( o (cid:63) o ) ∗ H , deblur it by H and ﬁnd o .At a ﬁrst glance, only the second scheme that deconvolves the PSF appears to provide a super-resolved reconstruction. However,scheme I) implies something even more interesting that we describe in Fig. 2. By averaging o ϕ i µ in direct space, the resultingvolume gets blurred by an average PSF given by h = / N ∑ i h ϕ i (Fig. 2A). Its A { h } = h (cid:63) h can be visualized in Fig. 2B. Byaveraging auto-correlations, instead, we neglect second order cross-terms of the PSF which introduce long-correlations in thefused image. For comparison, the corresponding A -PSF is shown in Fig. 2C. As a consequence, by solving for o ρ , we achievean effective PSF that is sharper than h . Interestingly, this is an implicit property that comes along with the average of multipleviews of A , thus the knowledge of the PSF is not required at all. However, for comparison, we show the effective point-spreadfunction achieved h eff = A − { H } in Fig. 2D. For a detailed discussion, see the Supplement Materials. We decided to tacklethe scheme I) by using the Schultz-Snyder (SS) iterations : o t + = o t (cid:104)(cid:16) χ µ o t (cid:63) o t (cid:17) ∗ (cid:101) o t + (cid:16) χ µ o t (cid:63) o t (cid:17) (cid:63) o t (cid:105) . (4)For the scheme II), instead, we implement the Anchor-Update (AU) protocol that was developed ad-hoc for this purpose: o t + = o t (cid:104)(cid:16) χ µ o t ∗ K t (cid:17) ∗ (cid:102) K t (cid:105) , updating: K t = o t (cid:63) H . (5)Both are ﬁxed-point iterative Bayesian methods, having the number of iterations as the only parameter to set. In the presentcase, we set a high number of 5 · iterations for both, since these methods are extremely stable and can withstand long runs.This is also a drawback, since these algorithms suffer from slow convergence-rate because each update t + t .To delve into the proposed method, let us consider the volumetric acquisition taken with an LSFM setup of a clearedmouse popliteal lymph node. We are interested in reconstructing the three-dimensional vasculature, which was stained witha ﬂuorescent label. The stack o ϕµ constitutes a single volumetric view of the specimen and contains the camera detectionsof the sample scanned through the light sheet. We use 12 volumes by rotating the sample in steps of 30 ◦ . In Fig. 1A-B, werender o ϕµ acquired at 0 ◦ and 90 ◦ . In a normal multi-view reconstruction algorithm, every dataset has to be aligned against thereference view (we assume this to be at ϕ = ◦ ). A consolidated strategy (accurate at pixel level) is to locate the maximumof the cross-correlation between the reference and the view, translating it back accordingly. However, the researcher may belooking for sub-pixel accuracy, which would require to upsample the volume accordingly to the resolution that he wants toreach . This makes the size of the problem rapidly explode, leaving the user with an up-sampled estimation (with respectthe original measurement) that needs to be down-sampled for the formation of the ﬁnal image. Here instead, we produce amulti-view reconstruction that is accurate at sub-pixel level and directly formed at the original resolution. We do not calculateany volume translation, we simply process the reconstruction altogether starting from its auto-correlation χ µ .We analyze two experimental situations. In Fig 3 and 4, we report the results obtained on two regions of the specimen. Theﬁrst contains the whole specimen and corresponds to a volume of 512 voxels, with size of ( µ m ) . The second volumetakes a region of interest of 256 × ×

128 voxels, with size of 330 µ m × µ m × µ m . Convolutions and correlations areimplemented via Fast-Fourier Transform (FFT) spectral decomposition. The GPU implementation is essential to perform suchreconstructions, since the method relies on intensive usage of 3D-FFT. We implemented the code in Python by using the CuPylibrary, which provides a ﬂexible CUDA framework for matrix operations. The problems were tackled using a single nVidiaTitan RTX, equipped with 4608 CUDA cores and 24 gigabytes of RAM. For the ﬁrst volume, each step is accomplished in Figure 2.

Point-spread-functions analysis. A) PSF that blurs o µ . B) Auto-correlation of h . C) PSF H that blurs the average χ µ . D) Corresponding PSF in the object domain, sharper than h . igure 3. Comparison of the methods used to reconstruct a mouse popliteal lymph node vasculature. The quality of the ﬁrstrow improves in the central and in the bottom rows. Results of aligned mean shown in a transverse (A), longitudinal (B) andlateral (C) direction. The shown data are color-encoded maximum intensity projection (MIP) along each spatial coordinate.In all these MIPs, the color indicates the depth at which the corresponding feature is located. The small letters indicate thecropped volume . The cropped regions located within the whole specimen are framed with white boxes. The scale bar is100 µ m . A,a) Transverse view of the volume o µ averaged and aligned by cross-correlation (Rendered in Fig.1, viewed from thetop). B,b) Longitudinal (or side) view projection. C,c) Lateral (or front) view. D,d) Transverse, E,e) longitudinal, and F,f)lateral projections of the volume o ρ reconstructed with SS. G,g) Transverse, H,h) longitudinal, and I,i) lateral projections of thevolume o deconvolved with AU.0 .

48 seconds, while for the second one it needs 0 .

05 seconds. We choose the reference view at angle ϕ = ◦ for the initialguess at t =

0. The results obtained for the reconstructions of the whole specimen were rendered in the previous Fig. 1G,where the top-half is the result of SS and the bottom-half is the result of AU. To compare the different results, we show themaximum intensity projection (MIP) along each spatial coordinate. The top row of Fig. 3 shows the ground-truth reconstruction,obtained by averaging the views previously aligned by locating the peak of their cross-correlation. The second row of Fig.3, instead, shows the results of SS iterations. Since the reconstruction is formed from an inherently aligned auto-correlation,the features of the specimen are ﬁnely resolved with respect to the ground truth. Compared to the o µ , the reconstructed o ρ iscrisp, with ﬁne features better isolated from an intensity background. This is due to the sharper PSF h eff implied by the usageof the auto-correlation. The third row of Fig. 3 displays the results obtained with AU, deconvolving H from the estimatedauto-correlation. The ﬁnal effect is a deblurring of the reconstruction with respect to the SS. The validity of the method canbe further assessed by examining a tomographic slice taken through the middle of the full-resolution volume. In Fig. 4A, weshow the ground truth result of the aligned and averaged volume. We have chosen a detailed region which displays a bifurcatedblood vessel, and a smaller circular opening located at the bottom. Fig. 4B slices exactly the same plane of the volume o ρ after the inversion of χ µ via SS iterations. If we compare this with the ground-truth, we observe a clear improvement of thereconstruction quality. Having correctly reinterpreted sub-pixel misalignment and with a neat PSF, the image is rich in detailsand well contrasted, where the ground truth appears fuzzier. On the other hand, Fig. 4C shows the reconstruction of the samevolume by using AU. Here, it is possible to appreciate the deblurring effect that leaves us with a highly-resolved reconstruction.To assess the qualitative verdict of our analysis, we examine the smaller detail of the vessel located at the bottom of the panel C.The region of interest is displayed in panel E as a reference in the case of AU reconstruction. We draw a line proﬁle in themiddle of it and we plot the intensities for each of the reconstruction considered in Fig. 4D. The ground truth almost confusesthe walls of the small blood vessel, whereas SS resolves this detail. The opening within the blood vessel becomes even more igure 4. Tomographic slice of the cropped volume. A) Aligned mean (ground truth). B) Reconstruction using SS. C)Reconstruction using AU. D) Proﬁle plot along dashed line in panel E) for the three cases. E) Detail of the small opening forAU.evident in case we use simultaneous deconvolution with AU, given that the deconvolution by the PSF let us resolve sharperdetails. A thorough image analysis is presented in the Supplement Materials document accompanying this manuscript.

Discussion

It is worth stressing that our auto-correlation method goes beyond the deconvolution approach. We are exploring a new pathin alignment-free image formation, studying its advantage in terms of PSF. With this work, we presented an approach to theproblem of shift-invariant reconstructions in volumetric multi-view tomography. Rather than relying on alignment and fusionpipelines, we proposed a conceptually simpler approach that promotes the reconstruction into the shift-invariant A -space.We made use of multiple views of the specimen with the sole goal of reﬁning the estimation of the auto-correlation of theobject, since we consider it as the ideal quantity for the formation of inherently aligned reconstructions. By releasing the userfrom this task, we could direct our attention on better ways to estimate the auto-correlation. In particular, this may open thepath for the corrections of higher-order transformations such as those introduced by inaccuracies of the rotation stage. Forexample, two-axes angular tilts are easier to be seen in the shift-invariant space rather than in the object space, since we nolonger worry about the object positioning. Furthermore, we have proven that the solution of the A − can be accompaniedwith deconvolution . Concatenating two inverse problems, in fact, can be hazardous since remaining artifacts from the ﬁrstinversion may condition the behaviour of the following method. In volumetric tomography, however, we can always startfrom a good guess because we have independent object observations which are not too far from the ideal reconstruction. Theinitialization with one of these permits us to ﬁnalize the reconstruction even if we do not comply with appropriate frequencypadding. In fact, the auto-correlation of a discretely n -sampled signal is deﬁned on a translation-space that is 2 n − managed to reduce by two orders of magnitude the iterationsneeded. This was done by tuning the forward and backward projection operators, similarly to what we have respectively at thedenominator and numerator in our Eq. 5. Reducing the number of iterations is a key aspect that we plan to investigate further inparticular for the analysis of big volumes, eventually reducing the GPU processing-time from a few hours to a fraction of it. ethods Correlation/Convolution Theory

In the following section, we deﬁne the quantities used in the manuscript. We make use of the bold xxx = ( x , y , z ) to indicatea vector in the direct space xxx ∈ R . Reconstructions are ﬁnalized in the direct xxx -space. Correlations and convolutions arequantities deﬁned by shifts in R , which we deﬁne with the vector ξξξ = ( ξ , η , ζ ) . Also ξξξ ∈ R and we refer to as the shift-space.We recall that the Fourier transform F of a function h ( xxx ) is deﬁned as: F { h } ≡ (cid:90) ∞ − ∞ h ( xxx ) e − i π xxx · kkk dxxx = ˆ h ( kkk ) , (6)where kkk = ( k x , k y , k z ) is the wave-vector in frequency space. Cross-correlation

Conceptually, the cross-correlation is deﬁned as a reference function o multiplied by another function h that shifts by ξξξ as in: o (cid:63) h = X { o ; h } = (cid:90) o ( xxx ) h ( ξξξ + xxx ) dxxx . (7)The cross-correlation theorem states that o (cid:63) f can be written as a product in the Fourier domain: F { o (cid:63) h } = F { o } · F { h } . (8)For the implementation of our method, we make intensive use of this property and the following Fourier theorems. Auto-correlation

The auto-correlation is deﬁned as a quantity shifted and multiplied by itself as in: o (cid:63) o = A { o } = (cid:90) o ( xxx ) o ( ξξξ + xxx ) dxxx ≡ χ ( ξξξ ) . (9)The Wiener-Khinchin (power-spectrum) theorem states: F { o (cid:63) o } = (cid:107) F { o }(cid:107) . (10)The auto-correlation of any real signal is an even function, thus χ ( ξξξ ) = χ ( − ξξξ ) , and it is always centered around its maximumlocated at ξξξ = ( , , ) ≡ Convolution

The convolution is deﬁned by shifting and multiplying two functions o and h as in: o ∗ h = C { o ; h } = (cid:90) o ( xxx ) h ( ξξξ − xxx ) dxxx . (11)The convolution theorem states that: F { o ∗ h } = F { o } · F { h } . (12) Image pre-processing

Each raw dataset was subtracted with a corresponding average background value, rotated to the same angular orientation of theﬁrst dataset acquired at ϕ =

0. The PSF of the system was assumed to be Gaussian, elongated along the direction of scanning.For each of these stacks, we computed the corresponding auto-correlation sequence. We took the absolute value of the averageauto-correlation to avoid any presence of unwanted negative values. These are determined by the background-subtraction andeventually by rounding errors due to FFT computation.

Multi-view registration and fusion

As ground truth comparison, we aligned the views against each other by ﬁnding the location of the maximum of the cross-correlation between X { o i µ ; o j µ } , for i (cid:54) = j . We deﬁned the displacement vector mmm i with respect the central coordinate ξξξ = ϕ i = = ◦ as reference and we translated each ϕ i by the vector − mmm i deﬁned in this way. Then we computed the averageof the registered stacks to form o µ . earranging the auto-correlation The measurement o µ : o µ = o ∗ h + ε . (13)We neglect the additive noise by assuming ε =

0. Using the commutation properties of the convolution and correlation, wehave that: χ µ ≡ A { o µ } = o µ (cid:63) o µ = ( o ∗ h ) (cid:63) ( o ∗ h ) (14) = ( o (cid:63) o ) ∗ ( h (cid:63) h ) = o ∗ ( o (cid:63) ( h (cid:63) h )) (15) = χ ∗ H = o ∗ K . (16)Here we have called χ = o (cid:63) o , H = h (cid:63) h and K = o (cid:63) H . In the main text, we use only the last two equations. Experimental details

For the test performed in our letter, we use a cleared mouse popliteal lymph node having the vasculature stained with the AlexaFluor 488 dye. The sample was embedded in agarose, then cleared and imaged in Benzyl-Alcohol Benzyl-Benzoate (BABB).The ﬂuorescence is excited with a light-sheet perpendicular to the camera detection at λ exc = nm , imaged onto the samplewith an 2.5x/0.07 N PLAN (air) objective lens. With a band-pass ﬁlter at 525 / nm , we imaged the emitted ﬂuorescence usinga 5x/0.12 N Plan EPI (air) objective lens. The sample was scanned through the light sheet along a z-axis, perpendicularly to thecamera detection in steps of 4 . µ m . The specimen was provided by Prof. J. Stein at the University of Bern and imaged byJim Swoger at the Center for Genomic Regulation (CRG), Barcelona. undings H2020 Marie Skłodowska-Curie Actions (HI-PHRET project, 799230); H2020 Laserlab Europe V (871124).

Acknowledgment

Sample provided by Prof. J. Stein at the University of Bern and imaged by Jim Swoger at the Center for Genomic Regulation(CRG), Barcelona. The authors further thank Dr. Gianmaria Calisesi for inspiring discussions.

Author contributions

You must include a statement that speciﬁes the individual contributions of each co-author. For example: "A.P.M. ‘contributed’Y and Z; B.T.R. ‘contributed’ Y,” etc. See our authorship policies for more details.

Competing interests

The authors declare no competing interests.

Materials & Correspondence

Correspondence to Daniele Ancora. eferences Sun, Y., Agostini, N. B., Dong, S. & Kaeli, D. Summarizing cpu and gpu design trends with product data. arXiv preprintarXiv:1911.11313 (2019). Leiserson, C. E. et al.

There’s plenty of room at the top: What will drive computer performance after moore’s law?

Science (2020). Despres, P. & Jia, X. A review of gpu-based medical image reconstruction.

Phys. Medica , 76–92 (2017). Sharpe, J. et al.

Optical projection tomography as a tool for 3d microscopy and gene expression studies.

Science ,541–545 (2002). Verveer, P. J. et al.

High-resolution three-dimensional imaging of large specimens with light sheet–based microscopy.

Nat.methods , 311–313 (2007). Krzic, U., Gunther, S., Saunders, T. E., Streichan, S. J. & Hufnagel, L. Multiview light-sheet microscope for rapid in totoimaging.

Nat. methods , 730–733 (2012). Weber, M. & Huisken, J. Omnidirectional microscopy.

Nat. methods , 656 (2012). Swoger, J., Verveer, P., Greger, K., Huisken, J. & Stelzer, E. H. Multi-view image fusion improves resolution inthree-dimensional microscopy.

Opt. express , 8029–8042 (2007). Preibisch, S. et al.

Efﬁcient bayesian-based multiview deconvolution.

Nat. methods , 645 (2014). Ancora, D. et al.

Phase-retrieved tomography enables mesoscopic imaging of opaque tumor spheroids.

Sci. reports ,11854 (2017). Ancora, D. et al.

Optical projection tomography via phase retrieval algorithms.

Methods , 81–89 (2018).

Shechtman, Y. et al.

Phase retrieval with application to optical imaging: a contemporary overview.

IEEE signal processingmagazine , 87–109 (2015). Schindelin, J. et al.

Fiji: an open-source platform for biological-image analysis.

Nat. methods , 676–682 (2012). Schulz, T. J. & Snyder, D. L. Image recovery from correlations.

JOSA A , 1266–1272 (1992). Ancora, D. & Bassi, A. Deconvolved image restoration from auto-correlations.

IEEE Transactions on Image Process. ,1332–1341 (2020). Guizar-Sicairos, M., Thurman, S. T. & Fienup, J. R. Efﬁcient subpixel image registration algorithms.

Opt. letters ,156–158 (2008). Guo, M. et al.

Rapid image deconvolution and multiview fusion for optical microscopy.

Nat. Biotechnol. , 1337–1346(2020)., 1337–1346(2020).