[PDF] Multi-modal 3D Shape Reconstruction Under Calibration Uncertainty using Parametric Level Set Methods

Abstract

We consider the problem of 3D shape reconstruction from multi-modal data, given uncertain calibration parameters. Typically, 3D data modalities can be in diverse forms such as sparse point sets, volumetric slices, 2D photos and so on. To jointly process these data modalities, we exploit a parametric level set method that utilizes ellipsoidal radial basis functions. This method not only allows us to analytically and compactly represent the object, it also confers on us the ability to overcome calibration related noise that originates from inaccurate acquisition parameters. This essentially implicit regularization leads to a highly robust and scalable reconstruction, surpassing other traditional methods. In our results we first demonstrate the ability of the method to compactly represent complex objects. We then show that our reconstruction method is robust both to a small number of measurements and to noise in the acquisition parameters. Finally, we demonstrate our reconstruction abilities from diverse modalities such as volume slices obtained from liquid displacement (similar to CTscans and XRays), and visual measurements obtained from shape silhouettes.

Full PDF

MMulti-modal 3D Shape Reconstruction Under Calibration Uncertainty usingParametric Level Set Methods

Moshe Eliasof ∗ , Andrei Sharf ∗ , and Eran Treister ∗ Abstract.

We consider the problem of 3D shape reconstruction from multi-modal data, given uncertain cali-bration parameters. Typically, 3D data modalities can come in diverse forms such as sparse pointsets, volumetric slices, 2D photos and so on. To jointly process these data modalities, we exploita parametric level set method that utilizes ellipsoidal radial basis functions. This method not onlyallows us to analytically and compactly represent the object, it also confers on us the ability toovercome calibration-related noise that originates from inaccurate acquisition parameters. This es-sentially implicit regularization leads to a highly robust and scalable reconstruction, surpassing othertraditional methods. In our results we ﬁrst demonstrate the ability of the method to compactly rep-resent complex objects. We then show that our reconstruction method is robust both to a smallnumber of measurements and to noise in the acquisition parameters. Finally, we demonstrate ourreconstruction abilities from diverse modalities such as volume slices obtained from liquid displace-ment (similar to CT scans and XRays), and visual measurements obtained from shape silhouettesas well as point-clouds.

Key words.

3D Shape Reconstruction, Parametric Level Sets, Dip Transform, Joint Reconstruction, Shapefrom Silhouettes, Point Clouds, Compactly Supported Radial Basis Functions.

AMS subject classiﬁcations.

1. Introduction.

The reconstruction of 3D objects is an important task in many disci-plines like computer graphics [1, 4, 6, 52, 54], computer vision [27, 16, 26], geophysics [32],computational biology [41], civil engineering, medical applications, architecture, and archae-ology, among others. Typically, we have incomplete data measurements of a physical object,and wish to reconstruct the full and continuous object that corresponds to these measure-ments. In addition to being incomplete, the measurements are usually noisy, and therefore,the aim of the reconstruction process is to ﬁnd a model that optimally ﬁts the data. This canbe done by solving a computational optimization problem, the aim of which is to ﬁt simulateddata to existing measurements. These problems are usually ill-posed and non-convex, andsolving them with existing methods is often challenging. In many cases—especially when theavailable data set is small—we need to include some prior information in the reconstructionproblem to guide the optimization toward a more plausible solution.In this work, we focus on problems of 3D shape reconstruction from multi-modal data.That is, measurements of objects that originate from several domains, such as volumetricslices [1, 8, 24, 13, 25, 30, 55], multiview images and object silhouettes [27, 16, 26], and point-clouds [7, 35]. Our aim is to deﬁne a robust unifying framework that can jointly processsuch data to obtain a best ﬁtting shape model. In the shape reconstruction problem, we areessentially searching for a piecewise constant binary function in some domain, which indicateswhether there is an object or background at each location. A common approach to such ∗ Computer Science Department, Ben-Gurion University of the Negev, Beer Sheva, Israel. ([email protected],[email protected], erant@cs,bgu.ac.il) a r X i v : . [ c s . G R ] D ec ELIASOF M., SHARF A., AND TREISTER E. piecewise constant reconstructions exploits level set methods [39, 40], which have been usedfor shape reconstruction [50] and image segmentation [47]. A particular version of thesemethods is the parametric level set (PaLS) method [2, 3], an approach that has been usedfor inverse problems, particularly for medical [28, 18] and geophysical [23, 31] applications.The PaLS method suggests that the level set function is approached analytically rather thanby using a computational grid. That is, the shape is analytically parameterized by a linearcombination of radial basis functions (RBFs) with limited support, e.g., Gaussian functions.The beneﬁt of using RBFs stems from their ability to compactly yet accurately represent theshape and to adapt it as part of the optimization process. Furthermore, we use compactlysupported RBFs. This has a tremendous computational beneﬁt when computing a shape fromits parameters, leading to sparse Jacobian matrices, and an eﬃcient optimization process forreconstructing high-resolution objects. This method acts as an implicit regularization towardsa reconstruction of an dense object.In the context of graphical applications, RBFs were previously used for surface recon-struction. The work of [14] suggests using RBFs for surface reconstruction from a pointcloud, where the centers and radii of the basis functions are assumed to be known, and theoptimization is performed only for the linear coeﬃcients of the RBFs. A later work [33] pro-posed a multi-scale reconstruction that is obtained with RBFs to capture both the global andlocal structures of the object, after which the reconstructed shape is obtained by interpolat-ing between the scales. However, such approaches require a large number of basis functionsfor the reconstruction, in the tens of thousands, conferring a high computational cost on themethod. A recent work addressed this issue by devising an eﬃcient GPU implementation ofRBF interpolation [17]. As we will show later, our method requires many fewer RBFs (only afew hundred parameters) to represent whole objects, and not just the surface. A recent work[15] proposed employing a deep RBF neural network to encode the spatial distribution ofpoint-clouds for classiﬁcation. This approach yields competitive classiﬁcation accuracy (i.e.,similar to standard networks [35]), while reducing both the number of learned parameters andthe computational costs.In this work we leverage the PaLS method for shape reconstruction, in which the objectis analytically represented as a composition of RBFs, and is made piece-wise constant byapplying a smoothed step-like function. The parameters of this representation, which is acollection of spheres, include the centers, radii, and linear coeﬃcients of the RBFs. To thebest of our knowledge, we are the ﬁrst group to use PaLS for graphical applications. Tothat end, our ﬁrst contribution to this subject features an enriched PaLS representation thatcomprises a collection of ellipsoids instead of spheres. This is needed since the conventionalPaLS method in 3D often results in blob-like artifacts caused by the spherical basis functions.This can be seen in the work of [28], which performed 3D medical reconstructions usingPaLS. While it is arguable whether such artifacts are acceptable in a medical scenario, theyare deﬁnitely unacceptable in problems where the reconstructed object should be as close aspossible to the original (e.g., reconstruction of a piece of art). To avoid undesired artifacts,therefore, we extend the PaLS representation to be ellipsoidal, so that the PaLS method cancompactly represent ﬂat and long objects. As we demonstrate later, our framework is capableof accurately representing complexly shaped objects with sharp elements by using severalhundred parameters only. Our approach thus reduces complex reconstruction problems of

HAPE RECONSTRUCTION UNDER CALIBRATION UNCERTAINTY 3 potentially several million parameters using standard volumetric grid values, to problems of afew thousands parameters at most using PaLS to represent the object. The markedly smallernumber of parameters using PaLS makes the optimization process eﬃcient and better-posed.As a result, the parametric representation implicitly regularizes the reconstruction and enablesus to reconstruct shapes from small data sets. Another, albeit smaller beneﬁt of the compactPaLS representation is the savings in storage—we only need to store the PaLS parametersrather than the volumetric grid values of the object.An additional signiﬁcant contribution made by our work relates to handling uncertainty inthe calibration (or alignment) parameters. For each measurement, data acquisition methodsinclude certain parameters, e.g., the angle and location relative to the camera at which animage was photographed. The accuracy with which these parameters are known (if they areknown at all) varies, generating uncertainty that essentially creates non-trivial noise in thesimulated data. Inevitable in real life scenarios, such noise is diﬃcult to model statisticallyand may hamper the reconstruction if not addressed properly. This is especially evidentin cases where the measured data is small. This calibration problem is evident in manyreconstruction problems of interest. One may approximate the calibration parameters inaddition to the reconstruction, but the process is not straightforward for some problems.For example, in tomography problems there are cases where it is theoretically possible toestimate the calibration (or alignment) parameters and the shape together [5]. Indeed, [21]and [34] apply a procedure that alternates between updating the alignment parameters andthe tomographic reconstruction. The work of [51] optimizes for both the shape and thealignment parameters using a Quasi-Newton method, and the derivatives with respect tothe alignment parameters are numerically approximated by ﬁnite diﬀerences. Similarly, thework of [11] tackles the problem with the Levenberg-Marquardt algorithm using a splinerepresentation of the shape to treat the derivatives of the alignment parameters. Similarly,[46] uses a bi-cubic interpolation of the shape and a variety of ﬁrst order methods for the jointreconstruction of the shape and alignment parameters. A similar situation is also evidentin shape reconstruction from point clouds, which also requires a registration in cases wheremultiple point clouds are given as data. There, the alignment parameters are usually estimatedby the Iterative Closest Point (ICP) methods separately from the reconstruction—see [9, 37,43, 7, 45, 42] and references therein. While the current methods are considered to be quiteaccurate, there may often be small alignment errors, especially if the overlap between thepoint clouds is small, or if multiple misaligned clouds are considered. In such cases, havingadditional data sources of the same object can help correcting the alignment. Our solutionto this issue is rooted in the PaLS method whose compact analytic representation allowsus to apply transformations or deformations to the object. Thus, we can also include theacquisition parameters in the optimization process as part of the multi-modal reconstructionproblem, allowing the reconstruction to adapt to the real acquisition parameters, and handlethis uncertainty.We demonstrate both the issue of the accurate object representation, and the handlingof uncertainty in the acquisition parameters by performing rigorous experiments using ourmethod for reconstruction from visual and non visual measurements. For the non visual data,we use the acquisition method of “dip transform”, where the data are obtained by repeatedlydipping the object in liquid, such that each time the object is dipped, it is oriented at a diﬀerent

ELIASOF M., SHARF A., AND TREISTER E. angle relative to the dipping axis [1]. Using dip transform, though the data acquisition is asimple task, it is tedious to execute, as the data generated from each dip are typically small.To complement this small data, we also experiment with joint shape reconstruction from boththe dipping data and visual data, the latter of which comprised silhouette images which arerelatively easy to obtain. The visual part of the reconstruction is applied via a modiﬁed versionof shape from silhouettes (SfS) proposed in [16, 26]. In addition, we perform experiments ofshape reconstruction from point clouds.Our paper is organized as follows: In Section 2, we provide the mathematical formulationsof the problems that we investigate. Then, in Section 3, we present the PaLS method, afterwhich we address calibration uncertainty in Section 4. Next in Section 5, we discuss the useof two modalities to perform shape reconstruction using the PaLS method. Finally, in Section6 we present our empirical results.

2. Problem Formulation.

The typical shape reconstruction problem that we consider hereis formulated as a binary piecewise constant optimization problem of the form(2.1) arg min u ∈{ , } n × n × n F ( u ) = 12 n ex (cid:88) j =1 (cid:13)(cid:13)(cid:13) d j ( u ) − d obs j (cid:13)(cid:13)(cid:13) + λR ( u ) , where n ex denotes the number of experiments (e.g., photos), d obs j is the data measurementthat corresponds to experiment j , u is the binary vector representing the object, and d j ( u )is the simulation of the data measurement j according to a given object u . We assume that d j ( u true ) ≈ d obs j , that is, the simulated data for the true object u true is approximately equalto the observed data d obs j . In the ﬁrst term of F we wish to measure and minimize thesimulation error d ( u ) − d obs in a least squares sense, which also means that we assume thatthe simulation error is independent and identically Gaussian distributed soutand ı.i.d.. Sincethe reconstruction problem is often ill-posed, we usually incorporate in it prior informationin the form of the regularization term R ( u ) and its associated parameter λ >

0, the latter ofwhich balances between the data ﬁdelity term and the regularization [48].The type of regularization R ( u ) used can be a simple Tikhonov regularization (as in [1]),or it can be a total variation regularization [36], the latter of which promotes smoothness inthe reconstructed image but is also able to tolerate edges. To solve the problem (2.1), a varietyof optimization algorithms can be used, most of which are gradient-based iterative methodssuch as gradient descent or Gauss-Newton. Such algorithms are suitable only in cases where u is continuous and real-valued. When using these algorithms, therefore, the problem (2.1) issolved under the assumption that u ∈ R n × n × n , and u is artiﬁcially promoted to be binaryby using penalty terms, smoothing regularization and other iterative procedures. This is thecase, for example, in [26, 27], which deal with reconstructions of shapes from silhouettes andmulti-view data. Generally, assigning u to be binary can be a challenge, and may requiresome heuristics. We demonstrate our framework on the problems below. A promising recent to-mographic acquisition and shape reconstruction technique is the “dip transform” [1]. Designedto reconstruct a shape from liquid displacement measurements that correspond to volumetricslices of the shape, dip transform is similar to other tomographic modalities like X-ray scans,

HAPE RECONSTRUCTION UNDER CALIBRATION UNCERTAINTY 5

Figure 1: The data acquisition process in the dip transform. An object is repeatedly dipped inwater in diﬀerent orientations. The changes in water level eﬀected by each dip are measuredto obtain a signature that describes the object’s volume.for example, and, in particular, it is not based on visual or surface measurements. Also closelyrelated is the work of [13], involving reconstruction from 2D slices. The dip transform exploitsArchimedes law to reconstruct the volume of the object by repeatedly dipping it in a con-tainer full of liquid, such that each time the object is dipped, it is oriented at a diﬀerent anglerelative to a given axis. Each dip along an axis creates a trace of liquid volume displacementthat represents the volumes of thin slices of the shape along the dipping axis (see Fig. 1).In essence, this technique performs tomographic measurements of the object’s slices. Suchmeasurements enable us to reconstruct hidden elements or transparent shapes, which cannotbe measured by rays of light.Using our notation, the dip transform reconstruction problem is deﬁned as:(2.2) arg min u  n dips (cid:88) j =1 (cid:13)(cid:13)(cid:13) SP j u − d obs j (cid:13)(cid:13)(cid:13) + λR ( u )  , u ∈ { , } n × n × n . where P j is a rotation matrix, rotating the object u at horizontal and vertical angles θ j and φ j , respectively, and essentially is a permutation matrix. S is the “summation over slices”matrix, which generates the water displacement measurements given the rotated u , and R is the regularization term. Given the angles ( θ j , φ j ) and ignoring the binary reconstructionrequirement, this is a linear inverse problem.A signiﬁcant drawback to this approach is that it necessitates a large number of measure-ments, as each dip only generates a one-dimensional trace, though we need to reconstruct athree dimensional object. In addition, the mechanical and manual acquisition method begetshigh uncertainty in the acquisition parameters. In actuality, the angles and centering of theobject during dipping are only roughly known. In the next sections, we will show how ourmethod resolves both of these problems. Another channel of data inputused in this paper comprises visual (or surface) measurements. We consider two data modal-

ELIASOF M., SHARF A., AND TREISTER E.

Figure 2: Visual Hull obtained via Shape from Silhouettes, colored in red. The object iscolored in blue.ities. One is the SfS model [29, 27, 16, 26], whose input includes photographs of the objecttaken from several diﬀerent angles. These photographs are then binarized to be shadow-like,and subsequently used to estimate the volumetric shape of the object. Another data modalityis point clouds, which are commonly used for object scanning and reconstruction [7] usingof-the-shelf sensors. Point clouds consist of sets of points which lie on the (visible) surface ofthe object. The reconstruction from both modalities may be deﬁned similarly to (2.1) as(2.3) arg min u (cid:88) j loss j ( u ) + λR ( u )  , u ∈ { , } n × n × n , where loss j ( u ) combines model principles (either SfS or point cloud) for the j -th experiment,and R is again a regularization term. Both modalities are not perfect and have limitations.For example, in SfS we cannot satisfactorily reconstruct concavities in the shape, and indeed,in most cases the hidden areas of the shape, like holes or hollows, are artiﬁcially ﬁlled duringthe reconstruction process. In point clouds, certain areas of the surface are often scanned inpoor resolution, leaving inaccuracies or holes in the reconstructed surface [1]. Nevertheless,both surface models have the advantages of simplicity and ease of the acquisition. Each of the data modalities mentioned above has its draw-backs. Therefore, in this work we use multiple modalities to improve the quality of thereconstruction given the available data. For example, we use the SfS to complement the datawe acquire via the dip transform, and demonstrate the joint reconstruction of the shape withinour framework. To obtain this, we deﬁne the joint problem:(2.4) arg min u  n dip (cid:88) j =1 loss dip j ( u ) + γ n sfs (cid:88) l =1 loss sfs l ( u ) + λR ( u )  , where loss dip corresponds to the objective in Eq. (2.2), and γ > γ is usually chosen such that the two losses are comparable. This combination HAPE RECONSTRUCTION UNDER CALIBRATION UNCERTAINTY 7 of SfS with dip transform enables us to achieve a high quality reconstruction using signiﬁcantlyfewer dipping measurements compared to using the dip transform exclusively. In this case,however, we have to neglect the assumption of a full model that is typical for SfS, since thedip measurements are able to capture an object’s holes and concavities, which are missed bySfS. Therefore, we propose an adapted SfS model that lacks such assumptions (more detailsin Section 5). In addition, in Section 6 we demonstrate the joint surface reconstruction fromSfS and point clouds, to overcome registration problems in point clouds.

3. Parametric Level Sets using Radial Basis Functions.

The PaLS approach [2] wasoriginally proposed to solve inverse problems involving a reconstruction of a constant valuedbody and its background. In our case, in (2.1) we have a background of zero, and we focuson retrieving the constant-valued object. The choice of the basis functions has a signiﬁcantimpact on the dimensionality and expressiveness of the PaLS representation. Herein, wechoose radial basis functions [2], which have been shown to represent 3D models well [14]. Tothis end, the PaLS representation is deﬁned by:(3.1) u ( (cid:126)x, { α i , β i , (cid:126)ξ i } n RBF i =1 ) = σ (cid:32)(cid:88) i α i ψ (cid:18)(cid:13)(cid:13)(cid:13) β i (cid:16) (cid:126)x − (cid:126)ξ i (cid:17)(cid:13)(cid:13)(cid:13) † (cid:19)(cid:33) where σ : R → [0 ,

1] is a Heaviside function used to binarize u . There are a few functionsthat can be used for this purpose, like the atan() function used in [2]. We leverage a piecewisepolynomial Heaviside function as shown in Fig. 3a to improve the condition number of theHessian and increase the sparsity of the Jacobian [22]. The function ψ : R + → [0 ,

1] is oneof the Wendland RBF functions that appear in Fig. 3b. For precise deﬁnitions of ψ and σ ,refer to the Appendix in Section 8.1. The pseudo-norm (cid:107) (cid:126)v (cid:107) † = (cid:113) (cid:107) (cid:126)v (cid:107) + (cid:15) is used to preventdivision by zero when computing the derivatives for our model.The RBF together with the (cid:96) distance as an operand leads to a sphere in space. Thus, therepresentation (3.1) is a linear combination of spheres, each one with a radius of approximately β i , centered around (cid:126)ξ i . Each basis function has ﬁve unknown parameters. We denote:(3.2) m = { α i , β i , (cid:126)ξ i } n RBF i =1 ∈ R n RBF to be the vector of the unknown RBF parameters in the optimization process, and u ( m ) tobe the discrete version of u ( (cid:126)x, m ) which is equivalent to (3.1).Next, the problem in Eq. (2.1) is replaced with its PaLS version:(3.3) arg min m ˆ F ( m ) = n ex (cid:88) j =1 (cid:13)(cid:13)(cid:13) d j ( u ( m )) − d obsj (cid:13)(cid:13)(cid:13) + λR ( m ) , m ∈ R n RBF . To solve such a reconstruction problem using a gradient-based method, we need to computethe discrete u ( m ) and its derivatives with respect to the parameters. That is, we need theJacobian J = ∂ u ∂ m , which we deﬁne in Appendix 8.2 in detail. Since our RBFs are compactlysupported, the contribution of each basis function is limited by its support, leading to a sparseJacobian. In addition, each of the derivatives in the Jacobian has a term involving σ (cid:48) , whichis mostly zero except for the object boundaries, hence the Jacobian matrix J is very sparse, ELIASOF M., SHARF A., AND TREISTER E. (a) (b)

Figure 3: The components of the PaLS representation: (a) A smooth vs. piecewise polynomialheaviside function. Top row shows the functions, bottom row shows their derivatives. (b)Wendland’s radial basis functions of orders 1,2 and 3.enabling high resolution reconstructions. While any gradient-based method is suitable to solvea problem like (3.3), in this work we use the Gauss-Newton (GN) method, which gives thefollowing iteration step:(3.4) m ( k +1) = m ( k ) − µ ( k ) ( J T J + λ ∇ R ) − ∇ m ˆ F , where µ ( k ) is obtained by a standard Armijo line-search procedure. For R , in this case, we usea simple iterated Tikhonov regularization R ( m ) = (cid:107) m − m ( k ) (cid:107) . This regularization is lesssensitive to the choice of λ than standard Tikhonov regularization, which does not depend onthe iteration k [20]. Since n RBF —the number of basis functions—is typically small (severalhundreds), it is computationally feasible to solve it by using an exact GN, i.e., invert theHessian J T J + λ ∇ R at each iteration. Here, λ has to be chosen substantial enough so thatthe Hessian matrix is numerically inevitable and the optimization is stable. Other suitableoptions to solve this problem include the non-linear conjugate gradients or LBFGS methods.However, these methods will not be able to fully exploit the low dimension and sparsity ofthe Hessian, and hence are typically slower in this case.We start the solution process with a small number of RBFs located at random locationsaround the center of the grid. Then, at each iteration we add a ﬁxed number of new RBFsto the solution and compute the compact representation of the object on-the-ﬂy. The newlyadded RBFs are located using the gradient of the misﬁt function w.r.t u and multiplied by σ (cid:48) ( u ). A non-zero σ (cid:48) ( u ) indicates the areas on the surface of the object where the sensitivitiesof the PaLS representation are visible (these are the values where 0 < u < HAPE RECONSTRUCTION UNDER CALIBRATION UNCERTAINTY 9 cells from each other in order not to add basis functions that are too close to each other. Weuse this heuristic insofar as it is designed to use the least number of parameters possible toreconstruct the object, it guides us to locations in the reconstructed object where its volumeis either missing or in excess. The new RBFs are added with the coeﬃcient α i = 0, andthus, initially they do not inﬂuence the objective of the modeling, resulting in a monotoneoptimization process. At last, we threshold the solution to make it strictly binary. Algorithm3.1 summarizes the process. Algorithm 3.1

3D Shape Reconstruction using PaLS

Input :

An objective function F ( m ). Init :

Randomly choose p basis functions around the center of the grid. Params: p , p : The number of the initial and incremental RBFs (default: 20, 5). it GN : The number of inner Gauss-Newton iterations (default: 5). δ : a binarization threshold (default: 0.7). for k=1,2,... do • Compute u = u ( m ( k ) ), and ∇ u F : the gradient of the objective with respect to u . • Deﬁne the vector s = | σ (cid:48) ( u ) ∇ F ( u ) | . ( σ : the heaviside function). • Add p new RBFs to m ( k ) , centered at points that correspond to large s . • Set up the iterated Tikhonov regularization. • Apply it GN Gauss-Newton iterations for minimizing ˆ F ( m ): m ( k +1) = arg min m ˆ F ( m ) , end Binarize u ( m ) by thresholding: u = (cid:26) if u > δ otherwise . Using the standard RBF representation in (3.1)restricts us to a spherical basis that results in limited expressiveness and spherical eﬀects inthe reconstructed objects, as can be observed in [28]. Clearly, this outcome is undesirable incomputer graphics applications, especially when objects with ﬂat or sharp elements such asarms, legs or horns, are involved. To render the basis functions more expressive, we suggestthat they be enriched to enable them to represent ellipsoids in space rather than spheres.Such enrichment is done by replacing the parameters β i in (3.1) with symmetric and positivedeﬁnite (SPD) matrices B i ∈ R × . Our ellipsoidal PaLS representation reads:(3.5) u ( (cid:126)x, m ) = σ (cid:32)(cid:88) i α i ψ (cid:18)(cid:13)(cid:13)(cid:13)(cid:16) (cid:126)x − (cid:126)ξ i (cid:17)(cid:13)(cid:13)(cid:13) † B i (cid:19)(cid:33) , where the weighted pseudo norm is given by (cid:107) (cid:126)v (cid:107) † B i = (cid:113) (cid:107) (cid:126)v (cid:107) B i + (cid:15) , and similar to (3.2), the ellipsoidal parameter vector is(3.6) m = { α i , B i , (cid:126)ξ i } n RBF i =1 ∈ R n RBF . (a) Input (b) PaLS with spherical RBFs (c) PaLS with ellipsoidal RBFs Figure 4: Comparison between reconstructions with spherical vs. ellipsoidal PaLS. Bothreconstructions use 200 parameters, that is, 40 RBFs for (b) and 20 RBFs for (c).Our enriched representation denotes a collection of ellipsoids, each with 10 unknowns. Fora description of how the derivatives are computed, see the Appendix in Section 8.2. In thiscase, we also require a regularization to ensure that the matrices B i remain SPD. To thisend, in addition to the Tikhonov regularization on m , we use a standard log barrier function( − (cid:80) i log(det( B i ))) to keep the determinants of B i away from zero. This regularization iseﬀective, as we do not expect B i to be too close to singular. In other cases where B i areexpected to be close to singular, the more sophisticated alternative approach of optimizationover manifolds [12] may be more suitable. To demonstrate the advantages of the ellipsoidalover the spherical RBFs, we reconstruct a model of a foot that has both detailed and smoothelements (Fig. 4). The ﬁgure clearly shows that the proposed ellipsoidal PaLS captures greaterdetail than the PaLS with basic RBFs, even though the two reconstruction algorithms usedthe the same number of parameters.

4. Robustness to Calibration Uncertainty and Scalability.

Object reconstruction algo-rithms typically consume data of the object captured in several orientations, e.g., pictures ofthe object taken at diﬀerent angles [16]. A major drawback of many algorithms is the assump-tion that the orientations at which the data were acquired are accurately known and that thenoise in the measurements corresponds to a simple probability distribution, e.g., Gaussian.This assumption may be practical in some cases—especially when the data set is large—butit may not be ideal in real-world scenarios with small data. The absence of accurately knownacquisition parameters creates noise in the measurements that is diﬃcult to model or predict.Such noise can also be introduced due to a human factor in the data acquisition process, e.g.,hands shaking while capturing the object, or due to mechanical inaccuracies in the machinethat rotate the object for data collection as in [1]. This often leads to poor reconstruction,especially in cases where the data set is small. To solve these problems, we use the analyticPaLS representation of the object, which allows us to analytically manipulate the object’s

HAPE RECONSTRUCTION UNDER CALIBRATION UNCERTAINTY 11 position parameters, and we estimate the acquisition parameters as part of the optimizationprocess for the data ﬁt . This is possible because we can diﬀerentiate the PaLS representationof u with respect to the parameters of the data acquisition.Consider, for example, the dip transform problem in (2.2). The PaLS formulation of thisproblem (analogous to that of the general problem in (3.3)), is(4.1) arg min m  n dips (cid:88) j =1 (cid:13)(cid:13)(cid:13) SP j u ( m ) − d obsj (cid:13)(cid:13)(cid:13) + λR ( m )  . Here P j is a permutation matrix designed to capture the object from diﬀerent angles andlocations that correspond to the j -th measurements of the shape, and S is a projection oper-ator that simulates the data after the object has been situated according to the acquisitionparameters. The operator SP j in (2.2) is linear. The matrices SP j , however, may requirelarge amounts of storage or be expensive to apply. More importantly, such ﬁxed operatorsintroduce non trivial noise if the acquisition parameters are not accurately known. Therefore,in addition to m , we also search for the acquisition parameters encapsulated in P j and deﬁnethe rotation and translation operator P j analytically.To deﬁne the rotation and translation in P j analytically, we ﬁrst deﬁne the linear trans-formation of our coordinates T j ( (cid:126)x ) = Q ( θ j , φ j )( (cid:126)x − (cid:126)x mid ) + (cid:126)b j + (cid:126)x mid . (4.2)Here Q ( θ j , φ j ) is a 3 × θ j and polar angle φ j . (cid:126)b j ∈ R is a translation vector, and (cid:126)x mid is the center of the domain that is used as the centerof the rotation. This is the transformation that guides the deﬁnition of P j in (2.2). Next, wedeﬁne the analytically translated and rotated shape u via backward warping according to thetransformation (4.2): u ( T − j ( (cid:126)x ) , m ). The backward warping is used to avoid discretizationproblems in the rotation. Note, that the rotated u is the analytical deﬁnition of the discreteoperator P j , which, up to discretization errors on the mesh, satisfy(4.3) u ( T − j ( (cid:126)x ) , m ) ≈ P j u ( m ) . Obviously, the rotated shape can also be represented by PaLS by using what we refer to as“rotated parameters”(4.4) rot j ( m ) = rot ( m , θ j , φ j , (cid:126)b j ) , where rot j is the function for rotating the parameters m in the angles θ j , φ j and translationvector (cid:126)b j , which together correspond to the j -th measurements of the shape. We deﬁne rot j below for both versions of the PaLS. To deﬁne an inverse problem with analyticallyrotated parameters (instead of using P j ), we can simply replace the term P u ( m ) in (4.1) with u ( rot j ( m )). As in (4.1), this will lead to a reconstruction with ﬁxed acquisition parameters,which we incorporate into the inversion, described in the following section. Next, we explicitlydeﬁne rot j ( m ) for each of the considered formulations, and the deﬁnition of their associatedJacobians appears in the Appendix in Section 8.3. Rotation for Spherical RBFs.

In the simpler version of the PaLS, we plug (4.3) into (3.1)and obtain u ( T − ( (cid:126)x ) , m ) = σ (cid:32)(cid:88) i α i ψ (cid:18)(cid:13)(cid:13)(cid:13) β i (cid:16) T − ( (cid:126)x ) − (cid:126)ξ i (cid:17)(cid:13)(cid:13)(cid:13) † (cid:19)(cid:33) (4.5) = σ (cid:32)(cid:88) i α i ψ (cid:18)(cid:13)(cid:13)(cid:13) β i (cid:16) (cid:126)x − T j ( (cid:126)ξ i ) (cid:17)(cid:13)(cid:13)(cid:13) † (cid:19)(cid:33) (4.6)The last equality holds because Q is an orthogonal matrix that does not change the (cid:96) normof a vector. Hence, rot j ( m ) only rotates the center of each basis function i and is deﬁned by(4.7) rot j (cid:16) α i , β i , (cid:126)ξ i (cid:17) = rot (cid:16) α i , β i , (cid:126)ξ i , θ j , φ j , (cid:126)b j (cid:17) = (cid:16) α i , β i , T j ( (cid:126)ξ i ) (cid:17) Rotation for the Ellipsoidal RBFs.

The rotation with respect to the ellipsoidal RBFs isslightly more complicated than in the spherical case, and is given by u ( T − ( (cid:126)x ) , m ) = σ (cid:32)(cid:88) i α i ψ (cid:18)(cid:13)(cid:13)(cid:13)(cid:16) T − ( (cid:126)x ) − (cid:126)ξ i (cid:17)(cid:13)(cid:13)(cid:13) † B i (cid:19)(cid:33) (4.8) = σ (cid:32)(cid:88) i α i ψ (cid:18)(cid:13)(cid:13)(cid:13)(cid:16) ( (cid:126)x − (cid:126)x mid − (cid:126)b j ) + T j ( (cid:126)ξ i ) (cid:17)(cid:13)(cid:13)(cid:13) † Q − Tj B i Q − j (cid:19)(cid:33) . (4.9)Here, for the i -th basis function deﬁned by ( α i , B i , (cid:126)ξ i ), we denote its rotation and translationaccording to the j -th acquisition parameters as(4.10) rot j (cid:16) α i , B i , (cid:126)ξ i (cid:17) = (cid:16) α i , Q − Tj B i Q − j , T j ( (cid:126)ξ i ) (cid:17) = ( α i , Q j B i Q Tj , T j ( (cid:126)ξ i )) , where the last equality holds since Q is orthogonal. So far the acquisitionparameters were kept ﬁxed, and we did not address the non-trivial noise that originates frominaccuracies in those parameters. Adequately handling such noise is important to producean accurate reconstruction. Therefore, as part of the reconstruction we also manipulate thePaLS representation parameters of u with respect to the parameters of the data acquisition { ( θ j , φ j ,(cid:126)b j ) } n ex j =1 .To include the data acquisition parameters in the reconstruction, we ﬁrst extend our(ellipsoidal) inversion parameters:(4.11) m ext = (cid:104) ( α i , B i , ξ i ) n RBF i =1 , ( θ j , φ j ,(cid:126)b j ) n ex j =1 (cid:105) ∈ R n RBF +5 n ex . The treatment with respect to the spherical RBFs is similar. To have the parameters of the i -th basis function rotated by the predicted j -th data acquisition parameters, we use the function rot () in (4.10), but now let the acquisition parameters change throughout the optimizationprocess. By combining Eqs. (4.1)-(4.4) together, we obtain our ﬁnal inverse problem(4.12) arg min m ext  n ex (cid:88) j =1 (cid:13)(cid:13)(cid:13) S u ( rot ( m , ( θ j , φ j , b j ))) − d obsj (cid:13)(cid:13)(cid:13) + λR ( m ext )  . HAPE RECONSTRUCTION UNDER CALIBRATION UNCERTAINTY 13

This is a version of (4.1), which applies the rotation of the object by rotating its RBF pa-rameters instead of using the matrices P j , and includes the data acquisition parameters asunknowns for the optimization. Similar to the previous cases, we use a Tikhonov regularizationfor the parameters in m ext .To solve problem (4.12), we calculate the sensitivities of the rotated m with respect toup-to-date predictions of ( θ j , φ j ,(cid:126)b j ) and ( α i , B i , (cid:126)ξ i ). To that end, we deﬁne the Jacobian of(4.10) with respect to all of its parameters:(4.13) J rot = ∂ ( α i , Q j B i Q Tj , T j ( (cid:126)ξ i )) ∂ ( θ j , φ j ,(cid:126)b j , α i , B i , (cid:126)ξ i ) . This Jacobian is deﬁned for all pairs of basis functions and acquisition parameters, and usingthe chain rule it is multiplies with the PaLS Jacobian ∂ u ∂ m to capture the sensitivity of u withrespect to all the parameters. All the deﬁnitions of J rot are found in Appendix 8.4.

5. Joint Reconstruction using PaLS under Calibration Uncertainty.

In sections 2.1-2.2, we present two models that are used to reconstruct a three-dimensional object. One ofthe models, the dip transform, includes information on the volume of the object, but it isalso characterized by a complex data acquisition mechanism. The second model, SfS, canonly provide information about the object’s surface (as it is visual), but the data are cheapand easy to acquire. Here we demonstrate how, in our framework, these two models cancomplement one another. Speciﬁcally, using the SfS model together with the dip transformcan signiﬁcantly reduce the number of dips required to achieve good volume reconstruction,thus promoting the dip transform to be a more practical technique. The main problem withcurrent SfS models, however, is that they artiﬁcially assume that a shape is full underneath itsvisible surface [29, 27]. Since SfS cannot “see” areas on the shape that are hidden from directview, this assumption can potentially prevent us from obtaining an accurate assessment of anobject’s volume, while such hidden parts may be recovered well by the dip transform. Thus,to complement the dip transform we modify the SfS model to generate its data based only onthe object’s surface. To this end, we suggest to use a Softmax function with SfS model, whichensures that the SfS tracks the rays and generates the silhouettes based on a projection of theobject’s surface only. This procedure is summarized is Section 5.1.For the joint reconstruction, we combine the data misﬁt functions of the two modalities.Refering again to the example of Eq. (2.4), we obtain:(5.1) arg min m joint  n dip (cid:88) j =1 loss dip j ( Q dip m joint ) + γ n sfs (cid:88) l =1 loss sfs l ( Q sfs m joint ) + λR ( m joint )  , where m joint is the joint vector of parameters for the two problems, and therefore, it containsthe set of PaLS parameters and separate sets of dip and SfS data acquisition parameters:(5.2) m joint = (cid:104) ( α i , B i , ξ i ) n RBF i =1 , ( θ j , φ j ,(cid:126)b j ) n dip j =1 , ( θ l , φ l ,(cid:126)b l ) n sfs l =1 (cid:105) . The matrices Q dip and Q sfs extract the relevant parameters from m joint for each of their corre-sponding modalities, and γ > that by using this approach we could, theoretically, combine more than two input channelsin our problem. This joint reconstruction is easily set up by using the jInv inversion package[38], which we use for our computations. Joint multi modal inversions using this package werealso recently applied in a geophysical context [44, 19]. The second input source is a modiﬁed version ofthe SfS algorithm that was suggested in [27]. Each pixel in a picture taken by the camerais computed by a ray extending from the camera to that pixel, possibly hitting the object.We denote the constructing ray of the ( i, j )-th pixel by r i,j , and its value by d ij ( r ij is theset of indices that constitute the ray). Ideally, we would have a binary 3D object and a 2Dsilhouette, and if the ray r ij hits the object’s edge or surface, the model should output 1.That is, if we observe two adjacent voxels in the direction of the ray, wherein the value ofthe ﬁrst is 0 and that of the other is 1, it indicates that we hit the surface of the object andshould output 1. For a real-life silhouette simulation, the part of the object that lies beyondthe surface does not inﬂuence the result produced by the model.In practice, we need to work with a smooth transition from 0 to 1, and hence, any voxel thatis near the boundary of the object registers a gradual change as it moves from the backgroundto the object itself. Similarly to the ideal case, we wish to recognize this boundary layer andto output the maximum value in that layer for the silhouette. Additionally, we would like thesilhouette value to depend on all the voxels in the boundary layer, an element that is of criticalimportance to the optimization process in the PaLS framework. To achieve such dependence,we deﬁne a voting process according to the ray’s ﬁrst increasing sequence of values u , whichindicates the rays encountered the object’s boundary.Our voting process for computing d ij entails performing a softmax of the values of u overthe ﬁrst boundary layer in the ray r ij (increasing values of u ). That is:(5.3) d ij ( u ) = (cid:88) k ∈ Ω( r ij ) u [ r ij [ k ]] exp ( η u [ r ij [ k ]]) (cid:80) t ∈ Ω( r ij ) exp ( η u [ r ij [ t ]]) , where Ω( r ij ) is the subset of indices that correspond to increasing values in u in the ray r ij .The behavior of the softmax function is controlled by setting η > η = 50 in our tests). Wenote that insofar, since the process in (5.3) is designed to be realistic and is deﬁned only by theshape’s boundaries, the model cannot determine the invisible inner parts of the shape. Theseparts are left to regularization only. Alternatively, as in our case, one can complement thismodel with a tomographic model like the dip transform to ﬁll the missing inner information. Suppose we aregiven a set of points C = { (cid:126)x i } Ni =1 that lie on the surface of the object. Because our PaLSframework provides a smooth transition from 0 to 1 using the Heaviside function in Fig. 3a,our objective is to ﬁnd the PaLS parameters m , s.t.(5.4) ∀ (cid:126)x i ∈ C : u ( (cid:126)x i , m ) ≈ δ, where δ is our binarization threshold parameter in Alg. 3.1. In order to consistently deﬁnethe inner and outer sides of the shape for all the points, we use the common technique of[14], as follows. Using the points in C we estimate surface normals { (cid:126)n i } Ni =1 , and deﬁne two HAPE RECONSTRUCTION UNDER CALIBRATION UNCERTAINTY 15 additional sets: C fwd = { (cid:126)x i + (cid:15)(cid:126)n i } Ni =1 , and C bwd = { (cid:126)x i − (cid:15)(cid:126)n i } Ni =1 . Then, in our loss function wetry to ﬁt (in a least squares sense) u ( (cid:126)x i , m ) = 0 for (cid:126)x i ∈ C fwd , and u ( (cid:126)x i , m ) = 1 for (cid:126)x i ∈ C bwd ,in addition to (5.4).Using the approach above, our loss depends only on the points deﬁned in the three givensets, and there is no need to deﬁne a volumetric grid as in the other problems. Hence, we useour PaLS model with a collection of points, in a mesh-free manner. That is, the parameters m are estimated using a mesh free minimization of the loss over the points in the clouds,and the shape can be constructed later on an arbitrarily ﬁne mesh. The value of (cid:15) is chosenaccording to the width of the transition from 0 to 1 in the Heaviside function. In addition, tohave a fast implementation, the procedure that computes u ( m ) has to eﬃciently determinewhich points in the clouds are relevant for each basis function. We obtain this using a nearestneighbor data structure implemented in the Julia package NearestNeighbors.jl , which isavailable on GitHub.

6. Results.

To validate our proposed framework for 3D shape reconstruction, we con-ducted several experiments. The ﬁrst experiment aims to demonstrate that even though thePaLS framework contains a representation using smooth RBFs, it can also be used to representsharp objects, which are typically diﬃcult to deﬁne using basis functions. In the subsequentexperiments, we exploit the dip transform model for tomographic data acquisition and com-pare the reconstructions obtained with our framework to those obtained with the originalreconstruction method in [1]. In our third batch of experiments, we demonstrate the robust-ness of the reconstruction algorithm to uncertainty in the acquisition parameters. Followingthat, we show that multi-modal shape reconstruction is possible with our framework via thecombination of the modiﬁed SfS model described in Section 5.1 with the dip transform model.This experiment not only shows that we are able to perform a joint shape reconstruction, italso demonstrates how the large number of experiments required for the dip transform canbe substantially reduced. Finally, we show the reconstruction of an object from two givenmisaligned and non-overlapping point-clouds, and show that their combination with SFS canlead to a correct reconstruction.All the shape reconstructions are obtained using Gauss-Newton in Algorithm 3.1 withiterated Tikhonov and log-determinant barrier regularization. Our framework is implementedin the Julia language [10], and we use the inversion package jInv [38] to perform the com-putations, which are parallelelized over the experiments. We use MeshLab as a ﬁnal step topost-process the volumetric meshes we obtained. The post-processing steps include Laplacian-smoothing and voxel sub-sampling to avoid artifacts, which can be caused by over-samplingthe continuous representation obtained using our framework. Our code for reproducing theresults in this work is available online at: https://github.com/BGUCompSci/ShapeReconstructionPaLS.jl

The outcome of this experiment serves as a proofof concept for the modeling capabilities of the ellipsoidal PaLS. Given an object u deﬁned on amesh (of size 120 × × u ( m ) and assess the quality with which the PaLSrepresentation models the object (based on a moderate number of RBFs). Since our goal isto enable objects to be modeled with the lowest possible number of parameters, we initializethe reconstruction with 5 RBFs and add only one RBF at each iteration. The results of this (a) Input object (b) Smoothed Input (c) 78 RBFs (d) 120 RBFs Figure 5: Sharp object representation. (a) The volumetric input as it is consumed by ourframework. (b) Smoothed version of the volumetric input. (c) and (d) Reconstructions using78 and 120 RBFs, respectively.experiment can be seen in Fig. 5. We intentionally chose an input object with both ﬂat areasand sharp angles to demonstrate the expressiveness of our model. The results show that ourmodel can faithfully reconstruct both the sharp and the ﬂat elements, tasks that are consideredparticularly challenging in shape reconstruction. Furthermore, despite the relative complexityof the shape, its reconstruction is done using a small number of parameters. As expected andcan be seen in Fig. 5, increasing the number of basis functions (while maintaining a relativelylow number of parameters) improves the reconstruction.

Shape Reconstruction Setting and Default Parameters.

In Sections 6.2-6.4 we present a fewshape reconstruction experiments. To make the experiments realistic, all the data measure-ments are generated from a binary object using a 240 × ×

240 grid. The data are thendown-sampled, and we perform the reconstruction on a 120 × ×

120 grid. We also includeuncertainty in the acquisition parameters (detailed later) and Gaussian white noise of variance σ = 2 V , where V is the volume of a voxel. For the reconstruction, we use algorithm 3.1 andits default parameters. That is, in all the experiments, we start with 20 randomly locatedRBFs ( p = 20), and at each iteration, we add 5 basis functions to improve the reconstruction( p = 5). We perform 40 iterations that result in 220 radial basis functions, and a typicalreduction of 3 order of magnitude in the data term. The value of the regularization λ is ini-tially chosen to be 10 − and we reduce it by a factor of 0 . , cube. Each of the initialRBFs is initialized as sphere with a radius of 1 (i.e., B i = I × ), located randomly around thecenter of the grid. Its coeﬃcient α i is initialized with a small random number. When addingRBFs during the optimization process, we set them with a smaller radius of , and zero theircoeﬃcient α i , such that we obtain monotonically non-increasing optimization routine. Thelocation of the added RBFs is determined via the gradient of the objective function w.r.t u ,such that new basis functions are added at grid points where the magnitude of the gradient HAPE RECONSTRUCTION UNDER CALIBRATION UNCERTAINTY 17

Known calibration Uncertain calibrationPaLS TV PaLS TVFigure 6: Comparison between dip transform reconstruction similar to [1] and our method.Noisy data consists of 2 degrees and 4% translation noise. Top: reconstruction from 1000 dips(large data). Bottom: reconstruction from 400 dips (small data). All the experiments alsoinclude i.i.d. Gaussian white noise.is largest to improve the representation of the object where it lacks the most.

In this batch of experiments, we demonstratethat our framework can signiﬁcantly enhance object reconstruction when there is not enoughdata. We compare the original work [1] using TV regularization and our RBF-based diptransforms for diﬀerent numbers of dips (small vs. large data) and diﬀerent degrees of noisein the acquisition parameters. The results are summarized in Fig. 6. First, to accuratelyreconstruct the object, the number of dips required in our framework (400) is signiﬁcantlylower compared to the original method, which requires at least 1000 data measurements toproduce a good reconstruction (this is aligned with the results in [1]). Secondly, and perhapsmore importantly, our method is able to overcome uncertainty in the acquisition parameters,which, in the case of small data, completely ruins the TV reconstruction. In contrast to thetremendous impact that the calibration uncertainty has on reconstruction when using theconventional method, when using our method, its eﬀect is small to insigniﬁcant.

In this section, we demonstrate the robust-ness of our reconstruction method to calibration uncertainty as discussed in Section 4.1. Westart with an experiment of a reconstruction assuming all the acquisition parameters areknown. Then, we gradually increase the calibration uncertainty level until it degrades theresults to the extent that the reconstruction no longer resembles the object. Fig. 7 shows thatour framework is capable of handling a signiﬁcant amount of calibration noise—more than 10degrees of noise in the object’s rotation and 17 percent in its translation. To assess the eﬀectof the data size on the ﬁnal reconstruction and robustness to noise, we conduct the experi- a b cFigure 7: Robustness to calibration uncertainty in the PaLS reconstruction. First row: 200dips. Second row: 600 dips. (a) Known calibration. (b) 4 degrees noise and 7% translationnoise (c) 10 degrees noise and 17% translation noise.Figure 8: Data misﬁt histories of the reconstruction with 200 and 600 dips (shown in Fig. 7),with various noise levels in the acquisition parameters .ment for low (200) and high (600) numbers of measurements. As expected, as the number ofmeasurements increases, the reconstructions become more robust to calibration noise.Here we also show the decay of the loss function as we increase the number of RBFs toreconstruct the object (Fig. 8). Note that the loss values do not include the regularizationterm, and hence they are not necessarily monotonically non-increasing. Our optimizationmethod guarantees a monotonically non-increasing solution for the objective in Eq. (4.12).

HAPE RECONSTRUCTION UNDER CALIBRATION UNCERTAINTY 19

Figure 9: Silhouettes of the object from diﬀerent orientations obtained using our modiﬁed SfSapproach described in 5.1.a b c dFigure 10: Joint reconstruction from tomographic and visual data. (a) Input object. Recon-structions based on (b) 800 dips, (c) 200 dips, and (d) 200 dips and 16 silhouettes.

Using the formulation in Eq. (5.1), we are ableto use data from two diﬀerent models to perform joint shape reconstruction. The ﬁrst modelthat we use is the dip transform as in the previous experiments. The second input source is amodiﬁed version of the SfS algorithm that was suggested in Section 5.1. Figure 10 shows thatthe use of the dip transform technique alone necessitates a large number of measurements (800)to obtain a good reconstruction, and achieves poor quality when fewer measurements (200)are used. The addition of the visual source signiﬁcantly improves the reconstruction whilereducing the number of the required dips. Thus, we conclude that the joint inversion algorithmis superior in terms of reconstruction accuracy. In addition, perhaps more importantly its useenables a considerable reduction in the number of data measurements needed, a favorableoutcome that could promote such experiments which are typically expensive, diﬃcult and/ordangerous to conduct (e.g., CT scans) when using conventional techniques.

In this experiment we demonstrate ourmethod for shape reconstruction from point clouds, as well as registration of point clouds.Often, two or more point clouds are given, possibly with some overlapping regions, and thegoal is to align them such that they can be viewed as one point cloud for performing tasks likeshape reconstruction. To demonstrate the registration and reconstruction from point clouds (a) (b) (c) (d)

Figure 11: Reconstruction from two point clouds. (a) The aligned point clouds. (b) Themisaligned point clouds used as input. (c) A reconstruction from the misaligned point cloudsonly. (d) A joint reconstruction from the misaligned point clouds together with SfS.we generate two non-overlapping point clouds from the shape in Fig. 5a—see Fig. 11a. Thenwe signiﬁcantly move one of the point clouds as shown in Fig. 11b, and try to reconstructthe shape. Since there is no overlap in the point distribution, methods like ICP often struggleto ﬁnd the correct alignment. Fig. 11c shows the a reconstruction using our mesh-freeimplementation. From an optimization perspective, this reconstruction is good even thoughit is misaligned—the data term in this experiment was reduced by two orders of magnitudein the optimization process. The drop-like artifacts that appear in the reconstruction arenot visible in the misﬁt term involving the point cloud. We note that our method is ableto recover the right alignment for some cases of initialization and choice of parameters, butthat is not guaranteed by a low data term, and is not robust. To guarantee the correctreconstruction following a successful optimization, we add another data modality, SfS, whichincludes silhouettes of the complete shape. Fig 11d shows a joint reconstruction from boththe point clouds and 8 ﬁgures of silhouettes. This time, the point clouds are aligned by theoptimization since it has the SfS to guide it towards the right shape, and recover the alignmentparameters.

7. Conclusions and Future Work.

In this paper, we proposed a general framework forrobust and eﬃcient 3D shape reconstruction using parametric level set methods with ellip-soidal radial basis functions. Using a compact PaLS representation, our reconstruction processestimates a relatively low number of parameters, and therefore, it requires a correspondinglysmall number of data measurements. Our framework has the ability to accurately estimateacquisition parameters as part of the reconstruction, and it is robust to noise that originatesfrom inaccuracies in these parameters. Our framework also supports multi-modal reconstruc-tion, and thus, it can be adapted to incorporate several models in a single shape estimation.The proposed method is capable of reducing the number of required measurements, which cangenerally be beneﬁcial for models where experiments are expensive, diﬃcult or dangerous toconduct (e.g., MRI scans).The compact analytical PaLS representation can be used to enhance various applications.For example, it is a natural approach to apply reconstruction from tomographic measurements

HAPE RECONSTRUCTION UNDER CALIBRATION UNCERTAINTY 21 under deformation [53], where the deformations can be treated analytically using the PaLSrepresentations, and the number of measurements can be reduced. The PaLS can also beconsidered as a regularization inside a deep neural network architecture, where the insightsfrom this work can be incorporated to reduce the computational cost for 3D shape analysisand image segmentation.

8. Appendix.8.1. The Wendland’s and Heaviside Functions.

The Wendland’s compactly supportedradial basis functions [49] constitute a family of polynomial RBFs ψ i whose range is [0 ,

1] andzero outside [0 ,

1] (for r > ψ i ( r ) = 0). A common choice to be used in R and R is one ofthe following: ψ ( r ) = (1 − r ) ∈ C (8.1) ψ ( r ) = (1 − r ) (4 r + 1) ∈ C (8.2) ψ ( r ) = (1 − r )

6+ 13 (35 r + 4 r + 3) ∈ C (8.3) ψ ( r ) = (1 − r ) (32 r + 25 r + 8 r + 1) ∈ C (8.4)where () + denotes the max ( x,

0) function. In this work, we use ψ because it is the simplestdiﬀerentiable function and we did not see any change in our experiments when using functionsof higher order.The Heaviside function σ smoothly transitions between 0 and 1. As shown in Fig. 3a andmentioned in [22], we wish to make the derivative of this function (the delta function) as ﬂatas possible to render our Gauss-Newton Hessian better conditioned. We deﬁne our Heavisidefunction to be a smoothed linear transition from 0 to 1 using a piecewise polynomial function.More precisely, it is deﬁned by a transition width δ and a smoothness width (cid:15) as follows:(8.5) σ δ,(cid:15) ( x ) =  x < − δ − (cid:15) δ(cid:15) ( x + δ + (cid:15) ) − δ − (cid:15) ≤ x < − δ + (cid:15) δ ( x + δ ) − δ + (cid:15) ≤ x < δ − (cid:15) − δ(cid:15) ( δ + (cid:15) − x ) + 1 δ − (cid:15) ≤ x < δ + (cid:15) δ + (cid:15) < x This function and its ﬁrst derivative are continuous, and it is very similar to the functionproposed in [22], using a sin function to deﬁne the transition smoothness. We prefer apolynomial approximation because of computation considerations. As default parametersthroughout all the experiments in this paper, we choose δ = 0 . , (cid:15) = 0 .

01, and use the letter σ to denote the function. Here we show the derivatives of theellipsoidal PaLS function, and the derivatives of the standard PaLS in (3.1) are deﬁned in [2].The i -th basis function in the sum of (3.5) is deﬁned by α i , B i , (cid:126)ξ i . For each RBF (containing10 parameters), we deﬁne its radius by:(8.6) r i ( (cid:126)x ) = (cid:13)(cid:13)(cid:13) (cid:126)x − (cid:126)ξ i (cid:13)(cid:13)(cid:13) † B i . Dropping the index i , the matrix B is deﬁned by 6 parameters due to symmetry: B =  B B B B B B B B B  . (8.7)We deﬁne the operator P =   (8.8)to be the operator that takes the vector [ B , ..., B ] (cid:62) and by multiplication transfers it into avector of size 9 that corresponds to the symmetric matrix B in (8.7). With this notation, thederivatives of r i ( (cid:126)x ) are computed by: ∂r i ∂(cid:126)ξ i = − B i (cid:126)z i r ( (cid:126)x ) ∈ R (8.9) ∂r i ∂B i = 1 r ( (cid:126)x ) P (cid:62) vec ( (cid:126)z i (cid:126)z (cid:62) i ) ∈ R (8.10)where (cid:126)z i = (cid:126)x − (cid:126)ξ i , and the operator vec is the column-stack operator transferring a 3 × u , as the additionalterms are just scalars: ∂u∂α i ( (cid:126)x ) = σ (cid:48) ( u ( (cid:126)x )) ψ ( r i ( (cid:126)x ))(8.11) ∂u∂(cid:126)ξ i ( (cid:126)x ) = α i σ (cid:48) ( u ( (cid:126)x )) ψ (cid:48) ( r i ( (cid:126)x )) ∂r i ∂(cid:126)ξ i (8.12) ∂u∂B i ( (cid:126)x ) = α i σ (cid:48) ( u ( (cid:126)x )) ψ (cid:48) ( r i ( (cid:126)x )) ∂r i ∂B i (8.13) As noted before, the radial basis functionscan be rotated analytically. Thus, instead of rotating u ( m ) we can simply rotate each of thebasis functions according to its rotation parameters. For the spherical RBFs, we have therotation function in (4.7) for rotating the i -th basis function at angles θ j , φ j and translationvector (cid:126)b j , which corresponds to the j-th measurements of the shape. The Jacobian of (4.7)with respect to ( α i , β i , (cid:126)ξ i ) is equal to(8.14) ∂rot j ( α i , β i , (cid:126)ξ i ) ∂ ( α i , β i , (cid:126)ξ i ) =  (cid:126) T (cid:126) T (cid:126) (cid:126) Q j  ∈ R × . HAPE RECONSTRUCTION UNDER CALIBRATION UNCERTAINTY 23

For the ellipsoidal RBFs, the rotation function is deﬁned in (4.10) and its Jacobian isdeﬁned by ∂rot j ( α i , B i , (cid:126)ξ i ) ∂ ( α i , B i , (cid:126)ξ i ) =  × × × tril ( Q j ⊗ Q j P ) × × × × Q j  ∈ R × , (8.15)where P is the matrix deﬁned in (8.8) and tril ( A ) is an operator that chooses the 6 out of9 rows of A corresponding to the indices of the lower-triangular elements of a 3 × tril ( Q j ⊗ Q j P ) ∈ R × . To incorporate the rotationparameters in the optimization process, we need to compute the derivatives of u according tothem. This way, we are able to overcome uncertainty of the calibration in the data-acquisitionprocess. For spherical RBFs, we have to take the derivatives of (4.7) with respect to theadditional 5 parameters in ( θ j , φ j ,(cid:126)b j ). We obtain: ∂rot ( α i , β i , (cid:126)ξ i , θ j , φ j ,(cid:126)b j ) ∂θ j = [0 , , ∂Q j ∂θ j ( (cid:126)x − (cid:126)x mid )] T = [0 , , (cid:126)z θ j ] T ∈ R (8.16) ∂rot ( α i , β i , (cid:126)ξ i , θ j , φ j ,(cid:126)b j ) ∂φ j = [0 , , ∂Q j ∂φ j ( (cid:126)x − (cid:126)x mid )] T = [0 , , (cid:126)z φ j ] T ∈ R (8.17) ∂rot ( α i , β i , (cid:126)ξ i , θ j , φ j ,(cid:126)b j ) ∂(cid:126)b j = (cid:20) × I × (cid:21) ∈ R × (8.18)where(8.19) (cid:126)z θ j = ∂Q j ∂θ j ( (cid:126)x − (cid:126)x mid ) , and (cid:126)z φ j = ∂Q j ∂φ j ( (cid:126)x − (cid:126)x mid ) . Placing all the partial derivatives together in the Jacobian we obtain:(8.20) ∂rot ( α i , β i , (cid:126)ξ i , θ j , φ j ,(cid:126)b j ) ∂ ( α i , β i , (cid:126)ξ i , θ j , φ j ,(cid:126)b j ) = (cid:20) I × × × × × × Q j (cid:126)z θ j (cid:126)z φ j I × (cid:21) × For ellipsoidal RBFs, the derivatives of (4.10) include: ∂rot ( α i , B i , (cid:126)ξ i , θ j , φ j ,(cid:126)b j ) ∂ ( θ j , φ j ) =  (cid:126)T θ j (cid:126)T φ j (cid:126)z θ j (cid:126)z φ j  ∈ R × , (8.21)where (cid:126)T θ j = tril (cid:32) Q j B i ∂Q Tj ∂θ j + ∂Q j ∂θ j B i Q Tj (cid:33) ∈ R (8.22) (cid:126)T φ j = tril (cid:32) Q j B i ∂Q Tj ∂φ j + ∂Q j ∂φ j B i Q Tj (cid:33) ∈ R , (8.23) and the derivative with respect to (cid:126)b j is similar to (8.18). Altogether, we obtain the Jacobian:(8.24) ∂rot ( α i , B i , (cid:126)ξ i , θ j , φ j ,(cid:126)b j ) ∂ ( α i , B i , (cid:126)ξ i , θ j , φ j ,(cid:126)b j ) =  × × × (cid:126) tril ( Q j ⊗ Q j P ) × × (cid:126)T θ j (cid:126)T φ j × (cid:126) × Q j (cid:126)z θ j (cid:126)z φ j I ×  × . REFERENCES [1]

K. Aberman, O. Katzir, Q. Zhou, Z. Luo, A. Sharf, C. Greif, B. Chen, and D. Cohen-Or , Diptransform for 3d shape reconstruction , ACM Transactions on Graphics (TOG), 36 (2017), p. 79.[2]

A. Aghasi, M. Kilmer, and E. L. Miller , Parametric level set methods for inverse problems , SIAMJournal on Imaging Sciences, 4 (2011), pp. 618–650.[3]

A. Aghasi and J. Romberg , Sparse shape reconstruction , SIAM Journal on Imaging Sciences, 6 (2013),pp. 2075–2108.[4]

H. Avron, A. Sharf, C. Greif, and D. Cohen-Or , (cid:96) -sparse reconstruction of sharp point set surfaces ,ACM Transactions on Graphics (TOG), 29 (2010), p. 135.[5] S. Basu and Y. Bresler , Uniqueness of tomography with unknown view angles , IEEE Transactions onImage Processing, 9 (2000), pp. 1094–1106.[6]

M. Berger, A. Tagliasacchi, L. Seversky, P. Alliez, J. Levine, A. Sharf, and C. Silva , State ofthe art in surface reconstruction from point clouds , EUROGRAPHICS star reports, 1 (2014), pp. 161–185.[7]

M. Berger, A. Tagliasacchi, L. M. Seversky, P. Alliez, G. Guennebaud, J. A. Levine, A. Sharf,and C. T. Silva , A survey of surface reconstruction from point clouds , in Computer Graphics Forum,vol. 36, Wiley Online Library, 2017, pp. 301–329.[8]

A. Bermano, A. Vaxman, and C. Gotsman , Online reconstruction of 3d objects from arbitrary cross-sections , ACM Transactions on Graphics (TOG), 30 (2011), p. 113.[9]

P. J. Besl and N. D. McKay , Method for registration of 3-d shapes , in Sensor fusion IV: controlparadigms and data structures, vol. 1611, International Society for Optics and Photonics, 1992,pp. 586–606.[10]

J. Bezanson, A. Edelman, S. Karpinski, and V. B. Shah , Julia: A fresh approach to numericalcomputing , SIAM Review, 59 (2017), pp. 65–98.[11]

F. Bleichrodt, J. Sijbers, J. de Beenhouwer, and K. J. Batenburg , An alignment method for fanbeam tomography , in Proceedings of 1st International Conference on Tomography of Materials andStructures, 2013, pp. 103–106.[12]

N. Boumal, B. Mishra, P.-A. Absil, and R. Sepulchre , Manopt, a matlab toolbox for optimizationon manifolds , The Journal of Machine Learning Research, 15 (2014), pp. 1455–1459.[13]

E. Bretin, F. Dayrens, and S. Masnou , Volume reconstruction from slices , SIAM Journal on ImagingSciences, 10 (2017), pp. 2326–2358.[14]

J. C. Carr, R. K. Beatson, J. B. Cherrie, T. J. Mitchell, W. R. Fright, B. C. McCallum,and T. R. Evans , Reconstruction and representation of 3d objects with radial basis functions , inProceedings of the 28th annual conference on Computer graphics and interactive techniques, ACM,2001, pp. 67–76.[15]

W. Chen, X. Han, G. Li, C. Chen, J. Xing, Y. Zhao, and H. Li , Deep rbfnet: Point cloud featurelearning using radial basis functions , arXiv preprint arXiv:1812.04302, (2018).[16]

D. Cremers and K. Kolev , Multiview stereo and silhouette consistency via convex functionals overconvex domains , IEEE Trans. on Pattern Analysis and Machine Intelligence, 33 (2011), pp. 1161–1174.[17]

S. Cuomo, A. Galletti, G. Giunta, and A. Starace , Surface reconstruction from scattered point viarbf interpolation on gpu , in Computer Science and Information Systems (FedCSIS), 2013 FederatedConference on, IEEE, 2013, pp. 433–440.[18]

E. De Sturler, S. Gugercin, M. E. Kilmer, S. Chaturantabut, C. Beattie, and M. O’Connell , Nonlinear parametric inversion using interpolatory model reduction , SIAM Journal on Scientiﬁc Com-

HAPE RECONSTRUCTION UNDER CALIBRATION UNCERTAINTY 25 puting, 37 (2015), pp. B495–B517.[19]

S. W. Fung and L. Ruthotto , An uncertainty-weighted asynchronous admm method for parallel pdeparameter estimation , arXiv preprint arXiv:1806.00192, (2018).[20]

E. Haber , Computational methods in geophysical electromagnetics , vol. 1, SIAM, 2014.[21]

L. Houben and M. B. Sadan , Reﬁnement procedure for the image alignment in high-resolution electrontomography , Ultramicroscopy, 111 (2011), pp. 1512–1520.[22]

A. Kadu, T. van Leeuwen, and K. J. Batenburg , A parametric level-set method for partially discretetomography , in International Conference on Discrete Geometry for Computer Imagery, Springer, 2017,pp. 122–134.[23]

A. Kadu, T. van Leeuwen, and W. A. Mulder , Salt reconstruction in full-waveform inversion with aparametric level-set method , IEEE Trans. on Computational Imaging, 3 (2017), pp. 305–315.[24]

B. Kainz, M. Steinberger, W. Wein, M. Kuklisova-Murgasova, C. Malamateniou, K. Ker-audren, T. Torsney-Weir, M. Rutherford, P. Aljabar, J. V. Hajnal, et al. , Fast volumereconstruction from motion corrupted stacks of 2d slices , IEEE transactions on medical imaging, 34(2015), pp. 1901–1913.[25]

J. Kim and C.-O. Lee , Three-dimensional volume reconstruction using two-dimensional parallel slices ,SIAM Journal on Imaging Sciences, 12 (2019), pp. 1–27.[26]

K. Kolev, T. Brox, and D. Cremers , Fast joint estimation of silhouettes and dense 3d geometry frommultiple images , IEEE Trans. on Pattern Analysis and Machine Intelligence, 34 (2012), pp. 493–505.[27]

K. Kolev, M. Klodt, T. Brox, and D. Cremers , Continuous global optimization in multiview 3dreconstruction , International Journal of Computer Vision, 84 (2009), pp. 80–96.[28]

F. Larusson, P. G. Anderson, E. Rosenberg, M. E. Kilmer, A. Sassaroli, S. Fantini, and E. L.Miller , Parametric estimation of 3d tubular structures for diﬀuse optical tomography , Biomedicaloptics express, 4 (2013), pp. 271–286.[29]

A. Laurentini , The visual hull concept for silhouette-based image understanding , IEEE Transactions onpattern analysis and machine intelligence, 16 (1994), pp. 150–162.[30]

L. Liu, C. Bajaj, J. O. Deasy, D. A. Low, and T. Ju , Surface reconstruction from non-parallel curvenetworks , Computer Graphics Forum, 27 (2008), pp. 155–163.[31]

M. S. McMillan, C. Schwarzbach, E. Haber, and D. W. Oldenburg ,

3d parametric hybrid inversionof time-domain airborne electromagnetic data , Geophysics, 80 (2015), pp. K25–K36.[32]

L. Metivier, R. Brossier, S. Operto, and J. Virieux , Full waveform inversion and the truncatednewton method , SIAM Review, 59 (2017), pp. 153–195.[33]

Y. Ohtake, A. Belyaev, and H.-P. Seidel , A multi-scale approach to 3d scattered data interpolationwith compactly supported basis functions , in Shape Modeling International, 2003, IEEE, 2003, pp. 153–161.[34]

D. Y. Parkinson, C. Knoechel, C. Yang, C. A. Larabell, and M. A. Le Gros , Automatic alignmentand reconstruction of images for soft x-ray tomography , Journal of structural biology, 177 (2012),pp. 259–266.[35]

C. R. Qi, L. Yi, H. Su, and L. J. Guibas , Pointnet++: Deep hierarchical feature learning on point setsin a metric space , in Advances in Neural Information Processing Systems, 2017, pp. 5099–5108.[36]

L. I. Rudin, S. Osher, and E. Fatemi , Nonlinear total variation based noise removal algorithms , PhysicaD: Nonlinear Phenomena, 60 (1992), pp. 259–268.[37]

S. Rusinkiewicz and M. Levoy , Eﬃcient variants of the icp algorithm. , in 3dim, vol. 1, 2001, pp. 145–152.[38]

L. Ruthotto, E. Treister, and E. Haber , jInv – a ﬂexible Julia package for PDE parameter estimation ,SIAM J. Sci. Comput., 39 (2017), pp. S702–S722.[39] F. Santosa , A level-set approach for inverse problems involving obstacles fadil santosa , ESAIM: Control,Optimisation and Calculus of Variations, 1 (1996), pp. 17–33.[40]

J. A. Sethian , Level set methods and fast marching methods: evolving interfaces in computational ge-ometry, ﬂuid mechanics, computer vision, and materials science , vol. 3, Cambridge university press,1999.[41]

A. Singer and Y. Shkolnisky , Three-dimensional structure determination from common lines in cryo-em by eigenvectors and semideﬁnite programming , SIAM journal on imaging sciences, 4 (2011),pp. 543–572.

HAPE RECONSTRUCTION UNDER CALIBRATION UNCERTAINTY [42]

R. Y. Takimoto, M. d. S. G. Tsuzuki, R. Vogelaar, T. de Castro Martins, A. K. Sato, Y. Iwao,T. Gotoh, and S. Kagei ,

3d reconstruction and multiple point cloud registration using a low precisionrgb-d sensor , Mechatronics, 35 (2016), pp. 11–22.[43]

G. K. Tam, Z.-Q. Cheng, Y.-K. Lai, F. C. Langbein, Y. Liu, D. Marshall, R. R. Martin, X.-F.Sun, and P. L. Rosin , Registration of 3d point clouds and meshes: a survey from rigid to nonrigid ,IEEE transactions on visualization and computer graphics, 19 (2012), pp. 1199–1217.[44]

E. Treister and E. Haber , Full waveform inversion guided by travel time tomography , SIAM J. Sci.Comput., 39 (2017), pp. S587–S609.[45]

C.-Y. Tsai and C.-H. Huang , Indoor scene point cloud registration algorithm based on rgb-d cameracalibration , Sensors, 17 (2017), p. 1874.[46]

T. van Leeuwen, S. Maretzke, and K. J. Batenburg , Automatic alignment for three-dimensionaltomographic reconstruction , Inverse Problems, 34 (2018), p. 024004.[47]

L. A. Vese and T. F. Chan , A multiphase level set framework for image segmentation using the mumfordand shah model , International journal of computer vision, 50 (2002), pp. 271–293.[48]

C. R. Vogel , Computational methods for inverse problems , vol. 23, SIAM, Philadelphia, 2002.[49]

H. Wendland , Piecewise polynomial, positive deﬁnite and compactly supported radial functions of minimaldegree , Advances in computational Mathematics, 4 (1995), pp. 389–396.[50]

R. T. Whitaker , A level-set approach to 3d reconstruction from range data , International journal ofcomputer vision, 29 (1998), pp. 203–231.[51]

C. Yang, E. G. Ng, and P. A. Penczek , Uniﬁed 3-d structure and projection orientation reﬁnementusing quasi-newton algorithm , Journal of structural biology, 149 (2005), pp. 53–64.[52]

K. Yin, H. Huang, P. Long, A. Gaissinski, M. Gong, and A. Sharf , Full 3d plant reconstructionvia intrusive acquisition , Computer Graphics Forum, 35 (2016), pp. 272–284.[53]

G. Zang, R. Idouchi, R. Tao, G. Lubineau, P. Wonka, and W. Heidrich , Space-time tomographyfor continuously deforming objects , ACM Transactions on Graphics (TOG), 37 (2018), p. 100.[54]

Q. Zheng, X. Fan, M. Gong, A. Sharf, O. Deussen, and H. Huang ,

4d reconstruction of bloomingﬂowers , Computer Graphics Forum, 36 (2017), pp. 405–417.[55]