Manifold Repairing, Reconstruction and Denoising from Scattered Data in High-Dimension
MManifold Repairing, Reconstruction and Denoising fromScattered Data in High-Dimension
Shira Faigenbaum-Golovin , ∗ David Levin School of Mathematical Sciences, Tel Aviv University, Israel ∗ Corresponding author, E-mail address: [email protected]
February 4, 2021
Abstract
We consider a problem of great practical interest: the repairing and recovery of alow-dimensional manifold embedded in high-dimensional space from noisy scattered data.Suppose that we observe a point cloud sampled from the low-dimensional manifold, withnoise, and let us assume that there are holes in the data. Can we recover missing infor-mation inside the holes? While in low-dimension the problem was extensively studied,manifold repairing in high dimension is still an open problem. We introduce a new ap-proach, called Repairing Manifold Locally Optimal Projection (R-MLOP), that expandsthe MLOP method [14], to cope with manifold repairing in low and high-dimensionalcases. The proposed method can deal with multiple holes in a manifold. We prove thevalidity of the proposed method, and demonstrate the effectiveness of our approach byconsidering different manifold topologies, for single and multiple holes repairing, in lowand high dimensions. keywords:
Manifold learning, Manifold denoising, Manifold reconstruction, Manifold repair-ing, High dimension, Dimensional reduction
MSC classification:
Imagine that we observe partial data sampled, with noise, from a manifold in high dimensionalspace. Is it possible to reconstruct this manifold exactly or at least accurately? For example,in the case of acquiring oil and gas well data, where each drilling operation is expensive, high-resolution sampling can be extremely pricey. In general, everybody would agree that recoveringdata from an incomplete sample is impossible. In this paper, we show that if the unknowndata is known to lie on a manifold in high dimension, then it is possible to reconstruct it andapproximate the samples that we have not seen.Before we turn to discuss the problem for high-dimensional data, we first consider the simpler,yet challenging questions of surface reconstruction and recovering missing information. Thechallenge of plain surface reconstruction gained a lot of attention in the recent years, withmany methods proposed to address it (see, the papers of [1, 5, 9, 21, 24]). There are many1 a r X i v : . [ m a t h . NA ] F e b hallenges when dealing with this problem, among them the need for preservation of sharp fea-tures, reconstruction from uniform/non-uniform data sampling, undersampling, fast-changingcurvature [12], adjacent surface segments, and the presence of noise. For example, usuallyreconstruction methods smooth out the existing corners and sharp features present on the sur-face. Several papers raised the sharp features preservation challenge and suggested methods toaddress it (e.g., [18, 35]).Missing information in the data, i.e., the presence of holes, is an additional challenge insurface reconstruction. During surface acquisition, sampling the entire geometry is not alwayspossible due to physical obstacles, occlusions, scanning angle issues, scanning costs, or to simplythe fact that the available data is incomplete. The problem of hole filling can be formulatedas follows: Given data sampled from a surface with holes, the task is to generate new pointsso as to reconstruct the data inside the hole. The problem of filling holes and recovering lostinformation in low dimensions has been dealt with by many researchers (for a concise survey,see [15]). The problem itself is usually treated as a two-step challenge, namely, locating the hole,and then recovering the missing data. The solution for both steps can be roughly classofoedaccording to the way in which the input data are given: as a point-cloud or as a mesh.The hole identification problem is further challenged by the real-life scenarios since usuallyinformation regarding surface geometry is unavailable. Subsequently, recognizing the areas thatneed to be amended and the ones that should be omitted (since this is the true geometry ofthe surface), becomes an ill-defined and non-trivial task. In case some information regardingthe connectivity of the data is available, mesh representation comes in handy. Methods likethose presented in [17, 19, 36] utilize this information for the hole identification by recognizinghollow edges, extract boundary via k -nearest neighbors search, or labeling boundaries basedon the triangulation structure. On the other hand, if the data are given as a point cloud,other methods are applicable. In [8] the authors suggested locating boundaries via sparse areaheuristics, in [25] an enhanced k -nearest neighborhood is used, while in [4] a set of criteria arederived for automatic hole detection (the criteria rely on symmetric k - (cid:15) neighborhoods, theirangle criterion and the deviation of a point and its neighborhood average, as well as the shapecriterion, based on eigenvalues).Once a hole is located, reconstructing the information inside the hole can be achieved byvarious methods. In the case where the surface is represented as a mesh, the paper [36] usesa triangle growth procedure based on the boundary edge angle, in [17] the hole is filled underconstraint on the boundary, in [19,28] holes are amended in a piecewise manner, i.e., by splittingholes into smaller parts and fixing latter, while in [2] a mesh repairing technique is proposed.When the surface is given as a point cloud, the authors of [8] suggested fitting an algebraicsurface patch to the neighborhood of the boundary; this avoids parametrization and allowsone to deal with holes with complicated topology. Other solutions rely on the minimal surfacesolutions (Plateau’s problem [16, 29]), diffusion [11], or an MLS-based point-cloud resamplingtechnique [25, 31]. In addition, some methods address the surface repairing problem as theproblem of reconstructing the surface from boundary conditions [26, 32].The problem of completing missing information arises not only in geometrical settings, butalso in other fields, for example, in matrix computations [7], or in image processing. In thelatter case, the problem is known as image inpainting, and various approaches were proposedit [6, 10, 27, 34]. Of special interest is the elegant solution suggested in [6], where the hole edgeinformation is propagated inside the hole using the gradient of the Laplacian in the directionof the edge. Thanks to the geometrical nature of the solution, it can be adopted to the surfacecase. 2he existing solutions for surface reconstruction have a hard time coping with challengesraised by real-life applications, especially in high dimension. First, they rely on some priorknowledge regarding the surface (e.g., a triangulation). In addition, these solutions do not dealwith noise or outliers, which are usually present in the data. Moreover, since they were tailoredfor the surface repairing challenge, most of the known methods cannot be extended to thehigh-dimensional case (due to practical reasons connected with inefficiency of meshes in highdimensions and to the curse of dimensionality). Thus, manifold repairing in high dimensionsis still an open problem.In this paper, we will address the problem of hole repairing in low and high dimensions. Weintroduce a new approach, which extends the Manifold Locally Optimal Projection (MLOP)[14], to cope with manifold repairing in low and high-dimensional cases. The MLOP methodis a multidimensional extension of the Locally Optimal Projection algorithm [24]. The MLOPbypasses the curse of dimensionality and avoids the need for dimension reduction. It is basedon a non-convex optimization problem, which leverages a generalization of the outlier-robust L -median to higher dimensions while generating noise-free quasi-uniformly distributed pointsreconstructing the unknown low-dimensional manifold.The vanilla MLOP does not repair the manifold, since by its design it maintains the proxim-ity to the original points. For example, see the MLOP method reconstruction of the Stanfordbunny [23], in Figure 1 (A-B), where one can see that the MLOP method maintained the scat-tered points geometry, and the existing hole was not amended. To overcome this, we suggestenhancing the MLOP method to address data repairing problem by adding another force ofattraction which will push the boundary points towards their convex hull (see Figure 1 (C)). Itshould be noted that since the hole identification problem is ill-posed, and the best way to fixthe missing information is by the user manually identifying the holes that need to be amended.In what follows we first assume that the location of the hole is known, and later propose amethod for hole identification. (A) (B) (C) Figure 1: Amending the scattered data for the Stanford bunny. (A) The initial scattered dataof size 1K sampled from the bunny model (the initial P -points in green, the initial Q -points inred). (B) The picture produced by the plain MLOP, which results in a quasi-uniform samplingof the bunny: the hole is not amended. (C) The picture produced by the R-MLOP algorithm. The Locally Optimal Projection (LOP) method was introduced in [24] to approximate two-dimensional surfaces in R from point-set data. The procedure does not require the estimation3f local normals and planes, or parametric representations. Its main advantage is that itperforms well in the case of noisy samples. In [14] the LOP mechanism was generalized todevise the so-called Manifold Locally Optimal Projection (MLOP) method. This method wasfurther enhanced to deal with approximation of functions in presence of noise in [13]. Here, wegive a concise overview of the MLOP method and its key properties.First, we introduce the h - ρ condition, defined for scattered-data approximation of functions(which is an adaptation of the condition in [20] for low-dimensional data), to handle finitediscrete data on manifolds. Definition 1. h - ρ sets of fill distance h and density ≤ ρ with respect to a manifold M .Let M be a manifold in R n and consider sets of data points sampled from M . We say thatsuch a set P = { P j } Ji =1 is an h - ρ set if:1. h is the fill distance, i.e., h = median p j ∈ P min p j ∈ P \{ p i } (cid:107) p i − p j (cid:107) .2. { P ∩ ¯ B ( y, kh ) } ≤ ρk n , k ≥ , y ∈ R n .Here Y denotes the number of elements in a set Y and ¯ B ( x, r ) denotes the closed ball ofradius r centered at x . Note that the last condition regarding the point separation δ defined in [20], which statesthat there exists δ > (cid:107) p i − p j (cid:107) ≥ δ, ≤ i ≤ j ≤ J , is redundant in the case offinite data. We also note that the vanilla definition of the fill distance uses the supremum supin its expression; here we use the median in order to deal with the presence of outliers.The setting of the high-dimensional reconstruction problem is the following: Let M be amanifold in R n of unknown intrinsic dimension d (cid:28) n . There is given a noisy point-cloud P = { p j } Jj =1 ⊂ R n situated near the manifold M such that P is a h - ρ set. We wish to find anew point-set Q = { q i } Ii =1 ⊂ R n which will serve as a noise-free approximation of M . We seeka solution in the form of a new, quasi-uniformly distributed point-set Q that will replace thegiven data P and provide a noise-free approximation of M . This is achieved by leveraging thewell-studied weighted L -median [30] used in the LOP algorithm and requiring a quasi-uniformdistribution of points q i ∈ Q . These ideas are encoded by the cost function G ( Q ) = E ( P, Q ) + Λ E ( Q ) = (cid:88) q i ∈ Q (cid:88) p j ∈ P (cid:107) q i − p j (cid:107) H (cid:15) w i,j + (cid:88) q i ∈ Q λ i (cid:88) q i (cid:48) ∈ Q \{ q i } η ( (cid:107) q i − q (cid:48) i (cid:107) ) (cid:98) w i,i (cid:48) , (1)where the weights w i,j are rapidly decreasing smooth functions. The MLOP implementationuses w i,j = exp (cid:8) − (cid:107) q i − p j (cid:107) /h (cid:9) and (cid:98) w i,i (cid:48) = exp (cid:8) − (cid:107) q i − q (cid:48) i (cid:107) /h (cid:9) . The L -norm used in [24]is replaced by the “norm” (cid:107) · (cid:107) H (cid:15) introduced in [22] as (cid:107) v (cid:107) H (cid:15) = √ v + (cid:15) , where (cid:15) > (cid:15) = 0 . (cid:107) · (cid:107) H (cid:15) instead of (cid:107) · (cid:107) hasthe advantage that one works with a smooth cost function and outliers can be removed. Inaddition, h and h are the support size parameters of w i,j and (cid:98) w i,i (cid:48) which guarantee a sufficientamount of P or Q points for the reconstruction (for more details, see the subsection “OptimalNeighborhood Selection” in [14]). Also, η ( r ) is a decreasing function such that η (0) = ∞ ; inour case we take η ( r ) = r . Finally, { λ i } Ii =1 are constant balancing parameters.In order to solve the problem with the cost function (1), we look for a point-set Q that minimizes G ( Q ). The solution Q is found via the gradient descent iterations q ( k +1) i (cid:48) = q ( k ) i (cid:48) − γ k ∇ G ( q ( k ) i (cid:48) ) , i (cid:48) = 1 , . . . , I , (2)where the initial guess { q (0) i } Ii − = Q (0) consists of points sampled from P .4he gradient of G is ∇ G ( q ( k ) i (cid:48) ) = J (cid:88) j =1 (cid:0) q ( k ) i (cid:48) − p j (cid:1) α i (cid:48) j − λ i (cid:48) I (cid:88) i =1 i (cid:54) = i (cid:48) (cid:0) q ( k ) i (cid:48) − q ( k ) i (cid:1) β i (cid:48) i , (3)with the coefficients α i (cid:48) j and β i (cid:48) j given by the formulas α i (cid:48) j = w i,j (cid:107) q i − p j (cid:107) H (cid:15) (cid:18) − h (cid:107) q i − p j (cid:107) H (cid:15) (cid:19) (4)and β i (cid:48) i = (cid:98) w i,i (cid:48) (cid:107) q i − q i (cid:48) (cid:107) (cid:18)(cid:12)(cid:12)(cid:12)(cid:12) ∂η ( (cid:107) q i − q i (cid:48) (cid:107) ) ∂r (cid:12)(cid:12)(cid:12)(cid:12) + 2 η ( (cid:107) q i − q i (cid:48) (cid:107) ) h (cid:107) q i − q i (cid:48) (cid:107) (cid:19) , (5)for i = 1 , ..., I , i (cid:54) = i (cid:48) . In order to balance the two terms in ∇ G ( q ( k ) i (cid:48) ), the factors λ i (cid:48) areinitialized in the first iteration as λ i (cid:48) = − (cid:13)(cid:13)(cid:13)(cid:13) J (cid:80) j =1 (cid:0) q ( k ) i (cid:48) − p j (cid:1) α i (cid:48) j (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) I (cid:80) i =1 (cid:0) q ( k ) i (cid:48) − q ( k ) i (cid:1) β i (cid:48) i (cid:13)(cid:13)(cid:13)(cid:13) . (6)Balancing the contribution of the two terms is important in order to maintain equal influenceof the attraction and repulsion forces in G ( Q ). The step size in the direction of the gradient γ k is calculated as indicated in [3]: γ k = (cid:104)(cid:52) q ( k ) i (cid:48) , (cid:52) G ( k ) i (cid:48) (cid:105)(cid:104)(cid:52) G ( k ) i (cid:48) , (cid:52) G ( k ) i (cid:48) (cid:105) , (7)where (cid:52) q ( k ) i (cid:48) = q ( k ) i (cid:48) − q ( k − i (cid:48) and (cid:52) G ( k ) i (cid:48) = ∇ G ( k ) i (cid:48) − ∇ G ( k − i (cid:48) . Remark . The reasoning in terms of Euclidean distances, which is the cornerstone of theMLOP method, works well in low dimensions, e.g., for the reconstruction of surfaces in 3D,but breaks down in high dimensions once noise is present. To deal with this issue, a dimensionreduction is performed via random linear sketching [33]. It should be emphasized that thedimension reduction procedure is utilized solely for the calculation of norms, and the manifoldreconstruction is performed in the high-dimensional space. Thus, given a point x ∈ R n , weproject it to a lower dimension m (cid:28) n using a random matrix, S , with certain properties.Subsequently, the norm of (cid:107) S t x (cid:107) will approximate (cid:107) x (cid:107) . The construction of S is carried out inthe following steps:1. Sample G ∈ R J × m with G ∼ N (0 , B ∈ R n × m as B := P t G .3. Calculate the QR decomposition of B as B = SR , and use S as the dimension reductionmatrix. Preliminaries — Optimal Neighborhood Selection
The support sizes h and h of the functions w i,j = exp (cid:8) − (cid:107) q i − p j (cid:107) /h (cid:9) and (cid:98) w i,i (cid:48) =exp (cid:8) − (cid:107) q i − q (cid:48) i (cid:107) /h (cid:9) , respectively, are closely related to the fill distance of the P -points andthe Q - points. Due to the importance of optimal selection of these parameters, we quote hereseveral definitions and results from [14]. Unlike the standard definition of the notion of filldistance in scattered-data function approximation [20], we introduce5 efinition 2. The fill distance of the set P is h = median p i ∈ P min p j ∈ P \{ p i } (cid:107) p i − p j (cid:107) . (8) Note that the vanilla definition of fill distance uses the supremum in the definition (insteadof the median ). However, as mentioned above, in our case we replace the supremum with themedian so as to deal with the presence of outliers.
Definition 3.
Given two point-clouds, P = { p j } Jj =1 ⊂ R n and Q = { q i } Ii =1 ⊂ R n , situated neara manifold M in R n , such that their sizes obey the constraint I ≤ J , denote ν = (cid:4) JI (cid:5) . Then wesay that the radius that guarantees approximately ν points from P in the support of each point q i is (cid:98) h = c h , with c given by (cid:98) h = c h , with c = argmin { c : B ch ( q i ) ∩ P ) ≥ ν, ∀ q i ∈ Q } . (9) Remark . Let σ be the variance of the Gaussian w ( r ) = exp {− r /σ } . For the normaldistribution, four standard deviations away from the mean account for 99 .
99% of the set. Inour case, by the definition of w i,k , since h is the square root of the variance, 4 σ = 4 h √ = 2 √ h covers 99 .
99% of the support size of w i,k .The following theorem, proved in [14], indicates how the parameters h and h should beselected. Theorem 2.3.
Let M be a d -dimensional manifold in R n . Suppose given two point-clouds, P = { p j } Jj =1 ⊂ R n and Q = { q i } Ii =1 ⊂ R n , situated near M , such that their sizes obey theconstraint I ≤ J , and let ν = (cid:4) JI (cid:5) . Let w i,j be the locally supported weight function given by w i,j = exp {−(cid:107) q i − p j (cid:107) /h } . Then a neighborhood size of h = 2 √ (cid:98) h guarantees . d ν pointsin the support of w i,j , where (cid:98) h = c h , with c given by (9) . Theoretical Analysis of the MLOP Method
For the sake of completeness, we mention here several important results regarding the conver-gence of the MLOP method, its order of approximation, rate of convergence and complexity(for more details, see [14]).
Theorem 2.4 (Convergence to a stationary point) . Let M be a in R n of unknown intrinsicdimension d . Suppose that the scattered data points P = { p j } Jj =1 were sampled near the manifold M , h and h are set as in Theorem , and the h - ρ set condition is satisfied with respect to M . Let the points Q (0) = { q (0) i } Ii =1 be sampled from P . Then the gradient descent iterations (6) converge almost surely to a local minimizer Q ∗ . Theorem 2.5 (Order of approximation) . Let P = { p j } Jj =1 be a set of points that are sampled ( without noise ) from a d –dimensional C manifold M , and satisfy the h - ρ condition. Then fora fixed ρ and a finite support of size h of the weight functions w i,j , the set Q defined by the MLOP algorithm has an order of approximation O ( h ) to M . Definition 4.
A differentiable function f ( · ) is called L -smooth if for any x , x (cid:107)∇ f ( x ) − ∇ f ( x ) (cid:107) ≤ L (cid:107) x − x (cid:107) . Theorem 2.6 (Rate of convergence) . Suppose the point-set P = { p j } Jj =1 is sampled near a d -dimensional manifold in R n and the assumptions of Theorem are satisfied. Suppose the ost function G defined in (1) is an L -smooth function. For (cid:15) > , let Q ∗ be a local fixed-point solution of the gradient descent iterations, with step size γ = (cid:15) − . Set the terminationcondition as (cid:107)∇ f ( x ) (cid:107) ≤ (cid:15) . Then Q ∗ is an (cid:15) -first-order stationary point that will be reachedafter k = L ( G ( Q (0) )– G ( Q ∗ )) (cid:15) − iterations, where L = l and l < ∞ is a bounded parameter. Theorem 2.7 (Complexity) . Given a point-set P = { p j } Jj =1 sampled near a d -dimensionalmanifold M ∈ R n , let Q = { q i } Ii =1 be a set of points that will provide the desired manifoldreconstruction. Then the complexity of the MLOP algorithm is O ( nmJ + kI ( nm (cid:98) I + (cid:98) J )) , wherethe number of iterations k is bounded as in Theorem , m (cid:28) n is the smaller dimensionto which we reduce the dimension of the data, and (cid:98) I and (cid:98) J are the numbers of points in thesupport of the weight functions (cid:98) w i,i (cid:48) , w i,j that belong to the Q -set and P -set, respectively. Thus,the approximation is linear in the ambient dimension n , and does not depend on the intrinsicdimension d . The settings for the reconstruction and repairing problem is the following: Given a noisy point-cloud P = { p j } Jj =1 ⊂ R n situated near a manifold M ⊂ R n , of unknown intrinsic dimension d and with incomplete data in a ball B ( c, r ). We look for a solution in the form of a new point-set Q = { q i } Ii =1 ⊂ R n that will replace the given data P , be quasi-uniformly distributed, providea noise-free approximation of M , and will reconstruct the missing information in the givenlocation. This is achieved by leveraging the well-studied weighted L -median [30], requiring aquasi-uniform distribution of points q i ∈ Q and propagating the boundary information insidethe hole.We propose here the Repairing Manifold Locally Optimal Projection (R-MLOP) method,which enhances the MLOP approach (described in Section 2.1) and is inspired by a method forlocal approximation of functions. In the latter approach, the value of a function at a point x is estimated as a weighted average of the values of the function at nearby points x i : f ( x ) = (cid:80) i ∈ I w i f ( x i ) (cid:80) i ∈ I w i , where w i = exp (cid:8) − (cid:107) x i − x (cid:107) /h (cid:9) and h is the fill distance of the points { x i } i ∈ I .The proposed R-MLOP method introduces a new term E in equation (1), which plays therole of a force of attraction, pulling points of the boundary of the hole towards their convexhull. We define this repairing E term as E = (cid:88) i (cid:48) ∈ I ¯ τ i (cid:48) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) q i (cid:48) (cid:88) i ∈ I (cid:98) w i,i (cid:48) − (cid:88) i ∈ I, i (cid:54) = i (cid:48) (cid:98) w i,i (cid:48) q i (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) , (10)where (cid:98) w i,i (cid:48) = exp (cid:8) − (cid:107) q i − q (cid:48) i (cid:107) /h (cid:9) , ¯ τ i are balancing terms (see below), and the h is theexpected representative distance of the Q -points, as introduced in Definition 3.Note that in matrix notation the repairing term E can be rewritten as the norm of the graphLaplacian L : E = (cid:88) i (cid:48) ∈ I ¯ τ i (cid:48) (cid:107) L ( Q − Q i (cid:48) ) (cid:107) , (11)where L = D − W , D i (cid:48) i (cid:48) = (cid:80) i (cid:98) w i,i (cid:48) , W is a matrix with entries (cid:98) w i,i (cid:48) , (cid:98) w i,i (cid:48) = exp (cid:8) −(cid:107) q i − q (cid:48) i (cid:107) /h (cid:9) , Q is the matrix with the rows { q i } i ∈ I , while Q i (cid:48) is the matrix with the rows { q i (cid:48) } .7e now define the manifold reconstruction and repairing algorithm as the minimization of thenon-convex function G ( Q ) = c E ( P, Q, T ) + c E ( Q ) + c E ( Q, T ) , (12)where E ( P, Q, T ) = (cid:88) q i ∈ Q (1 − ¯ τ i ) (cid:88) p j ∈ P (cid:107) q i − p j (cid:107) H (cid:15) w i,j (13) E ( Q ) = (cid:88) q i ∈ Q (cid:88) q i (cid:48) ∈ Q \{ q i } η ( (cid:107) q i − q (cid:48) i (cid:107) ) (cid:98) w i,i (cid:48) , (14)and E is given in (10).The constants c , c and c replace the λ i terms in (1) and balance the three terms in (12).They are calculated as c k = median i ∈ I ( (cid:107) ∂E k ∂q i (cid:107) ), for k = 1 , , T is a vector of weights ¯ τ i ∈ [0 , q i , that balance between the forces of attraction to P and those of attraction towardsthe convex hull of the neighboring Q -points. It is calculated only in the first iteration, and isbased on the distance of the points q i to the location of the hole. Specifically, near a hole thevalue of ¯ τ i is chosen to be close to 1 and thus E gains more weight, while in regions where norepairing is required ¯ τ i is small and E becomes dominant.Given that the hole is located in a ball B ( c, r ) (we will discuss how to approximate theparameters r and c later) the weight ¯ τ i ∈ [0 ,
1] associated to a point q i is calculated by the rule¯ τ i = τ i − min( τ i )max( τ i ) − min( τ i ) , i ∈ I, (15)where τ i = exp (cid:8) −(cid:107) q i − c (cid:107) /r (cid:9) . (16)The solution to the minimization problem (12) is found via the gradient descent algorithm: q ( k +1) i (cid:48) = q ( k ) i (cid:48) − γ k ∇ G ( q ( k ) i (cid:48) ) , (17)where the points q i (0) ’s are randomly sampled from P , and the coefficients γ k are given byformula (7).The gradient of the R-MLOP cost function G has the expression ∇ G ( q ( k ) i (cid:48) ) = (1 − ¯ τ i (cid:48) ) c J (cid:88) j =1 (cid:16) q ( k ) i (cid:48) − p j (cid:17) α i (cid:48) j − c I (cid:88) i =1 i (cid:54) = i (cid:48) (cid:16) q ( k ) i (cid:48) − q ( k ) i (cid:17) β i (cid:48) i + ¯ τ i (cid:48) c ∂E ∂q i (cid:48) , (18)where α i (cid:48) j and β i (cid:48) i are given the formulas (4) and (5), respectively, and the gradient of E isgiven by ∂E ∂q i (cid:48) = 2 (cid:88) i ∈ Ii (cid:54) = i (cid:48) ( q i (cid:48) − q i ) (cid:98) w i,i (cid:48) (cid:88) i ∈ Ii (cid:54) = i (cid:48) (cid:18) − h (cid:107) q i (cid:48) − q i (cid:107) (cid:19) (cid:98) w i,i (cid:48) . (19)8 .1 Multiple Hole Repair In the case of multiple hole repair, it is necessary to modify the T weights in equation (12) soas to include all the information regarding the holes. Let B ( c k , r k ) be a set of balls, each withincomplete data which need to be amended. Then, the normalized weights ¯ τ ki ∈ [0 ,
1] for a hole k are calculated as ¯ τ ki = τ ki − min( τ ki )max( τ ki ) − min( τ ki ) , i ∈ I, (20)where τ ki = exp (cid:8) −(cid:107) q i − c k (cid:107) /r k (cid:9) . (21)We then redefine the normalized weights ¯ τ i ∈ [0 ,
1] used in (12) for the present case of multiplehole repairing as ¯ τ i = τ i − min( τ i )max( τ i ) − min( τ i ) , (22)where τ i = K (cid:89) k =1 τ ki . (23)This definition of the weights that form T incorporates the information from all the holes in asingle coefficient attached to the points q i . In this section, we validitate the proposed method. Specifically, given noisy data with a hole,we investigate whether the proposed method amends the hole, and prove that it creates a noise-free reconstruction of the manifold which is of order O ( C h + C r ). In addition, we showthat the order of complexity does not change with the new extension, compared to the vanillaMLOP. Last, we propose a method for hole identification that uses the intrinsic properties ofthe MLOP mechanism. Definition 5.
Let (cid:15) > be a small number, and let M be a d -dimensional manifold with ahole H at a location l ∈ R n and with the size bounded by r ( where r is the smallest radius ofa ball that contains the hole ) . Suppose that the scattered data points P = { p j } Jj =1 ⊂ R n weresampled near the manifold M . Let Q = { q i } Ii =1 be the sought-for noise-free approximation of M . Let ¯ τ i be defined as in (15) . Then, the (cid:15) -neighborhood of the boundary of the hole is theset of points q i such that ¯ τ i > − (cid:15) ( note that ¯ τ i ∈ [0 , . Theorem 4.1 (Order of Approximation) . Let M be a d -dimensional manifold in R n , where d is an unknown intrinsic dimension. Let H be a convex hole in M bounded by a ball B ( c, r ) .Suppose that the scattered data points { p j } Jj =1 were sampled from the manifold M with noise, h is as in Definition , and the h - ρ conditions are satisfied with respect to M . Also, let Q (0) = { q (0) i } Ii =1 be a set of initial points set sampled from P . Then the order of approximationof R-MLOP to M is less than C h + C r , where the constant C depends on the curvatureof the manifold outside the hole, and C depends on the curvature inside the hole.Proof. In regions far from the boundary of the hole, the ¯ τ i (cid:48) ’s, defined in (15), are small andtherefore the function (12) to be optimized consists of only the first two terms, E and E .9ccording to Theorem 2.5, the order of approximation in these regions is O ( h ), where h =max { h , h } and h and h are defined in Definition 3 with respect to the points-sets P and Q .In regions included in the (cid:15) -neighborhood of the boundary of the hole, the representativedistance between the points is bounded by 2 r (we assume that h ≤ r , otherwise there is nohole). In these regions, there are points q i ∈ Q which lie at a distance O ( r ) from points in theset P . Therefore, the overall order of approximation at a new point is a combination of thetwo, namely ≤ C h + C r , with C , and C constants. Theorem 4.2.
Let M be a d -dimensional manifold in R n , where d is an unknown intrinsicdimension. Let H be a convex hole in M bounded by B ( c, r ) . Suppose that the scattered datapoints P = { p j } Jj =1 were sampled near the manifold M , h is selected as in Definition , andthe h - ρ condition is satisfied with respect to M . Also let Q (0) = { q (0) i } Ii =1 be initial set of pointssampled from P . Then the gradient descent iterations for minimizing (12) produce a quasi-uniformly distributed point-set Q that approximates and reconstructs the manifold, as well asrecover missing information inside the hole.Proof. The definition of the R-MLOP method (12) together with (13) and (10) implies thatin regions far from the hole, where (cid:98) τ i are small, the term E does not play a significant role.Therefore, in such regions the R-MLOP algorithm behaves like MLOP and, by Theorem 2.4,it converges and reconstructs there the manifold.In an (cid:15) -neighborhood of the boundary of H , where the ¯ τ i ’s are close to 1, the target function(12) consists of only the last two terms. The term E is responsible for the uniform distributionof the points Q , and therefore we will consider here only the contribution of the term E tohole repairing.Let us analyze the role of the term E , and prove that it indeed amends the hole. To thisend, we show that at each iteration, the points in the (cid:15) -neighborhood of the boundary of thehole H move towards the center of H . Let (cid:15) > q ( k ) i (cid:48) be a point inthe (cid:15) -neighborhood of the boundary of H . We would like to prove that as a result of a gradientdescent step the point q ( k +1) i (cid:48) = q ( k ) i (cid:48) − γ i ∂E ∂q i (cid:48) moves towards the center of the hole H .The gradient in (18) can be rewritten as ∂E ∂q i (cid:48) = 2 b i (cid:48) (cid:88) i ∈ I, i (cid:54) = i (cid:48) ( q ( k ) i (cid:48) − q ( k ) i ) (cid:98) w i,i (cid:48) , (24)where b i (cid:48) = (cid:80) i ∈ I, i (cid:54) = i (cid:48) (cid:16) − h (cid:107) q ( k ) i (cid:48) − q ( k ) i (cid:107) (cid:17) (cid:98) w i,i (cid:48) . In what follows we determine the direction of (cid:80) i ∈ I, i (cid:54) = i (cid:48) ( q ( k ) i (cid:48) − q ( k ) i ) (cid:98) w i,i (cid:48) and the sign of b i (cid:48) . Inthe proof, we rely on a uniform distribution of the points p j . In order to guarantee this, wefirst run the vanilla MLOP, which results in a quasi-uniform sampling of the manifold.1. The weighed sum (cid:126)a = (cid:80) i ∈ I i (cid:54) = i (cid:48) ( q ( k ) i (cid:48) − q ( k ) i ) (cid:98) w i,i (cid:48) is a vector pointing towards thecenter of the hole. Indeed, the term (cid:80) i ∈ I, i (cid:54) = i (cid:48) ( q ( k ) i (cid:48) − q ( k ) i ) (cid:98) w i,i (cid:48) is a weighted sum ofvectors, each directed from q ( k ) i to q ( k ) i (cid:48) (see Figure 2 left, vectors marked in black). Werewrite this sum according to the direction of the vector (cid:126)r i (cid:48) ,i = q ( k ) i (cid:48) − q ( k ) i , i.e., towards10he center of the hole or not, using a dot product. Let (cid:126)v i (cid:48) = c − q ( k ) i (cid:48) (cid:107) c − q ( k ) i (cid:48) (cid:107) be the unit vectororiginating at a point q ( k ) i (cid:48) and pointing in the direction of the center of the hole. Then (cid:126)a can be decomposed into a sum of vectors in the same direction as (cid:126)v i (cid:48) , and a sum ofvectors pointing in a different direction: (cid:126)a = (cid:88) (cid:126)ri (cid:48) ,i (cid:107) (cid:126)ri (cid:48) ,i (cid:107) · (cid:126)v i (cid:48) ≥ (cid:126)r i (cid:48) ,i (cid:98) w i,i (cid:48) + (cid:88) (cid:126)ri (cid:48) ,i (cid:107) (cid:126)ri (cid:48) ,i (cid:107) · (cid:126)v i (cid:48) < (cid:126)r i (cid:48) ,i (cid:98) w i,i (cid:48) . The hyperplane x · (cid:126)v i (cid:48) = 0 separates between the vectors (cid:126)r i (cid:48) ,i that point in the directionof the center of the hole, from those that are pointing in different directions. Assuminga uniform distribution of the points, since there are no points within the hole, there aremore points that satisfy the inequality (cid:126)r i (cid:48) ,i (cid:107) (cid:126)r i (cid:48) ,i (cid:107) · (cid:126)v i (cid:48) ≥ (cid:126)a points towards the center of the hole.2. The term b i (cid:48) satisfies the inequality b i (cid:48) < . Let us define b ( r ) = (cid:16) − h r (cid:17) exp {− r /h } ,where r = (cid:107) q i (cid:48) − q i (cid:107) . We analyze the behavior of the function b ( r ), and plot it in Figure 2(right). In view of Remark 2.2, the term exp {− r /h } vanishes for r > √ h . Further-more, note that b ( r ) ≥ r ∈ [0 , h √ ], b ( r ) < r ∈ ( h √ , √ h ], and b ( r ) reaches itsextrema at r = 0 and r = ± √ h √ . Using these observations, we rewrite the definition of b i (cid:48) as b i (cid:48) = (cid:88) ≤ r ≤ h √ b ( r ) − (cid:88) h √ 3. Then, relying on the proportion consistencyassumption, we estimate N in terms of the ratio V V = N N , where the volume of a ballof radius h √ in R d is V = π d/ ( h √ ) d /c ( d ), and the volume of a ball of radius R is V = π d/ R d / Γ( d ) where Γ is the Euler gamma function. Thus, N = (cid:0) (1+ √ h √ (cid:1) d N .Similarly, N = √ d N , and N = (cid:0) (4+ √ h √ (cid:1) d N . Finally, after calculating N (cid:48) and N (cid:48) ,and substituting the results in (26), we find that the second term is larger than c = c N ,where c > 1. Consequently, the second term in (25) is dominant and b i (cid:48) < (cid:80) i ∈ Ii (cid:54) = i (cid:48) ( q i (cid:48) − q i ) (cid:98) w i,i (cid:48) is directed towards the hole, b is negative,and so ∂E ∂q i (cid:48) points outside the hole. In addition, our numerical investigation showed thatadopting a small step size of 0 . γ k , where γ k is given by formula 7, results in a positive γ k during the gradient descent iterations for points in the (cid:15) -neighborhood of the boundary ofthe hole. It follow that the point q ( k ) i (cid:48) in (17) moves towards the hole and recovers missinginformation.Figure 2: Illustration of the two stages in the validation of the R-MLOP method. Left: Analysisof the direction of movement of a point q ( k ) i (cid:48) (in red) after an amendment step (the magentapoint) on a actual case example. The forces by means of which the neighboring points q i act onthe current point are in shown in black. The weighted sum in the direction (cid:126)a is shown in blue,while the gradient of the term E is shown in red. Thus, the point q ( k ) i (cid:48) moves in a directionopposite to that of the gradient, towards the center of the hole. Right: A plot of the function b ( r ), along with important key points which aid in bounding b i (cid:48) < The R-MLOP algorithm is based on the MLOP algorithm, with the addition of the term E .As specified in Theorem 2.7, the complexity of the MLOP algorithm is O ( nmJ + kI ( nm (cid:98) I + (cid:98) J )),where n is the dimension of the data and m is the dimension to which one reduces the dimensionof the data for the procedure of calculating the norm ( m (cid:28) n ). Thus, we need to evaluate thecomplexity of the term E . The gradient of this term involves the differences q i (cid:48) − q i betweena given point q i (cid:48) and points q i ∈ Q . This calculation is performed for each iteration. Thus, thetotal complexity of the R-MLOP algorithm is O ( nmJ + kI ( nm (cid:98) I + (cid:98) I + (cid:98) J )). Corollary 4.3. Given a set of points P = { p j } Jj =1 sampled near a d -dimensional manifold M ∈ R n , let Q = { q i } Ii =1 be a set of points which will provide the desired the manifold reconstructionand repairing. Then the complexity of the R-MLOP algorithm is O ( nmJ + kI (( nm + 1) (cid:98) I + (cid:98) J )) ,where the number of iterations k is bounded as in Theorem , m is the smaller dimension towhich one reduces the dimension of the data, m (cid:28) n , and (cid:98) I and (cid:98) J are the numbers of pointsfrom the sets Q and P , respectively, in the support of the weight function w i,j . Therefore, he approximation is linear in the ambient dimension n , and does not depend on the intrinsicdimension d . Hole identification is an ill-posed problem, since usually the geometry of the manifold is un-known, so it is hard to know whether a hole does actually exist, or the manifold was sampledpoorly. Therefore, the best scenario for fixing the missing information is when one can manuallyidentify the holes that need to be amended. However, in real-life scenarios, this information isnot always available, and thus estimating the location and radii of holes is required.In this subsection, we propose a method that relies on point density considerations toidentify the boundary of a hole. It is reasonable to assume that near the boundary of a holethe density of the points drops. However, low density does not characterize only regions close toa hole, but also can stem from non-uniform sampling. In addition, for an open manifold, low-density values can also characterize the boundary of the manifold. As described in Section 1,the problem of hole identification was addressed with various methods, with the most prominentone being, for the point-cloud case, the method based on k -nearest neighbors considerations,which is closely related to density of points. However, the challenge of dealing with manifoldboundaries was not taken into account.In what follows we propose a procedure for identifying the boundary of a hole in a manifold,while taking into account the challenges mentioned above. Given scatter data P = { p j } Jj =1 sampled from a manifold M that satisfy the h - ρ condition with respect to M , we proposeaddressing the hole identification task in several phases, each dealing with another challengecaused by low density:1. Quasi-uniform sampling of the manifold using the vanilla MLOP method.2. Identification of the boundary of the manifold.3. Identification of the boundary of the hole.4. Estimation of the location of the hole and its volume.First, we apply the MLOP algorithm as a pre-step in order to overcome the case of non-uniformsampling of the P -point data. Next, we classify the quasi-uniform point-set Q produced by theMLOP algorithm to be either: a) the boundary of the manifold; b) the boundary of the hole;c) neither of them. Let h , be the fill distance of the Q -points (see Definition 2), and let ρ be the density of Q , which satisfies the inequality (1), i.e., { Q ∩ ¯ B ( y, kh , ) } ≤ ρ k n , k ≥ , y ∈ Q . Note that on the manifold boundary of the manifold the number of points doesnot grow at the same rate as k . Thus, in order to identify the boundary of the manifold oneneeds to analyze the change in the number of points for kh for k = 1 , 2, and select the points q i such that their change in { Q ∩ ¯ B ( y, kh , ) } is insignificant. Last, the points on the boundaryof the hole are identified as the ones with a low number of points, that do not belong to themanifold boundary. Algorithm 1 summarizes this process.13 lgorithm 1 Estimating the Location of the Hole Input: P = { p j } Jj =1 ⊂ R n Output: Hole parameters r , c Create a quasi-uniform point-set Q via vanilla MLOP. Identify points on the manifold boundary by calculating the number of points for h , 2 h ,and use the ratio a i = { Q ∩ ¯ B ( y, h , ) } { Q ∩ ¯ B ( y,h , ) } , y ∈ Q to identify the boundary points q i such that a i < median i { a i } . Identify the points on the boundary of the hole, such that the number of points at thispoints is low (and they does not belong to the manifold boundary. Clean outliers in the set of boundary points. Approximate the location of the hole and its volume. Estimate the radius r of the hole ashalf of the maximum of the distance between the boundary points, and the center c of thehole as the center of mass of the boundary points. In this subsection, we demonstrate the efficiency of our method on 3D surfaces as well as onmanifolds in higher dimension, for single and multiple hole repairing. The numerical setup wasusually built by sampling known manifold data, and artificially creating holes in it. It shouldbe noted that in all of the examples below we relied on the fact that the location of the holeand its radius were known. Data Repairing in Low-Dimensional Space We start by demonstrating our data completion method on 3D scenes of the bunny and thedragon taken from the Stanford Scanning Repository [23]. We loaded the two published models,and randomly sampled 1000 points which served as the initial P points. Next, a hole wasartificially created by removing points from a chosen location (the hole in the bunny was in itsneck, while the hole in the dragon was in its head), which resulted in about 950 points. Later,a subset of 350 points was sampled to construct the Q set. In addition, we slightly increasedthe density of the Q points near the boundary of the hole. An illustration of the initial settingsfor the repairing algorithm is provided in Figure 1 (A), Figure 3 (A).In our numerical experiment, we compared the result of applying the vanilla MLOP with theresult of R-MLOP. First, we applied the plain MLOP algorithm on our data, which resulted ina quasi-uniform sampling (Figure 1 (B)). As expected, the reconstruction maintained its prox-imity to the P points and did not recover the missing information. Next, using the informationon the hole, we calculated the proximity coefficient T in (15) (its values are presented in Figure3 (B)). Finally, we executed the repairing algorithm described above. The amended result after30 iterations can be seen in Figure 1 (C), Figure 3, (C). One can see that the existing holeswere repaired successfully with uniform sampling.14 A) (B) (C) Figure 3: Amending the scattered data for the Stanford dragon. (A) The initial scattered datasampled from the dragon model, with a hole artificially created in its head (the initial P -pointsin green, the initial Q -points in red). (B) The weights T of the hole, which are closer to 1 nearthe hole location. (C) The result produced by the R-MLOP algorithm. Multiple Holes Repair In the next example, we applied the multiple holes repair methodology to the Stanford dragon,which was sampled with P and Q as described above. Two holes locations were selected, onein the dragon head and the other in its tail, and points around them were removed (see Figure4, (A)). Subsequently, we calculated the enhanced proximity weights T ; they are presented inFigure 4, (B). The result produced by the surface amending algorithm is presented in Figure4, (C). One can see that the missing information on the two holes was successfully recoveredwith quasi-uniform sampling. (A) (B) (C) Figure 4: Amending the scattered data for the Stanford dragon. (A) The initial scattered datasampled from the dragon model, with two holes artificially created, one in its head and anotherin its tail marked with arrows (the initial P -points in green, the initial Q -points in red). (B)The weights of the hole T which are closer to 1 near the hole location. (C) The result producedby the R-MLOP algorithm. Manifold Repairing in High-Dimensional Space In this subsection, we demonstrate the data completion method on several examples of a low-dimensional manifold embedded in a high-dimensional space. Specifically, we embedded atwo-dimensional and a six-dimensional cylindrical structure in R , as well as a cone structure15ith multiple holes, also embedded in R . In all of cases we artificially created a hole inthe data around a certain point (Figure 5, Figure 6 and Figure 7, (A)). We then applied theR-MLOP method for data completion. The details of the data creation are described below.First, we embedded a two-dimensional cylindrical structure in a 60-dimensional linear space.We sampled the structure using the parametrization p = tv + R √ u ) v + sin( u ) v ) , where v = [1 , , , , , . . . , v = [0 , , − , , , . . . , , v = [1 , , , − , , . . . , 0] ( v , v , v ∈ R ), t ∈ [0 , 2] and u ∈ [0 . π, . π ]. Using this representation, 790 uniformly distributed (inparameter space) points were sampled with uniformly distributed noise (i.e., U ( − . , . Q set was constructed by randomly sampling 230 points (Figure 5 (A)). Next, weapplied the plain MLOP algorithm on our data, which resulted in a quasi-uniform sampling(Figure 5 (B))). As expected, the reconstruction maintained its proximity to the P points, anddid not recover the missing information. Finally, we executed the R-MLOP algorithm, andafter 70 iterations the manifold was amended with quasi-uniformly sampling (Figure 5 (C)). (A) (B) (C) Figure 5: Cylindrical structure embedded into a 60-dimensional space. The first three co-ordinates of the point-set are shown. (A) Scattered data with uniformly distributed noise U ( − . 1; 0 . 1) (green), and the initial point-set Q (0) (red), with an artificial hole created. (B)The point-set generated by the MLOP algorithm. (C) The result produced by the R-MLOPalgorithm. Six-dimensional cylindrical structure Next, we tested our method on higher-dimensional manifolds by utilizing an n -sphere to gen-erate an ( n + 1)-dimensional cylinder (in the example of the two-dimensional cylinder, we useda circle to generate the structure). Specifically, we utilized a five-dimensional sphere to builda six-dimensional manifold, using the parameterization x = R cos( u ) , x = R sin( u ) cos( u ) , . . . , x = R sin( u ) sin( u ) · · · sin( u ) sin( u ) . We then embedded the sampled data in a 60-dimensional space by the parametrization p = tv + R [ x , x , x , x , x , x , , . . . , , (27)where R = 1 . t ∈ [0 , u i ∈ [0 . π, . π ], and v ∈ R is a vector with 1’s in positions1 , ..., d + 1 and 0 in the remaining positions. We randomly sampled the P -points from thesix-dimensional cylindrical structure and artificially created a hole in a known location and of16nown size. We embedded the sampled data in a 60-dimensional space. This process resultedin 1180 P -points. Next, uniformly distributed noise U ( − . 2; 0 . 2) was added, and then a subsetof 460 points was sampled to construct the Q set. To avoid trying to visualize a six-dimensionalmanifold, we plot here the cross-section of the cylindrical structure in three dimensions.Our numerical experiment included several executions. First, we applied the plain MLOPalgorithm on our data, which resulted in a quasi-uniform sampling (Figure 6 (B)). As expected,the reconstruction maintained its proximity to the P points, and did not recover the missinginformation. Next, using the hole information we calculated the proximity coefficient T in (15)(the resulting values are presented in Figure 6 (C)). Finally, we executed the repairing algorithmdescribed above. The amending result after 100 iterations is shown in Figure 6 (D). As one cansee, the holes were repaired successfully with the high-dimensional cylindrical structure withuniform sampling. (A) (B)(C) (D) Figure 6: Amending the scattered data from a six-dimensional cylindrical structure embeddedin a 60-dimensional space. The cross-section of the cylindrical structure by a hyperplane inwhich the first four coordinates are greater then − . P -points in green, the initial Q -points inred). (B) The result after applying the plain MLOP, which produced a quasi-uniform samplingof the six-dimensional cylindrical structure: the hole is not amended. (C) The weights of thehole T which are closer to 1 near the hole location. (D) The result produced by the R-MLOPalgorithm. 17 ultiple Holes Repair Last, we demonstrate the ability of the R-MLOP algorithm to cope with a geometric structureof different dimensions at different locations and with multiple holes. Here we combined a3-dimensional manifold, namely, a cone structure, with a one-dimensional manifold, namely, aline segment. This object was embedded in a 60-dimensional linear space. We used the coneparameterization p = tv + e − R √ u ) v + sin( u ) v ) , where v = [1 , , , , , . . . , , v = [0 , , − , , , . . . , , v = [1 , , , − , , . . . , v , v , v ) ∈ R , t ∈ [0 , R ∈ [0 , . u ∈ [0 . π, . π ]. We sampled 850 points from the structurewith added uniformly distributed noise of magnitude 0 . 2. The initial set Q (0) of size 140 wasrandomly sampled (Figure 7 (A)). Next, we applied the plain MLOP algorithm on our data,which produced a quasi-uniform sampling (Figure 7 (B))). As expected, the reconstructionmaintained its proximity to the P -points, and did not recover the missing information. Subse-quently, we calculated the coefficient T using equation (15) (its values range from 0 to 1 andcorrespond to the yellow and blue color in Figure 7 (C)). Finally, we executed the R-MLOPalgorithm, and after 200 iterations the manifold was amended with a quasi-uniform sampling(Figure 7 (D)). (A) (B)(C) (D) Figure 7: Cone structure embedded into a 60-dimensional space. The first three coordinatesof the point-set are shown. (A) Scattered data with uniformly distributed noise U ( − . 1; 0 . Q (0) (red), with an artificial hole created. (B) The point-setgenerated by the MLOP algorithm. (C) The weights of the hole T which are closer to 1 nearthe hole location. (D) The result produced by the R-MLOP algorithm.18 Discussion and Conclusions This paper reviewed and developed some new results about manifold repairing in high-dimensionunder noisy conditions. Given a noisy point-cloud situated near a manifold of unknown intrin-sic dimension, with holes, we aim at finding a noise-free reconstruction of the manifold thatwill amend the holes and complete the missing information. While in low-dimensional casethe problem was extensively studied (in noise-free conditions), in high dimension the problemis still open. We propose a solution that extends the Manifold Locally Optimal Projection(MLOP) method [14] by introducing an additional term that fills-in the missing data. Thevanilla MLOP does not repair the manifold, since by design it maintains the proximity to theoriginal points. To overcome this, we suggest enhancing the MLOP method to address datarepairing problems by adding another force of attraction which will pull the boundary pointstowards their convex hull. The basic idea is to propagate information from the boundary ofthe hole into the hole itself, and complete missing information. In the paper, we show thatthe order of approximation is O ( C h + C r ), where h is the representative distance betweenthe points and r is the radius of the ball that bounds the hole, and prove the validity of theproposed methodology. We demonstrate the efficiency of our method on 3D surfaces as well ason manifolds in higher dimensions, for single and multiple hole repairing.A possible future direction would be to investigate the manifold repairing problem in thecase where we observe a few sample signals. For example, suppose we observe a scatter data P = { p i } Jj =1 which is incomplete in some entries. Then is it possible to accurately guess theentries that we have not seen? This question is tightly related to the matrix competitionfield [7]. Acknowledgments We would like to thank Dr. Barak Sober for valuable discussions and comments. This studywas supported by a generous donation from Mr. Jacques Chahine, made through the FrenchFriends of Tel Aviv University, and was partially supported by ISF grant 2062/18. References [1] Alexa, M., Behr, J., Cohen-Or, D., Fleishman, S., Levin, D., Silva, C.T.: Computing andrendering point set surfaces. IEEE Transactions on Visualization and Computer Graphics (1), 3–15 (2003)[2] Attene, M., Campen, M., Kobbelt, L.: Polygon mesh repairing: An application perspec-tive. ACM Computing Surveys (CSUR) (2), 15 (2013)[3] Barzilai, J., Borwein, J.M.: Two-point step size gradient methods. IMA Journal of Nu-merical Analysis (1), 141–148 (1988)[4] Bendels, G.H., Schnabel, R., Klein, R.: Detecting holes in point set surfaces. The Journalof WSCG (2006)[5] Berger, M., Tagliasacchi, A., Seversky, L.M., Alliez, P., Guennebaud, G., Levine, J.A.,Sharf, A., Silva, C.T.: A survey of surface reconstruction from point clouds. In: ComputerGraphics Forum, vol. 36, pp. 301–329 (2017)196] Bertalmio, M., Sapiro, G., Caselles, V., Ballester, C.: Image inpainting. In: Proceedingsof the 27th Annual Conference on Computer Graphics and Interactive Techniques, pp.417–424 (2000)[7] Candes, E.J., Plan, Y.: Matrix completion with noise. Proceedings of the IEEE (6),925–936 (2010)[8] Chalmoviansk`y, P., J¨uttler, B.: Filling holes in point clouds. In: Mathematics of Surfaces,pp. 196–212. Springer (2003)[9] Cohen-Or, D., Levin, D., Remez, O.: Progressive compression of arbitrary triangularmeshes. Proceedings of Visualization ‘99, IEEE (1999)[10] Criminisi, A., P´erez, P., Toyama, K.: Region filling and object removal by exemplar-basedimage inpainting. IEEE Transactions on image processing (9), 1200–1212 (2004)[11] Davis, J., Marschner, S.R., Garr, M., Levoy, M.: Filling holes in complex surfaces us-ing volumetric diffusion. In: Proceedings. First International Symposium on 3D DataProcessing Visualization and Transmission, pp. 428–441. IEEE (2002)[12] Faigenbaum-Golovin, S., Levin, D.: Anisotropic moving least squares[13] Faigenbaum-Golovin, S., Levin, D.: Approximation of functions on manifolds in highdimension from noisy scattered data. arXiv preprint arXiv:2012.13804 (2020)[14] Faigenbaum-Golovin, S., Levin, D.: Manifold reconstruction and denoising from scattereddata in high dimension via a generalization of l (1), 93–103 (2018)[16] Harrison, J.: Soap film solutions to Plateau’s problem. Journal of Geometric Analysis (1), 271–297 (2014)[17] He, G., Cheng, X.: A virtual restoration strategy of 3d scanned objects. In: Advances inComputer Science, Intelligent System and Environment, pp. 621–627. Springer (2011)[18] Huang, H., Wu, S., Gong, M., Cohen-Or, D., Ascher, U., Zhang, H.R.: Edge-aware pointset resampling. ACM Transactions on Graphics (TOG) (1), 9 (2013)[19] Jun, Y.: A piecewise hole filling algorithm in reverse engineering. Computer-Aided Design (2), 263–270 (2005)[20] Levin, D.: The approximation power of moving least-squares. Mathematics of Computa-tion (224), 1517–1531 (1998)[21] Levin, D.: Mesh-independent surface interpolation. In: Geometric Modeling for ScientificVisualization, pp. 37–49. Springer (2004)[22] Levin, D.: Between moving least-squares and moving least- (cid:96) . BIT Numerical Mathematics (2005)[24] Lipman, Y., Cohen-Or, D., Levin, D., Tal-Ezer, H.: Parameterization-free projection forgeometry reconstruction. In: ACM Transactions on Graphics (TOG), vol. 26, p. 22. ACM(2007)[25] Moenning, C., Dodgson, N.A.: Intrinsic point cloud simplification. Proc. 14th GraphiCon , 23 (2004)[26] Ostrov, D.N.: Boundary conditions and fast algorithms for surface reconstructions fromsynthetic aperture radar data. IEEE transactions on geoscience and remote sensing (1),335–346 (1999)[27] Richard, M.M.O.B.B., Chang, M.Y.S.: Fast digital image inpainting. In: Proceedings ofthe International Conference on Visualization, Imaging and Image Processing (VIIP 2001),Marbella, Spain, pp. 106–107 (2001)[28] Tang, J., Wang, Y., Zhao, Y., Hao, W., Ning, X., Lv, K.: A repair method of point cloudwith big hole. In: 2017 International Conference on Virtual Reality and Visualization(ICVRV), pp. 79–84. IEEE (2017)[29] Thi, D.T., Fomenko, A.T., Primrose, E., Silver, B.: Minimal Surfaces, Stratified Multi-varifolds, and the Plateau Problem, vol. 84. Transactions of mathematical Monographs,American Mathematical Society (1991)[30] Vardi, Y., Zhang, C.H.: The multivariate l1-median and associated data depth. Proceed-ings of the National Academy of Sciences (4), 1423–1426 (2000)[31] Wang, J., Oliveira, M.M.: Filling holes on locally smooth surfaces reconstructed frompoint clouds. Image and Vision Computing (1), 103–113 (2007)[32] Wang, T.H., Krishnamurti, R., Shimada, K.: Restructuring surface tessellation with ir-regular boundary conditions. Frontiers of Architectural Research (4), 337–347 (2014)[33] Woodruff, D.P., et al.: Sketching as a tool for numerical linear algebra. Foundations andTrends in Theoretical Computer Science (1–2), 1–157 (2014)[34] Xu, Z., Sun, J.: Image inpainting by patch propagation using patch sparsity. IEEETransactions on Image Processing (5), 1153–1165 (2010)[35] Yadav, S.K., Reitebuch, U., Skrodzki, M., Zimmermann, E., Polthier, K.: Constraint-based point set denoising using normal voting tensor and restricted quadratic error metrics.Computers & Graphics , 234–243 (2018)[36] Zhou, D., Jiang, C., Dong, J., LIU, R.: Algorithm of detecting and filling small holes intriangular mesh surface. Computer Aided Drafting, Design and Manufacturing4