[PDF] Large-Scale Evaluation of Shape-Aware Neighborhood Weights and Neighborhood Sizes

Abstract

In this paper, we define and evaluate a weighting scheme for neighborhoods in point sets. Our weighting takes the shape of the geometry, i.e. the normal information, into account. This causes the obtained neighborhoods to be more reliable in the sense that connectivity also depends on the orientation of the point set. We utilize a sigmoid to define the weights based on the normal variation. For an evaluation of the weighting scheme, we turn to a Shannon entropy model for feature separation and rigorously prove its non-degeneracy for our family of weights. Based on this model, we evaluate our weighting terms on a large scale of both clean and real-world models. This evaluation provides results regarding the choice of optimal parameters within our weighting scheme. Furthermore, the large-scale evaluation also reveals that neighborhood sizes should not be fixed globally when processing models. This is in contrast to current general practice in the field of geometry processing.

Full PDF

AA Large-Scale Evaluation of Shape-Aware Neighborhood Weights andNeighborhood Sizes ⋆ Martin Skrodzki a , ∗ , Eric Zimmermann b a ICERM, Brown University, Providence, RI, USA and RIKEN iTHEMS, Wako, Saitama, Japan b Institute of Mathematics, Freie Universität Berlin, Berlin, Germany

A R T I C L E I N F O

Keywords :Point SetNeighborhoodsFeaturesSigmoid

A B S T R A C T

In this paper, we deﬁne and evaluate a weighting scheme for neighborhoods in point sets. Our weight-ing takes the shape of the geometry, i.e. the normal information, into account. This causes the obtainedneighborhoods to be more reliable in the sense that connectivity also depends on the orientation of thepoint set. We utilize a sigmoid to deﬁne the weights based on the normal variation. For an evaluationof the weighting scheme, we turn to a Shannon entropy model for feature separation and rigorouslyprove its non-degeneracy for our family of weights. Based on this model, we evaluate our weightingterms on a large scale of both clean and real-world models. This evaluation provides results regardingthe choice of optimal parameters within our weighting scheme. Furthermore, the large-scale evalua-tion also reveals that neighborhood sizes should not be ﬁxed globally when processing models. Thisis in contrast to current general practice in the ﬁeld of geometry processing.

1. Introduction

Point sets arise naturally in many kinds of 3D acquisitionprocesses, like e.g. 3D laser-scanning. As early as 1985, theyhave been recognized as fundamental shape representationsin computer graphics, see [15]. Ever since, they have beenused in diverse applications, e.g. in face recognition [4], traf-ﬁc accident analysis [6], or archaeology [14].Despite their versatility and their advantages—like easyacquisition and low storage costs—point sets have a signiﬁ-cant downside to them when compared with mesh represen-tations: They are not equipped with connectivity informa-tion. This is mostly due to the acquisition process. Considerfor example a manually guided scanning device. The opera-tor will scan those areas of the real-world objects with verysharp features multiple times. Consequently, occlusion isprevented and the whole geometry is captured. Even thougheach scan can provide connectivity information on the re-spectively acquired points, the complete point set obtainedvia registration of the individual scans (see e.g. [2]) does notprovide global connectivity information in general. Thus, anotion of neighborhoods has to be deﬁned and computed foreach point.Many deﬁnitions of neighborhoods, combinatorial or ge- ⋆ This material is based upon work supported by the National ScienceFoundation under Grant No. DMS-1439786 and the Alfred P. Sloan Foun-dation award G-2019-11406 while the author was in residence at the In-stitute for Computational and Experimental Research in Mathematics inProvidence, RI, during the Illustrating Mathematics program. Furthermore,this research was supported by the DFG collaborative research cluster SFBTransregio 109, “Discretization in Geometry and Dynamics” as well as bythe German National Academic Foundation. ∗ Corresponding author [email protected] (M. Skrodzki); [email protected] (E. Zimmermann) https://ms-math-computer.science (M. Skrodzki); https://userpage.fu-berlin.de/ezimmermann/ (E. Zimmermann)

ORCID (s): (M. Skrodzki) https://twitter.com/msmathcomputer2 (M. Skrodzki) ometric, with global or local parameters, have been proposedand discussed (see Section 2). Furthermore, the concept ofweighting neighboring points is not new. For example, thepure selection of a neighborhood causes an equal treatmentof all neighbors. Aside from this, isotropic weighting is onecommon way, evaluating Euclidean distances via a Gaussianweighting function. This provides closer points with higherinﬂuence (see e.g. [1]). Additionally, other point set infor-mation can be incorporated, like density or distribution (seee.g. [20] or [25]). The inclusion of normal deviation in thearea of anisotropic weighting has also been considered anddiscussed before (see [29, 24]).The research work presented here aims at investigatinganisotropic weighting terms in a broad framework (Section 3)which includes usual weighting choices such as equal weightsor sharp cut-oﬀ weights . Our evaluation is processed viaa Shannon entropy model (Section 4), which is based onthe work of [8, 28]. Furthermore, we aim at evaluating theweighting scheme on a large scale. This is to prevent over-interpretation of ﬁndings obtained from a very small set ofmodels. Overall, the contributions of this work are:• Deﬁnition of a shape-aware neighborhood weightingutilizing sigmoid function weights based on normalvariation;• Presentation of an evaluation model as well as proof ofits non-degenerate cases and dependency on the sig-moid parameters;• Large scale experimental evaluation of the proposedneighborhood weighting concept;• Discussion of the results with respect to both neigh-borhood weighting and neighborhood sizes. We consider the case of cut-oﬀ weights if starting from a given devi-ation, all points with greater or equal deviation are attributed weight . M. Skrodzki and E. Zimmermann:

Preprint submitted to Elsevier

Page 1 of 12 a r X i v : . [ c s . G R ] S e p valuation of Neighborhood Weights and Sizes

2. Related Work

Neighborhoods are very important in point set processing,as almost all algorithmic approaches rely on them. A com-mon choice is to use heuristics to determine suﬃcient no-tions like the size of a combinatorial or metric neighbor-hood. In some applications, neighborhoods arise as byprod-ucts, for instance in segmentation, where one could considersegments to impose a neighborhood relation on the pointsthey cover. However, we aim at a more general frameworkfor the determination and weighting of neighborhoods. Inthe following, we recall works discussing heuristic neighbor-hood deﬁnitions. Several works have advanced from simpleheuristics and derive more involved notions for better ﬁt-ting neighborhood deﬁnitions in diﬀerent contexts. Theseare mainly obtained from error functionals, which we willalso discuss.

Most works consider either a combinatorial 𝑘 -nearest neigh-borhood  𝑘 ( ⋅ ) or a metric ball 𝐵 𝑟 ( ⋅ ) inducing a neighbor-hood. Both of these notions have parameters to be tuned,namely the number of neighbors 𝑘 or the radius 𝑟 of theneighborhood. Several works have been presented introduc-ing heuristics to ﬁnd appropriate values for 𝑘 or 𝑟 in diﬀer-ent scenarios. The authors of [1] for instance use a globalradius and change it to aﬀect the running time of their al-gorithm. In [21], the authors ﬁx a combinatorial number 𝑘 of neighbors to be sought. Then, for each point 𝑝 𝑖 from theconsidered point set 𝑃 , these 𝑘 neighbors are found, whichﬁxes a radius 𝑟 𝑖 to the farthest of them. Finally, the neigh-bors within radius 𝑟 𝑖 ∕3 are used. Therefore, their approachresembles the geometric neighborhood in a local manner.The method used in [22] is more involved. The authorsrecognize that both a too large or too small radius 𝑟 lead toproblems and thus aim for a local adaption like [21]. A lo-cal density estimate 𝛿 𝑖 around each point 𝑝 𝑖 ∈ 𝑃 is computedfrom the smallest ball centered at 𝑝 𝑖 , containing  𝑘 ( 𝑝 𝑖 ) , where 𝑘 is found experimentally to be best chosen from {6 , … , ⊂ ℕ .Given the radius 𝑟 𝑖 of this ball, the local density is set tobe 𝛿 𝑖 = 𝑘 ∕ 𝑟 𝑖 . In a second step, a smooth density function 𝛿 is interpolated from the local density estimates 𝛿 𝑖 , hence thisweighting involves the incorporation of density-informationinto the weight assignment.In the context of surface reconstruction, the authors of [9]discuss several choices for neighborhoods and correspond-ing weights. While two of the three presented methods sim-ply use geometric neighborhoods, the third method takes adiﬀerent approach. Namely, the authors collect all neighborsof 𝑝 𝑖 in a “large” ball ([9, page 7]) around 𝑝 𝑖 . Then, theyﬁt a plane to this preliminary neighborhood and project allneighbors and 𝑝 𝑖 onto this plane. On the projections, a De-launay triangulation is built and the induced neighborhoodof the triangulation is used in the following computations,which localizes their approach and respects diﬀerent pointdistributions.A completely diﬀerent route is taken by [5]. The authorsﬁrst calculate features of a point set based on diﬀerently sized neighborhoods. Then, they use a training procedure to ﬁndthe combination of neighborhood sizes that provides the bestseparation of diﬀerent feature classes.The inclusion of normal deviation and hence anisotropicweighting into neighborhood concepts is part of the work [29].The approach of the authors is to use a weighted principalcomponent analysis, which ﬁts our evaluation model. How-ever, they rely on a global neighborhood size and assignsharp cut-oﬀ weights while we allow for changing neigh-borhood sizes and smooth weighting terms. While the approaches presented above are based on heuris-tics, some works try to deduce an optimal 𝑘 for the 𝑘 nearestneighborhoods based on error functions. For instance, theauthors of [17] work in the context of the MLS framework(see [1, 12, 13, 26]) for function approximation. The authorsperform an extensive error analysis to quantify the approxi-mation error both independent and depending on the givendata. Finally, they obtain an error functional. This is thenevaluated for diﬀerent neighborhood sizes 𝑘 . The neighbor-hood  𝑘 yielding the smallest error is ﬁnally chosen to beused in the actual MLS approximation.In contrast, the authors of [19] deduce an error bound onthe normal estimation obtained from diﬀerent neighborhoodsizes. Utilizing this error functional, they obtain the bestsuited neighborhood size for normal computation. The workof [17] heavily depends on the MLS framework in which theerror analysis is deduced, while the work of [19] depends onthe framework of normal computation.The authors of [28] take a more general approach in thecontext of segmentation of 3D point sets. They also use theconcept of combinatorial neighborhoods, going back to re-sults of [16, 8]. In order to choose an optimal value for 𝑘 , theauthors turn to the covariance matrix, which is symmetricand positive-semi-deﬁnite. Thus, the matrix has three non-negative eigenvalues. Following an idea of [10], in the workof [22], the authors grow a neighborhood and consider a sur-face variation as a measure to grow a neighborhood aroundeach point 𝑝 𝑖 . The same quantity is used by [3]. However,the authors of [22] do not grow a neighborhood, but choosea size 𝑘 for it according to a consistent curvature level. Theauthors of [28] do not stop at these information, but proceedto consider three more quantities derived from the eigen-values of the covariance matrix reﬂecting point set features,see [8, 28]. Afterwards, following the concept of entropyby Shannon [23], they evaluate combinatorial and geomet-ric neighborhood sizes via two error measures (see Section 4for a detailed discussion).

3. Sigmoid Weights

Given points 𝑃 = { 𝑝 𝑖 ∣ 𝑖 ∈ [ 𝑛 ]} , 𝑛 ∈ ℕ , corresponding ori-ented unit-length normals 𝑛 𝑖 ∈ 𝕊 and neighborhoods  𝑖 ⊂ [ 𝑛 ] for every 𝑖 ∈ [ 𝑛 ] . For a given weighting function 𝜙 ∶ [0 , → [0 , (1) M. Skrodzki and E. Zimmermann:

Preprint submitted to Elsevier

Page 2 of 12valuation of Neighborhood Weights and Sizes we obtain the following weights 𝑤 𝑖𝑗 = 𝜙 ( ⟨ 𝑛 𝑖 , 𝑛 𝑗 ⟩ + 12 ) for 𝑖 ∈ [ 𝑛 ] , 𝑗 ∈  𝑖 . (2)Note that the argument of 𝜙 includes the deviation of the nor-mals measured by the Euclidean scalar product. The term ⟨ 𝑛 𝑖 , 𝑛 𝑗 ⟩ ranges from −1 to , because we assume normals of unit-length. By shifting the scalar product and normalizing, theargument of 𝜙 is in the range [0 , . Note that by the symme-try of the scalar product the weights are symmetric, i.e. 𝑤 𝑖𝑗 = 𝑤 𝑗𝑖 .The weighting function 𝜙 shall assign non-negative weightsbetween and . These weights should correspond to thesimilarity of the corresponding normals, i.e. a small normalvariation should result in weights close to , while a highnormal variation should yield weights close to .Our choice for the weighting function is a sigmoid. Asigmoid function is visually characterized by its shape ofan “S”-curve, see Figure 1. We will consider a family ofsigmoid functions that provide diﬀerent interpolations be-tween and . The family is based on the trigonometriccosine function. It is related to the sigmoid used in [18],however, we ﬁx the image of the function to be or re-spectively outside of [0 , . Deﬁnition 1 (Cosine-Sigmoid) . Consider a given threshold 𝑎 ∈ [0 , ⊂ ℝ and a given incline 𝑏 ∈ ℝ ≥ ∪ {∞} and let 𝑎 ′ = (1 − 𝑎 ) 𝑏 −1 + 𝑎 , i.e. 𝑎 ′ ∈ ( 𝑎, . Then, we deﬁne the sig-moid weighting function sig cos 𝑎,𝑏 as sig cos 𝑎,𝑏 ( 𝑥 ) ∶ ℝ → [0 , ,𝑥 ↦ ⎧⎪⎨⎪⎩ 𝑥 ∈ (−∞ , 𝑎 ) , − cos ( 𝑏𝜋 ( 𝑥 − 𝑎 )1− 𝑎 ) + 𝑥 ∈ [ 𝑎, 𝑎 ′ ) , 𝑥 ∈ [ 𝑎 ′ , +∞) . (3)Note that sig cos 𝑎,𝑏 ∶ [0 , → [0 , is surjective for all 𝑎 ∈ [0 , , 𝑏 ∈ ℝ ≥ and that sig cos 𝑎, ∞ ∶ [0 , → {0 , is surjective forall 𝑎 ∈ [0 , . The threshold parameter 𝑎 ∈ [0 , translatesthe curve along the 𝑥 -axis and controls where the cosine-curve starts. Furthermore, the incline parameter 𝑏 inﬂuencesthe slope of the cosine-curve, where 𝑏 = 1 results in a softincrease, while higher values of 𝑏 cause increasingly steeperslopes until the curve simulates a sharp cut-oﬀ at 𝑏 = ∞ .An illustration of Equation (3) for diﬀerent pairs of parame-ters ( 𝑎, 𝑏 ) is given in Figure 1. Observe that we obtain equalweights 𝑤 𝑖𝑗 ≡ by parameters 𝑎 = 0 , 𝑏 = ∞ and mimic asharp cut-oﬀ at 𝑎 ∈ [0 , by setting 𝑏 = ∞ . These observa-tions relate our weights to the uniform weights used in [28]and to the sharp cut-oﬀ of [29], respectively.

4. Evaluation Model

Having presented the set of neighborhood weights in Equa-tion (2) and the corresponding weighting function in Equa-tion (3) in the previous section, we will now describe themathematical background of our evaluation process. Forthis, we turn to the information measures originally intro-duced by Shannon [23]. Speciﬁcally, we will use a variation . . . . . . . . 𝑥 𝑠 𝑖 𝑔 𝑐 𝑜 𝑠 𝑎 , 𝑏 ( 𝑥 ) 𝑎 = 0 . , 𝑏 = 1 𝑎 = 0 . , 𝑏 = 2 𝑎 = 0 . , 𝑏 = ∞ Figure 1:

Plots of the sigmoid 𝑠𝑖𝑔 𝑐𝑜𝑠𝑎,𝑏 ( 𝑥 ) for three parameterchoices. of the quantities derived in [8, 28] as we will present in Sec-tion 4.1. First, we will establish the necessary notation andpreliminary results.Consider the covariance matrices 𝐶 𝑖 ∈ ℝ given by 𝐶 𝑖 ∶= ∑ 𝑗 ∈  𝑖 𝑤 𝑖𝑗 ( 𝑝 𝑗 − ̄𝑝 𝑖 )( 𝑝 𝑗 − ̄𝑝 𝑖 ) 𝑇 , (4)with 𝑖 ∈ [ 𝑛 ] , where ̄𝑝 𝑖 = |  𝑖 | ∑ 𝑗 ∈  𝑖 𝑝 𝑗 is the barycenter ofthe neighborhood of 𝑝 𝑖 and 𝑣 𝑇 denotes the transpose of avector 𝑣 ∈ ℝ . The weights 𝑤 𝑖𝑗 are chosen according toEquation (2). The covariance matrix 𝐶 𝑖 is symmetric andpositive-semi-deﬁnite. Thus, it has three non-negative eigen-values, which in the following we will denote by 𝜆 𝑖 ≥ 𝜆 𝑖 ≥ 𝜆 𝑖 ≥ . (5)Depending on the neighborhood  𝑖 and the assigned weights 𝑤 𝑖𝑗 ,we can prove the following theorem about the covariancematrix 𝐶 𝑖 . Theorem 1 (Non-degenerate Covariance Matrix) . Let 𝑃 = { 𝑝 𝑖 ∣ 𝑖 ∈ [ 𝑛 ]} be a set of points, ﬁx a point 𝑝 𝑖 ∈ 𝑃 andits neighborhood  𝑖 ⊆ [ 𝑛 ] , and consider the sigmoid func-tion sig cos 𝑎,𝑏 from Equation (3) as well as the covariance ma-trix 𝐶 𝑖 given in Equation (4). Assume there are 𝓁 , 𝓁 ∈  𝑖 , 𝓁 ≠ 𝓁 with 𝑝 𝓁 ≠ 𝑝 𝓁 and 𝑛 𝓁 ≠ − 𝑛 𝓁 . Then there existssome 𝑎 ∈ [0 , , such that the sum of all eigenvalues of 𝐶 𝑖 isstrictly positive independent of the choice of 𝑏 ∈ ℝ ≥ ∪ {∞} .Proof. First, we make the following two observations: i) The weights 𝑤 𝑖𝑗 = sig cos 𝑎,𝑏 ( ( ⟨ 𝑛 𝑖 , 𝑛 𝑗 ⟩ + 1) ⋅ −1 ) are non-negative for all 𝑗 ∈  𝑖 . This follows directly from thedeﬁnition of the function in Equation (3). ii) The matrix 𝐶 𝑖𝑗 ∶= ( 𝑝 𝑗 − ̄𝑝 𝑖 )( 𝑝 𝑗 − ̄𝑝 𝑖 ) 𝑇 is positive semi-deﬁnite for all 𝑗 ∈  𝑖 . This follows as 𝐶 𝑖𝑗 can be writ-ten in the form 𝑣 𝑇 𝑥𝑥 𝑇 𝑣 = 𝑣 𝑇 𝑥𝑣 𝑇 𝑥 = ( 𝑣 𝑇 𝑥 ) ≥ for M. Skrodzki and E. Zimmermann:

Preprint submitted to Elsevier

Page 3 of 12valuation of Neighborhood Weights and Sizes all 𝑣 ∈ ℝ with 𝑥 = 𝑝 𝑗 − ̄𝑝 𝑖 . Hence, 𝐶 𝑖𝑗 provides non-negative eigenvalues.Now, let { 𝜆 𝓁 𝑖 } 𝓁 =1 denote the eigenvalues of 𝐶 𝑖 as introducedabove. Assume that ∑ 𝓁 =1 𝜆 𝓁 𝑖 = 0 . Then ∑ 𝓁 =1 𝜆 𝓁 𝑖 ⋆ = Tr( 𝐶 𝑖 ) ⋆ = ∑ 𝑗 ∈  𝑖 𝑤 𝑖𝑗 Tr( 𝐶 𝑖𝑗 ) ⋆ = ∑ 𝑗 ∈  𝑖 𝑤 𝑖𝑗 ∑ 𝓁 =1 𝜆 𝓁 𝑖𝑗 , (6)where Tr( 𝐴 ) denotes the trace of square matrix 𝐴 and 𝜆 𝑙𝑖𝑗 , 𝑙 = 1 , , , are the eigenvalues of 𝐶 𝑖𝑗 . Equations ⋆ and ⋆ hold because of the relation between the trace and theeigenvalues, and ⋆ is justiﬁed by the linearity of the trace.From the observations i) and ii) above, we know thatboth 𝑤 𝑖𝑗 and 𝜆 𝓁 𝑖𝑗 are non-negative for all 𝑗 ∈  𝑖 as well asfor all 𝓁 ∈ {1 , , . Hence, the sum (6) is , if and only ifall summands are.We ﬁx an arbitrary summand 𝐶 𝑖𝑗 . Assume that 𝑤 𝑖𝑗 = 0 .From this, we set 𝑥 ∶= ( ⟨ 𝑛 𝑖 , 𝑛 𝑗 ⟩ + 1) ⋅ −1 and deduce di-rectly that 𝑤 𝑖𝑗 = sig cos 𝑎,𝑏 ( 𝑥 ) = 0 . Independent of the choiceof 𝑏 , by the reasoning of Appendix A, we obtain that 𝑥 ≤ 𝑎 .For 𝑛 𝑖 ≠ − 𝑛 𝑗 , we have 𝑥 > . Therefore, choosing 𝑎 𝑗 ∶= 𝑥 results in weights sig cos 𝑎 𝑗 ,𝑏 ( 𝑥 ) > independent of 𝑏 .Finally, by setting 𝑎 ′ = min{ 𝑎 𝑗 ∣ 𝑗 ∈  𝑖 } , we obtain anew set of weights 𝑤 ′ 𝑖𝑗 ∶= sig cos 𝑎 ′ ,𝑏 ( ( ⟨ 𝑛 𝑖 , 𝑛 𝑗 ⟩ + 1) ⋅ −1 ) > .Since we assumed that there is at least one pair of distinctpoints 𝑝 𝓁 ≠ 𝑝 𝓁 with normals 𝑛 𝓁 ≠ − 𝑛 𝓁 , we have at leastone of the summands 𝐶 𝑖 𝓁 , 𝐶 𝑖 𝓁 to be non-zero and withit 𝐶 ′ 𝑖 ∶= ∑ 𝑗 ∈  𝑖 𝑤 ′ 𝑖𝑗 ( 𝑝 𝑗 − ̄𝑝 𝑖 )( 𝑝 𝑗 − ̄𝑝 𝑖 ) 𝑇 ≠ , for all 𝑖 ∈ [ 𝑛 ] .Therefore, we constructed a parameter 𝑎 ′ such that the corre-sponding covariance matrix provides a strictly positive sumof eigenvalues independent of the choice of 𝑏 ∈ ℝ ≥ ∪ {∞} . Given the assumptions of Theorem 1, we can assume that 𝐶 𝑖 ≠ ℝ . Therefore, we can derive certain quantitiesfrom the eigenvalues of the covariance matrix. In our con-text, we will consider the linearity 𝐿 𝜆 , planarity 𝑃 𝜆 , and scat-tering 𝑆 𝜆 . These are given by 𝐿 𝜆𝑖 = 𝜆 𝑖 − 𝜆 𝑖 𝜆 𝑖 , 𝑃 𝜆𝑖 = 𝜆 𝑖 − 𝜆 𝑖 𝜆 𝑖 , 𝑆 𝜆𝑖 = 𝜆 𝑖 𝜆 𝑖 (7)and represent 1D, 2D, and 3D features in the point set, re-spectively. See [8] for a derivation and a detailed explana-tion of these quantities. As 𝐶 𝑖 ≠ , we have 𝜆 𝑖 ≠ , there-fore the quantities in Equation (7) are well-deﬁned. Further-more, because of the ordering of the eigenvalues given inEquation (5), we have 𝐿 𝜆𝑖 , 𝑃 𝜆𝑖 , 𝑆 𝜆𝑖 ∈ [0 , . Hence, as 𝐿 𝜆𝑖 + 𝑃 𝜆𝑖 + 𝑆 𝜆𝑖 = 1 , . . . . . . . . 𝑥 − 𝑥 l n ( 𝑥 ) Figure 2:

Plot of the summand − 𝑥 ln( 𝑥 ) from Equation (8)and (10) for 𝑥 ∈ [0 , as all arguments 𝐿 𝜆𝑖 , 𝑃 𝜆𝑖 , 𝑆 𝜆𝑖 , and 𝜆 𝑙𝑖 𝜆 Σ 𝑖 ∀ 𝑙 ∈ [1 , , are taken from [0 , . each of these three quantities can be interpreted as the prob-ability of the considered point to be part of an intrinsic 1D,2D, or 3D part of the geometry. The authors of [8, 28] con-sider the ﬁrst measure 𝐸 dim 𝑖 = − 𝐿 𝜆𝑖 ln( 𝐿 𝜆𝑖 ) − 𝑃 𝜆𝑖 ln( 𝑃 𝜆𝑖 ) − 𝑆 𝜆𝑖 ln( 𝑆 𝜆𝑖 ) . (8)See Figure 2 for a plot of each summand of the equation.Note that while lim 𝑥 → ln( 𝑥 ) = ∞ it is lim 𝑥 → 𝑥 ln( 𝑥 ) = 0 ,see Appendix B for a detailed discussion. Practically, theerror measure 𝐸 dim 𝑖 assesses to what extent the neighbor-hood  𝑖 indicates a corner, an edge point, or a planar pointof the geometry. In particular, the extreme cases ( 𝜆 𝑖 , 𝜆 𝑖 , 𝜆 𝑖 ) ∈ {( 𝜌, , , ( 𝜌, 𝜌, , ( 𝜌, 𝜌, 𝜌 ) ∣ 𝜌 ∈ ℝ > } (9)all obtain 𝐸 dim 𝑖 = 0 .The second measure is a more general solution for op-timal selection of neighborhood sizes. For this, recall thatthe eigenvalues correspond to the size of the principal com-ponents spanning a 3D covariance ellipsoid, see [20]. Wedenote their sum by 𝜆 Σ 𝑖 = ∑ 𝓁 =1 𝜆 𝓁 𝑖 . Then, by normalizingthe eigenvalues with 𝜆 Σ 𝑖 and recalling the positiveness of alleigenvalues, we once more obtain 𝜆 𝑖 𝜆 Σ 𝑖 , 𝜆 𝑖 𝜆 Σ 𝑖 , 𝜆 𝑖 𝜆 Σ 𝑖 ∈ [0 , , 𝜆 𝑖 𝜆 Σ 𝑖 + 𝜆 𝑖 𝜆 Σ 𝑖 + 𝜆 𝑖 𝜆 Σ 𝑖 = 1 . Therefore, these quantities can also be interpreted as prob-abilities for 𝑝 𝑖 being a corner or part of an edge or planararea respectively. Furthermore, as we assume 𝜆 𝑖 > , theseterms are well-deﬁned. By considering the entropy of theeigenvalues, i.e. the eigenentropy [28], we obtain the secondmeasure 𝐸 𝜆𝑖 = − 𝜆 𝑖 𝜆 Σ 𝑖 ln ( 𝜆 𝑖 𝜆 Σ 𝑖 ) − 𝜆 𝑖 𝜆 Σ 𝑖 ln ( 𝜆 𝑖 𝜆 Σ 𝑖 ) − 𝜆 𝑖 𝜆 Σ 𝑖 ln ( 𝜆 𝑖 𝜆 Σ 𝑖 ) . (10) M. Skrodzki and E. Zimmermann:

Preprint submitted to Elsevier

Page 4 of 12valuation of Neighborhood Weights and Sizes (a) Equal Weights: 𝑎 = 0 , 𝑏 = ∞ (b) Cut-Oﬀ: 𝑎 = . , 𝑏 = ∞ (c) Optimal 𝑎 ∗ = . , 𝑏 ∗ = 1 from Equation (11) Figure 3:

The eﬀect of the diﬀerent parameters on the fandisk model. Showing error measure 𝐸 dim 𝑖 from Equation (8) for eachpoint of the model, from low (blue) to high error (orange). Note how the optimal weights from Equation (11) have drasticallyreduced error in comparison to both equal weights (used by [28]) and sharp cut-oﬀ weights (used by [29]). Note that while the arguments are slightly diﬀerent, the sum-mands in this measure behave once more like the plot inFigure 2. However, in terms of the diﬀerent extremal casesfor eigenvalues given in Equation (9), this measure only at-tains for 𝜆 𝑖 > and 𝜆 𝑖 = 𝜆 𝑖 = 0 and not for the other two.Therefore, it shows a general preference for linear structuresover planar or volumetric structures in the data.We will use the two measures (8) and (10) in our quanti-tative experiments in Section 5. However, the above discus-sion depends on the assumptions provided in Theorem 1. Inthe following we will discuss cases in which these assump-tions are not satisﬁed. In practical applications, the assumptions of Theorem 1 arenot always satisﬁed. Note here that the error values 𝐸 dim 𝑖 and 𝐸 𝜆𝑖 are evaluated on a single point 𝑝 𝑖 of the point set 𝑃 .The following reasons can hinder the correct evaluation: i) If the point set contains multiple duplicates of a point,more than the sought-for number of neighbors 𝑘 , allpoints in the reported neighborhood collapse into asingle point equal to the barycenter of the neighbor-hood. Thus, the summands 𝐶 𝑖𝑗 all become . ii) If a point 𝑝 𝑖 has a ﬂipped normal in comparison to allits neighboring points 𝑝 𝑗 , the argument 𝑥 in the weightequation 𝑤 𝑖𝑗 = sig cos 𝑎,𝑏 ( 𝑥 ) becomes and therefore, allweights degenerate to . This happens in particularfor very small or thin geometries as well as for faultynormal ﬁelds. iii) Even if the assumptions of Theorem 1 are satisﬁed, itonly states the existence of a suitable parameter 𝑎 ∈ [0 , .Therefore, choosing parameter 𝑎 too large can causeall weights in the covariance matrix (4) to degenerateto .In the following evaluation, we prevent case i) by requir-ing the point sets to only contain distinct points. Further- more, we orient the normal ﬁeld to prevent case ii) . Con-cerning a too large parameter 𝑎 , we report a failure in thecomputation of the error values for the point set 𝑃 if 𝜆 Σ 𝑖 = 0 for at least one point 𝑝 𝑖 ∈ 𝑃 . By including the choice 𝑎 = 0 for the parameters, we ensure that each model has at leastone correctly evaluated pair of error values 𝐸 dim and 𝐸 𝜆 .

5. Evaluation Results

In this section, we present our quantitative evaluation of theweights presented in Equation (2). For the evaluation, weutilize the error measures 𝐸 dim and 𝐸 𝜆 as deﬁned in Equa-tions (8) and (10) respectively. Our clean models are takenfrom a data set described in [11]. The authors provide tenthousand clean and manifold surface meshes, which are ob-tained by exporting only the boundary of the tetrahedral meshesused in [11]. From these, we randomly select a subset of , meshes with uniform probability. Furthermore, we use meshed models from the real-world object scans providedby [7]. For both repositories, we use the mesh informationand its manifold property to obtain oriented face normals.From these, we compute vertex normals and then use theseand the vertices as point sets for our experiments. For eachsuch point set 𝑃 , we consider the parameter set 𝔓 ∶= {0 , . , . , . , .

9} × {1 , , , ∞} . We use the combinatorial neighborhood notion , so that forevery pair ( 𝑎, 𝑏 ) and every point 𝑝 𝑖 ∈ 𝑃 , we calculate its 𝐸 dim 𝑖 and 𝐸 𝜆𝑖 value over the range of 𝑘 , taken from 𝔎 ∶= {6 , … , . We assume this range for 𝑘 , as it reﬂects typical, heuristicchoices for neighborhood sizes in the area of point set pro- For a point 𝑝 𝑖 ∈ 𝑃 , we consider the index 𝑖 as well as the in-dices of the 𝑘 nearest neighbors to 𝑝 𝑖 within 𝑃 as neighborhood  𝑖 ,i.e. |  𝑖 | = 𝑘 + 1 . M. Skrodzki and E. Zimmermann:

Preprint submitted to Elsevier

Page 5 of 12valuation of Neighborhood Weights and Sizes1 2 4 ∞ Parameter 𝑏 M o d e l s w i t h ( 𝑎 ∗ , 𝑏 ∗ ) 𝐸 dim 𝑎 = 0 𝑎 = 0 . 𝑎 = 0 . 𝑎 = 0 . 𝑎 = 0 . ∞ Parameter 𝑏 M o d e l s w i t h ( 𝑎 ∗ , 𝑏 ∗ ) 𝐸 𝜆 𝑎 = 0 𝑎 = 0 . 𝑎 = 0 . 𝑎 = 0 . 𝑎 = 0 . Figure 4:

Histograms of preferred sigmoid parameters ( 𝑎 ∗ , 𝑏 ∗ ) (Eq. 11) with respect to minimal average error values for left: 𝐸 dim (Eq. (8)) and right: 𝐸 𝜆 (Eq. (10)) over the range 𝔎 applied to , geometries randomly selected from the data set used in [11].1 2 4 ∞ Parameter 𝑏 M o d e l s w i t h ( 𝑎 ∗ , 𝑏 ∗ ) 𝐸 dim 𝑎 = 0 𝑎 = 0 . 𝑎 = 0 . 𝑎 = 0 . 𝑎 = 0 . ∞ Parameter 𝑏 M o d e l s w i t h ( 𝑎 ∗ , 𝑏 ∗ ) 𝐸 𝜆 𝑎 = 0 𝑎 = 0 . 𝑎 = 0 . 𝑎 = 0 . 𝑎 = 0 . Figure 5:

Histograms of preferred sigmoid parameters ( 𝑎 ∗ , 𝑏 ∗ ) (Eq. 11) with respect to minimal average error values for left: 𝐸 dim (Eq. (8)) and right: 𝐸 𝜆 (Eq. (10)) over the range 𝔎 applied to geometries taken from [7]. cessing, see the works discussed in Section 2, in particu-lar [22]. For each point set 𝑃 , we obtain the optimal param-eter pair ( 𝑎 ∗ , 𝑏 ∗ ) as ( 𝑎 ∗ , 𝑏 ∗ ) dim = arg min ( 𝑎,𝑏 )∈ 𝔓 | 𝑃 | | 𝑃 | ∑ 𝑖 =1 min 𝑘 ∈ 𝔎 𝐸 dim 𝑖 , ( 𝑎 ∗ , 𝑏 ∗ ) 𝜆 = arg min ( 𝑎,𝑏 )∈ 𝔓 | 𝑃 | | 𝑃 | ∑ 𝑖 =1 min 𝑘 ∈ 𝔎 𝐸 𝜆𝑖 . (11)Following the discussion from Section 4.2, we set 𝐸 dim 𝑖 = ∞ if there is some point 𝑝 𝑖 ∈ 𝑃 for which the covariance ma-trix 𝐶 𝑖 degenerates for all 𝑘 ∈ 𝔎 given the current param-eters ( 𝑎, 𝑏 ) ∈ 𝔓 . We proceed accordingly for 𝐸 𝜆𝑖 . That is,a parameter choice ( 𝑎, 𝑏 ) ∈ 𝔓 cannot be attained as opti-mal parameter pair if there is at least one point that cannotbe interpreted meaningfully. Furthermore, for the optimalparameters ( 𝑎 ∗ , 𝑏 ∗ ) and each point 𝑝 𝑖 , we store the utilizedneighborhood sizes arg min 𝑘 ∈ 𝔎 𝐸 dim 𝑖 and arg min 𝑘 ∈ 𝔎 𝐸 𝜆𝑖 re-spectively. See Figure 3 for an illustration of the error mea-sure 𝐸 dim on the fandisk geometry as well as for a compar-ison of diﬀerent parameter choices ( 𝑎, 𝑏 ) . In the followingwe report and interpret our ﬁndings. We analyze the total amount of ( 𝑎, 𝑏 ) choices for both modelrepository selections. Here, we count all point sets with theirrespective optimal parameter pair ( 𝑎 ∗ , 𝑏 ∗ ) . The correspond-ing four global histograms for both model repositories andboth error measures are given in Figures 4 and 5. In sum-mary, both error measures act almost similar on the two datasets, i.e. in the comparison between clean and real-worldmodels.On the large scale of , point sets (Figure 4), we ob-serve, that on average, a large choice for parameter 𝑎 anda small choice for parameter 𝑏 are preferred. This can beinterpreted to say that it is desirable to take only normalsinto account that exhibit a small deviation. Also, this sug-gests to assign weights with a slow ascent, caused via a lowchoice of parameter 𝑏 . This experiment shows the potentialof soft increasing weights assigned to smaller normal devi-ations only, and it contrasts the assignment of equal weights( 𝑎 = 0 , 𝑏 = ∞ , [28]) or a sharp cut oﬀ ( 𝑏 = ∞ , [29]), as bothare rarely chosen as optimal choices regarding the two er-ror measures. A localized, i.e. model-depended, discussionabout the possibility to increase 𝑎 and 𝑏 for better results isgiven in the upcoming section. M. Skrodzki and E. Zimmermann:

Preprint submitted to Elsevier

Page 6 of 12valuation of Neighborhood Weights and Sizes

Table 1

Distribution of ( 𝑎 ∗ , 𝑏 ∗ ) choices into the three cases of (a) an attained maxi-mum ( 𝑎 = . , 𝑏 = ∞ ), (b) a possible increase of the parameter without failure ( 𝑎 + , 𝑏 + ), and(c) impossibility of increasing the parameter because it would cause a failure ( ¬ 𝑎 + , ¬ 𝑏 + ). 𝑎 = . 𝑎 + ¬ 𝑎 + 𝑏 = ∞ 𝑏 + ¬ 𝑏 + 𝐸 dim

248 20 732 32 968 0 1,000 [11] 𝐸 𝜆

266 0 734 2 998 0 1,000

Scanned 𝐸 dim [7] 𝐸 𝜆 In terms of scanned real-world models (Figure 5), we an-alyzed point sets. In comparison to the clean models, wedo observe a diﬀerent behavior. Namely, small values for 𝑎 are favored, while the values for 𝑏 separate into the smallestand largest bin possible. The latter mostly matches the ob-servation made above for the clean models. We interpret theparameter 𝑎 to reﬂect the noise components caused by the ac-quisition process. Therefore, even a mid-range choice for 𝑎 causes several points 𝑝 𝑖 ∈ 𝑃 to have degenerate covariancematrices 𝐶 𝑖 respectively. As discussed above, these parame-ter pairs ( 𝑎, 𝑏 ) are then neglected from the choice as optimalparameters. See also the following section for a more de-tailed discussion of this.In conclusion, we see that weight-determination gener-ally favors a soft increase of the weight function, i.e. choos-ing 𝑏 rather small. The value 𝑎 however depends on thegeometry. Clean models mostly attain smaller error valuesfor larger values of 𝑎 , whereas real-world models requiresmaller values of 𝑎 to obtain non-degenerate covariance ma-trices. Both model repositories have in common that theyalmost never report equal weights as preferred weight as-signment. Hence, the equal weighting scheme of [28] is in-ferior to the family of weights presented here. Sharp cut-oﬀweights are only chosen as optimal weighting by a subset ofthe real-world scans. As [29] used sharp cut-oﬀ weights inthe context of denoising, our results hint that this weight setmight be beneﬁcial in the presence of noise. However, forabout of the scanned models, our weighting family stillchooses weights superior to the cut-oﬀ weights used by [29]. In this section, we will discuss the ( 𝑎 ∗ , 𝑏 ∗ ) choices presentedin the previous section from a local, i.e. point-set-dependent,perspective. The respective results are presented in Table 1.There, the ﬁrst two rows correspond to the clean and the lasttwo rows to the scanned real-world models. The columnspresent information about the amount of point sets accept-ing maximal value 𝑎 = . , allowing or forbidding an increaseof 𝑎 , accepting maximal value 𝑏 = ∞ , and allowing or for-bidding an increase of 𝑏 .For example, the column labeled 𝑎 = . reports the num-ber of all geometries reporting this value as best choice. Thecolumn labeled 𝑎 + gives the number of those point sets,where larger values for 𝑎 would have been possible but werenot attained. Finally, the column labeled ¬ 𝑎 + provides the number of geometries, where an increase of the parameter 𝑎 would result in a failure, i.e. in at least one degenerate co-variance matrix 𝐶 𝑖 , see Section 4.2. Observe that we coverall possible cases. Hence, the three columns sum up to thenumber of considered geometries, given in the last column.The second set of three columns presents the correspondingvalues for parameter 𝑏 .Having all values in one chart, we directly observe thebehavior assessed for parameter 𝑎 in the previous section.There, we stated that especially in the case of clean models,an as-large-as-possible value for 𝑎 is favorable over smallervalues for 𝑎 . Indeed, Table 1 conﬁrms this statement, as inthe case of 𝐸 dim , only clean and none of the scanned mod-els allow for an increase of parameter 𝑎 (cf. column 𝑎 + ). Forthe error measure 𝐸 𝜆 , the result is even more striking. Forthis measure, none of the clean models and only one scannedmodel exhibit the case in which parameter 𝑎 could be in-creased without causing a degenerate covariance matrix 𝐶 𝑖 for some point of the geometry. This justiﬁes the small val-ues for 𝑎 attained in the real-world scenarios presented inFigure 5 when compared to the values of 𝑎 attained in theclean scenarios in Figure 4. Semantically, this opts for in-cluding just enough neighbors in the computation to makeit feasible, but focus on those that are as similar as possiblewith regard to the normal ﬁeld.The reported numbers on the parameter 𝑏 also supportthe observation drawn before: From Theorem 1, we knowthat an increase of parameter 𝑏 cannot cause a failure, i.e. adegenerate covariance matrix. And indeed, we observe all-zero entries in the last column, which experimentally val-idates Theorem 1. Furthermore, the optimal choices for 𝑏 rarely assume larger values up to ∞ as seen in Figures 4and 5. These highest values are attained for . (with 𝐸 dim )and . (with 𝐸 𝜆 ) of the clean models, but for (with 𝐸 dim )and (with 𝐸 𝜆 ) of the scanned models. In particular incomparison with Figure 5, these results point to a qualita-tive diﬀerence in parameter choice. Obviously, scanned real-world models favor either a smooth and light increase or asharp cut-oﬀ, which justiﬁes the weighting choice of [29] al-though it only proves to be eﬀective in about one fourth ofall models from the data set used here.Summarizing the global and local analysis of the param-eter choices ( 𝑎 ∗ , 𝑏 ∗ ) , we draw the following conclusions:• The utilized error measures favor weight determina- M. Skrodzki and E. Zimmermann:

Preprint submitted to Elsevier

Page 7 of 12valuation of Neighborhood Weights and Sizes6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 . . Neighborhood Size 𝑘 P o i n t s w / n e i g hb o r h oo d s i z e 𝑘 [ % ] 𝐸 dim clean models [11] 𝐸 𝜆 clean models [11] 𝐸 dim scanned models [7] 𝐸 𝜆 scanned models [7] Figure 6:

Histogram of preferred neighborhood sizes 𝑘 with respect to minimal error values 𝐸 dim and 𝐸 𝜆 for the correspondingoptimal sigmoid parameters ( 𝑎 ∗ , 𝑏 ∗ ) applied to , geometries taken from [11] and scanned models taken from [7]. Toensure compatibility over the two diﬀerent data repositories, we normalize by the total number of points and report the percentageof points choosing the respective neighborhood size. tion with as-large-as-possible values for parameter 𝑎 and mostly small values for parameter 𝑏 . That is, onlypoints with as-similar-as-possible normals are consid-ered, but out of these, all are allowed to inﬂuence thecomputation.• Equal weights ( 𝑎 = 0 , 𝑏 = ∞ ), as used by [28], arenever chosen as optimal parameters (except for onereal-world model under measure 𝐸 𝜆 ).• Sharp cut-oﬀ weights as widely used in the literature,e.g. in [29], rarely attained minimal error measuresfor clean models, but proved to be eﬀective in about afourth of the real-world cases. As stated in the beginning of Section 5, for each point in theutilized point sets, we store the neighborhood size 𝑘 ∈ 𝔎 which leads to the optimal choice of parameters ( 𝑎 ∗ , 𝑏 ∗ ) ac-cording to Equation (11). In Figure 6, we present a his-togram plotting this data, i.e. for each neighborhood size 𝑘 ∈ 𝔎 , we show what percentage of points use this 𝑘 whencontributing to the optimal parameters ( 𝑎 ∗ , 𝑏 ∗ ) .Note that the plots for both 𝐸 dim and 𝐸 𝜆 on clean mod-els [11] as well as the plots for 𝐸 𝜆 on the scanned models [7]as given in Figure 6 are qualitatively similar. All favor anas-small-as-possible neighborhood size 𝑘 over larger neigh-borhoods. However, when using the error measure 𝐸 dim asdeﬁned in Equation (8) and evaluating it on the scanned real-world models taken from [7], the histogram indicates a dif-ferent behavior, see Figure 6. Still, the smallest neighbor-hood size 𝑘 = 6 collects the highest number of points ( ).But the other neighborhood sizes exhibit a less uniform dis-tribution (most notably at 𝑘 = 11 , with ) while in theother cases the histogram rather resembled a hyperbola. Asimilar behavior is observed for the nine scanned models [27],cf. Figure 8. This observation justiﬁes the usability of 𝐸 𝜆 over 𝐸 dim (with its clear geometric meaning of minima) for scanned models, to mimic the similar behavior of neighbor-hood size selection as reported for the clean models.For these clean models, taken from [11], we obtain an av-erage neighborhood size of . and . for the mea-sures 𝐸 dim and 𝐸 𝜆 respectively. The corresponding standarddeviations are . and . . For the real-world mod-els from [7], we have average neighborhood sizes of . and . for 𝐸 dim and 𝐸 𝜆 respectively with correspondingstandard deviations of . and . . These ﬁndingssuggest that variable neighborhood sizes yield smaller errorvalues in the two functionals. In order to further investigatethis hypothesis, in the following section, we once more turnto a local, i.e. point-set-dependent, perspective. 𝑘 Analysis

We will now consider the standard variation of the neigh-borhood size taken over a single model for 𝐸 dim and 𝐸 𝜆 . Tobetter understand and investigate the hypothesis formulatedabove, i.e. the statement that a variable neighborhood sizecontributes to lower error measures, we also include ninemodels from [27] in this analysis. In order to interpret the neighborhood sizes, we considera box-whisker plot over all standard deviations within the re-spective models in Figure 7. That is to say, the box indicatesthe median of the standard deviations of neighborhood sizesfor the indicated model repository and error measure. Whiletaken over all points of all models, the standard deviationof the neighborhood size according to 𝐸 dim and 𝐸 𝜆 is com-parable, as reported above, when considering the standarddeviation of the individual models, we ﬁnd a slightly morediverse behavior. In particular, most approaches in the liter-ature use and are evaluated on a setting with a ﬁxed neigh-borhood size 𝑘 . In our analysis, this would correspond to astandard deviation around , indicating no or small changesto the neighborhood size within a geometry. However, it is These are: Armadillo, Asian Dragon, Buddha, Bunny, David Head,Dragon, Drill, Lucy, Tyrannosaur, see [27].

M. Skrodzki and E. Zimmermann:

Preprint submitted to Elsevier

Page 8 of 12valuation of Neighborhood Weights and Sizes 𝐸 dim clean [11] 𝐸 𝜆 clean [11] 𝐸 dim scan [7] 𝐸 𝜆 scan [7] 𝐸 dim scan [27] 𝐸 𝜆 scan [27] S t a nd a r dd e v i a t i o n o f n e i g hb o r h oo d s i z e s Figure 7:

Box-whisker plot for the standard deviations obtained by the diﬀerent models. Each model contributes its own standarddeviation as a data point for the diagram. Therefore, the two leftmost columns represent , data points each (from [7]),the two center columns represent data points each (from [7]), and the two rightmost columns represent data points each(from [27]). obvious from Figure 7 that all standard deviations are lo-cated well away from . Thus, varying neighborhood sizesare clearly corresponding to smaller error measures.This observation is further supported when consideringan analog to Figure 6, but for the nine models chosen from [27]and separated for the respective models. There is no clearpreference for any neighborhood size 𝑘 ∈ 𝔎 in either of thetwo error measures, see Figure 8. A very interesting caseoccurs for the “Bunny” model. When considering the er-ror value 𝐸 dim , the optimal neighborhood size qualitativelyfollows a Gaussian distribution around a mean of 𝑘 = 16 .However, when considering 𝐸 𝜆 , it once more mimics a hy-perbolic behavior. Another noteworthy model is the “Drill”model. For error measure 𝐸 dim , it is roughly uniformly dis-tributed except for three peaks at , , and . In the caseof 𝐸 𝜆 , there is a qualitative Gauss bump centered at 𝑘 = 15 ,with notable exceptions at and . This further supportsthe statement that varying, model-dependent neighborhoodsizes are crucial in order to minimize the error measures.In summary, from the global and local analysis of theobtained neighborhood sizes 𝑘 , we draw the following con-clusions:• All standard deviations lie well above , i.e. both con-sidered error measure favor variable neighborhood sizesover constant-size neighborhoods.• This behavior is more pronounced for scanned models([27] and [7]) than for clean models ([11]).• Both error measures favor smaller neighborhood sizesfor clean models, however for scanned models this be-havior is only preserved by 𝐸 𝜆 with its normalization.

6. Conclusion

In this article, we investigated a family of weights (Eq. (2))for point set processing. These weights are based on the nor-mal similarity. The family includes common choices such as equal weights or sharp cut-oﬀ weights at a given threshold.Furthermore, we presented an evaluation model for neigh-borhood weights based on two Shannon entropy error mea-sures (Eqs. (8) and (10)). We have performed a large-scaleevaluation of our weight family on three data sets. The ﬁrstset consisted of , clean surface meshes from the workof [11]. The second set consisted of real-world scanstaken from [7], while the third set contributed nine real-worldscans taken from [27].A statistical analysis revealed that the optimal weight pa-rameters should lead to a neglect of non-similar normals, yetinclude mid-range normal points with a low weight. Specif-ically, equal weights, as used in the literature discussed inSection 2 and in particular in [28] do not obtain minimal er-ror values. Furthermore, sharp cut-oﬀ weights as used e.g.by [29] do perform well on certain scanned models, but arealso generally inferior to more ﬂexible weighting terms. Fi-nally, it became obvious in the evaluation that neighborhoodsizes have to be variable over a point set as only these vari-able sizes attain minimal error values.While this article addresses a variety of possible weight-ing choices and neighborhood sizes, to cover the most widelyused versions from the literature, several aspects are left asfuture work. Further research consists of running the large-scale analysis on a broader range of neighborhood sizes, com-parable to [28]. From a theoretical point of view it remainsto be better understood how the two error measures 𝐸 dim and 𝐸 𝜆 diﬀer. Finally, all tests were run on point clouds ob-tained from meshed geometries. Thus, more tests need to berun on a large set of real-world point sets to further validatethe ﬁndings presented in this article. M. Skrodzki and E. Zimmermann:

Preprint submitted to Elsevier

Page 9 of 12valuation of Neighborhood Weights and Sizes6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ⋅ −2 . . . Neighborhood Size 𝑘 P o i n t s w / n e i g hb o r h oo d s i z e 𝑘 [ % ] 𝐸 dim Armadillo Asian DragonBuddha BunnyDavid Head DragonDrill LucyTyrannosaur6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 . . . Neighborhood Size 𝑘 P o i n t s w / n e i g hb o r h oo d s i z e 𝑘 [ % ] 𝐸 𝜆 Armadillo Asian DragonBuddha BunnyDavid Head DragonDrill LucyTyrannosaur

Figure 8:

Histogram of preferred neighborhood sizes 𝑘 with respect to minimal error values (a) 𝐸 dim and (b) 𝐸 𝜆 for thecorresponding optimal sigmoid parameters ( 𝑎 ∗ , 𝑏 ∗ ) applied to geometries taken from [27] and separated into the individualmodels in the order indicated below the histograms. A. Zeroes of the Sigmoid Function

Here, we will determine the zeroes of the sigmoid function sig cos 𝑎,𝑏 as deﬁned in Equation (3). Note that sig cos 𝑎,𝑏 ( 𝑥 ) = 0 for all 𝑥 ∈ (−∞ , 𝑎 ) and sig cos 𝑎,𝑏 ( 𝑥 ) ≠ for all 𝑥 ∈ [ 𝑎 ′ , +∞) .Thus, it remains to be determined for which 𝑥 ∈ [ 𝑎, 𝑎 ′ ) wehave sig cos 𝑎,𝑏 ( 𝑥 ) = 0 ⇔ − 12 cos ( 𝑏𝜋 ( 𝑥 − 𝑎 )1 − 𝑎 ) + 12 = 0 ⇔ cos ( 𝑏𝜋 ( 𝑥 − 𝑎 )1 − 𝑎 ) = 1 . The latter is true if and only if 𝑏𝜋 ( 𝑥 − 𝑎 )(1 − 𝑎 ) −1 = 2 𝓁 𝜋 with 𝓁 ∈ ℤ , i.e. the argument in cos( ⋅ ) is a multiple of 𝜋 ,which yields 𝑥 = 2 𝓁 (1 − 𝑎 ) 𝑏 + 𝑎 for some 𝓁 ∈ ℤ . We proceed with a case distinction for 𝓁 ∈ ℤ ,but ﬁrst we recall that 𝑎 ′ = 𝑎𝑏 + 𝑎 . a) If 𝓁 = 0 it follows directly, that 𝑥 = 𝑎 .b) If 𝓁 > we have 𝑥 = 2 𝓁 (1 − 𝑎 ) 𝑏 + 𝑎 ≥ 𝑎 ) 𝑏 + 𝑎 = 2 𝑎 ′ − 𝑎 ⋆ ≥ 𝑎 ′ , but this indicates sig cos 𝑎,𝑏 ( 𝑥 ) = 1 .c) If 𝓁 < we have 𝑥 = 2 𝓁 (1 − 𝑎 ) 𝑏 + 𝑎 ≤ −2(1 − 𝑎 ) 𝑏 + 𝑎 = −2 𝑎 ′ + 3 𝑎 ⋆ ≤ 𝑎. Where ⋆ holds as 𝑎 ≤ 𝑎 ′ . Consequently, the only additionalcase for sig cos 𝑎,𝑏 ( 𝑥 ) = 0 aside from 𝑥 ∈ (−∞ , 𝑎 ) is 𝑥 = 𝑎 . M. Skrodzki and E. Zimmermann:

Preprint submitted to Elsevier

Page 10 of 12valuation of Neighborhood Weights and Sizes

B. Limit Results

In Equations (8) and (10) we deal with energy terms of theform 𝑥 ln( 𝑥 ) . We want to reason their limit and show that 𝑓 ( 𝑥 ) = 𝑥 ln( 𝑥 ) 𝑥 ↘ ⟶ . To do so we rewrite it as 𝑓 ( 𝑥 ) = 𝑔 ( 𝑥 ) ℎ ( 𝑥 ) = 𝑙𝑛 ( 𝑥 ) 𝑥 −1 . The limits lim 𝑥 ↘ 𝑔 ( 𝑥 ) = −∞ and lim 𝑥 ↘ ℎ ( 𝑥 ) = +∞ as well as lim 𝑥 ↘ 𝑔 ′ ( 𝑥 ) ℎ ′ ( 𝑥 ) = lim 𝑥 ↘ 𝑥 = 0 allow us to apply L’Hospital’s rule, such that we obtain lim 𝑥 ↘ 𝑓 ( 𝑥 ) = lim 𝑥 ↘ 𝑔 ′ ( 𝑥 ) ℎ ′ ( 𝑥 ) = 0 . CRediT authorship contribution statement

Martin Skrodzki:

Conceptualization of this study, Method-ology, Software, Evaluation.

Eric Zimmermann:

Concep-tualization of this study, Methodology, Software, Evalua-tion.

References [1] Alexa, M., Behr, J., Cohen-Or, D., Fleishman, S., Levin, D., Silva,C.T., 2001. Point Set Surfaces, in: VIS’01: Proceedings of the con-ference on Visualization ’01, IEEE Computer Society. pp. 21–28.[2] Bellekens, B., Spruyt, V., Berkvens, R., Weyn, M., 2014. A surveyof rigid 3d pointcloud registration algorithms, in: AMBIENT 2014:the Fourth International Conference on Ambient Computing, Appli-cations, Services and Technologies, August 24-28, 2014, Rome, Italy,pp. 8–13.[3] Belton, D., Lichti, D.D., 2006. Classiﬁcation and segmentation ofterrestrial laser scanner point clouds using local variance information.The International Archives of the Photogrammetry, Remote Sensing,and Spatial Information Sciences 36, 44–49.[4] Boehnen, C., Flynn, P., 2005. Accuracy of 3D scanning technologiesin a face scanning scenario, in: IEEE Fifth International Conferenceon 3D Digital Imaging and Modeling., pp. 310–317.[5] Brodu, N., Lague, D., 2012. 3D terrestrial lidar data classiﬁcation ofcomplex natural scenes using a multi-scale dimensionality criterion:Applications in geomorphology. ISPRS Journal of Photogrammetryand Remote Sensing 68, 121–134.[6] Buck, U., Naether, S., Braun, M., Bolliger, S., Friederich, H., Jack-owski, C., Aghayev, E., Christe, A., Vock, P., Dirnhofer, R., et al.,2007. Application of 3D documentation and geometric reconstruc-tion methods in traﬃc accident analysis with high resolution surfacescanning, radiological MSCT/MRI scanning and real data based ani-mation. Forensic science international 170, 20–28.[7] Choi, S., Zhou, Q.Y., Miller, S., Koltun, V., 2016. A large dataset ofobject scans. arXiv:1602.02481 .[8] Demantké, J., Mallet, C., David, N., Vallet, B., 2011. Dimensional-ity based scale selection in 3D lidar point clouds. The InternationalArchives of the Photogrammetry, Remote Sensing and Spatial Infor-mation Sciences XXXVIII-5/W12, 97–102.[9] Floater, M.S., Reimers, M., 2001. Meshless parametrization and sur-face reconstruction. Computer Aided Geometric Design 18, 77–92.[10] Hoppe, H., DeRose, T., Duchamp, T., McDonald, J., Stuetzle, W.,1992. Surface Reconstruction from Unorganized Points, in: Proceed-ings of the 19th annual conference on Computer graphics and inter-active techniques, ACM. pp. 71–78. [11] Hu, Y., Zhou, Q., Gao, X., Jacobson, A., Zorin, D., Panozzo, D., 2018.Tetrahedral meshing in the wild. ACM Trans. Graph. 37, 60–1.[12] Levin, D., 1998. The approximation power of moving least-squares.Mathematics of Computation 67, 1517–1531.[13] Levin, D., 2004. Mesh-independent Surface Interpolation, in: Brun-nett, G., Hamann, B., Müller, H., Linsen, L. (Eds.), Geometric mod-eling for scientiﬁc visualization. Springer, pp. 37–49.[14] Levoy, M., Pulli, K., Curless, B., Rusinkiewicz, S., Koller, D., Pereira,L., Ginzton, M., Anderson, S., Davis, J., Ginsberg, J., et al., 2000.The Digital Michelangelo Project: 3D Scanning of Large Statues, in:Proceedings of the 27th annual conference on Computer graphics andinteractive techniques, pp. 131–144.[15] Levoy, M., Whitted, T., 1985. The Use of Points as a Display Primi-tive. Technical Report. University of North Carolina.[16] Linsen, L., Prautzsch, H., 2001. Local Versus Global Triangulations,in: Proceedings of EUROGRAPHICS, pp. 257–263.[17] Lipman, Y., Cohen-Or, D., Levin, D., 2006. Error Bounds and Op-timal Neighborhoods for MLS Approximation, in: Proceedings ofthe fourth Eurographics symposium on Geometry processing, Euro-graphics Association. pp. 71–80.[18] Marler, M.R., Gehrman, P., Martin, J.L., Ancoli-Israel, S., 2006. Thesigmoidally transformed cosine curve: a mathematical model for cir-cadian rhythms with symmetric non-sinusoidal shapes. Statistics inmedicine 25, 3893–3904.[19] Mitra, N.J., Nguyen, A., Guibas, L., 2004. Estimating Surface Nor-mals in Noisy Point Cloud Data. International Journal of Computa-tional Geometry & Applications 14, 261–276.[20] Park, M.K., Lee, S.J., Lee, K.H., 2012. Multi-scale tensor voting forfeature extraction from unstructured point clouds. Graphical Models74, 197–208.[21] Pauly, M., Gross, M., Kobbelt, L., 2002. Eﬃcient Simpliﬁcation ofPoint-Sampled Surfaces, in: Proceedings of the conference on Visu-alization’02, IEEE Computer Society. pp. 163–170.[22] Pauly, M., Keiser, R., Kobbelt, L., Gross, M., 2003. Shape Modelingwith Point-Sampled Geometry. ACM Transactions on Graphics 22,3, 641–650.[23] Shannon, C.E., 1948. A Mathematical Theory of Communication.The Bell System Technical Journal 27, 379–423.[24] Skrodzki, M., 2019. Neighborhood Data Structures, Manifold Prop-erties, and Processing of Point Set Surfaces. Ph.D. thesis. Freie Uni-versität Berlin. Berlin, Germany.[25] Skrodzki, M., Jansen, J., Polthier, K., 2018. Directional den-sity measure to intrinsically estimate and counteract non-uniformityin point clouds. Computer Aided Geometric Design 64, 73– 89. URL: , doi: https://doi.org/10.1016/j.cagd.2018.03.011 .[26] Sober, B., Levin, D., 2016. Manifolds’ projective approximation us-ing the moving least-squares (MMLS). CoRR abs/1606.07104. URL: http://arxiv.org/abs/1606.07104 , arXiv:1606.07104 .[27] Stanford Scanning Repository, . The Stanford 3D Scanning Reposi-tory. https://graphics.stanford.edu/data/3Dscanrep/. Accessed: 2020-02-10.[28] Weinmann, M., Jutzi, B., Mallet, C., 2014. Semantic 3D scene in-terpretation: a framework combining optimal neighborhood size se-lection with relevant features. ISPRS Annals of the Photogrammetry,Remote Sensing and Spatial Information Sciences II-3, 181–188.[29] Yadav, S.K., Reitebuch, U., Skrodzki, M., Zimmermann, E., Polthier,K., 2018. Constraint-based point set denoising using normal votingtensor and restricted quadratic error metrics. Computers & Graphics74, 234–243. URL: , doi: https://doi.org/10.1016/j.cag.2018.05.014 . M. Skrodzki and E. Zimmermann:

Preprint submitted to Elsevier

Page 11 of 12valuation of Neighborhood Weights and Sizes

Martin Skrodzki has studied Mathematics andComputer Science at TU Dortmund University(Dortmund, Germany), Texas A&M InternationalUniversity (Laredo, TX, USA), and Freie Univer-sität Berlin (Berlin, Germany). He obtained hisDr. rer. nat. in 2019 at Freie Universität Berlinunder the supervision of Prof. Polthier (FreieUniversität Berlin, Germany) and Prof. Levin(Tel Aviv University, Israel). He has been apostdoctoral scholar at the Institute for Computa-tional and Experimental Research in Mathemat-ics (ICERM, Brown University, Providence, RI,USA) and in the RIKEN Interdisciplinary Theoret-ical and Mathematical Sciences Program (RIKEN,Wako, Saitama, Japan).Eric Zimmermann has studied Mathematics atFreie Universität Berlin (Berlin, Germany). Hecurrently is a doctoral candidate in the group“Mathematical Geometry Processing” under theguidance of Prof. Polthier (Freie UniversitätBerlin, Germany), and he is part of the projectC05 of the collaborative research cluster SFB Tran-sregio 109 called “Discretization in Geometry andDynamics” dealing with computational and struc-tural aspects of point sets.

M. Skrodzki and E. Zimmermann: