[PDF] Stochastic Properties of Minimal Arc Distance and Cosine Similarity between a Random Point and Prespecified Sites on Sphere

Abstract

In applications such as wireless communication, it is important to study the statistical properties of L_{2}, the minimal arc distance between a random point (e.g., a cellphone user) uniformly distributed on a sphere to a set of pre-defined seeds (e.g., wireless towers) on that sphere. In this study, we first derive the distribution (CDF) and density (PDF) functions of the arc distance between a selected vertex of a spherical triangle to a random point uniformly distributed within this triangle. Next, using computational techniques based on spherical Voronoi diagram and triangular partition of Voronoi cells, we derive moments of L_{2} and \cos L_{2}. These results are verified by extensive Monte Carlo simulations.

Full PDF

SStochastic Properties of Minimal Arc Distance andCosine Similarity between a Random Point andPrespeciﬁed Sites on Sphere

Hongjun Li a , Jiatong Sui b , Shengpeng Mu a , Xing Qiu b, ∗ a College of Science, Beijing Forestry University, Beijing, China b Department of Biostatistics and Computational Biology, University of Rochester, NewYork, U.S.A.

Abstract

In applications such as wireless communication, it is important to study thestatistical properties of L , the minimal arc distance between a random point(e.g., a cellphone user) uniformly distributed on a sphere to a set of pre-deﬁnedseeds (e.g., wireless towers) on that sphere. In this study, we ﬁrst derive thedistribution (CDF) and density (PDF) functions of the arc distance between aselected vertex of a spherical triangle to a random point uniformly distributedwithin this triangle. Next, using computational techniques based on sphericalVoronoi diagram and triangular partition of Voronoi cells, we derive moments of L and cos L . These results are veriﬁed by extensive Monte Carlo simulations. Keywords:

Arc distance, cosine similarity, distribution function, moments,spherical geometry, Voronoi diagram

1. Introduction

A Voronoi diagram, also known as Thiessen polygon or Dirichlet tessellation,is a powerful tool in conducting research related to the partition of a space [1].Given a set of pre-speciﬁed points (called seeds or sites), a Voronoi diagram ∗ Correspondence should be send to: Xing Qiu.

Email addresses: [email protected] (Hongjun Li), [email protected] (Jiatong Sui),

[email protected] (ShengpengMu), [email protected] (Xing Qiu)

Preprint submitted to Computational Statistics & Data Analysis February 3, 2021 a r X i v : . [ s t a t . C O ] F e b ivides the space into regions (called Voronoi cells) according to the nearest-neighbor rule, such that each region contains exactly one seed and the interiorpoints of the regions are closer to this seed than any others [2, 3].Voronoi diagrams can be constructed eﬃciently [4], and it has wide ap-plications in various disciplines, including forestry, chemistry, transportation,and statistics [5, 6, 7, 8]. In particular, it has been used by mathematiciansand statisticians to model stochastic foam geometries [9], estimating Shannonentropy of multidimensional probability [10], approximating resampling-baseddistributions [11], etc.In this study, we focus on the following application of Voronoi diagram inwireless communications. As we know, wireless signal strength depends criti-cally on the distance between a user’s device (e.g., cell phone) and the nearest base station. In order to study the distribution and moments of this minimaldistance, we need to decompose the whole region covered by wireless signalinto Voronoi cells [12, 13] of which the seeds are base stations. Because aVoronoi cell is a convex polygon composed of several triangles intersecting atthe seed, it suﬃces to study the stochastic properties of the distance betweena selected vertex (the seed) of a triangle and a random interior point. Severalresearch teams, including us, have study this problem on R thoroughly underthe assumption that the random point follows a uniform distribution on thetriangle [14, 15, 16, 13]. In particular, we derived the closed-form cumulativedistribution function (CDF), probability density function (PDF), and the ﬁrsttwo moments of this random distance [13].There are various generalizations of Voronoi diagram that serve better forcertain applications [17], including higher order diagrams, weighted Voronoidiagrams, and the case in which the seeds are replaced by line segments [18, 12,19, 20]. Another large class of generalizations are Voronoi diagrams deﬁned onnon-Euclidean geometric spaces, e.g., Laguerre planes or spheres [21, 22, 23].The latter case (spherical Voronoi diagram, illustrated in Figure 1) is especiallyimportant in practice, not least because the Earth is approximated spherical.Again take the wireless communication problem as an example, if the base2tations to be modeled occupy a relatively large area, using the spherical Voronoidiagram deﬁned on S would be more accurate than the planar Voronoi diagram.There are similar problems in large-scale forest ecology, vehicle distributionmanagement, and biomolecular structure modeling, that can beneﬁt from usingthe spherical rather than planar Voronoi diagram.The distance between random points on S have been studied by a few re-searchers, including the generation and test of uniformity on the sphere [24],minimization of the summation of Euclidean distances of n random points [25],and random spherical packing [26, 27]. However, no prior studies have system-atically studied the stochastic properties of L ( ∆ P BC ) , the random sphericaldistance ( a.k.a. arc distance) between a given vertex P to a random point Q which is uniformly distributed on the spherical triangle ∆ P BC .In this study, we ﬁrst study the stochastic properties of L ( ∆ P BC ) in de-tail, and extend these results to the case in which the spherical triangle ∆ P BC is replaced by an arbitrary convex spherical polygon. Next, we use these resultsand techniques based on spherical Voronoi diagram to derive the distributionof L ( P , . . . , P m ) , which is the minimal arc distance between a uniformly dis-tributed random point on S to { P , . . . , P m } , a collection of pre-speciﬁed siteson the same sphere.This manuscript is organized as follows. In Section 2 we provide usefulpreliminary results about spherical triangles. Detailed derivations of the CDFand PDF of L ( ∆ P BC ) , as well as their generalizations to convex sphericalpolygons (dubbed as L ( Γ , P ) , where Γ is a convex spherical polygon and P isan arbitrary internal seed), are given in Sections 3.1 and 3.2. The CDF and PDFof L ( P , . . . , P m ) is derived in Section 3.3. The close-form expression of ﬁrstfour moments of L ( ∆ P BC ) is provided in Section 4. To further demonstratethe utility of the CDF and PDF derived in Section 3, we derive the exact formulafor several moments of cos ( L ( ∆ P BC )) in Appendix B. These quantities areuseful in spherical cluster analysis and data mining [28, 29, 30, 31].3 a) (b)Figure 1: An illustration of spherical Voronoi diagram and spherical triangles. (a) Given a setof seeds, the spherical Voronoi diagram is a partition of a sphere into convex spherical polygons,such that the points within each polygon has the shortest spherical (angular) distance to theinterior seed than all exterior seeds. A combination of triangulation and mid-point bisectorinspector is employed to form these honeycomb-like, asymmetric, mesh-like shapes. (b) Dueto the non-Euclidean nature of S , a “straight line” (geodesic) on S is a great circle, and thespherical triangle is a triangular shaped area enclosed by three great circles. The sphericaldistance between two points such as A and A is measured by the arc-length of the greatcircle passing through these two points. Spherical geometry and trigonometry is fundamentallydiﬀerent from planar geometry/trigonometry. For instance, the sum of the interior angles ofa spherical triangle exceeds 180 degrees [32].

2. Useful geometric properties of spherical triangles

Throughout this manuscript, we denote the sphere in R centered at theorigin (denoted by O ) with radius R by S R ∶= { x ∈ R ∶ ∣ x ∣ = R } . Let A and B be two distinct and non-opposing points on S R , e.g., A ≠ ± B , they deﬁne agreat circle, i.e., geodesic on S R . We denote the smallest arc of a great circlewith endpoints A and B by L AB ; ∠ AOB measured in radians by ϕ AB . Wealso denote the area of the spherical cap by A cap and the half-of-top angle [16]of this spherical cap by θ (see Fig. 2(a)). The area of a spherical sector, asshown in Fig. 2(b), is denoted by A fan , its top angle is denoted by α . Formaldeﬁnitions of these concepts can be found in Appendix A.4 a) (b) X Z O Y

P C B ϕ PB α P α B α C h M (c)Figure 2: The three sub-ﬁgures illustrate the cap, fan and triangle on a sphere. (a) A sphericalcap is the region of a sphere cut oﬀ by a plane. If the plane passes through the center of thesphere, the cap is a called a hemisphere. We could use a second plane to cut the cap, whichwill result in a spherical segment. (b) A spherical fan on the sphere is part of a sphericalcap generated by further partitions, using great circles passing through the apex or the poleof spherical cap. (c) A triangle from a sphere can be formed using 3 points on a sphere,connected by the great circles. In this paper, we consider the inner triangles instead of theouter triangles. The sum of the angles of a inner spherical triangle is between π and 3 π timesthe radius. Area of spherical triangle.

Let α P , α B , α C be the three interior angles of triangle∆ P BC , as shown in Fig. 2(c). It is well known that the area of ∆

P BC is A ∆ = R ( α P + α B + α C − π ) = R E (1)Here α P , α B , and α C represent angles between great circles. The sphericalangle ∠ AP B = α can be calculated with the following Equation α = arccos ((—→ OA × —→ OP ) ⋅ (—→ OB × —→ OP ) ) (2)Here —→ v represents the unit direction of vector v ∈ R , namely, —→ v ∶= —→ v /∣—→ v ∣ .The calculation for α B and α C are similar. E is called the spherical excess ,0 ≤ E ≤ π , with E = Altitude of spherical triangle.

An important concept used in our derivation isthe altitude of a spherical triangle, which is a generalization of the altitude ofa planar triangle [13]. For spherical triangle ∆

P BC , the length h BC (or h ) of5he altitude to base edge ̂ BC is the length of arc ̂ P M , as shown in Fig. 2(c),which can be calculated as h = R ∠ P OM = R arcsin (—→ n ⋅ —→ OP ) (3)where —→ n is the unit normal direction of the plane deﬁned by points O, B and C : —→ n ∶= (—→ OB × —→ OC ) ; h is the shortest distance from vertex P to the greatestcircle deﬁned by B and C . In addition, the intersection M can be calculated bythe following equation ——→ OM = R (—→ OP + t —→ n ) , (4)where t ∶= −(—→ n ⋅ —→ OP ) .

3. Distribution of the Minimal Arc Distance

In this section, we will derive the distribution of L ( P , . . . , P m ) , the min-imal arc distance between a random point Q to a pre-speciﬁed set of sites { P , . . . , P m } on a sphere. L ( ∆ P BC ) Recall that ∆

P BC is an arbitrary spherical triangle, and L ( ∆ P BC ) rep-resents the arc distance between P to a random point Q uniformly distributedin ∆ P BC . Without loss of generality, we may choose B and C such that that L P B ≤ P C . In [16], Ahmadi and Pan classiﬁed every spherical triangle ∆

P BC into two cases based on whether the altitude from P to the arc BC is inside(e.g., Fig.3(a)) or outside (e.g., Fig.3(b)) of the triangle.We ﬁnd that these two cases based on the altitude can be uniﬁed into asingle graph (Fig.3(c)) as follows: based on the altitude, a reﬂection of arc P B about the altitude is drawn. The outside and the inside arcs are denoted by ̂ P B and ̂ P B , respectively. Then for the inside-altitude case, B coincides with B ,while for the outside-altitude case, B coincides with B . The above procedureis illustrated in Fig. 3(c).The altitude length h can be calculated with Equation (3). F L ( r ) , theCDF of L can be represented as F L ( r ) = S ( r )/ Ω, where Ω is the area of the6

X Z O Y

P C B ϕ PB α P α B α C h M (a) X Z O Y

P C B ϕ PB α P α B α C h M (b) X Z O Y P C B ϕ PB α α C h M B (c)Figure 3: ∆ P BC can be classiﬁed into two classes based on the location of the altitude

P M :(a) The inside-altitude case,

P M ⊂ ∆ P BC , (b) the outside-altitude case,

P M ⊊ ∆ P BC . Byreﬂecting arc

P B about the altitude

P M , we can unify these two cases, which is illustratedin (c). spherical triangle ∆

P BC and S ( r ) ∶= ∣ B r ( P ) ⋂ ∆ P BC ∣ [15], where B r ( P ) is aspherical cap centered at P with radius r . By construction, B r ( P ) ⋂ ∆ P BC contains those points in ∆

P BC with arc distance to P less or equal to r . Thefollowing four cases need to be considered. Case 1.

When 0 < r ≤ h , B r ( P ) ⋂ ∆ P BC is a sector, centered at P , with twocut points on ̂ P B and ̂ P C , as shown in Fig.4(a). Using Equation (A.3), itsarea is S ( r ) = α P ( − cos r ) R , (5)where α P is ∠ BP C . Case 2.

When h < r ≤ L P B , B r ( P ) ⋂ ∆ P BC consists of a spherical triangle,∆

P D D (mid), and two sectors, (both centered at P , one has cut points on ̂ P B and ̂ P D , the other cuts on ̂ P D and ̂ P C ), as shown in Fig.4(b). WithEquations (A.3) and (1), we have S ( r ) = { α ( − cos r ) + η [( π / + α r + β r − π ) + ( α − α r )( − cos r )]} R . (6) Here α r = ∠ M P D , β r = ∠ P D M , and their values depend on r . α =∠ CP B and α = ∠ M P B do not depend on r . η ∶= L BB / L MB , thus η = B = B , η = B = B . 7 X Z O Y P C B ϕ PB α C h M B r (a) X Z O Y P C B α C M B r h D D (b) X Z O Y P C B α C h M B r D (c) X Z O Y P C B h M B r (d)Figure 4: The CDF/PDF of r ( ∆ P BC ) depends on how the spherical fan originated from P with radius r intersects with ∆ P BC . By construction, every point P in this fan satisﬁes L PQ ⩽ r , therefore P ( r ⩽ r ) is the ratio between the area of the intersection and the areaof the triangle. This intersection can be classiﬁed as follows: (a) case 1: 0 < r ≤ h , theintersection is a sector. (b) case 2: h < r ≤ L PB , it contains a spherical triangle and twosectors. (c) case 3: L PB < r ≤ L PC , it contains a spherical triangle and a sector. (d) case 4: r > L PC , it is a spherical triangle. α r and β r can be represented in one integrative Equation (6) for both inside-8ltitude and outside-altitude cases α r = cos − [(——→ OM × —→ OP ) ⋅ (——→ OD × —→ OP ) ] (7) = arccos ( tan h tan r ) . (8) β r = cos − [(—→ OP × ——→ OD ) ⋅ (——→ OM × ——→ OD ) ] (9) = arcsin ( sin h sin r ) . (10) Case 3.

When L P B < r ≤ L P C , B r ( P ) ⋂ ∆ P BC includes a spherical triangle∆

P BD and a sector centered at P with cut points on ̂ P D and ̂ P C , as shownin Fig.4(c). With Equations (A.3) and (1), we have S ( r ) = [( α P − ω r + α B + β r − π ) + ω r ( − cos r )] R , (11)where α B = ∠ P BC . ω r = ∠ D P C , depending on the distance r , can be calcu-lated as ω r = cos − [(—→ OC × —→ OP ) ⋅ (——→ OD × —→ OP ) ] = ∠ M P C − α r . (12) Case 4.

When r > L P C , B r ( P ) ⋂ ∆ P BC is the spherical triangle ∆

P BC asshown in Fig.4(d). Thus we have S ( r ) = Ω . (13)After simpliﬁcations, S ( r ) can be represented as S ( r ) = ⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩ , r ≤ α P g r R , < r ≤ h ( α g r + η [( α r + β r − π / ) + ( α − α r ) g r ]) R , h < r ≤ L PB (( α P + α B + β r − π ) − ω r cos r ) R , L PB < r ≤ L PC Ω , r > L PC (14) where g r = − cos r . Apparently r ≥ r ≤ r can be extended to R .Again, the CDF of L ( ∆ P BC ) is F L ( r ) = S ( r ) Ω ; hence its PDF is f L ( r ) ∶= S ′ ( r ) Ω f L ( r ) = ⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩ , r ≤ r > L PC α P sin rR , < r ≤ h { α sin r + η [ α ′ r cos r + β ′ r + ( α − α r ) sin r ]} R , h < r ≤ L PB [ β ′ r − ω ′ r cos r + ω r sin r ] R , L PB < r ≤ L PC (15) Here α ′ r , β ′ r , and ω ′ r are the derivatives of α r , β r , and ω ) r w.r.t. r ,respectively. When h ≤ r ≤ L P C , α ′ r = ∂α ′ r ∂r = tan h sin r cos r √ tan r − tan h , (16) β ′ r = ∂β ′ r ∂r = − sin h cos r sin r √ sin r − sin h , (17) ω ′ r = − α ′ r . (18) In real world applications, researchers may be interested in the distributionof arc distance between a pre-speciﬁed internal point P to a point Q that isuniformly distributed in a spherical polygon Γ. If Γ is convex, it can be decom-posed into a combination of spherical triangles pivoting at P as shown in Fig. 5,so we can extend the results we obtained in the previous section to this case.To avoid confusion, we use L instead of L to represent the random arcdistance L P Q . The CDF of L can be calculated based on the formulas insubsection 3.1.A convex spherical polygon Γ with n edges can be segmented as n sphericaltriangles pivoting at an internal P . An illustration is given in Fig. 5, where Γ isa spherical heptagon with vertices B , B , . . . , B that is decomposed into sevenspherical triangles: ∆ P B B , . . . , ∆ P B B .Based on construction, for L ( Γ , P ) deﬁned for a spherical polygon Γ with n vertices B , B , . . . , B n and an internal point P , its CDF and PDF are F L ( r ) = P ( L P Q ⩽ r ) = ∣ Γ ∣ n ∑ i = S i ( r ) , f L ( r ) = ∣ Γ ∣ n ∑ i = S ′ i ( r ) . (19)10 igure 5: A convex spherical polygon Γ can be decomposed into a set of adjacent sphericaltriangles pivoting at a pre-speciﬁed internal point P . In Fig.5, the spherical cap centered at P represents a subset of Γ in which every point Q satisﬁes L PQ ⩽ r , where r is the arc radiusof this spherical cap. Apparently, this spherical cap is the disjoint union of spherical fanspertain to the member triangles. Consequently, the statistical properties of L = L PQ , thearc distance between P and a random point Q in Γ, can be derived from the properties of r ,which is the random arc distance between Q to a pre-selected vertex P of a spherical triangle. where S i ( r ) = ∣ B r ( P ) ⋂ ∆ P B i B i + ∣ , for i = , , . . . , n −

1, and S n ( r ) =∣ B r ( P ) ⋂ ∆ P B n B ∣ . These values, as well as their derivatives, can be com-puted by Equations (14) and (15). Of note, S i ( r ) = ∣ ∆ P B n B ∣ ⋅ P ( L P Q ⩽ r ∣ Q ∈ ∆ P B n B ) , therefore F L ( r ) = ∑ ni = ∣ ∆ P B n B ∣∣ Γ ∣ ⋅ P ( L P Q ⩽ r ∣ Q ∈ ∆ P B n B ) , whichis a weighted summation of conditional probabilities. L ( P , . . . , P m ) Let { P , P , . . . , P m } be a pre-speciﬁed set of points on a sphere and Q be a random point that is uniformly distributed on this sphere. Let L ∶= mj = L P j Q be the minimal arc length between Q and { P , P , . . . , P m } . Toderive the distribution of L , we ﬁrst partition the sphere into { Γ j } mj = , thespherical Voronoi diagram with seeds { P , P , . . . , P m } . By construction, eachcell Γ j in this diagram is a spherical polygon such that every point in Γ j is closerto P j (the seed of Γ j ) than any other seed in terms of arc distance. Consequently,if Q ∈ Γ j , L = L P j Q . This property is illustrated in Fig.6. Figure 6: A spherical Voronoi diagram is a partition of sphere that consists of adjacent anddisjoint spherical polygons ( { Γ , . . . , Γ m } ) based on pre-speciﬁed seeds { P , . . . , P m } . For eachpoint Q ∈ Γ j , its arc distance to P j (the seed of Γ j ) is the shortest among all seeds. Circles inthe ﬁgure represent points that have the same arc distance to a given seed. The intersectionsof these circles can be used to determine the boundaries between adjacent cells. Once the Voronoi diagram is constructed, the distribution of L can bederived in this way. First, we use Equation (19) to obtain the distributionof L conditional on Q ∈ Γ j . As a reminder, it depends on partition Γ j intodisjoint spherical triangles ∆ P j B i,j B i + ,j , for i = , , . . . , n j . The unconditionaldistribution of L can then be obtained by taking the weighted summation of12hese conditional distributions over all spherical triangles. The ﬁnal result ispresented as follows. F L ( r ) = m ∑ j = ∣ Γ j ∣ πR ⋅ F L ( r ∣ Q ∈ Γ j ) = πR m ∑ j = n j ∑ i = S i,j ( r ) .f L ( r ) = F ′ L ( r ) = πR m ∑ j = n j ∑ i = S ′ i,j ( r ) . (20)Here S i,j ( r ) = ∣ B r ( P ) ⋂ ∆ P j B i,j B i + ,j ∣ is the intersection of a spherical ball andthe spherical triangle ∆ P j B i,j B i + ,j ⊂ Γ j . It is easy to see that S i,j ( r ) = ∣ ∆ P j B i,j B i + ,j ∣ ⋅ P ( L P j Q ⩽ r ∣ Q ∈ ∆ P j B i,j B i + ,j ) , n j ∑ i = S i,j ( r ) = ∣ Γ j ∣ ⋅ F L ( r ∣ Q ∈ Γ j ) .

4. Moments

With the density functions derived in the previous section (Equations (15),(19), and (20)), we are now ready to compute the k th moments of L , L , and L .For a given spherical triangle ∆ P BC , we have E ( L k ( ∆ P BC )) = ∫ ∞−∞ r k f L ( r ) dr ∶= R Ω ( Term + Term + Term ) . Term = ∫ h r k ⋅ α P sin r dr . Term = ∫ L PB h r k ( α sin r + η ( α ′ r cos r + β ′ r + ( α − α r ) sin r )) dr . Term = ∫ L PC h PB r k ⋅ ( β ′ r + ω ′ r cos r − ω r sin r ) dr . (21) Here α ′ r and β ′ r are sophisticated functions of r deﬁned by Equations (16) and(17). To the best of our knowledge, Equation (21) does not permit an analyticsolution. In practice, it can be calculated with a suitable numerical integra-tion method. In our implementation, this integral is computed by Gaussianquadrature [33].As we mentioned in Section 3.2, moments of L for a given convex sphericalpolygon Γ can be computed by ﬁrst divide Γ into disjoint spherical triangles∆ P B i B i + (see Figure 5), evaluate the moments of the corresponding L in13ach triangle (by Equation (21)), and then take the weighted summation ofthese conditional moments. In summary, we have E ( L k ( Γ )) = ∣ Γ ∣ n ∑ i = ∣ ∆ P B i B i + ∣ ⋅ E ( L k ( ∆ P B i B i + )) . (22)Likewise, the moments of L for a given set of seeds { P , . . . , P m } can beobtained by ﬁrst partition the sphere into m Voronoi cells Γ j , and then takethe weighted summation of the conditional moments of the corresponding L within each Γ j . E ( L k ({ P , . . . , P m })) = πR m ∑ j = ∣ Γ j ∣ ⋅ E ( L k ( Γ j ))= πR m ∑ j = n ∑ i = ∣ ∆ P B i,j B i + ,j ∣ ⋅ E ( L k ( ∆ P j B i,j B i + ,j )) . (23) Using similar techniques, we are able to derive several moments of cosinesimilarities cos ( L ka ) , for a = , , k = , , ,

6. These results are providedin Appendix Appendix B.

5. Numerical experiment

We conduct three sets of Monte Carlo experiments to verify the theoreticalresults derived in Section 4.In each simulation, we ﬁrst generate m =

100 random seeds { P , . . . , P m } ona unit sphere based on the algorithm deﬁned in [34] which is numerically stableand more accurate than many popular spherical sampling method at generatingpoints near the north and south poles. Technical details about these threesimulations are provided as follows. • Sim1: All m =

100 seeds are generated based on the uniform distributionon the unit sphere. This is one of the most commonly used model inapplications such as wireless communications. • Sim2: First, we sample 100 seeds on the sphere uniformly. For the ﬁrst75 of them, we reﬂect those in the lower hemisphere through the xy -planeto ensure that all of them are in the upper hemisphere. It can be shown14asily that these processed seeds are uniformly distributed on the upperhemisphere. Similarly, we use reﬂection to ensure the rest 25 are uniformlydistributed on the lower hemisphere. We created this scenarios to mimican unbalanced case in which the sites are denser in a subset of the sphere. • Sim3: Using the acceptance-rejection method, we generate 50 seeds thatare uniformly distributed on the North end region, and another 50 seedsuniformly distributed on the South end region. Here the North end regionis deﬁned as {( x, y, z ) ∈ S ∣ z ⩾ . } , which is the spherical cap near thenorth pole with height h = .

9. The South end region is deﬁned in asimilar way: {( x, y, z ) ∈ S ∣ z ⩽ − . } . We use this scenario to illustratean extremely unbalanced case in which all seeds are concentrated on tworelatively small and disjoint regions of the sphere. (a) Sim1 (b) Sim2 (c) Sim3Figure 7: Three distributions of seeds used in simulations. (a) In Sim1, all m =

100 seeds areuniformly generated on the surface of the sphere. (b) In Sim2, the majority of the seeds (75of them) are uniformly distributed on the upper hemisphere of the sphere, while the rest (25of them) are uniformly distributed on the lower hemisphere. (c) In Sim3, half of the sites areuniformly generated on the North end region, while others are uniformly generated on theSouth end region.

In each case, once the seeds are generated, we uniformly sample 10, 000points ( Q s) on the sphere and compute L ∶= min mj = L P j Q , the shortest arcdistance between Q and the nearest seed. We then compute the empiricalmoments of L and cos L and compare them with their theoretical counterpartsderived in Section 4. 15o evaluate the discrepancy between the theoretical and empirical results,we use the Absolute Relative Error (ARE) deﬁne as followsARE ( m ) = ∣ ˆ m − mm ∣ . (24)Here ˆ m is one of the sample moments reported in Table 1, and m is its corre-sponding oracle moment computed by our theoretical derivations. To furtherminimize the impact of random variations in those Monte Carlo experiments,all experiments are repeated for 100 times, and the mean, maximum, and stan-dard deviation of the ARE reported in Table 1 are computed from these 100repetitions, each consists of 10,000 randomly sampled Q on the unit sphere.From Table 1, we ﬁnd that the theoretical and empirical moments of L andcos ( L ) match well in most of the cases. Speciﬁcally, the mean AREs for lowerorder moments of L across diﬀerent scenarios stay lower than 0.01; for higherorder moments the mean AREs are all lower than 0.02. The max AREs formoments of L are a little higher, but most of them are still lower than 0.05,except the 4th order moments of L in Sim2 (0.064).Estimated moments of cosine similarity have somewhat lower but acceptableaccuracy. With a few exceptions, most of E cos k L have mean and max ARElower than 0.05. We believe the inﬂated ARE of these moments is largely dueto the fact that the theoretical formulae of E cos k L involves dozens of terms(e.g., Equation (B.16)), so the total numerical error, which is the sum of errorsin all terms, becomes larger than that of E ( L k ) .Of note, the standard deviations of AREs are relatively low across all sce-narios and moments: for most of the cases, STD ( ARE ) ⩽ .

01. It implies thatthe reported accuracy is stable across 100 repetitions.

6. Conclusion & Discussion

In this paper, we ﬁrst derive the CDF and PDF of L ( ∆ P BC ) , the arcdistance between P , a vertex of the spherical triangle ∆ P BC , to a uniformlydistributed random interior point Q . Using a technique based on polygon de-composition, we are able to extend these results to cases in which the spherical16 easurement E ( L ) E ( L ) E ( L ) E ( L ) E ( cos L ) E ( cos L ) E ( cos L ) Sim1 Proposed Method 0.176 0.039 0.010 0.003 0.962 0.926 0.893Mean MC 0.175 0.039 0.010 0.003 0.944 0.960 0.845Mean ARE 0.005 0.009 0.013 0.017 0.019 0.037 0.054Max ARE 0.014 0.026 0.038 0.050 0.027 0.050 0.071STD ARE 0.003 0.006 0.009 0.012 0.002 0.004 0.006Sim2 Proposed Method 0.194 0.050 0.016 0.006 0.952 0.908 0.870Mean MC 0.195 0.050 0.016 0.006 0.980 0.958 0.938Mean ARE 0.005 0.009 0.014 0.020 0.029 0.055 0.078Max ARE 0.020 0.035 0.045 0.064 0.043 0.077 0.106STD ARE 0.004 0.007 0.010 0.015 0.004 0.007 0.010Sim3 Proposed Method 0.380 0.208 0.133 0.091 0.820 0.704 0.625Mean MC 0.378 0.206 0.132 0.090 0.807 0.684 0.601Mean ARE 0.006 0.008 0.009 0.011 0.016 0.028 0.038Max ARE 0.013 0.017 0.018 0.019 0.024 0.041 0.055STD ARE 0.005 0.007 0.007 0.008 0.005 0.008 0.009

Table 1: Simulation results show that our proposed theoretical results matches well with theempirical estimation for moments of L and cos L in all three scenarios. The mean AREsfor lower order moments of L across diﬀerent scenarios stay lower than 0.01; for higher ordermoments the mean AREs are all lower than 0.02. The max AREs for moments of L area little higher, but most of them are still lower than 0.05, except the 4th order moments of L in Sim2 (0.064). The estimation for cos L is less accurate, but still have low mean andmax ARE for lower order moments. Most of the estimation for cos L have mean and maxARE lower than 0.05. In addition, the accuracy of the estimations is stable across diﬀerentscenarios, with standard deviation of ARE ≤ L ( Γ ) ). Using methodsdeveloped for spherical Voronoi diagrams and integration techniques, we areable to compute arbitrary moments of L ({ P , . . . , P m }) , the shortest arc dis-tance from a uniformly distributed random point Q to a set of pre-deﬁned seeds { P , . . . , P m } on a sphere. Finally, we compute moments for L and cos L ,which are both widely used in various applications.We conduct Monte Carlo simulations to validate our theoretical results. Byand large, the theoretical and the empirical results are consistent. However,higher order moments for both L and cos L have larger AREs than the lowermoments. This is not beyond our expectation, since larger samples are requiredin order to obtain estimates of similar quality for higher order moments [35].Note that the assumption of convexity of spherical polygon in deriving theCDF and PDF of L is made out of convenience, not necessity. We believe theCDF/PDF of L P Q , for Q uniformly distributed in an arbitrary spherical triangle∆ of which P ∉ ∆, can be derived based on techniques we deployed in this study.Since it is always possible to divide a non-convex spherical polygon into disjointspherical triangles (not necessarily pivoting at a pre-speciﬁed interior point P if it is not convex), we can aggregate the CDF/PDF with exterior point P to obtain the desired formula for a non-convex polygon. Furthermore, everysmooth geometric object O on S can be approximated by spherical polygonstherefore a disjoint union of spherical triangles ∆ j . Therefore, the distributionof the arc distance between an arbitrary pre-speciﬁed point P ∈ S and Q uniformly distributed on O can be approximated by our method. How to achievethe best accuracy of the said approximation with a ﬁxed number of sphericaltriangles/polygons, therefore attain the best trade-oﬀ between computationalcost and accuracy, is an interesting research topic that warrants a future study.We think our work has tangible practical impact in many applications. Thestatistical distribution of the minimal distance between random points and apre-speciﬁed seeds (sites) is used in wireless communications, forest ecology,etc. Most existing studies in this ﬁeld are based on the Voronoi diagram in R . For large-scale studies, it may be necessary to model the domain as a18phere because the curvature of earth may become non-negligible. Our resultshas potential to signiﬁcantly improve the accuracy of the results under thesescenarios.Last but not least, many intermediate results (e.g., the distribution of L and L ) and mathematical techniques (e.g., the partition of the spherical trianglesand their classiﬁcations) we developed in this study may help other researchersstudy similar problems in the future. References [1] J. M. Augenbaum, C. S. Peskin, On the construction of the voronoi mesh ona sphere, Journal of Computational Physics 59 (2) (1985) 177–192 (1985).[2] H. Edelsbrunner, R. Seidel, Voronoi diagrams and arrangements, Discrete& Computational Geometry 1 (1) (1986) 25–44 (1986).[3] F. Aurenhammer, Voronoi diagrams—a survey of a fundamental geometricdata structure, ACM Computing Surveys (CSUR) 23 (3) (1991) 345–405(1991).[4] M. de Berg, M. Van Kreveld, O. Schwarzkopf, M. Overmars, Computationalgeometry algorithms and applications, 2nd Edition, Springer, Germany,2000 (2000).[5] O. Persson, Distance methods: the use of distance measurements in theestimation of seedling density and open space frequency, Studia forestaliasuecica (15) (1964) 1–68 (1964).[6] N. Christoﬁdes, S. Eilon, Expected distances in distribution problems,Journal of the Operational Research Society 20 (4) (1969) 437–443 (1969).[7] I. I. Vaisman, Statistical and computational geometry of biomolecularstructure, in: Handbook of Computational Statistics, Springer, 2012, pp.1095–1112 (2012). 198] G. Zhao, K. Xuan, W. Rahayu, D. Taniar, M. Safar, M. Gavrilova, B. Srini-vasan, Voronoi-based continuous nearest neighbor search in mobile naviga-tion, Industrial Electronics, IEEE Transactions on 58 (6) (2011) 2247–2257(2011).[9] J. K¨oll, S. Hallstr¨om, Generation of periodic stochastic foam models fornumerical analysis, Journal of cellular plastics 50 (1) (2014) 37–54 (2014).[10] J. Lin, Divergence measures based on the shannon entropy, IEEE Transac-tions on Information theory 37 (1) (1991) 145–151 (1991).[11] A. Villagran, G. Huerta, M. Vannucci, C. S. Jackson, A. Nosedal, Non-parametric sampling approximation via voronoi tessellations, Communica-tions in Statistics - Simulation and Computation 45 (2) (2016) 717–736(2016).[12] A. Okabe, B. Boots, K. Sugihara, S. N. Chiu, Spatial tessellations: conceptsand applications of Voronoi diagrams, Vol. 501, John Wiley & Sons, 2009(2009).[13] H. Li, X. Qiu, Moments of distance from a vertex to a uniformly distributedrandom point within arbitrary triangles, Mathematical Problems in Engi-neering 2016 (2016).[14] F. Aurenhammer, H. Edelsbrunner, An optimal algorithm for constructingthe weighted voronoi diagram in the plane, Pattern Recognition 17 (2)(1984) 251–257 (1984).[15] R. Pure, S. Durrani, Computing exact closed-form distance distributionsin arbitrarily shaped polygons with arbitrary reference point, The Mathe-matica Journal (2015).URL dx.doi.org/doi:10.3888/tmj.17-6 [16] M. Ahmadi, J. Pan, Random distances associated with arbitrary triangles:A recursive approach with an arbitrary reference point, UVicSpace (2014).URL http://hdl.handle.net/1828/5134 ppendix A. Deﬁnitions of useful geometric concepts of sphericaltriangles L AB = Rϕ AB (A.1)The area of the spherical cap, as shown in Fig. 2(a), is A cap = πR ( − cos θ ) (A.2)The area of the spherical sector, as shown in Fig. 2(b), is A fan = αR ( − cos θ ) (A.3) Appendix B. Moments of Cosine Similarity

With the PDF of L provided in Equation (15), we can calculate the math-ematical expectation of cos L as follows. E ( cos ( r )) = ∫ ∞−∞ cos ( r ) f ( r ) dr = (∫ h cos ( r ) ⋅ α P sin rdr + ∫ L PB h cos ( r ) ⋅ { α sin r + η [ α ′ r cos r + β ′ r + ( α − α r ) sin r ]} dr + ∫ L PC h PB cos ( r ) ⋅ [ β ′ r + ω ′ r cos r − ω r sin r ] dr )= ( Term + Term + Term ) . (B.1)The ﬁrst two terms in the above equation are derived as follows.Term = ∫ h cos ( r ) ⋅ α P sin rdr = − α P ⋅ cos ( r )∣ h = − α P ⋅ ( cos ( h ) − ) = α P ( − cos ( h )) . (B.2)23erm = ∫ L PB h cos ( r ) ⋅ { α sin r + η [ α ′ r cos r + β ′ r + ( α − α r ) sin r ]} dr = ∫ L PB h cos ( r ) ⋅ α sin rdr + ηI η = α [ cos ( h ) − cos ( h P B )] + ηI η . (B.3)Here I η = ∫ L PB h cos ( r )[ α ′ r cos r + β ′ r + ( α − α r ) sin r ] dr = ∫ L PB h ( cos rα ′ r − cos r sin rα r ) dr + ∫ L PB h cos rβ ′ r dr + ∫ L PB h α cos r sin rdr = I η, + I η, + I η, , (B.4)where I η, = ∫ L PB h ( cos rα ′ r − cos r sin rα r ) dr = α r cos r ∣ L PB h + ∫ L PB h cos rα ′ r dr = [ arccos tan h tan L PB cos L PB − ] + ∫ L PB h tan h cos r sin r √ tan r − tan h dr =

13 arccos tan h tan L PB cos L PB + ⋅ I η, p . (B.5) I η, p = ∫ L PB h tan h cos r sin r √ tan r − tan h dr = tan h ( ∫ L PB h r √ tan r − tan h dr − ∫ L PB h sin r √ tan r − tan h dr )= −

12 arctan 2 tan h √ tan r − tan h √ tan r + r − tan r tan h − h ∣ L PB h − √ tan r − tan h tan h ( + tan h )√ + tan r ∣ L PB h = −

12 arctan 2 tan h √ tan L PB − tan h √ tan L PB + L PB − tan L PB tan h − h − tan h √ tan L PB − tan h ( + tan h )√ + tan L PB . (B.6) Note that the following two technical results are needed in deriving the above24ntegral. Let u = tan r , a = tan h , we have ∫ r √ tan r − tan h dr = ∫ u √ u − a √ + u du = √− a a ⎡⎢⎢⎢⎢⎣ ln √ u + − √− a − √ u − a − ln (√− a )+ √ u + − √− a − √ u − a − a (√ u + − )√− a − √ u − a + √− a (√ u + − ) (√− a − √ u − a ) ⎤⎥⎥⎥⎥⎦= i a ln u − u a − a + i ⋅ a √ u − a √ u + u ( a + ) = − a arctan 2 a √ u − a √ u + u − u a − a + Const , Const ∶= ⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩− i a ln ( a + ) , u − u a − a > , − i a ln ( a + ) − π a , u − u a − a < . (B.7) Note that only the ﬁrst term of the results in Equation (B.7) depends on variable u ∶= tan r , so the choice of Const is irrelevant in computing the correspondingdeﬁnite integral used in Equation (B.6).Again let u = tan r , b = tan h , we have ∫ sin r √ tan r − tan h dr = ∫ √ u − b ( + u ) / du = √ u − b ( b + )√ u + = √ tan r − tan h ( + tan h )√ + tan r . (B.8)The rest two terms, I η, , I η, are provided as follows. I η, = ∫ L PB h cos rβ ′ r dr = − ∫ L PB h sin h cos r sin r √ sin r − sin h dr = −( arctan √ sin r − sin h sin h − sin h √ sin r − sin h )∣ L PB h = sin h √ sin L P B − sin h − arctan √ sin L P B − sin h sin h . (B.9) I η, = ∫ L PB h α cos r sin rdr = − α cos r ∣ L PB h = α ( cos h − cos L P B ) . (B.10)25n summary, we have I η = I η, + I η, + I η, =

13 arccos tan h tan L PB cos L PB −

13 arctan 2 tan h √ tan L PB − tan h √ tan L PB + L PB − tan L PB tan h − h − h √ tan L PB − tan h ( + tan h )√ + tan L PB + sin h √ sin L PB − sin h − arctan √ sin L PB − sin h sin h + α ( cos h − cos L PB ) . (B.11) Term is derived as follows. Term = ∫ L PC L PB cos ( r ) ⋅ [ β ′ r + ω ′ r cos r − ω r sin r ] dr = − ∫ L PC L PB sin h cos r sin r √ sin r − sin h dr + ω r cos r ∣ L PC L PB − ∫ L PC L PB ω ′ r cos rdr = sin h √ sin L PC − sin h − arctan √ sin L PC − sin h sin h − sin h √ sin L PB − sin h + arctan √ sin L PB − sin h sin h + [ cos L PB arccos tan h tan L PB − cos L PC arccos tan h tan L PC ]+ ∫ L PC L PB cos rα ′ r dr = sin h √ sin L PC − sin h − arctan √ sin L PC − sin h sin h − sin h √ sin L PB − sin h + arctan √ sin L PB − sin h sin h +

13 cos L PB arccos tan h tan L PB −

13 cos L PC arccos tan h tan L PC +

23 Term . (B.12) Using Equation (B.6), we haveTerm = ∫ L PC L PB cos rα ′ r dr = ∫ L PC L PB tan h cos r sin r √ tan r − tan h dr =

12 arctan 2 tan h √ tan L P B − tan h √ tan L P B + L P B − tan L P B tan h − h −

12 arctan 2 tan h √ tan L P C − tan h √ tan L P C + L P C − tan L P C tan h − h − tan h √ tan L P C − tan h ( + tan h )√ + tan L P C + tan h √ tan L P B − tan h ( + tan h )√ + tan L P B . (B.13)26n summary, we have E ( cos ( r )) = ( Term + Term + Term )= α P ( − cos ( h )) + α [ cos ( h ) − cos ( h PB )] + ηI η + sin h √ sin L PC − sin h − arctan √ sin L PC − sin h sin h − sin h √ sin L PB − sin h + arctan √ sin L PB − sin h sin h +

13 cos L PB arccos tan h tan L PB −

13 cos L PC arccos tan h tan L PC +

23 Term = α P ( − cos ( h )) + α [ cos ( h ) − cos ( h PB )]+ (

13 arccos tan h tan L PB cos L PB −

13 arctan 2 tan h √ tan L PB − tan h √ tan L PB + L PB − tan L PB tan h − h − h √ tan L PB − tan h ( + tan h )√ + tan L PB + sin h √ sin L PB − sin h − arctan √ sin L PB − sin h sin h + α ( cos h − cos L PB ))+ sin h √ sin L PC − sin h − arctan √ sin L PC − sin h sin h − sin h √ sin L PB − sin h + arctan √ sin L PB − sin h sin h +

13 cos L PB arccos tan h tan L PB −

13 cos L PC arccos tan h tan L PC +

13 arctan 2 tan h √ tan L PB − tan h √ tan L PB + L PB − tan L PB tan h − h −

13 arctan 2 tan h √ tan L PC − tan h √ tan L PC + L PC − tan L PC tan h − h − h √ tan L PC − tan h ( + tan h )√ + tan L PC + h √ tan L PB − tan h ( + tan h )√ + tan L PB . (B.14) r and cos r as follows. E ( cos L ) = ∫ +∞−∞ cos r ⋅ f ( r ) dr = ( ∫ h cos r ⋅ αα p sin rdr + ∫ LPBh cos r ⋅ { α sin r + η [ α ′ cos r + β ′ + ( α − α r ) sin r ]} dr + ∫ LPCLPB cos r ⋅ [ β ′ + ω ′ cos r − ω r sin r ] dr )= ⎛⎝ (∠ MP C cos ( L PC ) − (∠ MP C + α η + α ) cos ( L PB ) + ( α η − α p + α ) cos h + α p )+ η −

65 arctan ⎛⎝ √ cos h − cos ( L PB ) sin h ⎞⎠ +

65 arctan ⎛⎝ √ cos h − cos ( L PC ) sin h ⎞⎠+ η −

615 sin h ( sin ( L PB ) − h − ) √ cos h − cos ( L PB )+

25 sin h ( sin ( L PC ) − h − ) √ cos h − cos ( L PC )+ ( η − ) arcsin ( sin h ∣ sin ( L PB )∣ ) + arcsin ( sin h ∣ sin ( L PC )∣ ) − η arcsin ( sin h ∣ sin h ∣ )− ( cos ( L PC ) arccos ( tan h tan ( L PC ) ) − ( + η ) cos ( L PB ) arccos ( tan h tan ( L PB ) ))− η −

13 sin h √( sin ( L PB ) − sin h ) ( sin ( L PB ) + sin h ) ( sin ( L PB ) + h − )−

13 sin h √( sin ( L PC ) − sin h ) ( sin ( L PC ) + sin h ) ( sin ( L PC ) + h − ) ⎞⎠ . (B.15) ( c o s r ) = ∫ + ∞ − ∞ c o s r ⋅ f ( r ) d r = Ω ( ∫ h c o s r ⋅ αα p s i n r d r + ∫ L P B h c o s r ⋅ { α s i n r + η [ α ′ c o s r + β ′ + ( α − α r ) s i n r ] } d r + ∫ L P C L P B c o s r ⋅ [ β ′ + ω ′ c o s r − ω r s i n r ] d r ) = Ω ⎛ ⎝ ( c o s h ( α η − α p + α ) − c o s ( L P B )( α η − ∠ M P C + α ) + ∠ M P C c o s ( L P C ) + α p ) + η − r c t a n ⎛ ⎝ √ c o s h − c o s ( L P B ) s i n h ⎞ ⎠ + r c t a n ⎛ ⎝ √ c o s h − c o s ( L P C ) s i n h ⎞ ⎠ − η − s i n h √ c o s h − c o s ( L P B ) ( s i n ( L P B ) ( s i n ( L P B ) − c o s h − ) + c o s h + c o s h + ) − s i n h √ c o s h − c o s ( L P C ) ( s i n ( L P C ) ( s i n ( L P C ) − c o s h − ) + c o s h + c o s h + ) + ( η − ) a r c s i n ( s i n h ∣ s i n ( L P B ) ∣ ) + a r c s i n ( s i n h ∣ s i n ( L P C ) ∣ ) − η a r c s i n ( s i n h ∣ s i n h ∣ ) + ( ( + η ) c o s ( L P B ) a r cc o s ( t a n h t a n ( L P B ) ) − c o s ( L P C ) a r cc o s ( t a n h t a n ( L P C ) )) + η − s i n h √ ( s i n ( L P B ) − s i n h )( s i n ( L P B ) + ( s i n h − ) s i n ( L P B ) + s i n h − s i n h + ) + s i n h √ ( s i n ( L P C ) − s i n h )( s i n ( L P C ) + ( s i n h − ) s i n ( L P C ) + s i n h − s i n h + ) ⎞ ⎠ . ( B . ))