Stochastic Properties of Minimal Arc Distance and Cosine Similarity between a Random Point and Prespecified Sites on Sphere
SStochastic Properties of Minimal Arc Distance andCosine Similarity between a Random Point andPrespecified Sites on Sphere
Hongjun Li a , Jiatong Sui b , Shengpeng Mu a , Xing Qiu b, ∗ a College of Science, Beijing Forestry University, Beijing, China b Department of Biostatistics and Computational Biology, University of Rochester, NewYork, U.S.A.
Abstract
In applications such as wireless communication, it is important to study thestatistical properties of L , the minimal arc distance between a random point(e.g., a cellphone user) uniformly distributed on a sphere to a set of pre-definedseeds (e.g., wireless towers) on that sphere. In this study, we first derive thedistribution (CDF) and density (PDF) functions of the arc distance between aselected vertex of a spherical triangle to a random point uniformly distributedwithin this triangle. Next, using computational techniques based on sphericalVoronoi diagram and triangular partition of Voronoi cells, we derive moments of L and cos L . These results are verified by extensive Monte Carlo simulations. Keywords:
Arc distance, cosine similarity, distribution function, moments,spherical geometry, Voronoi diagram
1. Introduction
A Voronoi diagram, also known as Thiessen polygon or Dirichlet tessellation,is a powerful tool in conducting research related to the partition of a space [1].Given a set of pre-specified points (called seeds or sites), a Voronoi diagram ∗ Correspondence should be send to: Xing Qiu.
Email addresses: [email protected] (Hongjun Li), [email protected] (Jiatong Sui),
[email protected] (ShengpengMu), [email protected] (Xing Qiu)
Preprint submitted to Computational Statistics & Data Analysis February 3, 2021 a r X i v : . [ s t a t . C O ] F e b ivides the space into regions (called Voronoi cells) according to the nearest-neighbor rule, such that each region contains exactly one seed and the interiorpoints of the regions are closer to this seed than any others [2, 3].Voronoi diagrams can be constructed efficiently [4], and it has wide ap-plications in various disciplines, including forestry, chemistry, transportation,and statistics [5, 6, 7, 8]. In particular, it has been used by mathematiciansand statisticians to model stochastic foam geometries [9], estimating Shannonentropy of multidimensional probability [10], approximating resampling-baseddistributions [11], etc.In this study, we focus on the following application of Voronoi diagram inwireless communications. As we know, wireless signal strength depends criti-cally on the distance between a user’s device (e.g., cell phone) and the nearest base station. In order to study the distribution and moments of this minimaldistance, we need to decompose the whole region covered by wireless signalinto Voronoi cells [12, 13] of which the seeds are base stations. Because aVoronoi cell is a convex polygon composed of several triangles intersecting atthe seed, it suffices to study the stochastic properties of the distance betweena selected vertex (the seed) of a triangle and a random interior point. Severalresearch teams, including us, have study this problem on R thoroughly underthe assumption that the random point follows a uniform distribution on thetriangle [14, 15, 16, 13]. In particular, we derived the closed-form cumulativedistribution function (CDF), probability density function (PDF), and the firsttwo moments of this random distance [13].There are various generalizations of Voronoi diagram that serve better forcertain applications [17], including higher order diagrams, weighted Voronoidiagrams, and the case in which the seeds are replaced by line segments [18, 12,19, 20]. Another large class of generalizations are Voronoi diagrams defined onnon-Euclidean geometric spaces, e.g., Laguerre planes or spheres [21, 22, 23].The latter case (spherical Voronoi diagram, illustrated in Figure 1) is especiallyimportant in practice, not least because the Earth is approximated spherical.Again take the wireless communication problem as an example, if the base2tations to be modeled occupy a relatively large area, using the spherical Voronoidiagram defined on S would be more accurate than the planar Voronoi diagram.There are similar problems in large-scale forest ecology, vehicle distributionmanagement, and biomolecular structure modeling, that can benefit from usingthe spherical rather than planar Voronoi diagram.The distance between random points on S have been studied by a few re-searchers, including the generation and test of uniformity on the sphere [24],minimization of the summation of Euclidean distances of n random points [25],and random spherical packing [26, 27]. However, no prior studies have system-atically studied the stochastic properties of L ( ∆ P BC ) , the random sphericaldistance ( a.k.a. arc distance) between a given vertex P to a random point Q which is uniformly distributed on the spherical triangle ∆ P BC .In this study, we first study the stochastic properties of L ( ∆ P BC ) in de-tail, and extend these results to the case in which the spherical triangle ∆ P BC is replaced by an arbitrary convex spherical polygon. Next, we use these resultsand techniques based on spherical Voronoi diagram to derive the distributionof L ( P , . . . , P m ) , which is the minimal arc distance between a uniformly dis-tributed random point on S to { P , . . . , P m } , a collection of pre-specified siteson the same sphere.This manuscript is organized as follows. In Section 2 we provide usefulpreliminary results about spherical triangles. Detailed derivations of the CDFand PDF of L ( ∆ P BC ) , as well as their generalizations to convex sphericalpolygons (dubbed as L ( Γ , P ) , where Γ is a convex spherical polygon and P isan arbitrary internal seed), are given in Sections 3.1 and 3.2. The CDF and PDFof L ( P , . . . , P m ) is derived in Section 3.3. The close-form expression of firstfour moments of L ( ∆ P BC ) is provided in Section 4. To further demonstratethe utility of the CDF and PDF derived in Section 3, we derive the exact formulafor several moments of cos ( L ( ∆ P BC )) in Appendix B. These quantities areuseful in spherical cluster analysis and data mining [28, 29, 30, 31].3 a) (b)Figure 1: An illustration of spherical Voronoi diagram and spherical triangles. (a) Given a setof seeds, the spherical Voronoi diagram is a partition of a sphere into convex spherical polygons,such that the points within each polygon has the shortest spherical (angular) distance to theinterior seed than all exterior seeds. A combination of triangulation and mid-point bisectorinspector is employed to form these honeycomb-like, asymmetric, mesh-like shapes. (b) Dueto the non-Euclidean nature of S , a “straight line” (geodesic) on S is a great circle, and thespherical triangle is a triangular shaped area enclosed by three great circles. The sphericaldistance between two points such as A and A is measured by the arc-length of the greatcircle passing through these two points. Spherical geometry and trigonometry is fundamentallydifferent from planar geometry/trigonometry. For instance, the sum of the interior angles ofa spherical triangle exceeds 180 degrees [32].
2. Useful geometric properties of spherical triangles
Throughout this manuscript, we denote the sphere in R centered at theorigin (denoted by O ) with radius R by S R ∶= { x ∈ R ∶ ∣ x ∣ = R } . Let A and B be two distinct and non-opposing points on S R , e.g., A ≠ ± B , they define agreat circle, i.e., geodesic on S R . We denote the smallest arc of a great circlewith endpoints A and B by L AB ; ∠ AOB measured in radians by ϕ AB . Wealso denote the area of the spherical cap by A cap and the half-of-top angle [16]of this spherical cap by θ (see Fig. 2(a)). The area of a spherical sector, asshown in Fig. 2(b), is denoted by A fan , its top angle is denoted by α . Formaldefinitions of these concepts can be found in Appendix A.4 a) (b) X Z O Y
P C B ϕ PB α P α B α C h M (c)Figure 2: The three sub-figures illustrate the cap, fan and triangle on a sphere. (a) A sphericalcap is the region of a sphere cut off by a plane. If the plane passes through the center of thesphere, the cap is a called a hemisphere. We could use a second plane to cut the cap, whichwill result in a spherical segment. (b) A spherical fan on the sphere is part of a sphericalcap generated by further partitions, using great circles passing through the apex or the poleof spherical cap. (c) A triangle from a sphere can be formed using 3 points on a sphere,connected by the great circles. In this paper, we consider the inner triangles instead of theouter triangles. The sum of the angles of a inner spherical triangle is between π and 3 π timesthe radius. Area of spherical triangle.
Let α P , α B , α C be the three interior angles of triangle∆ P BC , as shown in Fig. 2(c). It is well known that the area of ∆
P BC is A ∆ = R ( α P + α B + α C − π ) = R E (1)Here α P , α B , and α C represent angles between great circles. The sphericalangle ∠ AP B = α can be calculated with the following Equation α = arccos ((—→ OA × —→ OP ) ⋅ (—→ OB × —→ OP ) ) (2)Here —→ v represents the unit direction of vector v ∈ R , namely, —→ v ∶= —→ v /∣—→ v ∣ .The calculation for α B and α C are similar. E is called the spherical excess ,0 ≤ E ≤ π , with E = Altitude of spherical triangle.
An important concept used in our derivation isthe altitude of a spherical triangle, which is a generalization of the altitude ofa planar triangle [13]. For spherical triangle ∆
P BC , the length h BC (or h ) of5he altitude to base edge ̂ BC is the length of arc ̂ P M , as shown in Fig. 2(c),which can be calculated as h = R ∠ P OM = R arcsin (—→ n ⋅ —→ OP ) (3)where —→ n is the unit normal direction of the plane defined by points O, B and C : —→ n ∶= (—→ OB × —→ OC ) ; h is the shortest distance from vertex P to the greatestcircle defined by B and C . In addition, the intersection M can be calculated bythe following equation ——→ OM = R (—→ OP + t —→ n ) , (4)where t ∶= −(—→ n ⋅ —→ OP ) .
3. Distribution of the Minimal Arc Distance
In this section, we will derive the distribution of L ( P , . . . , P m ) , the min-imal arc distance between a random point Q to a pre-specified set of sites { P , . . . , P m } on a sphere. L ( ∆ P BC ) Recall that ∆
P BC is an arbitrary spherical triangle, and L ( ∆ P BC ) rep-resents the arc distance between P to a random point Q uniformly distributedin ∆ P BC . Without loss of generality, we may choose B and C such that that L P B ≤ P C . In [16], Ahmadi and Pan classified every spherical triangle ∆
P BC into two cases based on whether the altitude from P to the arc BC is inside(e.g., Fig.3(a)) or outside (e.g., Fig.3(b)) of the triangle.We find that these two cases based on the altitude can be unified into asingle graph (Fig.3(c)) as follows: based on the altitude, a reflection of arc P B about the altitude is drawn. The outside and the inside arcs are denoted by ̂ P B and ̂ P B , respectively. Then for the inside-altitude case, B coincides with B ,while for the outside-altitude case, B coincides with B . The above procedureis illustrated in Fig. 3(c).The altitude length h can be calculated with Equation (3). F L ( r ) , theCDF of L can be represented as F L ( r ) = S ( r )/ Ω, where Ω is the area of the6
X Z O Y
P C B ϕ PB α P α B α C h M (a) X Z O Y
P C B ϕ PB α P α B α C h M (b) X Z O Y P C B ϕ PB α α C h M B (c)Figure 3: ∆ P BC can be classified into two classes based on the location of the altitude
P M :(a) The inside-altitude case,
P M ⊂ ∆ P BC , (b) the outside-altitude case,
P M ⊊ ∆ P BC . Byreflecting arc
P B about the altitude
P M , we can unify these two cases, which is illustratedin (c). spherical triangle ∆
P BC and S ( r ) ∶= ∣ B r ( P ) ⋂ ∆ P BC ∣ [15], where B r ( P ) is aspherical cap centered at P with radius r . By construction, B r ( P ) ⋂ ∆ P BC contains those points in ∆
P BC with arc distance to P less or equal to r . Thefollowing four cases need to be considered. Case 1.
When 0 < r ≤ h , B r ( P ) ⋂ ∆ P BC is a sector, centered at P , with twocut points on ̂ P B and ̂ P C , as shown in Fig.4(a). Using Equation (A.3), itsarea is S ( r ) = α P ( − cos r ) R , (5)where α P is ∠ BP C . Case 2.
When h < r ≤ L P B , B r ( P ) ⋂ ∆ P BC consists of a spherical triangle,∆
P D D (mid), and two sectors, (both centered at P , one has cut points on ̂ P B and ̂ P D , the other cuts on ̂ P D and ̂ P C ), as shown in Fig.4(b). WithEquations (A.3) and (1), we have S ( r ) = { α ( − cos r ) + η [( π / + α r + β r − π ) + ( α − α r )( − cos r )]} R . (6) Here α r = ∠ M P D , β r = ∠ P D M , and their values depend on r . α =∠ CP B and α = ∠ M P B do not depend on r . η ∶= L BB / L MB , thus η = B = B , η = B = B . 7 X Z O Y P C B ϕ PB α C h M B r (a) X Z O Y P C B α C M B r h D D (b) X Z O Y P C B α C h M B r D (c) X Z O Y P C B h M B r (d)Figure 4: The CDF/PDF of r ( ∆ P BC ) depends on how the spherical fan originated from P with radius r intersects with ∆ P BC . By construction, every point P in this fan satisfies L PQ ⩽ r , therefore P ( r ⩽ r ) is the ratio between the area of the intersection and the areaof the triangle. This intersection can be classified as follows: (a) case 1: 0 < r ≤ h , theintersection is a sector. (b) case 2: h < r ≤ L PB , it contains a spherical triangle and twosectors. (c) case 3: L PB < r ≤ L PC , it contains a spherical triangle and a sector. (d) case 4: r > L PC , it is a spherical triangle. α r and β r can be represented in one integrative Equation (6) for both inside-8ltitude and outside-altitude cases α r = cos − [(——→ OM × —→ OP ) ⋅ (——→ OD × —→ OP ) ] (7) = arccos ( tan h tan r ) . (8) β r = cos − [(—→ OP × ——→ OD ) ⋅ (——→ OM × ——→ OD ) ] (9) = arcsin ( sin h sin r ) . (10) Case 3.
When L P B < r ≤ L P C , B r ( P ) ⋂ ∆ P BC includes a spherical triangle∆
P BD and a sector centered at P with cut points on ̂ P D and ̂ P C , as shownin Fig.4(c). With Equations (A.3) and (1), we have S ( r ) = [( α P − ω r + α B + β r − π ) + ω r ( − cos r )] R , (11)where α B = ∠ P BC . ω r = ∠ D P C , depending on the distance r , can be calcu-lated as ω r = cos − [(—→ OC × —→ OP ) ⋅ (——→ OD × —→ OP ) ] = ∠ M P C − α r . (12) Case 4.
When r > L P C , B r ( P ) ⋂ ∆ P BC is the spherical triangle ∆
P BC asshown in Fig.4(d). Thus we have S ( r ) = Ω . (13)After simplifications, S ( r ) can be represented as S ( r ) = ⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩ , r ≤ α P g r R , < r ≤ h ( α g r + η [( α r + β r − π / ) + ( α − α r ) g r ]) R , h < r ≤ L PB (( α P + α B + β r − π ) − ω r cos r ) R , L PB < r ≤ L PC Ω , r > L PC (14) where g r = − cos r . Apparently r ≥ r ≤ r can be extended to R .Again, the CDF of L ( ∆ P BC ) is F L ( r ) = S ( r ) Ω ; hence its PDF is f L ( r ) ∶= S ′ ( r ) Ω f L ( r ) = ⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩ , r ≤ r > L PC α P sin rR , < r ≤ h { α sin r + η [ α ′ r cos r + β ′ r + ( α − α r ) sin r ]} R , h < r ≤ L PB [ β ′ r − ω ′ r cos r + ω r sin r ] R , L PB < r ≤ L PC (15) Here α ′ r , β ′ r , and ω ′ r are the derivatives of α r , β r , and ω ) r w.r.t. r ,respectively. When h ≤ r ≤ L P C , α ′ r = ∂α ′ r ∂r = tan h sin r cos r √ tan r − tan h , (16) β ′ r = ∂β ′ r ∂r = − sin h cos r sin r √ sin r − sin h , (17) ω ′ r = − α ′ r . (18) In real world applications, researchers may be interested in the distributionof arc distance between a pre-specified internal point P to a point Q that isuniformly distributed in a spherical polygon Γ. If Γ is convex, it can be decom-posed into a combination of spherical triangles pivoting at P as shown in Fig. 5,so we can extend the results we obtained in the previous section to this case.To avoid confusion, we use L instead of L to represent the random arcdistance L P Q . The CDF of L can be calculated based on the formulas insubsection 3.1.A convex spherical polygon Γ with n edges can be segmented as n sphericaltriangles pivoting at an internal P . An illustration is given in Fig. 5, where Γ isa spherical heptagon with vertices B , B , . . . , B that is decomposed into sevenspherical triangles: ∆ P B B , . . . , ∆ P B B .Based on construction, for L ( Γ , P ) defined for a spherical polygon Γ with n vertices B , B , . . . , B n and an internal point P , its CDF and PDF are F L ( r ) = P ( L P Q ⩽ r ) = ∣ Γ ∣ n ∑ i = S i ( r ) , f L ( r ) = ∣ Γ ∣ n ∑ i = S ′ i ( r ) . (19)10 igure 5: A convex spherical polygon Γ can be decomposed into a set of adjacent sphericaltriangles pivoting at a pre-specified internal point P . In Fig.5, the spherical cap centered at P represents a subset of Γ in which every point Q satisfies L PQ ⩽ r , where r is the arc radiusof this spherical cap. Apparently, this spherical cap is the disjoint union of spherical fanspertain to the member triangles. Consequently, the statistical properties of L = L PQ , thearc distance between P and a random point Q in Γ, can be derived from the properties of r ,which is the random arc distance between Q to a pre-selected vertex P of a spherical triangle. where S i ( r ) = ∣ B r ( P ) ⋂ ∆ P B i B i + ∣ , for i = , , . . . , n −
1, and S n ( r ) =∣ B r ( P ) ⋂ ∆ P B n B ∣ . These values, as well as their derivatives, can be com-puted by Equations (14) and (15). Of note, S i ( r ) = ∣ ∆ P B n B ∣ ⋅ P ( L P Q ⩽ r ∣ Q ∈ ∆ P B n B ) , therefore F L ( r ) = ∑ ni = ∣ ∆ P B n B ∣∣ Γ ∣ ⋅ P ( L P Q ⩽ r ∣ Q ∈ ∆ P B n B ) , whichis a weighted summation of conditional probabilities. L ( P , . . . , P m ) Let { P , P , . . . , P m } be a pre-specified set of points on a sphere and Q be a random point that is uniformly distributed on this sphere. Let L ∶= mj = L P j Q be the minimal arc length between Q and { P , P , . . . , P m } . Toderive the distribution of L , we first partition the sphere into { Γ j } mj = , thespherical Voronoi diagram with seeds { P , P , . . . , P m } . By construction, eachcell Γ j in this diagram is a spherical polygon such that every point in Γ j is closerto P j (the seed of Γ j ) than any other seed in terms of arc distance. Consequently,if Q ∈ Γ j , L = L P j Q . This property is illustrated in Fig.6. Figure 6: A spherical Voronoi diagram is a partition of sphere that consists of adjacent anddisjoint spherical polygons ( { Γ , . . . , Γ m } ) based on pre-specified seeds { P , . . . , P m } . For eachpoint Q ∈ Γ j , its arc distance to P j (the seed of Γ j ) is the shortest among all seeds. Circles inthe figure represent points that have the same arc distance to a given seed. The intersectionsof these circles can be used to determine the boundaries between adjacent cells. Once the Voronoi diagram is constructed, the distribution of L can bederived in this way. First, we use Equation (19) to obtain the distributionof L conditional on Q ∈ Γ j . As a reminder, it depends on partition Γ j intodisjoint spherical triangles ∆ P j B i,j B i + ,j , for i = , , . . . , n j . The unconditionaldistribution of L can then be obtained by taking the weighted summation of12hese conditional distributions over all spherical triangles. The final result ispresented as follows. F L ( r ) = m ∑ j = ∣ Γ j ∣ πR ⋅ F L ( r ∣ Q ∈ Γ j ) = πR m ∑ j = n j ∑ i = S i,j ( r ) .f L ( r ) = F ′ L ( r ) = πR m ∑ j = n j ∑ i = S ′ i,j ( r ) . (20)Here S i,j ( r ) = ∣ B r ( P ) ⋂ ∆ P j B i,j B i + ,j ∣ is the intersection of a spherical ball andthe spherical triangle ∆ P j B i,j B i + ,j ⊂ Γ j . It is easy to see that S i,j ( r ) = ∣ ∆ P j B i,j B i + ,j ∣ ⋅ P ( L P j Q ⩽ r ∣ Q ∈ ∆ P j B i,j B i + ,j ) , n j ∑ i = S i,j ( r ) = ∣ Γ j ∣ ⋅ F L ( r ∣ Q ∈ Γ j ) .
4. Moments
With the density functions derived in the previous section (Equations (15),(19), and (20)), we are now ready to compute the k th moments of L , L , and L .For a given spherical triangle ∆ P BC , we have E ( L k ( ∆ P BC )) = ∫ ∞−∞ r k f L ( r ) dr ∶= R Ω ( Term + Term + Term ) . Term = ∫ h r k ⋅ α P sin r dr . Term = ∫ L PB h r k ( α sin r + η ( α ′ r cos r + β ′ r + ( α − α r ) sin r )) dr . Term = ∫ L PC h PB r k ⋅ ( β ′ r + ω ′ r cos r − ω r sin r ) dr . (21) Here α ′ r and β ′ r are sophisticated functions of r defined by Equations (16) and(17). To the best of our knowledge, Equation (21) does not permit an analyticsolution. In practice, it can be calculated with a suitable numerical integra-tion method. In our implementation, this integral is computed by Gaussianquadrature [33].As we mentioned in Section 3.2, moments of L for a given convex sphericalpolygon Γ can be computed by first divide Γ into disjoint spherical triangles∆ P B i B i + (see Figure 5), evaluate the moments of the corresponding L in13ach triangle (by Equation (21)), and then take the weighted summation ofthese conditional moments. In summary, we have E ( L k ( Γ )) = ∣ Γ ∣ n ∑ i = ∣ ∆ P B i B i + ∣ ⋅ E ( L k ( ∆ P B i B i + )) . (22)Likewise, the moments of L for a given set of seeds { P , . . . , P m } can beobtained by first partition the sphere into m Voronoi cells Γ j , and then takethe weighted summation of the conditional moments of the corresponding L within each Γ j . E ( L k ({ P , . . . , P m })) = πR m ∑ j = ∣ Γ j ∣ ⋅ E ( L k ( Γ j ))= πR m ∑ j = n ∑ i = ∣ ∆ P B i,j B i + ,j ∣ ⋅ E ( L k ( ∆ P j B i,j B i + ,j )) . (23) Using similar techniques, we are able to derive several moments of cosinesimilarities cos ( L ka ) , for a = , , k = , , ,
6. These results are providedin Appendix Appendix B.
5. Numerical experiment
We conduct three sets of Monte Carlo experiments to verify the theoreticalresults derived in Section 4.In each simulation, we first generate m =
100 random seeds { P , . . . , P m } ona unit sphere based on the algorithm defined in [34] which is numerically stableand more accurate than many popular spherical sampling method at generatingpoints near the north and south poles. Technical details about these threesimulations are provided as follows. • Sim1: All m =
100 seeds are generated based on the uniform distributionon the unit sphere. This is one of the most commonly used model inapplications such as wireless communications. • Sim2: First, we sample 100 seeds on the sphere uniformly. For the first75 of them, we reflect those in the lower hemisphere through the xy -planeto ensure that all of them are in the upper hemisphere. It can be shown14asily that these processed seeds are uniformly distributed on the upperhemisphere. Similarly, we use reflection to ensure the rest 25 are uniformlydistributed on the lower hemisphere. We created this scenarios to mimican unbalanced case in which the sites are denser in a subset of the sphere. • Sim3: Using the acceptance-rejection method, we generate 50 seeds thatare uniformly distributed on the North end region, and another 50 seedsuniformly distributed on the South end region. Here the North end regionis defined as {( x, y, z ) ∈ S ∣ z ⩾ . } , which is the spherical cap near thenorth pole with height h = .
9. The South end region is defined in asimilar way: {( x, y, z ) ∈ S ∣ z ⩽ − . } . We use this scenario to illustratean extremely unbalanced case in which all seeds are concentrated on tworelatively small and disjoint regions of the sphere. (a) Sim1 (b) Sim2 (c) Sim3Figure 7: Three distributions of seeds used in simulations. (a) In Sim1, all m =
100 seeds areuniformly generated on the surface of the sphere. (b) In Sim2, the majority of the seeds (75of them) are uniformly distributed on the upper hemisphere of the sphere, while the rest (25of them) are uniformly distributed on the lower hemisphere. (c) In Sim3, half of the sites areuniformly generated on the North end region, while others are uniformly generated on theSouth end region.
In each case, once the seeds are generated, we uniformly sample 10, 000points ( Q s) on the sphere and compute L ∶= min mj = L P j Q , the shortest arcdistance between Q and the nearest seed. We then compute the empiricalmoments of L and cos L and compare them with their theoretical counterpartsderived in Section 4. 15o evaluate the discrepancy between the theoretical and empirical results,we use the Absolute Relative Error (ARE) define as followsARE ( m ) = ∣ ˆ m − mm ∣ . (24)Here ˆ m is one of the sample moments reported in Table 1, and m is its corre-sponding oracle moment computed by our theoretical derivations. To furtherminimize the impact of random variations in those Monte Carlo experiments,all experiments are repeated for 100 times, and the mean, maximum, and stan-dard deviation of the ARE reported in Table 1 are computed from these 100repetitions, each consists of 10,000 randomly sampled Q on the unit sphere.From Table 1, we find that the theoretical and empirical moments of L andcos ( L ) match well in most of the cases. Specifically, the mean AREs for lowerorder moments of L across different scenarios stay lower than 0.01; for higherorder moments the mean AREs are all lower than 0.02. The max AREs formoments of L are a little higher, but most of them are still lower than 0.05,except the 4th order moments of L in Sim2 (0.064).Estimated moments of cosine similarity have somewhat lower but acceptableaccuracy. With a few exceptions, most of E cos k L have mean and max ARElower than 0.05. We believe the inflated ARE of these moments is largely dueto the fact that the theoretical formulae of E cos k L involves dozens of terms(e.g., Equation (B.16)), so the total numerical error, which is the sum of errorsin all terms, becomes larger than that of E ( L k ) .Of note, the standard deviations of AREs are relatively low across all sce-narios and moments: for most of the cases, STD ( ARE ) ⩽ .
01. It implies thatthe reported accuracy is stable across 100 repetitions.
6. Conclusion & Discussion
In this paper, we first derive the CDF and PDF of L ( ∆ P BC ) , the arcdistance between P , a vertex of the spherical triangle ∆ P BC , to a uniformlydistributed random interior point Q . Using a technique based on polygon de-composition, we are able to extend these results to cases in which the spherical16 easurement E ( L ) E ( L ) E ( L ) E ( L ) E ( cos L ) E ( cos L ) E ( cos L ) Sim1 Proposed Method 0.176 0.039 0.010 0.003 0.962 0.926 0.893Mean MC 0.175 0.039 0.010 0.003 0.944 0.960 0.845Mean ARE 0.005 0.009 0.013 0.017 0.019 0.037 0.054Max ARE 0.014 0.026 0.038 0.050 0.027 0.050 0.071STD ARE 0.003 0.006 0.009 0.012 0.002 0.004 0.006Sim2 Proposed Method 0.194 0.050 0.016 0.006 0.952 0.908 0.870Mean MC 0.195 0.050 0.016 0.006 0.980 0.958 0.938Mean ARE 0.005 0.009 0.014 0.020 0.029 0.055 0.078Max ARE 0.020 0.035 0.045 0.064 0.043 0.077 0.106STD ARE 0.004 0.007 0.010 0.015 0.004 0.007 0.010Sim3 Proposed Method 0.380 0.208 0.133 0.091 0.820 0.704 0.625Mean MC 0.378 0.206 0.132 0.090 0.807 0.684 0.601Mean ARE 0.006 0.008 0.009 0.011 0.016 0.028 0.038Max ARE 0.013 0.017 0.018 0.019 0.024 0.041 0.055STD ARE 0.005 0.007 0.007 0.008 0.005 0.008 0.009
Table 1: Simulation results show that our proposed theoretical results matches well with theempirical estimation for moments of L and cos L in all three scenarios. The mean AREsfor lower order moments of L across different scenarios stay lower than 0.01; for higher ordermoments the mean AREs are all lower than 0.02. The max AREs for moments of L area little higher, but most of them are still lower than 0.05, except the 4th order moments of L in Sim2 (0.064). The estimation for cos L is less accurate, but still have low mean andmax ARE for lower order moments. Most of the estimation for cos L have mean and maxARE lower than 0.05. In addition, the accuracy of the estimations is stable across differentscenarios, with standard deviation of ARE ≤ L ( Γ ) ). Using methodsdeveloped for spherical Voronoi diagrams and integration techniques, we areable to compute arbitrary moments of L ({ P , . . . , P m }) , the shortest arc dis-tance from a uniformly distributed random point Q to a set of pre-defined seeds { P , . . . , P m } on a sphere. Finally, we compute moments for L and cos L ,which are both widely used in various applications.We conduct Monte Carlo simulations to validate our theoretical results. Byand large, the theoretical and the empirical results are consistent. However,higher order moments for both L and cos L have larger AREs than the lowermoments. This is not beyond our expectation, since larger samples are requiredin order to obtain estimates of similar quality for higher order moments [35].Note that the assumption of convexity of spherical polygon in deriving theCDF and PDF of L is made out of convenience, not necessity. We believe theCDF/PDF of L P Q , for Q uniformly distributed in an arbitrary spherical triangle∆ of which P ∉ ∆, can be derived based on techniques we deployed in this study.Since it is always possible to divide a non-convex spherical polygon into disjointspherical triangles (not necessarily pivoting at a pre-specified interior point P if it is not convex), we can aggregate the CDF/PDF with exterior point P to obtain the desired formula for a non-convex polygon. Furthermore, everysmooth geometric object O on S can be approximated by spherical polygonstherefore a disjoint union of spherical triangles ∆ j . Therefore, the distributionof the arc distance between an arbitrary pre-specified point P ∈ S and Q uniformly distributed on O can be approximated by our method. How to achievethe best accuracy of the said approximation with a fixed number of sphericaltriangles/polygons, therefore attain the best trade-off between computationalcost and accuracy, is an interesting research topic that warrants a future study.We think our work has tangible practical impact in many applications. Thestatistical distribution of the minimal distance between random points and apre-specified seeds (sites) is used in wireless communications, forest ecology,etc. Most existing studies in this field are based on the Voronoi diagram in R . For large-scale studies, it may be necessary to model the domain as a18phere because the curvature of earth may become non-negligible. Our resultshas potential to significantly improve the accuracy of the results under thesescenarios.Last but not least, many intermediate results (e.g., the distribution of L and L ) and mathematical techniques (e.g., the partition of the spherical trianglesand their classifications) we developed in this study may help other researchersstudy similar problems in the future. References [1] J. M. Augenbaum, C. S. Peskin, On the construction of the voronoi mesh ona sphere, Journal of Computational Physics 59 (2) (1985) 177–192 (1985).[2] H. Edelsbrunner, R. Seidel, Voronoi diagrams and arrangements, Discrete& Computational Geometry 1 (1) (1986) 25–44 (1986).[3] F. Aurenhammer, Voronoi diagrams—a survey of a fundamental geometricdata structure, ACM Computing Surveys (CSUR) 23 (3) (1991) 345–405(1991).[4] M. de Berg, M. Van Kreveld, O. Schwarzkopf, M. Overmars, Computationalgeometry algorithms and applications, 2nd Edition, Springer, Germany,2000 (2000).[5] O. Persson, Distance methods: the use of distance measurements in theestimation of seedling density and open space frequency, Studia forestaliasuecica (15) (1964) 1–68 (1964).[6] N. Christofides, S. Eilon, Expected distances in distribution problems,Journal of the Operational Research Society 20 (4) (1969) 437–443 (1969).[7] I. I. Vaisman, Statistical and computational geometry of biomolecularstructure, in: Handbook of Computational Statistics, Springer, 2012, pp.1095–1112 (2012). 198] G. Zhao, K. Xuan, W. Rahayu, D. Taniar, M. Safar, M. Gavrilova, B. Srini-vasan, Voronoi-based continuous nearest neighbor search in mobile naviga-tion, Industrial Electronics, IEEE Transactions on 58 (6) (2011) 2247–2257(2011).[9] J. K¨oll, S. Hallstr¨om, Generation of periodic stochastic foam models fornumerical analysis, Journal of cellular plastics 50 (1) (2014) 37–54 (2014).[10] J. Lin, Divergence measures based on the shannon entropy, IEEE Transac-tions on Information theory 37 (1) (1991) 145–151 (1991).[11] A. Villagran, G. Huerta, M. Vannucci, C. S. Jackson, A. Nosedal, Non-parametric sampling approximation via voronoi tessellations, Communica-tions in Statistics - Simulation and Computation 45 (2) (2016) 717–736(2016).[12] A. Okabe, B. Boots, K. Sugihara, S. N. Chiu, Spatial tessellations: conceptsand applications of Voronoi diagrams, Vol. 501, John Wiley & Sons, 2009(2009).[13] H. Li, X. Qiu, Moments of distance from a vertex to a uniformly distributedrandom point within arbitrary triangles, Mathematical Problems in Engi-neering 2016 (2016).[14] F. Aurenhammer, H. Edelsbrunner, An optimal algorithm for constructingthe weighted voronoi diagram in the plane, Pattern Recognition 17 (2)(1984) 251–257 (1984).[15] R. Pure, S. Durrani, Computing exact closed-form distance distributionsin arbitrarily shaped polygons with arbitrary reference point, The Mathe-matica Journal (2015).URL dx.doi.org/doi:10.3888/tmj.17-6 [16] M. Ahmadi, J. Pan, Random distances associated with arbitrary triangles:A recursive approach with an arbitrary reference point, UVicSpace (2014).URL http://hdl.handle.net/1828/5134 ppendix A. Definitions of useful geometric concepts of sphericaltriangles L AB = Rϕ AB (A.1)The area of the spherical cap, as shown in Fig. 2(a), is A cap = πR ( − cos θ ) (A.2)The area of the spherical sector, as shown in Fig. 2(b), is A fan = αR ( − cos θ ) (A.3) Appendix B. Moments of Cosine Similarity
With the PDF of L provided in Equation (15), we can calculate the math-ematical expectation of cos L as follows. E ( cos ( r )) = ∫ ∞−∞ cos ( r ) f ( r ) dr = (∫ h cos ( r ) ⋅ α P sin rdr + ∫ L PB h cos ( r ) ⋅ { α sin r + η [ α ′ r cos r + β ′ r + ( α − α r ) sin r ]} dr + ∫ L PC h PB cos ( r ) ⋅ [ β ′ r + ω ′ r cos r − ω r sin r ] dr )= ( Term + Term + Term ) . (B.1)The first two terms in the above equation are derived as follows.Term = ∫ h cos ( r ) ⋅ α P sin rdr = − α P ⋅ cos ( r )∣ h = − α P ⋅ ( cos ( h ) − ) = α P ( − cos ( h )) . (B.2)23erm = ∫ L PB h cos ( r ) ⋅ { α sin r + η [ α ′ r cos r + β ′ r + ( α − α r ) sin r ]} dr = ∫ L PB h cos ( r ) ⋅ α sin rdr + ηI η = α [ cos ( h ) − cos ( h P B )] + ηI η . (B.3)Here I η = ∫ L PB h cos ( r )[ α ′ r cos r + β ′ r + ( α − α r ) sin r ] dr = ∫ L PB h ( cos rα ′ r − cos r sin rα r ) dr + ∫ L PB h cos rβ ′ r dr + ∫ L PB h α cos r sin rdr = I η, + I η, + I η, , (B.4)where I η, = ∫ L PB h ( cos rα ′ r − cos r sin rα r ) dr = α r cos r ∣ L PB h + ∫ L PB h cos rα ′ r dr = [ arccos tan h tan L PB cos L PB − ] + ∫ L PB h tan h cos r sin r √ tan r − tan h dr =
13 arccos tan h tan L PB cos L PB + ⋅ I η, p . (B.5) I η, p = ∫ L PB h tan h cos r sin r √ tan r − tan h dr = tan h ( ∫ L PB h r √ tan r − tan h dr − ∫ L PB h sin r √ tan r − tan h dr )= −
12 arctan 2 tan h √ tan r − tan h √ tan r + r − tan r tan h − h ∣ L PB h − √ tan r − tan h tan h ( + tan h )√ + tan r ∣ L PB h = −
12 arctan 2 tan h √ tan L PB − tan h √ tan L PB + L PB − tan L PB tan h − h − tan h √ tan L PB − tan h ( + tan h )√ + tan L PB . (B.6) Note that the following two technical results are needed in deriving the above24ntegral. Let u = tan r , a = tan h , we have ∫ r √ tan r − tan h dr = ∫ u √ u − a √ + u du = √− a a ⎡⎢⎢⎢⎢⎣ ln √ u + − √− a − √ u − a − ln (√− a )+ √ u + − √− a − √ u − a − a (√ u + − )√− a − √ u − a + √− a (√ u + − ) (√− a − √ u − a ) ⎤⎥⎥⎥⎥⎦= i a ln u − u a − a + i ⋅ a √ u − a √ u + u ( a + ) = − a arctan 2 a √ u − a √ u + u − u a − a + Const , Const ∶= ⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩− i a ln ( a + ) , u − u a − a > , − i a ln ( a + ) − π a , u − u a − a < . (B.7) Note that only the first term of the results in Equation (B.7) depends on variable u ∶= tan r , so the choice of Const is irrelevant in computing the correspondingdefinite integral used in Equation (B.6).Again let u = tan r , b = tan h , we have ∫ sin r √ tan r − tan h dr = ∫ √ u − b ( + u ) / du = √ u − b ( b + )√ u + = √ tan r − tan h ( + tan h )√ + tan r . (B.8)The rest two terms, I η, , I η, are provided as follows. I η, = ∫ L PB h cos rβ ′ r dr = − ∫ L PB h sin h cos r sin r √ sin r − sin h dr = −( arctan √ sin r − sin h sin h − sin h √ sin r − sin h )∣ L PB h = sin h √ sin L P B − sin h − arctan √ sin L P B − sin h sin h . (B.9) I η, = ∫ L PB h α cos r sin rdr = − α cos r ∣ L PB h = α ( cos h − cos L P B ) . (B.10)25n summary, we have I η = I η, + I η, + I η, =
13 arccos tan h tan L PB cos L PB −
13 arctan 2 tan h √ tan L PB − tan h √ tan L PB + L PB − tan L PB tan h − h − h √ tan L PB − tan h ( + tan h )√ + tan L PB + sin h √ sin L PB − sin h − arctan √ sin L PB − sin h sin h + α ( cos h − cos L PB ) . (B.11) Term is derived as follows. Term = ∫ L PC L PB cos ( r ) ⋅ [ β ′ r + ω ′ r cos r − ω r sin r ] dr = − ∫ L PC L PB sin h cos r sin r √ sin r − sin h dr + ω r cos r ∣ L PC L PB − ∫ L PC L PB ω ′ r cos rdr = sin h √ sin L PC − sin h − arctan √ sin L PC − sin h sin h − sin h √ sin L PB − sin h + arctan √ sin L PB − sin h sin h + [ cos L PB arccos tan h tan L PB − cos L PC arccos tan h tan L PC ]+ ∫ L PC L PB cos rα ′ r dr = sin h √ sin L PC − sin h − arctan √ sin L PC − sin h sin h − sin h √ sin L PB − sin h + arctan √ sin L PB − sin h sin h +
13 cos L PB arccos tan h tan L PB −
13 cos L PC arccos tan h tan L PC +
23 Term . (B.12) Using Equation (B.6), we haveTerm = ∫ L PC L PB cos rα ′ r dr = ∫ L PC L PB tan h cos r sin r √ tan r − tan h dr =
12 arctan 2 tan h √ tan L P B − tan h √ tan L P B + L P B − tan L P B tan h − h −
12 arctan 2 tan h √ tan L P C − tan h √ tan L P C + L P C − tan L P C tan h − h − tan h √ tan L P C − tan h ( + tan h )√ + tan L P C + tan h √ tan L P B − tan h ( + tan h )√ + tan L P B . (B.13)26n summary, we have E ( cos ( r )) = ( Term + Term + Term )= α P ( − cos ( h )) + α [ cos ( h ) − cos ( h PB )] + ηI η + sin h √ sin L PC − sin h − arctan √ sin L PC − sin h sin h − sin h √ sin L PB − sin h + arctan √ sin L PB − sin h sin h +
13 cos L PB arccos tan h tan L PB −
13 cos L PC arccos tan h tan L PC +
23 Term = α P ( − cos ( h )) + α [ cos ( h ) − cos ( h PB )]+ (
13 arccos tan h tan L PB cos L PB −
13 arctan 2 tan h √ tan L PB − tan h √ tan L PB + L PB − tan L PB tan h − h − h √ tan L PB − tan h ( + tan h )√ + tan L PB + sin h √ sin L PB − sin h − arctan √ sin L PB − sin h sin h + α ( cos h − cos L PB ))+ sin h √ sin L PC − sin h − arctan √ sin L PC − sin h sin h − sin h √ sin L PB − sin h + arctan √ sin L PB − sin h sin h +
13 cos L PB arccos tan h tan L PB −
13 cos L PC arccos tan h tan L PC +
13 arctan 2 tan h √ tan L PB − tan h √ tan L PB + L PB − tan L PB tan h − h −
13 arctan 2 tan h √ tan L PC − tan h √ tan L PC + L PC − tan L PC tan h − h − h √ tan L PC − tan h ( + tan h )√ + tan L PC + h √ tan L PB − tan h ( + tan h )√ + tan L PB . (B.14) r and cos r as follows. E ( cos L ) = ∫ +∞−∞ cos r ⋅ f ( r ) dr = ( ∫ h cos r ⋅ αα p sin rdr + ∫ LPBh cos r ⋅ { α sin r + η [ α ′ cos r + β ′ + ( α − α r ) sin r ]} dr + ∫ LPCLPB cos r ⋅ [ β ′ + ω ′ cos r − ω r sin r ] dr )= ⎛⎝ (∠ MP C cos ( L PC ) − (∠ MP C + α η + α ) cos ( L PB ) + ( α η − α p + α ) cos h + α p )+ η −
65 arctan ⎛⎝ √ cos h − cos ( L PB ) sin h ⎞⎠ +
65 arctan ⎛⎝ √ cos h − cos ( L PC ) sin h ⎞⎠+ η −
615 sin h ( sin ( L PB ) − h − ) √ cos h − cos ( L PB )+
25 sin h ( sin ( L PC ) − h − ) √ cos h − cos ( L PC )+ ( η − ) arcsin ( sin h ∣ sin ( L PB )∣ ) + arcsin ( sin h ∣ sin ( L PC )∣ ) − η arcsin ( sin h ∣ sin h ∣ )− ( cos ( L PC ) arccos ( tan h tan ( L PC ) ) − ( + η ) cos ( L PB ) arccos ( tan h tan ( L PB ) ))− η −
13 sin h √( sin ( L PB ) − sin h ) ( sin ( L PB ) + sin h ) ( sin ( L PB ) + h − )−
13 sin h √( sin ( L PC ) − sin h ) ( sin ( L PC ) + sin h ) ( sin ( L PC ) + h − ) ⎞⎠ . (B.15) ( c o s r ) = ∫ + ∞ − ∞ c o s r ⋅ f ( r ) d r = Ω ( ∫ h c o s r ⋅ αα p s i n r d r + ∫ L P B h c o s r ⋅ { α s i n r + η [ α ′ c o s r + β ′ + ( α − α r ) s i n r ] } d r + ∫ L P C L P B c o s r ⋅ [ β ′ + ω ′ c o s r − ω r s i n r ] d r ) = Ω ⎛ ⎝ ( c o s h ( α η − α p + α ) − c o s ( L P B )( α η − ∠ M P C + α ) + ∠ M P C c o s ( L P C ) + α p ) + η − r c t a n ⎛ ⎝ √ c o s h − c o s ( L P B ) s i n h ⎞ ⎠ + r c t a n ⎛ ⎝ √ c o s h − c o s ( L P C ) s i n h ⎞ ⎠ − η − s i n h √ c o s h − c o s ( L P B ) ( s i n ( L P B ) ( s i n ( L P B ) − c o s h − ) + c o s h + c o s h + ) − s i n h √ c o s h − c o s ( L P C ) ( s i n ( L P C ) ( s i n ( L P C ) − c o s h − ) + c o s h + c o s h + ) + ( η − ) a r c s i n ( s i n h ∣ s i n ( L P B ) ∣ ) + a r c s i n ( s i n h ∣ s i n ( L P C ) ∣ ) − η a r c s i n ( s i n h ∣ s i n h ∣ ) + ( ( + η ) c o s ( L P B ) a r cc o s ( t a n h t a n ( L P B ) ) − c o s ( L P C ) a r cc o s ( t a n h t a n ( L P C ) )) + η − s i n h √ ( s i n ( L P B ) − s i n h )( s i n ( L P B ) + ( s i n h − ) s i n ( L P B ) + s i n h − s i n h + ) + s i n h √ ( s i n ( L P C ) − s i n h )( s i n ( L P C ) + ( s i n h − ) s i n ( L P C ) + s i n h − s i n h + ) ⎞ ⎠ . ( B . ))