On the Precision to Sort Line-Quadric Intersections
CCCCG 2016, Vancouver, British Columbia, August 3–5, 2016
On the Precision to Sort Line-Quadric Intersections
Michael Deakin Jack Snoeyink ∗ Abstract
To support exactly tracking a neutron moving along agiven line segment through a CAD model with quadricsurfaces, this paper considers the arithmetic precisionrequired to compute the order of intersection points oftwo quadrics along the line segment. When the ordersof all but one pair of intersections are known, we showthat a resultant can resolve the order of the remainingpair using only half the precision that may be requiredto eliminate radicals by repeated squaring. We comparethe time and accuracy of our technique with convertingto extended precision to calculate roots.
In this work, we are concerned with ordering the pointsof line-quadric intersections in 3 dimensions, where theinputs are representable exactly using w -bit fixed-pointnumbers. We will actually use floating point in stor-age and computation, but our guarantees will be forwell-scaled inputs, which are easiest described as fixed-point. A representable point q or representable vector v is a 3-tuple of representable numbers ( x, y, z ). The linesegment from point q to q + v is defined parametricallyfor t ∈ [0 ,
1] as ‘ ( t ) = q + tv ; note that there may be norepresentable points on line ‘ except its endpoints (andeven q + v may not be representable, if the additioncarries to w + 1 bits.)A quadratic is an implicit surface defined by its 10representable coefficients, Q ( x, y, z ) = q xx x + q xy xy + q xz xz + q x x + . . . + q zz z + q z z + q c = 0 . For more accuracy, we can allow more precision forthe linear and quadratic coefficients, since we will need3 w bits to exactly multiply out the quadratic terms, orwe can use a representable symmetric 3 × M , arepresentable vector v , and a 3 w -bit constant R to givea different set of quadrics ˜ Q ( p ) = ( p − v ) T M ( p − v ) = R that is closed under representable translations of v .Whichever definition of quadrics is chosen, the param-eter values for line-quadric intersections are the rootsof Q ( ‘ ( t )) = 0, which can be expressed as a quadratic at + 2 bt + c = 0 whose coefficients can have at most ∗ School of Computer Science, University of North Carolina atChapel Hill, [email protected] , [email protected] w + 4 bits. (Four carry bits suffice to sum the 3 w -bitproducts; w = 16 allows exact coefficient computationas IEEE 754 doubles; w = 33 as pairs of doubles.)These definitions are motivated by a problem fromDavid Griesheimer, of Bettis Labs: rather than track-ing a particle through quadric surfaces in a CAD model,would it be more robust to compute the intervals of in-tersections with a segment? We compare three meth-ods to order line-quadric intersections. Our methods,particularly the third, are developed and tested for thecase where only one pair of roots has a difference that ispotentially overwhelmed by the rounding errors in thecomputation. We comment at the end how to handlepairs of quadric surfaces that have more than one pairof ambiguous roots. This section outlines three methods—ApproximateComparison, Repeated Squaring, and Resultant—tosort the intersections with two quadrics, Q and Q ,with a given line ‘ ( t ), or equivalently, the roots of twoquadratics, a t + 2 b t + c = 0 and a t + 2 b t + c =0. For each, we evaluate correctness, precision, andfloating-point arithmetic operations (FLOPs) required. The approximate comparison method computes, for i ∈ { , } , the roots r ± i = ( b i ± p b i − a i c i ) /a i ap-proximately by computing each operation in IEEE 754double precision or in extended precision. Actually, toavoid subtractive cancellation, we calculate one of thetwo roots as r − sign b i i = − c i / ( b i + (sign b i ) p b i − a i c i ).The order of any two chosen approximate roots can becalculated exactly as sign( r ± − r ± ).The rounding of floating point arithmetic means thateven with representable input, the correct order is notguaranteed unless we establish a gap or separation the-orem (which are also established using resultants [1, 2])and compute with sufficient precision. Determining thisprecision is a longstanding open problem [ ? ]. Withouta guarantee, this method requires very little computa-tion. Computing both roots takes 12 FLOPs, with onemore to compute the sign of the difference. Moreover,the roots can be reused in a scene of many quadrics.We also use extended precision, where the multipli-cations and addition in the discriminants are calculated a r X i v : . [ c s . C G ] M a y th Canadian Conference on Computational Geometry, 2016 with 6 w bits, square root and addition at 12 w bits, anddivisions at 24 w bits. To actually perform the compar-ison, one final subtraction is required at 24 times theinitial precision – 1 FLOP, with an initialization cost of10 FLOPs per quadric intersecting the line. The repeated squaring method computes sign( r ± − r ± )by algebraic manipulations to eliminate division andsquare root operations, leaving multiplications and ad-ditions whose precision requirements can be bounded.It uses, for x = 0, the property that sign( y ) =sign( x ) sign( x · y ). Divisions can be removed directly,since sign( r ± − r ± ) = sign( a a ) sign( a a ( r ± − r ± )).One square root can be eliminated by multiplyingby r ± − r ∓ , giving sign( a a ) sign( a a ( r ± − r ∓ )) · sign( a a ( r ± − r ± )( r ± − r ∓ )). When simplified, thefinal sign is computed from a b − a a c + 2 a a c − a a b b ± p ( a a b − a b ) ( b − a c ).The expression under the radical is correctly com-puted with 8 × the input precision; the remaining ex-pression can be evaluated to a little more than 4 × inputprecision in floating point, or can be evaluated in fixedpoint in 8 × input precision by isolating the radical andsquaring one last time.This method not only requires high precision, but alsoa large number of FLOPs. Computing the unambigu-ous sign of the difference of the roots requires 15 FLOPstotal, and correctly computing the final sign requiresanother 24 FLOPs. Unfortunately, many of the com-puted terms require coefficients from both polynomials;only the discriminants, squares, and products can pre-computed, which reduces the number of FLOPs by 14.This brings us to 25 FLOPs per comparison, with aninitialization cost of 14 FLOPs per quadric.Note that this method uses our assumption that weknow sign( r ± − r ∓ ) when computing sign( r ± − r ± ), butwe can learn this from a lower precision test against − b /a , since r − ≤ − b /a ≤ r +2 . This method was previously described in [ ? ], but a de-scription is included here for completeness.The resultant method computes the order of twointersections from the resultant for their polynomi-als, which can be written as the determinant of theirSylvester Matrix [4, Section 3.5]. The general SylvesterMatrix for polynomials P ( t ) = p m t m + · · · + p and Q ( t ) = q n t n + · · · + q is defined as in Equation1. res ( P, Q ) = p m . . . p p m . . . p
0. . . . . .0 0 p m . . . p q n . . . q q n . . . q
0. . . . . .0 q n . . . q (1)The resultant is also the product of the differences of P ’s roots, a , . . . , a n , and Q ’s roots, b , . . . , b m , as inEquation 2. [4, Section 6.4] res ( P, Q ) = p nm q mn m Y i =1 n Y j =1 ( a i − b j ) (2)The two expressions for the resultant provide us withanother method of computing the sign of one of the dif-ferences of the two roots. Under our assumption thatwe know the order of all pairs or roots except, say, a and b , we can compute sign( a − b ) from the determi-nant and known signs, as in Equation 3 at the top of thenext page. The signs need not be multiplied; we simplycount the negatives. With quadratics, m = n = 2, sothe signs of the leading p and q will be positive andcan be ignored.The determinant can be computed with half the pre-cision and fewer floating point operations than repeatedsquaring to correctly compute the sign of the differencesof roots of the polynomials.Computing a general 4 × × We experimentally evaluated the resultant method andthe approximate computation method with both ma-chine precision and extended precision. RepeatedSquaring is dominated by the other methods so was nottested.We created two types of test scenes that had touchingsurfaces so that random lines might have some chance(albeit small) to give incorrect orders under approxima-tion, and count the number of disagreements. We evalu-ated time per comparison for each method on computerswith different processors. Finally, by varying the num-ber of surfaces in the second type of scene, we could use
CCG 2016, Vancouver, British Columbia, August 3–5, 2016 sign( a − b ) = sign( res ( P, Q )) sign( p nm ) sign( q mn ) m Y i =2 n Y j =2 [sign( a i − b j ) sign( a − b j ) sign( a i − b )] (3)∆ = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) a b c a b c a b c a b c (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = a c + c a + b a c + b a c − b c a b − a b b c − a c a c α i = a i , γ i = c i , δ i = a i b i , (cid:15) i = a i c i , ζ i = b i c i , D i = b i − (cid:15) i , i ∈ , α γ + γ α + D (cid:15) + (cid:15) D − ζ δ − δ ζ (5)linear regression to determine the contribution to run-ning time from per quadric and per comparison terms. All methods were implemented in C++, and were testedby computing the line-quadric intersection orders alongrandom lines in scenes of quadric surfaces. The cre-ation of these lines and quadric surfaces is describedin the next subsection. Machine precision tests wereperformed in IEEE 754, with quadratic coefficients anddiscriminants stored as single precision floats, with allmachine precision computations performed as floats.MPFR[3] was used to support arbitrary precision inboth the approximate comparison and the resultantmethods. The approximate comparison method used24 × the precision of a float. The resultant comparisonmethod also used 24 × the precision of a float, to accountfor the range of exponents in the inputs.The first step of the evaluation for a line ‘ and quadric Q was to determined if there was a real intersectionby evaluating the discriminant of the quadratic p ( t ) = Q ( ‘ ( t )). This evaluation was done in machine precision,so there is a small chance that near tangent intersectionsmay have been missed due to numeric error in calculat-ing the discriminant. (In our application, missing neartangent intersections was allowed, but getting orderswrong had been known to trap particles into repeatedlytrying to cross the same pair of surfaces, which tends toworry a physicist.)If the intersections are deemed to exist, the secondstep is to compute the roots at machine precision. Theseroots are needed to determine if the order of a pairof intersections is ambiguous or not. Finally, the stlsort algorithm is used to sort the intersections. Thefull process was timed in nanoseconds with the POSIXclock_gettime function.The comparison function used for sorting came fromthe method being evaluated. The machine precision ap-proximate comparison just returns the difference of the previously computed roots. In the increased precisionapproximation and the resultant method, the differenceof the roots is compared against a threshold. If the dif-ference was smaller than a threshold of 2 − , the moreaccurate method provided is used to determine the or-der, and an appropriate value is returned. This occurredinfrequently for a random line, and is only expected tooccur a few times for every 100k lines.We ran tests on two computers with different speedsand operating systems; we name them by their operat-ing systems. Arch was a Core i3 M370 processor with 2 cores, a 3MB cache, and 4 GB of DDR3 memory clocked at 1GHz. It ran an up-to-date installation of Arch Linux,kernel version 4.4, and GCC 6.0 was used to compilethe code. For the tests, the performance manager wasset to keep the CPU clock at 2.4 GHz, and the processwas run with a nice value of − Gentoo was a Core 2 Duo E6550 processor with twocores, a 4 MB cache, and 8 GB of DDR2 memory clockedat 667 MHz. It ran an up-to-date installation of GentooLinux, kernel version 4.1 and GCC 4.9 was used to com-pile the code. For the tests, the performance managerwas set to keep the CPU clock at 2.3 GHz, with a nicevalue of − We created two types of test scenes: a single scene ofPacked Spheres and a set of scenes of Nested Spheres.The test scenes consisted of quadric surfaces stored asIEEE754 single precision floating point numbers. Wepreferred spheres and ellipsoids, since any intersectingline would intersect twice, possibly with a repeated root.Sorting isolated single roots is easier, since, for example,the intersection with a plane requires less precision. The8 th Canadian Conference on Computational Geometry, 2016 quadric surfaces were constructed from the unit cubethat has one corner at the origin and the opposite cornerat (1 . , . , . .
05 units, and are spacedabout 0 .
05 units from each other. The initial sphere iscentered at the origin, and one of the axes of the latticeis aligned with the y axis of the coordinate frame. Thecoefficients of the spheres are scaled so that the coeffi-cients of the squared terms were all 1 .
0. This causedthe exponent range for the non-zero coefficients of thespheres to be between − − . , − . , − .
0) and (1 . , . , . n = 10 spheres, the others had n = 100 i , for 1 ≤ i ≤
10. The first sphere was cen-tered at x = 0 . , y = 0 . , z = 0 . R = 0 . R n = 2 − units. Thus, R i = R i − − ( R − R n ) /n .The x position of successive spheres increased linearlyto fix the minimum distance at (cid:15) = 2 − units. Thus, x i = x i − +( R − R n ) /n − (cid:15) . The exponent range for thenon-zero coefficients of the spheres was chosen to be be-tween − p i from a uni-form distribution over the unit cube. The directionswere set as (1 . , . , . − p i , where (1 . , . , .
5) is apoint very close to the points of minimum distance forthe sets of spheres. This made it very probable thatincreased precision would be required to correctly com-pute the order of intersections. To ensure that we areable to compute the order of intersections exactly withthe resultant method, the exponents of the non-zeroterms were constrained between -20 and 0.
The time that it takes to compute the order of inter-sections between a given line and a scene of quadricsurfaces is expected to be linear in both the number of Figure 1: Test Scenes of 1331 Packed Spheres and 10Nested Spheres, which is smallest of a family of eleven.Random lines in Packed Spheres have some chance ofbeing near sphere contacts. Random lines in NestedSpheres are unlikely to, unless they are biased to passby the near tangency.
CCG 2016, Vancouver, British Columbia, August 3–5, 2016 quadric surfaces and the number of accurate compar-isons made. Because performing accurate comparisonsis so much more expensive than normal comparisons,we expect there to be a clear linear relation betweenthe number of accurate comparisons performed and thetime it takes to perform the sorting.The number of quadrics, on the other hand, can sig-nificantly affect the number of intersections in the listto be sorted, especially in antagonistic scenes. However,most of the time spent sorting will be accounted forby the time spent making accurate comparisons, whichwe have already accounted for. Thus, the remainingtime will instead come from computing the approximateroots, which is linear.To analyze the Packed Spheres timing data, we usedleast squares to fit a line to the number of comparisonsmade and the timing data. A constant term was alsocomputed for the time taken computing the approxi-mate roots.To analyze the set of Nested Spheres scenes, we ag-gregated the test results for the scenes so that we coulduse least squares to fit a plane to the number of com-parisons made, the number of quadric surfaces, and thetiming data. A constant term was also computed tocatch any hidden initialization costs, though we expectthis to contain mostly noise.
The results of the experiments are shown in Table 1.The first thing to notice is that increasing the preci-sion of a computation is not enough to guarantee thatthe result will be computed correctly. Despite increas-ing the precision of the computations to 24 × the ini-tial precision, the increased precision approximationstill fails for 1044 / / th test.In addition to guaranteeing correctness, the resul-tant method also performed well against the generic in-creased precision method. For the set of Nested Spheresscenes, it cost slightly more to compute the order of in-tersections on a time per quadric basis. The approx-imate computation with increased precision can cacheintermediate values more effectively, reducing its cost.The resultant method performed extremely well onthe time per comparison basis, as it actually beat the in-creased precision method by more than it lost out on inthe time per quadric basis in the Nested Spheres scenes,and the Packed Spheres scene on the Gentoo machine.After removing the time per quadric basis in the Figure 2: Evaluation Time for a Line (ms) vs. the Num-ber of Comparisons; Sorting the intersections of 1k linesin each of the Nested Spheres test scenes on the Gentoomachine; Red Dot’s (above the bars): Approxi-mation Method at × the Input Precision ; BlueBars (beneath the dots): Resultant Method . Theleast squares coefficient for the time per quadrics hasbeen subtracted out to better show the actual fit. Thelines show the respective least squares fits without thequadric term.tests with the Nested Spheres scenes, the constant termappears somewhat nonsensical. From previous experi-ments, we have concluded that this is mostly noise, sug-gesting that we obtained most of the useful informationfrom the measured times. This suggests the time perquadric is the main contributor to the constant time inthe tests with the Packed Spheres scene as we expected.Figure 2 shows a plot of the results from one of thetests. It appears to confirm our expectation that thetime required is linearly coorelated with the number ofprecision increases.
In this paper we showed how the resultant method canguarantee the correct order of line-quadric intersectionsat a similar cost to using an increased precision approxi-mation method. We have also shown that naively usingincreased precision to improve accuracy is not enoughto eliminate errors, and that one must take into accountthe operations being used and the ranges of the input.We have assumed that we know the order of all rootsexcept one pair. Even if one’s application does not pro-vide this information, for quadratic equations it is rela-tively easy to obtain using lower precision than it takes8 th Canadian Conference on Computational Geometry, 2016
Table 1: Analysis of the timing of the Approximate Comparison and Resultant Comparison. Timing data for 11klines was analyzed for the Packed Spheres scene to find the coefficients of the best fitting lines. Timing data for 1klines was analyzed for each of the set of 11 Nested Spheres scenes to find the coefficients of the best fitting planes.The dimensions are the number of quadric surfaces and the number of increased precision comparisons made.Scene Machine Method Errors ms/Quadric ms/Comp Const ms P Residual (ms )Nested Arch Approximate 8272 0.00425 0.000361 -0.0693 149.084Spheres Increased Prec. 1044 0.00554 0.105 -0.567 87655.1Resultant — 0.00670 0.100 -0.746 80544.6Gentoo Approximate 8244 0.00379 0.000313 -0.0519 34.4705Increased Prec. 1042 0.00484 0.146 -0.110 11944.9Resultant — 0.00584 0.141 0.00485 19872.5Packed Arch Approximate 0 – 0.00738 4.54 21.7059Spheres Increased Prec. 0 – 0.126 4.49 22.4387Resultant — – 0.130 4.51 23.5822Gentoo Approximate 0 – 0.00180 4.37 3.75176Increased Prec. 0 – 0.156 4.37 3.76225Resultant — – 0.155 4.41 3.83604to compare roots. The zero of the derivative x i = − b i /a i separates r − i and r + i by value of the discriminant. If x = x then comparing squared discriminants tells usall we need to know about root orders. When, wlog, x < x , we use the signs of both quadratics at x and x to bound roots to intervals, and can again comparesquared discriminants to reveal the order for all but onepair. We thank David Griesheimer for discussions on thisproblem, and both NSF and Bettis Labs for their sup-port of this research.
References [1] W Dale Brownawell and Chee K Yap. Lower boundsfor zero-dimensional projections. In
Proc. Int’l Sympon Symbolic and Algebraic Computation , pages 79–86. ACM, 2009.[2] Ioannis Z Emiris, Bernard Mourrain, and Elias PTsigaridas. The DMM bound: Multivariate (ag-gregate) separation bounds. In
Proc. Int’l Sympon Symbolic and Algebraic Computation , pages243–250. ACM, 2010. http://arxiv.org/abs/1005.5610 .[3] Laurent Fousse, Guillaume Hanrot, Vincent Lefèvre,Patrick Pélissier, and Paul Zimmermann. Mpfr: Amultiple-precision binary floating-point library withcorrect rounding.
ACM Trans. Math. Softw. , 33(2),June 2007.[4] Chee-Keng Yap.