[PDF] volesti: Volume Approximation and Sampling for Convex Polytopes in R

Abstract

Sampling from high dimensional distributions and volume approximation of convex bodies are fundamental operations that appear in optimization, finance, engineering and machine learning. In this paper we present volesti, a C++ package with an R interface that provides efficient, scalable algorithms for volume estimation, uniform and Gaussian sampling from convex polytopes. volesti scales to hundreds of dimensions, handles efficiently three different types of polyhedra and provides non existing sampling routines to R. We demonstrate the power of volesti by solving several challenging problems using the R language.

Full PDF

vvolesti : Volume Approximation and Sampling forConvex Polytopes in R Apostolos Chalkis

National & Kapodistrian University of Athens, Greece

Vissarion Fisikopoulos

National & Kapodistrian University of Athens, Greece

Abstract

Sampling from high dimensional distributions and volume approximation of convexbodies are fundamental operations that appear in optimization, ﬁnance, engineering andmachine learning. In this paper we present volesti , a

C++ package with an R interfacethat provides eﬃcient, scalable algorithms for volume estimation, uniform and Gaussiansampling from convex polytopes. volesti scales to hundreds of dimensions, handles eﬃ-ciently three diﬀerent types of polyhedra and provides non existing sampling routines to R . We demonstrate the power of volesti by solving several challenging problems using the R language. Keywords : volume approximation, sampling, polytopes, integration, ﬁnancial crisis detection,zonotope approximation, counting linear extensions, R , C++ .

1. Introduction

High-dimensional sampling and integration are fundamental problems with many applicationsin science and engineering (Iyengar 1988; Somerville 1998; Genz and Bretz 2009; Schellen-berger and Palsson 2009; Venzke, Molzahn, and Chatzivasileiadis 2019). Nevertheless, thoseproblems are computationally hard for general dimension. For example computing the vol-ume of convex polytopes (a special case of integration) is a hard problem (Dyer and Frieze1988). Even worse, deterministic approximation of volume computation is only possible un-der exponential in the dimension errors for convex bodies in the oracle model (Elekes 1986).Therefore, a great eﬀort has been devoted to randomized approximation algorithms that es-timate the volume by eﬃciently sampling random points from the convex body. Startingwith the celebrated result by Dyer, Frieze, and Kannan (1991) with complexity O ∗ ( d ) ora-cle calls , around three decades of algorithmic improvements reduced the exponent to 3 forwell-rounded convex bodies (Cousins and Vempala 2015). Recently, even faster algorithms In the oracle model, a black box routine called oracle gives access to the convex body either by answeringwhether a query points lays in or out of it or by computing the intersection points of a trajectory and thebody. Here, O ∗ ( · ) suppresses polylogarithmic factors and dependence on error parameters and d is the dimension. a r X i v : . [ s t a t . C O ] J u l volesti : Volume Approximation and Sampling in R have been designed and analyzed for the special case of convex polytopes (Chen, Dwivedi,Wainwright, and Yu 2018; Lee and Vempala 2018; Mangoubi and Vishnoi 2019).All the above mentioned algorithms rely on sampling and enjoy great theoretical guaranteesbut they cannot be applied eﬃciently to real life computations. For example, the asymptoticanalysis by Lovász and Vempala (2006) hides some large constants in the complexity andin (Lee and Vempala 2018) the step of the random walk used for sampling is too small to bean eﬃcient choice in practice. Therefore, practical algorithms have been designed by relaxingthe theoretical guarantees and applying new algorithmic and statistical techniques to performeﬃciently while on the same time meet the requirements for high accuracy results. The ﬁrstpractical algorithm that scaled to high dimensions was by Emiris and Fisikopoulos (2014)(Sequence of Balls or SoB algorithm), implemented in C++ , highlighted the importance ofCoordinate Direction Hit and Run walk. An asymptotically faster practical algorithm fol-lowed by Cousins and Vempala (2016) (Cooling Gaussians or CG algorithm), implementedin

MATLAB . However, both algorithms can handle only H-polytopes (i.e., sets of linear in-equalities) in high-dimensions. A more recent algorithm by Chalkis, Emiris, and Fisikopoulos(2019) (Cooling Bodies or CB algorithm) is designed to scale in high dimensions for otherpolytope representations as well such as V-polytopes (i.e., convex hulls of sets of points) andZ-polytopes (i.e. Minkowski sums of segments). In this paper, high dimensions typically meanorder of few hundreds, which is today’s limit in practical volume approximation.Geometric random walks is an undoubted active area in the intersection of convex geometryand statistics. The Markov chain method by Metropolis, Rosenbluth, Rosenbluth, Teller, andTeller (1953) was the ﬁrst method to sample from high dimensional distributions and has beenused extensively for numerical problems in statistical mechanics. Several research results werebased on this method before Hastings (1970) provide a generalization of Metropolis method.Since then, a great amount of eﬀort devoted to high dimensional sampling with Markov chainalgorithms. Let us stand out a few important and famous algorithms in the sequel. Geman andGeman (1984) introduce an algorithm known as Gibbs sampler is commonly used in statisticalinference. Hamiltonian Monte Carlo (HMC) (Duane, Kennedy, Pendleton, and Roweth 1987)resulted to a remarkable empirical success over the years but only recently Mangoubi andSmith (2017); Lee, Song, and Vempala (2018); Lee, Shen, and Tian (2020) presented the ﬁrsttheoretical results on the mixing of the walk. For two inspirational surveys on Hamiltoniandynamics we suggest those of Neal (2011) and Betancourt (2017). Smith (1984) introducedHit and Run algorithm, while provided simple, widely used criteria to prove the convergenceto the uniform distribution. Later, Kaufman and Smith (1998) discussed eﬃcient directionchoices for Hit and Run and provided a variation of the Hit and Run walk that enjoyed a lot ofattention across sciences. For example it is widely used in biology to study constraint-basedmodels of metabolic networks (Megchelenbrink, Huynen, and Marchiori 2014; Saa and Nielsen2016). Another famous family of algorithms is based on Langevin Monte Carlo (LMC) methodand its variants (Dalalyan 2017). Recently, Shen and Lee (2019) presented an algorithm basedon underdamped Langevin diﬀusion for sampling from log-concave distributions and provedsub-linear dependence on the input dimension for the mixing time.An important special case of sampling is from a truncated to a convex body probabilitydistribution. For this case, the random walks need some access to the body given by anoracle. Hit and Run requires a so called boundary oracle for the body—that is, compute theintersection points of a line and the body’s boundary. The truncated version of MetropolisHastings is called Ball walk and requires a membership oracle–that answers whether a point postolos Chalkis, Vissarion Fisikopoulos R software, there are several CRAN packages that provide sampling from multi-variate distributions. There are two main categories, truncated and non-truncated distribu-tions. For the ﬁrst category an important case is the truncated Gaussian distribution. For thiscase there is tmg implementing exact HMC with boundary reﬂections as well as multinomineq,restrictedMVN, lineqGPR, tmvmixnorm implementing variations of the Gibbs sampler. Thelatter can also be used to sample from the t-distribution. To generate uniformly distributedpoints from a convex polytope there exist hitandrun . For the non-truncated case, packages

HybridMC, rhmc provide implementations for HMC without truncation on the target space,while there are two more packages, i.e. mcmc and

MHadaptive , to sample from any distribu-tion using the Metropolis Hastings algorithm. For volume computation in R there exist onlyone package, namely geometry , which provides a deterministic algorithm for volume compu-tation of the convex hull of a set of points; it is based on the C++ package qhull (Barber,Dobkin, and Huhdanpaa 1996).In this paper, we present volesti , a C++ library with an R interface, that contains a varietyof high-dimensional sampling and volume computation algorithms. In particular, it includeseﬃcient implementations of all three practical volume algorithms noted above—that is SoB,CG and CB. Moreover, volesti provides eﬃcient implementations for the following randomwalks (deﬁned in Section 2.2):• Random-Directions Hit and Run (RDHR) (Smith 1984)• Coordinate-Directions Hit and Run (CDHR) (Smith 1984)• Ball Walk (BaW) (Hastings 1970)• Billiard Walk (BiW) (Polyak and Gryazina 2014)All the methods above can be used to sample from multivariate uniform or spherical Gaussiandistributions (centered at any point), except BiW which, by deﬁnition, can be employed onlyfor uniform sampling. The novelty of volesti is highlighted by the following points:(a) is the ﬁrst R package for eﬃcient high dimensional volume estimation,(b) is the ﬁrst open source software that handles eﬃciently three diﬀerent types of polyhedrain high dimensions,(c) provides non existing in R sampling algorithms from uniform and Gaussian distributions,(d) solves so far intractable problems using R (Section 4). https://github.com/GeomScale/volume_approximation/tree/v1.1.1 volesti : Volume Approximation and Sampling in R Let us now illustrate the power of volesti with few examples of volume estimation and sam-pling. The scripts of the examples in this paper use some R libraries either for comparisonwith volesti or for plotting. To run all the scripts successfully the reader should enable allthose libraries by the following R command: library(volesti, hitandrun, geometry, SimplicialCubature, ggplot2, plotly,latex2exp, rgl) Let as start with a simple example by computing the volume of the 10-dimensional cube[ − , , which is know to be 1024. P = gen_cube(10, 'H')cat(volume(P, seed = 5))1022.408

Since this is a randomized algorithm it makes sense to compute some statistics for the outputvalues using R . Now we compute the volume of a 40-dimensional cube. P = gen_cube(40, 'H')cat(P$volume)1.099512e+12volumes <- list()for (i in 1:20) {volumes[[i]] <- volume(P, settings = list("error" = 0.2))}options(digits=10)summary(as.numeric(volumes))Min. 1st Qu. Median Mean 3rd Qu. Max.7.998443e+11 9.505622e+11 1.029577e+12 1.035344e+12 1.116768e+12 1.303105e+12

By changing the error to 0 .

02 we can obtain more accurate results.

Min. 1st Qu. Median Mean 3rd Qu. Max.1.004643e+12 1.032829e+12 1.072707e+12 1.086475e+12 1.110526e+12 1.225569e+12

To understand the need of randomized computation in high dimensions implemented in volesti we compare with the state-of-the-art volume computation in R today i.e., geometry thatimplements a deterministic algorithm. In dimension 15 using geometry is more eﬃcient than volesti while starting in dimension 20 geometry terminates with error, hence volesti is the onlyway to estimate the volume. The following example illustrates this behavior using random15- and 20-dimensional polytopes with 30 and 40 vertices each. postolos Chalkis, Vissarion Fisikopoulos P = gen_rand_vpoly(15, 30, generator = 'cube', seed = 1729)time1 = system.time({geom_values = geometry::convhulln(P$V, options = 'FA')})time2 = system.time({vol2 = volume(P, settings = list("algorithm" = "CB","random_walk" = "BiW"),seed = 5) })cat(time1[3], time2[3])2.482 7.879cat(geom_values$vol, vol2, abs(geom_values$vol-vol2)/geom_values$vol)6.613415e-05 6.426176e-05 0.02831212P = gen_rand_vpoly(20, 40, generator = 'cube', seed = 1729)time1 = system.time({geom_values = geometry::convhulln(P$V, options = 'FA')})QH6082 qhull error (qh_memalloc): insufficient memory to allocate 1329542232bytestime2 = system.time({ vol2 = volume(P, seed = 5) })cat(time2[3], vol2)14.524 2.68443e-07

The R package hitandrun (van Valkenhoef and Tervonen 2019) implements RDHR for uniformsampling from convex polytopes. The following R script compares the running time of hitan-drun with volesti on the 100-dimensional hypercube [ − , . In terms of convergence totarget distribution, hitandrun behaves similarly with the RDHR from volesti (see Figure 4). d = 100P = gen_cube(d, 'H')constr = list("constr" = P$A, "dir" = rep("<=", 2 * d), "rhs" = P$b)time1 = system.time({ points1 = t(hitandrun::hitandrun(constr = constr,n.samples = 1000,thin = 1)) })time2 = system.time({ points2 = sample_points(P, random_walk =list("walk" = "RDHR","walk_length" = 1),n = 1000, seed = 5) })cat(time1[3], time2[3])47.483 0.010 For more detailed comparison with hitandrun see https://github.com/GeomScale/volume_approximation/wiki/volesti-vs-hitandrun volesti : Volume Approximation and Sampling in R Figure 1: Examples of three diﬀerent polytope representations. From left to right: an H-polytope, a V-polytope and a Z-polytope (a sum of four segments).The case of sampling from the truncated Gaussian distribution is more involved. There is anumber of R packages described above that implement various sampling algorithms. volesti adds two new ones, namely RHDR and CDHR, to this list, with the latter being the mosteﬃcient per random walk’s step. However, since there are not known bounds for the mixingtimes of most of those walks it is diﬃcult to compare them. An analytical as well as anempirical comparison would be of great interest but beyond the limits of this paper.In Section 4 we demonstrate how our software can be useful for several applications. Startingwith computational ﬁnance we exploit volesti ’s functionality and the framework given byCalès, Chalkis, Emiris, and Fisikopoulos (2018) to perform complex computations on real-world open data for ﬁnancial crisis detection. Volume approximation of Z-polytopes canbe used to evaluate Z-polytope over-approximation methods in mechanical engineering. Thisevaluation process involves a volume computation and thus, due to the curse of dimensionality,was limited to low dimensions only (typically less than 15) before the use of volesti . Section 4.3presents how volesti algorithms can be used for high dimensional Monte Carlo integration ofa multivariate function over a convex polytope. We are not aware of any other R packageto perform such computations in high dimensions e.g. more than 15. Moreover, volumecan be used to solve hard problems in combinatorics such as counting the number of linearextensions of an acyclic digraph (Section 4.4). Last but not least, motivated by a problemin bio-geography we use our package to approximate the volume of the intersection of twoV-polytopes (Section 4.5). Paper organization.

We continue our presentation with Section 2 that provides all thenecessary deﬁnitions of objects and algorithmic tools that are available in volesti , thus pro-viding the technical background needed to understand the package description in Section 3.Section 4 illustrates how volesti can be applied in various applications ranging from ﬁnanceto combinatorics and engineering.

2. Algorithms and polytopes

Convex polytopes are a special case of convex bodies with special interest in many scientiﬁcﬁelds and applications. For example, in optimization the feasible region of a linear programis a polytope and in ﬁnance the space of portfolios is usually expressed by a polytope (i.e. thesimplex). postolos Chalkis, Vissarion Fisikopoulos P := { x | Ax ≤ b } ⊆ R d where A ∈ R m × d and b ∈ R m , and we say that P is given in H-representation. Each row a Ti ∈ R d of matrix A corresponds to a normal vector that deﬁnes a halfspace a Ti x ≤ b i , i = [ m ]. Theintersection of those halfspaces deﬁnes the polytope P and the hyperplanes a Ti x = b i , i = [ m ]are called facets of P .A V-polytope is given by a matrix V ∈ R d × n which contains n points column-wise, and wesay that P is given in V-representation. The points of P that cannot be written as convexcombinations of other points of P are called vertices. The polytope P is deﬁned as the convexhull of those vertices, i.e. the smallest convex set that contains them. Equivalently, a V-polytope can be seen as the linear map of the canonical simplex ∆ n − := { x ∈ R n | x i ≥ , P ni =1 x i = 1 } according to matrix V , i.e., P := { x ∈ R d | ∃ y ∈ ∆ n − : x = V y } A Z-polytope (or zonotope) is given by a matrix G ∈ R d × k , which contains k segments column-wise, which are called generators. In this case, P is deﬁned as the Minkowski sum of thosesegments and we say that it is given in Z-representation. We call order of a Z-polytope theratio between the number of segments over the dimension. Equivalently, P can be expressedas the linear map of the hypercube [ − , k with matrix G , i.e. P := { x ∈ R d | ∃ y ∈ [ − , k : x = Gy } . Thus, a Z-polytope is a centrally symmetric convex body, as a linear map of an other centrallysymmetric convex body.Examples of an H-polytope, a V-polytope and a Z-polytope in two dimensions are illustrated inFigure 1. For an excellent introduction to polytope theory we recommend the book of Ziegler(1995).

We deﬁne here more formally the four geometric random walks implemented in volesti , namely,Hit and Run (two variations, RDHR and CDHR), Ball walk (Baw) and Billiard walk (BiW).They are illustrated in Figure 2 for two dimensions. All walks, except BiW, can be used tosample approximately from any distribution truncated in P , while BiW can be used only togenerate approximate uniformly distributed points in P .In general if f : R n → R + is a non-negative integrable function then it deﬁnes a measure π f on any measurable subset A of R d , π f ( A ) = R A f ( x ) dx R R d f ( x ) dx Let ‘ be a line in R d and let π ‘,f be the restriction of π to ‘ , π ‘,f ( P ) = R p + tu ∈ P f ( p + tu ) dt R ‘ f ( x ) dx where p is a point on ‘ and u a unit vector parallel to ‘ . The following pseudocode describesone step of Hit and Run. volesti : Volume Approximation and Sampling in R (cid:96) pq (cid:96) p q B p q p q Figure 2: Examples of random walks. From left to right: RDHR, CDHR, BaW, BiW; p isthe point at current step and q the new point computed. Dotted lines depict previous steps. Hit and Run ( P, p, f ): Polytope P ⊂ R d , point p ∈ P , f : R d → R +

1. Pick a line ‘ through p .2. return a random point on the chord ‘ ∩ P chosen from the distribution π ‘,f .It is easy to prove that π f is the stationary distribution of the random walk. When the line ‘ in line (1.) of the pseudocode is chosen uniformly at random from all possible lines passingthrough p then the walk is called Random-Directions Hit and Run (Smith 1984). To pick arandom direction through point p ∈ R d we could sample d numbers g , . . . , g d from N (0 , u = ( g , . . . , g d ) / qP g i is uniformly distributed on the surface of the d -dimensional unit ball. A special case is called Coordinate-Directions Hit and Run (Smith1984) where we pick ‘ uniformly at random from the set of d lines that passing through p andare parallel to the coordinate axes.The Ball walk needs, additionally to Hit and Run, a radius δ as input. In particular, Ballwalk is Metropolis Hastings (Hastings 1970) when the target distribution is truncated. Thefollowing pseudocode describes one step of Ball walk with a Metropolis ﬁlter. Ball Walk ( P, p, δ, f ): Polytope P ⊂ R d , point p ∈ P , radius δ , f : R d → R +

1. Pick a uniform random point x from the ball of radius δ centered at p return x with probability min (cid:26) , f ( x ) f ( p ) (cid:27) ; return p with the remaining probability.Again it is easy to prove that π f is the stationary distribution. If f ( x ) = e −|| x − x || / σ thenfor both Hit and Run and Ball walk, the target distribution is the multidimensional sphericalGaussian with variance σ and its mode at x and if it is the indicator function of P then thetarget distribution is the uniform distribution.Billiard walk, is a random walk for sampling from the uniform distribution (Polyak andGryazina 2014). It tries to emulate the movement of a gas particle during the physicalphenomena of ﬁlling uniformly a vessel. The following pseudocode implements one step ofBilliard walk, where h· , ·i is the inner product between two vectors, || · || is the ‘ norm and | · | is the length of a segment. postolos Chalkis, Vissarion Fisikopoulos Billiard walk ( P, p, τ, R ): Polytope P ⊂ R d , current point of the random walk p ∈ P ,parameter τ ∈ R + , upper bound on the number of reﬂections R ∈ N

1. Set the length of the trajectory L ← − τ ln η , η ∼ U (0 , n ← p ← p ;Pick a uniformly distributed direction on the unit sphere, v ;2. Update n ← n + 1; If n > R return p ;3. Set ‘ ← { p + tv, ≤ t ≤ L } ;4. If ∂P ∩ ‘ = ∅ return p + Lv ;5. Update p ← ∂P ∩ ‘ ; Let s be the inner normal vector of the tangent plane on p ,s.t. || s || = 1; Update L ← L − | P ∩ ‘ | , v ← v − h v, s i s ; goto p ∈ P the more points we generate theless correlated with p will be. The number of random walk steps to get an uncorrelated point,that is a point approximately drawn from π f , is called mixing time. We call cost per stepthe number of operations performed to generate a point. Hence, the total cost to generate arandom point is the mixing time multiplied by the cost per step.random walk mixing time cost/step cost/stepH-polytope V- & Z-polytopeRDHR (Lovász and Vempala 2006) O ∗ ( d ) O ( md ) 2 LPsCDHR (Smith 1984) ? O ( m ) 2 LPsBaW (Lee and Vempala 2017) O ∗ ( d . ) O ( md ) 1 LPBiW (Polyak and Gryazina 2014) ? O (( d + R ) m ) R LPsTable 1: Overview of the random walks implemented in volesti . LP for linear program and R for the number of reﬂections per point in BiW.Table 1 displays known complexities for mixing time and cost per step. For the mixing timeof RDHR we assume that P is well rounded, i.e. B d ⊆ P ⊆ C √ dB d , where B d is the unit balland C a constant. In general if rB d ⊆ P ⊆ RB d then RDHR mixing time is O ∗ ( d ( R/r ) ).For the mixing time of Ball walk in Table 1 we assume that P is in isotropic position and theradius of the ball is δ = Θ(1 / √ d ) (Lee and Vempala 2017). There are no theoretical boundson mixing time for CDHR and BiW. Polyak and Gryazina (2014) experimentally show thatBiW converges faster than RDHR when τ ≈ diam ( P ), i.e. the diameter of P . CDHR isthe main paradigm for sampling in practice from H-polytopes, e.g. in volume computation(Emiris and Fisikopoulos 2014) and biology (Haraldsdòttir et al. et al. volesti enable us to empirically evaluatetheir mixing time using R (e.g., Figure 4).Apart from sampling from the interior of convex polytopes, there are methods for samplingfrom their boundary. They are based on RDHR/CDHR and are called BRDHR/BCDHR0 volesti : Volume Approximation and Sampling in R respectively in this paper. They performs as RDHR/CDHR but store only the extreme/boundary points. There is no theoretical guarantee that these methods converge to theuniform distribution, as the Shake-and-Bake algorithm (Dieker and Vempala 2015), but thereis some experimental evidence that they both do so (Schneebeli 2015). Both BRDHR andBCDHR are implemented in volesti . As mentioned before, volume computation is a hard problem, so given a polytope P we haveto employ randomized algorithms to approximate vol( P ) within some target relative error (cid:15) and high probability. The keys to success of those algorithms are the Multiphase Monte Carlo(MMC) technique and sampling from multivariate distributions with geometric random walks(Section 2.2).In particular, we deﬁne a sequence of functions { f , . . . , f q } , f i : R d → R . Then vol( P ) isgiven by the following telescopic product:vol( P ) = Z P dx = Z P f q ( x ) dx R P f q − ( x ) dx R P f q ( x ) dx · · · R P dx R P f ( x ) dx (1)Then, we need to:• Fix the sequence such that q is as small as possible.• Select f i such that each integral ratio can be eﬃciently computed.• Compute R P f q ( x ) dx .For a long time researchers, e.g., Lovász, Kannan, and Simonovits (1997), set f i to be indicatorfunctions of concentric balls intersecting P (Figure 3). It follows that R P f i ( x ) dx = vol( B i ∩ P )and the sequence of convex bodies P = P ⊇ · · · ⊇ P q , P i = B i ∩ P forms a telescopic productof ratios of volumes, while for vol( P q ) there is a closed formula. Assuming rB d ⊂ P ⊂ RB d ,then q = O ( d lg R/r ). The trick now is that we do not have to compute the exact value ofeach ratio r i = vol( P i ) / vol( P i +1 ), but we can use sampling-rejection to estimate it withinsome target relative error (cid:15) i . If r i is bounded then O (1 /(cid:15) i ) uniformly distributed points in P i +1 suﬃces. Another crucial aspect is the sandwiching ratio R/r of P which has to be assmall as possible. This was tackled by a rounding algorithm, that is bringing P to nearlyisotropic position (Lovász et al. P i suchthat 0 . ≤ vol( P i ) / vol( P i +1 ) ≤

1. They also introduce a practical method for rounding P toreduce the sandwiching ratio by skipping the transformation to nearly isotropic position.Lovász and Vempala (2006) were the ﬁrst to set f i to general functions and thus asymp-totically reduce q , i.e., the length of the sequence of f i . In the CG algorithm, each f i is aspherical multidimensional Gaussian distribution. In particular, they are based on (Cousinsand Vempala 2015) with an annealing schedule (Lovász and Vempala 2006) to ﬁx the sequenceof Gaussians. The sequence is parameterized by the variances σ < · · · < σ q . They computean inscribed ball (ideally the largest one) of P and use the number of facets to set the varianceof the ﬁrst Gaussian so that P r [ x / ∈ P ] ≤ (cid:15), x ∼ N (0 , σ I ). postolos Chalkis, Vissarion Fisikopoulos et al. (2019). Here, r = 0 . r + δ = 0 . d ≤ d > (cid:88) CB (Chalkis et al. (cid:88) (cid:88)

Table 2: Performance of the volume estimation algorithms in volesti . The most eﬃcientalgorithm for each scenario is checked.Motivated by polytopes where the boundary is unknown (e.g., V- and Z-polytopes) CB algo-rithm use a simulated annealing technique in order to minimize q by deﬁning the sequence of P i s.t. r i = vol( P i ) / vol( P i +1 ) ∈ [ r, r + δ ] with high probability, where r, δ > < r + δ < O ∗ (1) BiW points per ratio suﬃces. Last, the simulated annealingallow to generalize the MMC in Figure 3, by using any body C in MMC, instead of a ball.This is crucial when P is a Z-polytope as they can compute bodies that ﬁts better to the inputpolytope than the ball. Thus, skipping the rounding step and reduce further the number ofbodies in MMC than the rounding step could do. For example, in Figure 3 (middle) thealgorithm requires 5 balls, but in the right plot it computes an enclosing body and thus asingle rejection step suﬃces.Table 2 sums up the performance of the three practical methods implemented in volesti . TheCB algorithm is the most eﬃcient choice for H-polytopes in less than 200 dimensions andfor V- and Z-polytopes in any dimension. For the rest of the cases the user should chooseCG algorithm. Preliminary tests show that SoB algorithm has the largest probability toapproximate vol( P ) within a target relative error (cid:15) but it is unclear whether this is the ruleand further benchmarks are needed.

3. Package

Our package volesti combines the eﬃciency of

C++ and the popularity and usability of R .Currently, the package consists of around 10 K lines of C++ and R code. The package is using2 volesti : Volume Approximation and Sampling in R the eigen library (Guennebaud, Jacob et al. lpsolve library (Berkelaar,Eikland, and Notebaert 2004) for solving linear programs and boost random library (Maurerand Watanabe 2017) (part of Boost C++ libraries) for random numbers and distributions.All the code development is performed on https://github.com/GeomScale/volume_approximation

The package is available in Comprehensive R Archive Network (CRAN) and is regularlyupdated with new features and bug-ﬁxes (Fisikopoulos and Chalkis 2019). We employ con-tinuous integration for maintainable and testing the

C++ part of volesti . Additionally, thereis a separate test suite for the functions of the R package.There is a detailed documentation of all the exposed R classes and functions publicly available.We maintain a contribution tutorial to help users and researchers who want to contribute tothe development or propose a bug-ﬁx. The package is shipped under the LGPL-3 license tobe open to all the scientiﬁc communities.We use

Rcpp (Eddelbuettel and François 2011) to interface

C++ with R . In particular, wecreate one Rcpp function for each procedure (such as sampling, volume estimation etc.) andwe export it as an R function. To export C++ classes that represent convex polytopes weuse

Rcpp modules (Eddelbuettel and Fran¸cois 2017). Currently, the package provides 10functions, 4 exposed classes for convex polytopes and 8 polytope generators, explained indetail in the following subsections.

Our package volesti comes with 4 classes to handle diﬀerent representations of polytopes. Ta-ble 3 demonstrates the exposed R classes. The name of the classes are the names of polytoperepresentations as deﬁned in Section 2.1. There is one more class called VpolytopeIntersection that represents the intersection of two V-polytopes. Each polytope class has a few variablemembers that describe a speciﬁc polytope, demonstrated in Table 3. The matrices and thevectors in Table 3 correspond to those of Section 2.1 which deﬁne a polytope of a particularrepresentation. The integer variable type implies the representation: is for H-polytopes, for V-polytopes, for Z-polytopes and for the intersection of two V-polytopes. The numerical variable volume corresponds to the volume of the polytopes if it is known. The volesti ’s generators of standard, well studied polytopes assign the value of the exact volumeto this variable.The set of polytopes’ generators can be used to create and test a variety of diﬀerent inputdata. All the random generators could take a generator seed as input.• gen_cube(dimension = integer, representation = string) generates a cube- d : { x = ( x , . . . , x d ) | x i ≤ , x i ≥ − , x i ∈ R for all i = 1 , . . . , d } . The exact volume isequal to 2 d .• gen_cross(dimension = integer, representation = string) generates a cross- d :cross polytope, the dual of cube, i.e. conv( {− e i , e i , i = 1 , . . . , d } ). The exact volume isequal to 2 d /d !. https://circleci.com/gh/GeomScale/volume_approximation https://github.com/GeomScale/volume_approximation/blob/develop/CONTRIBUTING.md postolos Chalkis, Vissarion Fisikopoulos Hpolytope Hpolytope$new(A,b) matrix A ∈ R m × d vector b ∈ R m integer typenumerical volumeVpolytope Vpolytope$new(V) matrix V ∈ R n × d integer typenumerical volumeZonotope Zonotope$new(G) matrix G ∈ R k × d integer typenumerical volumeVpolytopeIntersection VpolytopeIntersection$new(V1,V2) matrix V ∈ R n × d matrix V ∈ R n × d integer typenumerical volume Table 3: Overview of the polytopes’ classes in volesti .• gen_simplex(dimension = integer, representation = string) generates a ∆- d :the d -dimensional simplex conv( { e i , for i = 1 , . . . , d } ). The exact volume is equal to1 /d !.• gen_rand_vpoly(dimension = integer, nvertices = integer, generator = string,seed = integer) . When generator = "cube" generates a V-polytope with n non re-dundant vertices, uniformly distributed in the hypercube [ − , d . When generator ="sphere" (default value) generates a V-polytope with n vertices uniformly distributedon the d -dimensional unit sphere.• gen_rand_zonotope(dimension = integer, nsegments = string, generator = string,seed = integer) . When generator = "uniform" (default value) the length of each d -dimensional segment is picked uniformly from the interval [0 , generator= "gaussian" the length of each d -dimensional segment is picked from N (50 , (50 / )truncated to [0 , generator = "exponential" the length of each d -dimensionalsegment is picked from Exp (1 /

30) truncated to [0 , gen_skinny_cube(dimension = integer) generates the skinny cube [ − , d − × [ − , · d .• gen_prod_simplex(dimension = integer) generates the Cartesian product of twounit simplices in R d ; it is available only in H-representation. The exact volume isequal to 1 /d ! .• gen_rand_hpoly(dimension = integer, nfacets = integer, seed = integer) gen-erates a random H-polytope. We choose m hyperplane tangent to the hypersphere of4 volesti : Volume Approximation and Sampling in R radius 10 centered at the origin and for each one we construct a halfspace that containsthe center of the hypersphere.Note that for all the random Z-polytope generators we pick the direction of each one of the k segments to be a uniformly distributed vector on the unit sphere. When the exact volume ofthe polytope is known the generator sets the value of the class variable volume equal to thatvalue; otherwise the default value is NaN . For the ﬁrst three generators the user can choosethe representation of the generated polytope by setting representation either to

Hpolytope or Vpolytope by giving "H" or "V" as input respectively. The rest of the generators comewith a ﬁxed output representation. In this subsection we present all the functions provided by volesti to sample from convexpolytopes. Moreover we demonstrate how these functions can be used for empirical study ofmixing time, or plotting 2d polygons.

Sampling via random walks

An important aspect of volesti is approximate sampling from convex bodies with uniformor spherical Gaussian target distribution using the 4 geometric random walks mentioned inSection 2.2. The sampling function is, sample_points(P = Rcpp_class, n = integer, distribution = List,random_walk = List, seed = integer) where distribution , random_walk and seed can be omitted and the default values are goingto be used. The default target distribution is the uniform distribution. However, if Gaussian isselected by the user, the default variance is 1 and the default mode is the center of the inscribedball rB d ⊆ P (see Section 2.3). The default random walk for the uniform distribution is BiWwith walk length equal to 5 for all the representations. For the Gaussian distribution thedefault random walk is CDHR for H-polytopes and RDHR for all the other representationswith walk length equal to b

10 + d/ c for both random walks. The default starting pointfor all the random walks is also the center of rB d . The default number of points nburns toburn before start sampling is 0, the default maximum length L of the BiW trajectory is 2 dr and the default radius for the BaW is δ = 4 r/ √ d when the target distribution is the uniformdistribution, otherwise δ = 4 r/ p max { , / σ } d when the latter is the spherical Gaussiandistribution. The function is parameterized by:1. A convex polytope P in any of the 4 representations described in Section 3.1.2. The number of points n , to sample from P .3. The List distribution is used to set the target distribution with elements:(i) "density" = {"uniform", "gaussian"} , (ii) "variance" = numeric and (iii) "mode" = vector . postolos Chalkis, Vissarion Fisikopoulos − , . We mapthe sample back to [ − , and then we project them on R by keeping the ﬁrst threecoordinates. Each row corresponds to a diﬀerent walk: BaW, CDHR, RDHR, BiW. Eachcolumn to a diﬀerent walk length: {1, 50, 100, 150, 200}. That is, the sub-ﬁgure in third rowand forth column corresponds to RDHR with 150 walk length.4. The List random_walk to choose and parameterize the random walk, with elements:(i) "walk"= {"CDHR", "RDHR", "BiW", "BaW", "BRDHR", "BCDHR"} ,(ii) "walk_length"= integer , (iii) "starting_point" = vector ,(iv) "nburns" = integer , (v) "BaW_rad" = numeric and (vi) "L" = numeric .5. A random seed .Using volesti and R we can empirically study the mixing time of the geometric random walksimplemented in volesti . To this end, we sample uniformly from a random rotation of the200-dimensional hypercube [ − , . Then we map the points back to [ − , using theinverse transformation and then we project all the sample points on R , or equivalently onthe 3D cube [ − , , by keeping the ﬁrst three coordinates. We plot the results in Figure 4.The following script performs those computations (plot scripts in the Appendix A.1). Thefunction rotate_polytope() rotates randomly a polytope and returns the rotated polytopeand the matrix of the linear transformation; for more details see Section 3.6. d = 200 volesti : Volume Approximation and Sampling in R num_of_points = 1000P = gen_cube(d, 'H')retList = rotate_polytope(P, seed = 5)T = retList$TP = retList$Pfor (i in c(1, seq(from = 50, to = 200, by = 50))){points1 = t(T) %*% sample_points(P, n = num_of_points,random_walk = list("walk" = "BaW","walk_length" = i),seed = 5)points2 = t(T) %*% sample_points(P, n = num_of_points,random_walk = list("walk" = "CDHR","walk_length" = i),seed = 5)points3 = t(T) %*% sample_points(P, n = num_of_points,random_walk = list("walk" = "RDHR","walk_length" = i),seed = 5)points4 = t(T) %*% sample_points(P, n = num_of_points,random_walk = list("walk" = "BiW","walk_length" = i),seed = 5)} We note that, in general, perfect uniform sampling in the rotated polytope would resultto perfect uniformly distributed points in the 3D cube [ − , . Hence, Figure 4 shows anadvantage of BiW in mixing time for this scenario compared to the other walks—it mixesrelatively well even with one step (i.e. walk length). Notice also that the mixing of bothCDHR and RDHR seem similar while it is sightly better than the mixing of BaW. Direct sampling volesti also provides direct uniform sampling from special bodies. By direct sampling wemean that we do not employ random walks. Typically direct sampling is more eﬃcient andmore accurate than sampling using random walks. The function that oﬀers this option is, direct_sampling(body = List, n = integer, seed = integer)

There are no default values for the input variables of this function, except from seed . It isparameterized by,1. The

List body to request exact uniform sampling from special well known convex bod-ies through the following elements:(i) "type" a string that declares the type of the body for the exact sampling with ’unit_simplex’ for the unit simplex, i.e. ∆ d := { x ∈ R d | x i ≥ , P di =1 x i ≤ } , postolos Chalkis, Vissarion Fisikopoulos ’canonical_simplex’ for the canonical simplex, i.e. ∆ d − := { x ∈ R d | x i ≥ , P di =1 x i = 1 } , or ’hypersphere’ for the boundary of a hypersphere centered atthe origin, or ’ball’ for the interior of a hypersphere centered at the origin,(ii) "dimension" and "radius" (only for hypersphere and ball).2. The number of points n , to sample from P .3. A random seed .The cost per uniformly distributed point for a ball/hypersphere as well as for the unit andthe canonical simplex is O ( d ) (Rubinstein and Melamed 1998). Notice that the random walkwith the smallest cost per step is CDHR, which is also O ( d ) for those bodies. However, onemay need several walk steps (at least polynomial in d following the theoretical bounds; seeSection 1) to generate an almost uniformly distributed point.Direct uniform sampling from some well known families of convex bodies such as simplicesor hypersphere are useful fundamental operations in many randomized algorithms in convexoptimization (Dabbene, Shcherbakov, and Polyak 2010) or computational biology (Herrmann,Dyson, Vass, Johnson, and Schwartz 2019).Generating uniformly distributed points on the boundary of a hypersphere is equivalent togenerate a direction uniformly as discussed in Section 2.2. Uniform sampling from the inte-rior of a hypersphere is used by CB algorithm. Uniform sampling from ∆ d − is useful forapplications in ﬁnance (Pouchkarev 2005; Guegan, Calès, and Billio 2011; Banerjee and Hung2011), as it is equivalent to generate uniformly distributed portfolios in a stock market of d assets (see Section 4). Figure 5 illustrates the samples obtained by the following R script(plot scripts in the Appendix A.2). N = 2000points1 = direct_sampling(body = list("type" = "canonical_simplex","dimension" = 3),n = N, seed = 5)points2 = direct_sampling(body = list("type" = "ball","dimension" = 2),n = N, seed = 5)points3 = direct_sampling(body = list("type" = "hypersphere","dimension" = 2),n = N, seed = 5)

Plotting a polygon via sampling

The plots of polygons in this paper are drawn by uniform sampling from the boundary ofthe polygons with sample_points() . The following R script illustrates how volesti supportssampling from the boundary of the intersection of two V-polytopes illustrated in Figure 6(plot scripts in the Appendix A.3). P1 = gen_rand_vpoly(2, 7, generator = "sphere", seed = 13)P2 = gen_rand_vpoly(2, 5, generator = "cube", seed = 151)P3 = VpolytopeIntersection$new(V1 = P1$V, V2 = P2$V) volesti : Volume Approximation and Sampling in R Figure 5: Uniformly distributed points from the 3D canonical simplex and from the interiorand the boundary of the 2D unit ball. points1 = sample_points(P1, random_walk = list("walk" = "BRDHR"), n = 10000,seed = 5)points2 = sample_points(P2, random_walk = list("walk" = "BRDHR"), n = 10000,seed = 5)points3 = sample_points(P3, random_walk = list("walk" = "BRDHR"), n = 10000,seed = 5)

A signiﬁcant function for volume estimation in volesti is volume() . The user can select any ofthe three randomized algorithms that are implemented in volesti (Section 2.3). The polytopecan be given in any of the four representations described in Section 3.1. volume(P = Rcpp_class, settings = List, rounding = boolean, seed = integer) The input variables settings , rounding , seed are optional and when omitted the defaultvalues are going to be used. The default algorithm is selected as suggested by Table 2. Thedefault parameters of each algorithm are those suggested by Emiris and Fisikopoulos (2014);Cousins and Vempala (2016); Chalkis et al. (2019). The default random walk is CDHR forH-polytopes and BiW for V- and Z-polytopes. The default walk length is 1 for CB andCG algorithms and b

10 + d/ c for SoB. The default enclosed ball in P that the algorithmcomputes, depending the representation of P , is given later in this subsection. The defaultbody that CB algorithm uses in MMC is ball, except the case of P being a Z-polytope withorder <

5, where the H-polytope discussed in Section 1 is used. The default value for theerror parameter is 0.1 for both CB and CG and 1 for SoB algorithm. postolos Chalkis, Vissarion Fisikopoulos P , P . Right, uniformboundary sampling from the intersection of P ∩ P .The function volume() is parameterized by:1. The List settings to set the algorithm to use and the parameters of the selectedalgorithm: i) "algorithm" = {"CB", "CG", "SOB"} , ii) "error" = numeric ,iii) "random_walk" = {"CDHR", "RDHR", "BiW", "BaW"} ,iv) "walk_length" = integer , v) "win_len" = integer (the length of the slidingwindow for either CB or CG algorithm), vi) "hpoly" = boolean (a ﬂag to use H-polytopes in MMC when the input polytope is a Z-polytope and the algorithm of choiceis CB)2. The boolean input variable rounding to request rounding before volume computation.3. A random seed .When P is an H-polytope volesti computes the largest inscribed ball, but for V- and Z-polytopes computes non optimal balls as the problem becomes computationally harder. Inparticular, when P is a zonotope, checking whether a ball B ⊆ P is in co-NP, but it is notknown whether it is co-NP-complete (Cerny 2012). When P is a V-polytope, given p ∈ P thecomputation of the largest inscribed ball centered at p is NP-hard (Murty 2009). For thosecases volesti computes suboptimal inscribed balls that work well in practice as default choicesfor the sampling and volume estimation algorithms implemented in volesti . The defaultcomputations are the following:• H-polytopes:

We compute the Chebychev ball (the largest inscribed ball) by solvinga linear program (Boyd and Vandenberghe 2004).•

V- and Z-polytopes:

We compute the maximal r s.t.: re i ∈ P for all i = 1 , . . . , d ,then the ball centered at the origin with radius r/ √ d is an inscribed ball of P .0 volesti : Volume Approximation and Sampling in R • Intersection of two V-polytopes P , P : Let V ∈ R d × n , V ∈ R d × n be thematrices that contain the n vertices of P and the n vertices of P respectively. Then P ∩ P = ∅ if and only if the following linear program is feasible, h V − V i  x ... x n + n  = , x i ≥ , ∀ i ∈ [ n + n ] , n X i =1 x i = 1 , n + n X j = n +1 x j = 1 . Then we employ this linear program to compute d + 1 vertices of P ∩ P : For theﬁrst vertex we pick a random direction in R n + n as a linear objective function and theoptimal solution would be a vertex of P ∩ P . For the i -th vertex, where 1 < i ≤ d + 1,we consider the computed vertices v , . . . , v i − as vectors and we pick an objectivefunction from the orthogonal subspace generated by these vectors. Last, we computethe largest inscribed ball of the corresponding simplex using the algorithm providedby Murty (2009). For exact volume computation volesti provides support for speciﬁc convex bodies through thefollowing function, exact_vol(P = Rcpp_class, seed = integer)

When the input is a Z-polytope, exact_vol() computes and returns the exact volume bysumming absolute values of determinants as proposed by Gover and Krikorian (2010). For theother representations the function returns the member variable volume when it has not beenset to

NaN . For well known polytopes, e.g. cubes, that have been generated by volesti ’s randomgenerators this variable has been assigned to the exact volume of the polytope. Otherwise thefunction returns an exception. The following example demonstrates how exact_vol() works.

P = gen_cube(100, 'H')Z = gen_rand_zonotope(5, 10, seed = 20)rP = gen_rand_hpoly(10, 60, seed = 11)exact_vol1 = exact_vol(P)exact_vol2 = exact_vol(Z)cat(exact_vol1, exact_vol2)1.267651e+30 155854541519exact_vol3 = exact_vol(rP)Error in exact_vol(rP) : Volume unknown!Called from: exact_vol(rP) postolos Chalkis, Vissarion Fisikopoulos H := { x ∈ R d | a T x ≤ z } , a ∈ R d , z ∈ R , volesti provides the exactcomputation of vol(∆ d − ∩ H ) / vol(∆ d − ) which is also equal to vol(∆ d ∩ H ) / vol(∆ d ), where∆ d := { x ∈ R d | x i ≥ , P di =1 x i ≤ } . This ratio of volumes is strongly related with the crosssectional score of a portfolio—in terms of return—in a stock market of d assets and thereforeof special interest (see Section 4). The function frustum_of_simplex() implements thealgorithm by Varsi (1973), which performs O ( d ) operations to compute that volume ratio. frustum_of_simplex(a = numeric vector, z0 = numeric) To compute the frustum of an arbitrary simplex, one has i) to apply to H the linear trans-formation that maps the simplex to ∆ d , ii) to apply Varsi’s algorithm and iii) to computethe exact volume of the simplex by calculating a determinant. The following R script demon-strates the eﬃciency of the algorithm in thousands dimensions. The sampled vector a deﬁnesthe direction of a hyperplane and the sampled vector x is used to compute the scalar z0 whichﬁnally deﬁnes a halfspace H that intersects the ∆ d − . Notice that a few milliseconds suﬃcesto compute the volume of a frustum of ∆ d − when d = 5 000. d = 5000a = t(direct_sampling(n = 1, body = list("type" = "hypersphere","dimension" = d),seed = 50))x = direct_sampling(n = 1, body = list("type" = "canonical_simplex","dimension" = d),seed = 50)z0 = a%*%xtim = system.time({ volume = frustum_of_simplex(a, z0) })cat(tim[3], volume)0.057 0.01729134 A critical complexity issue for all the volume estimation algorithms implemented in volesti is to bring a skinny polytope to well-rounded or isotropic position. To achieve this goal inpractice, volesti follows the method proposed by Emiris and Fisikopoulos (2014). For H- andZ-polytopes the method generates 10 d uniformly distributed points in P and computes anapproximation of the minimum volume enclosing ellipsoid of that pointset using Khachiyan’salgorithm (Todd and Yildirim 2007) and the implementation by Nikolic (2015). Then it mapsthe ellipsoid to the unit ball and applies the same linear transformation to P . When P is aV-polytope volesti computes the same ellipsoid, but now the pointset consists of the verticesof P . This rounding preprocessing can be also computed by the function, round_polytope(P = Rcpp_class, seed = integer) volesti : Volume Approximation and Sampling in R Figure 7: Left: the projected points of a skinny cube. Right: the projected points of therounded skinny cube.It returns a

List that contains the rounded polytope, the matrix of the linear transformation,the vector/point which shifts input polytope P and the determinant of the linear map to beused for other volume computations. The following R script shows how useful rounding canbe for volume computation and results to Figure 7 (plot scripts in the Appendix A.4). d = 10P = gen_skinny_cube(d)points1 = sample_points(P, random_walk = list("walk" = "CDHR"), n = 1000,seed =5)P_rounded = round_polytope(P)$Ppoints2 = sample_points(P_rounded, random_walk = list("walk" = "CDHR"),n = 1000, seed = 5)cat(P$volume, volume(P, seed = 50), volume(P, rounding = TRUE, seed = 50))102400 76324.25 99695.16 Notice that the accuracy is much better when we apply rounding as a preprocessing stepbefore volume computation. The main reason behind this, is the better mixing time of allthe random walks implemented in volesti for rounded bodies than for skinny ones.When P is a Z-polytope the rounding step is computationally too costly for some cases,especially when the order is high. The use of the centrally symmetric H-polytope in MMC,reduces signiﬁcantly the number of phases and the total running time. The following scriptshow how one might select this option by declaring as TRUE the ﬂag "hpoly" . Z = gen_rand_zonotope(30, 35, generator = "uniform", seed = 250)time1 = system.time({ vol1 = volume(Z, settings = list("hpoly" = FALSE),seed = 5) })time2 = system.time({ vol2 = volume(Z, settings = list("hpoly" = TRUE), postolos Chalkis, Vissarion Fisikopoulos P := [ − , × [ − , seed = 5) })cat(time1[3], time2[3])501.224 69.792cat(vol1, vol2)1.434612e+57 1.133094e+57 Rounding is also particularly useful for sampling from skinny polytopes. One could round askinny polytope P , sample from the rounded polytope and apply the inverse linear map toobtain a sample in P . The next R script demonstrates how useful this property can be evenfor 2D sampling and results to Figure 8 (plot scripts in the Appendix A.4). d = 2P = gen_skinny_cube(d)points1 = sample_points(P, random_walk =list("walk" = "BiW", "walk_length" = 1,"starting_point" = c(50,0), "L" = 2*sqrt(d)),n = 100, seed = 5)ret_list = round_polytope(P, seed = 5)P_rounded = ret_list$PT = ret_list$Tshift = ret_list$shiftpoints2 = (T %*% sample_points(P_rounded,random_walk = list("walk" = "BiW","walk_length" = 1),n = 100, seed = 5)) +kronecker(matrix(1, 1, 100), matrix(shift, ncol = 1)) Notice that the sample points in the left plot of Figure 8 are concentrated around the startingpoint of the random walk. Furthermore, the sampling from the rounded polytope (right plotof Figure 8) results to much better convergence to the uniform distribution.

The package provides more functions for the preprocessing of the input polytope.The function inner_ball() takes as input a convex polytope and computes the inscribedball of P as discussed in Section 3.3. The following script results to the left plot of Figure 9(plot scripts in the Appendix A.5).4 volesti : Volume Approximation and Sampling in R Figure 9: Left a H-polytope and the largest inscribed ball. Right the same H-polytope andwith blue color a random rotation.

P1 = gen_rand_hpoly(2, 6, seed = 729)points1 = sample_points(P1, random_walk = list("walk" = "BRDHR"), n = 10000,seed = 5)r = inner_ball(P1)[3]points2 = direct_sampling(body = list("type" = "hypersphere","dimension" = 2, "radius" = r), n = 10000,seed = 5)

The function rotate_polytope() can be used to rotate a convex polytope P . rotate_polytope(P = Rcpp_class, T = matrix, seed = integer) It takes as input a polytope and a matrix T of a linear map and then apply the map on P .There is also the option to set a ﬁxed seed for the linear map random generator. If it is givenonly a polytope then it generates a random linear map. It returns the rotated polytope andthe linear transformation applied on input polytope P . The following script results to theright plot of Figure 9. P1 = gen_rand_hpoly(2, 6, seed = 729)points1 = sample_points(P1, random_walk = list("walk" = "BRDHR"), n = 10000,seed = 5)P2 = rotate_polytope(P1, seed = 12496)$Ppoints2 = sample_points(P2, random_walk = list("walk" = "BRDHR"), n = 10000,seed = 5) postolos Chalkis, Vissarion Fisikopoulos

4. Applications

In this section we demonstrate volesti ’s potential to solve challenging problems. More specif-ically, we provide detailed use-cases for applications in ﬁnance, mechanical engineering, mul-tivariate integration, biology, and combinatorics.

In this subsection we present how one could employ volesti to detect ﬁnancial crises in bigstock markets. For all the examples in the sequel we use a set of 52 popular exchangetraded funds (ETFs) and the US central bank (FED) rate of return publicly available from https://stanford.edu/class/ee103/portfolio.html . The following script is used to loadthe data.

MatReturns = read.table("https://stanford.edu/class/ee103/data/returns.txt",sep = ",")MatReturns = MatReturns[-c(1, 2), ]dates = as.character(MatReturns$V1)MatReturns = as.matrix(MatReturns[ ,-c(1, 54)])MatReturns = matrix(as.numeric(MatReturns), nrow = dim(MatReturns )[1],ncol = dim(MatReturns )[2], byrow = FALSE)nassets = dim(MatReturns)[2]

For a speciﬁc set of assets in a stock market, portfolios are characterized by their returnand their risk which is the variance of the portfolios’ returns (volatility). Financial marketsexhibit 3 types of behavior (Billio, Getmansky, and Pelizzon 2012). In normal times, portfoliosare characterized by slightly positive returns and a moderate volatility, in up-market times(typically bubbles) by high returns and low volatility, and during ﬁnancial crises by stronglynegative returns and high volatility. These features motivate researchers to describe thetime-varying dependency between portfolios’ return and volatility. volesti relies on the workof Calès et al. (2018) and copula representation. A copula is an approximation of the bivariatejoint distribution of portfolios’ return/volatility. First, consider the set of portfolios in a stockmarket of d assets being the canonical simplex ∆ d − . A vector of assets’ returns R ∈ R d deﬁnesa family of hyperplanes R · x = const. For a speciﬁc return (constant) R the correspondinghyperplane intersecting ∆ d − deﬁnes the set of portfolios with return R . The portfoliosabove this hyperplane have larger return than R and those below smaller. volesti providesfast computations for the proportion of portfolios that lie in R · x ≤ R , x ∈ ∆ d − , which alsocalled cross sectional score of portfolio, given a vector of assets’ returns (Guegan et al. CRAN package

Performance-Analytics (Peterson and Carl 2020). However, they suﬀer from estimation errors as shownby Lo (2003), which prevent any performance comparison to be signiﬁcant. The cross sec-tional score of a portfolio by Pouchkarev (2005); Banerjee and Hung (2011) is computed withuniform sampling from ∆ d − and a rejection step. However, Calès et al. (2018) notice that6 volesti : Volume Approximation and Sampling in R Varsi’s algorithm, which is provided by function frustum_of_simplex() , performs robustcomputations of such volumes for d in the thousands in just a few milliseconds. Interestingly,the following R script let us know that in 03 / / . d − for approximate computation andVarsi’s algorithm for exact computation of the score. N = 500000R = MatReturns[which(dates %in% "2009-03-13"), ]R0 = 0.002tim1 = system.time({points = direct_sampling(n = N, seed = 5,body = list("type" ="canonical_simplex","dimension" = nassets))vals = R %*% pointspoints_in = 0for (i in 1:N) {if (vals[i] < R0){points_in = points_in + 1}}approximate_score = points_in / N})tim2 = system.time({ exact_score = frustum_of_simplex(R, R0) })cat(approximate_score, tim1[3], "\n", exact_score, tim2[3])0.469328 1.440.4773961 0

The volatility of portfolios is represented by a family of concentric ellipsoids, centered at theorigin, which matrix Σ ∈ R d × d is the covariance matrix of the distribution of the assets’returns. For a constant v ∈ R d , the set of points x T Σ x = v , x ∈ ∆ d − is the set ofportfolios with volatility v . The points above or below correspond to portfolios with largeror smaller volatility respectively. When M parallel hyperplanes and M concentric ellipsoidsintersect with ∆ d − they deﬁne a grid of M × M bodies. The copulas that are derived fromthe computation of all the proportions of portfolios that lie in each body in the grid is anapproximation of the joint distribution between portfolios’ return and volatility. In particular,we use compound returns which are computed on W consecutive rows of assets’ returns andfor the volatility we compute the covariance matrix of those returns. The function copula(r1 = numeric vector, r2 = numeric vector, sigma = matrix,m = integer, n = integer, seed = integer) can be used to compute such copulas. It takes as input (i) the vectors r1, r2 that denote afamily of parallel hyperplanes each; when both are given by the user then the computed copulais related to the problem of momentum eﬀect in stock markets (for more details are given by postolos Chalkis, Vissarion Fisikopoulos / / − / / I = 0 . / / − / / I = 5 . x axis is for return and y axis is for volatility. The plot in the middle showsthe mass of interest to characterize the market state.Calès et al. (2018)), (ii) the matrix sigma that denotes a family of concentric ellipsoids, (iv)the integer m that denotes the number of objects for each family and (v) the integer n for thenumber of uniformly distributed points to generate in ∆ d − . The method counts the numberof points in each body of the grid to obtain a copula.The following script results to Figure 10 by setting the starting and the stopping date for theleft and the right plot respectively (code of the additional function get_compound_returns() in Appendix A.6). We computed two copulas; one in normal times (left plot) and a second incrisis period (right plot), by using the compound return and an estimation of the covariancematrix of W = 60 consecutive days of assets’ returns. Notice that the mass of a copulaconcentrates in a speciﬁc diagonal depending on the market state. row1 = which(dates %in% "2008-12-18")row2 = which(dates %in% "2009-03-13")cR = get_compound_return(MatReturns[row1:row2, ])mass = copula(r1 = cR, sigma = cov(MatReturns[row1:row2, ]),m = 100, n = 1e+06, seed = 5)plot_ly(z = ~mass) %>% add_surface(showscale=FALSE) In particular, we can derive information by considering the mass on the two main diagonalsof such a copula as the plot in the middle in Figure 10 illustrates. We deﬁne an indicator asthe ratio between two masses (red / blue). If the indicator is ≥ compute_indicators(returns = matrix, win_len = integer, m = integer,n = integer, nwarning = integer, ncrisis = integer,seed = integer) The function compute_indicators() takes as input (i) a set of assets’ returns as a matrix,(ii) the length, win_len (or W), of the sliding window, (iii) the number of objects, m , for eachfamily of hyperplanes or ellipsoids, (iv) the number of points n to sample. The input variable8 volesti : Volume Approximation and Sampling in R Figure 11: The values of the indicators from 2007-01-04 until 2010-01-04. We mark with redthe crisis periods, with orange the warning periods and with blue the normal periods that volesti identiﬁes. nwarning implies the number of consecutive indicators larger than 1 to declare a warningand the variable ncrisis the number of consecutive indicators larger than 1 to declare acrisis. The function computes the copulas of all the sets of W consecutive days and returnsthe corresponding indicators and the states of the market during the given time period.The following R script takes as input the daily returns of all the 52 assets from 01 / / / / ≥ row1 = which(dates %in% "2007-01-04")row2 = which(dates %in% "2010-01-04")market_analysis = compute_indicators(returns = MatReturns[row1:row2, ],win_len = 60, m = 100, n = 1e+06,nwarning = 30, ncrisis = 60, seed = 5)I = market_analysis$indicatorsmarket_states = market_analysis$market_states We compare the results with the database for ﬁnancial crises in European countries proposedin (Duca, Koban, Basten, Bengtsson, Klaus, Kusmierczyk, Lang, Detken, and Peltonen 2017).The only listed crisis for this period is the sub-prime crisis (from December 2007 to June 2009).Notice that Figure 11 (plot scripts in the Appendix A.7) successfully points out 4 crisis eventsin that period (2 crisis and 2 warning periods) and detects sub-prime crisis as a W-shape crisis. postolos Chalkis, Vissarion Fisikopoulos P , astight as possible, with a second Z-polytope P red of smaller order, while vol( P red ) is given by,an easy to compute, closed formula. A good measure for the quality of the approximation isthe following ratio of ﬁtness, ρ = (cid:18) vol( P red )vol( P ) (cid:19) /d . (2)This involves a volume computation problem. In (Kopetzki et al. d > volesti is the ﬁrst software that provides eﬃcient volume estimation for Z-polytopessuch that it is possible to evaluate an over approximation of a high dimensional Z-polytope P or a Z-polytope of very high order in lower dimensions. Moreover, the function, zonotope_approximation(Z = Rcpp_class, fit_ratio = boolean,settings = List, seed = integer) provides both an over-approximation P red of a given zonotope P using PCA method and anevaluation of P red estimating the ratio of ﬁtness, ρ . The List settings can be used to setthe parameters of the CB algorithm as in Section 3.3, while the boolean fit_ratio can beused to request the computation of ρ . In the following pseudocode of PCA method the IH ( · )is the interval hull by Kühn (1998). PCA (Z-polytope P with generators’ matrix G ∈ R d × k ) X = [ G | − G ] T U SV T = SVD( X T X ) Return: G red = U · IH ( U T G )Notice that G red = U · IH ( U T G ) deﬁnes a box and generates P red and thus vol( P red ) is givenby a closed formula and its H-representation can be easily derived. The following R script,generates a random 2D zonotope and computes the over-approximation with PCA method;it results to Figure 12 (plot scripts in the Appendix A.8). Z = gen_rand_zonotope(2, 8, generator = "uniform", seed = 1729)points1 = sample_points(Z, random_walk = list("walk" = "BRDHR"), n = 10000)retList = zonotope_approximation(Z = Z, fit_ratio = TRUE, seed = 5)P = retList$Ppoints2 = sample_points(P, random_walk = list("walk" = "BRDHR"), n = 10000,seed = 5)cat(retList$fit_ratio)1.116799539 volesti : Volume Approximation and Sampling in R Figure 12: With blue color a 2D Z-polytope. With grey color the over-approximation of P computed with PCA method.The next R script displays an example in d = 30 for which exact volume computation is notpossible. In particular, computing ρ using exact volume computation would take years in anordinary PC. Inevitably, Kopetzki et al. (2017) report values of ρ with almost 100% error asthey replace, in Equation (2), the volume of P with the volume of the over-approximationcomputed by other methods (e.g. BOX method). Z = gen_rand_zonotope(20, 500, generator = "uniform", seed = 127)retList = zonotope_approximation(Z = Z, fit_ratio = TRUE, seed = 5)cat(retList$fit_ratio)2.454694

Computing the integral of a function over a convex set (i.e. convex polytope) is a hard fun-damental problem with numerous applications. The only R packages that support such com-putations are SimplicialCubature by Nolan and Genz (2016) and cubature by Narasimhan,Koller, Johnson, Hahn, Bouvier, Kiêu, and Gaure (2019). The ﬁrst computes multivariate in-tegrals over simplices and the second over hypercubes. cubature is a R wrapper of C packages cuba by Hahn (2018) and cubature by Johnson (2018).Hence, to compute an integral of a function over a convex polytope P in R one should computethe Delaunay triangulation with package geometry and then use the package SimplicialCuba-ture to sum the values of all the integrals over the simplices computed by the triangulation.On the other hand volesti can be used to approximate the value of such an integral by asimple MCMC integration method, which employs the vol( P ) and a uniform sample from P .In particular, let I = Z P f ( x ) dx. (3) postolos Chalkis, Vissarion Fisikopoulos N uniformly distributed points x , . . . , x N from P and, I ≈ vol( P ) 1 N N X i =1 f ( x i ) . (4)The following R script generates a V-polytope for d = 5 , , ,

20 and deﬁnes a function f .Then computes the exact value of I and in the sequel approximates that value employing volesti . The pattern is similar to volume computation, for d = 5 ,

10 the exact computationis faster than the approximate, for d = 15 volesti is 13 times faster and for d = 20 the exactapproach halts while volesti returns an estimation in less than a minute. for (d in seq(from = 5, to = 20, by = 5)) {P = gen_rand_vpoly(d, 2 * d, seed = 127)tim1 = system.time({triang = geometry::delaunayn(P$V)f = function(x) { sum(x^2) + (2 * x[1]^2 + x[2] + x[3]) }I1 = 0for (i in 1:dim(triang)[1]) {I1 = I1 + SimplicialCubature::adaptIntegrateSimplex(f,t(P$V[triang[i,], ]))$integral}})tim2 = system.time({num_of_points = 5000points = sample_points(P, random_walk = list("walk" = "BiW","walk_length" = 1),n = num_of_points, seed = 5)int = 0for (i in 1:num_of_points){int = int + f(points[, i])}V = volume(P, settings = list("error" = 0.05), seed = 5)I2 = (int * V) / num_of_points})cat(d, I1, I2, abs(I1 - I2) / I1, tim1[3], tim2[3], "\n")}5 0.02738404 0.02597928 0.05129854 0.41 3.09510 3.224286e-06 2.935246e-06 0.08964482 2.945 12.0115 4.504834e-11 4.574285e-11 0.01541695 471.479 33.25620 error 9.947623e-17 Inf - 64.058 Let G = ( V, E ) be an acyclic digraph with V = [ n ] := { , , . . . , n } . One might want toconsider G as a representation of the partially ordered set (poset) V : i > j if and only if2 volesti : Volume Approximation and Sampling in R Figure 13: An acyclic directed graph with 5 nodes, 4 edges and 9 linear extensions.there is a directed path from node i to node j . A permutation π of [ n ] is called a linearextension of G (or the associated poset V ) if π − ( i ) > π − ( j ) for every edge i → j ∈ E .Let P LE ( G ) be the polytope in R n deﬁned by P LE ( G ) = { x ∈ R n | ≥ x i ≥ i = 1 , , . . . , n } , and x i ≥ x j for all directed edges i → j ∈ E .It is well known (Stanley 1986) that the number of linear extensions of G equals the normalizedvolume of P LE ( G ) i.e., LE G = vol( P LE ( G )) n !The graph in Figure 13 has 9 linear extensions . This number can be estimated using volesti as in the following script. A = matrix(c(-1,0,1,0,0,0,-1,1,0,0,0,-1,0,1,0,0,0,0,-1,1,1,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,1,-1,0,0,0,0,0,-1,0,0,0,0,0,-1,0,0,0,0,0,-1,0,0,0,0,0,-1),ncol = 5, nrow = 14, byrow = TRUE)b = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1 , 1, 1, 1)P_LE = Hpolytope$new(A, b)cat(volume(P_LE, settings = list("error" = 0.2), seed = 5) * factorial(5))8.456771

As the number of nodes (i.e., the dimension of P LE ( G )) grows the problem becomes in-tractable for exact methods while volesti provides an eﬃcient alternative. There is a recent Example taken from https://inf.ethz.ch/personal/fukudak/lect/pclect/notes2016/expoly_order.pdf postolos Chalkis, Vissarion Fisikopoulos

Challenging volume computations appear in biogeography research and in particular in bio-diversity analysis of bird species Barnagaud, Kissling, Tsirogiannis, Fisikopoulos, Villeger,Sekercioglu, and Svenning (2017). This computation involves volume calculations for inter-sections of V-polytopes. Our package, volesti , provides an R class to represent such convexbodies and an option to sample from or to estimate their volume. While up to dimension 5 geometry is clearly faster than volesti in volume computation, starting in dimension 6 volesti computes faster estimates and this is expected to be the rule as the dimension grows. d = 6P1 = gen_rand_vpoly(d, 14, seed=7)P2 = gen_rand_vpoly(d, 15, seed=4)VP = VpolytopeIntersection$new(P1$V, P2$V)time1 = system.time({ exact_volume = geometry::intersectn(P1$V, P2$V)$ch$vol })time2 = system.time({ approximate_volume = volume(VP, seed = 50) })cat(exact_volume, approximate_volume, time1[3], time2[3])0.005736105 0.005579085 13.711 3.089

5. Conclusion

The R package, volesti , is a key to guarantee that our software is accessible from scientiﬁc orbusiness communities that are not familiar with programming in C++ and need a friendlyenvironment to use all these statistical and geometrical tools we provide.

Computational details

The results in this paper were obtained using R volesti volesti packages are stats methods volesti packages, Rcpp BH RcppEigen testthat volesti and for plots this paper uses, geometry hitandrun

SimplicialCubature ggplot2 plotly latex2exp rgl

CRAN at http://CRAN.R-project.org .All experiments were performed on a PC with Intel® Pentium(R) CPU G4400 @ 3.30GHz × and .4 volesti : Volume Approximation and Sampling in R Acknowledgments

The main part of the work has been done while A.C. was supported by Google Summer ofCode R -project for statisticalcomputing and the R community. References

Afshar HM, Domke J (2015). “Reﬂection, Refraction, and Hamiltonian Monte Carlo.” In

Proceedings of the 28th International Conference on Neural Information Processing Systems- Volume 2 , NIPS’15, pp. 3007–3015. MIT Press, Cambridge, MA, USA. URL https://dl.acm.org/doi/10.5555/2969442.2969575 .Althoﬀ M, Dolan JM (2014). “Online Veriﬁcation of Automated Road Vehicles Using Reach-ability Analysis.”

IEEE Transactions on Robotics , (4), 903–918. ISSN 1552-3098. doi:10.1109/TRO.2014.2312453 .Banerjee A, Hung CH (2011). “Informed Momentum Trading Versus Uninformed "Naive"Investors Strategies.” Journal of Banking & Finance , (11), 3077–3089. ISSN 0378-4266. doi:https://doi.org/10.1016/j.jbankfin.2011.04.005 .Barber CB, Dobkin DP, Huhdanpaa H (1996). “The Quickhull Algorithm for Convex Hulls.” ACM Transactions on Mathematical Software , (4), 469–483. ISSN 0098-3500. doi:10.1145/235815.235821 .Barnagaud J, Kissling W, Tsirogiannis C, Fisikopoulos V, Villeger S, Sekercioglu C, SvenningJ (2017). “Biogeographic, Environmental and Anthropogenic Determinants of Global Pat-terns in Functional and Taxonomic Turnover in Birds.” Global Ecology and Biogeography , , 1190–1200. doi:https://doi.org/10.1111/geb.12629 .Berkelaar M, Eikland K, Notebaert P (2004). “ lp_solve http://lpsolve.sourceforge.net/5.5/ .Betancourt M (2017). “A Conceptual Introduction to Hamiltonian Monte Carlo.” https://arxiv.org/pdf/1701.02434.pdf .Billio M, Getmansky M, Pelizzon L (2012). “Dynamic Risk Exposures in Hedge Funds.” Computational Statistics & Data Analysis , (11), 3517–3532. ISSN 0167-9473. doi:https://doi.org/10.1016/j.csda.2010.08.015 .Boyd S, Vandenberghe L (2004). Convex Optimization . Cambridge University Press, NewYork, NY, USA.Calès L, Chalkis A, Emiris I, Fisikopoulos V (2018). “Practical Volume Computation of Struc-tured Convex Bodies, and an Application to Modeling Portfolio Dependencies and FinancialCrises.” In B Speckmann, CD Tóth (eds.), https://summerofcode.withgoogle.com postolos Chalkis, Vissarion Fisikopoulos Geometry (SoCG 2018) , volume 99 of

LIPIcs , pp. 19:1–19:15. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany. doi:10.4230/LIPIcs.SoCG.2018.19 .Cerny M (2012). “Goﬃn’s Algorithm for Zonotopes.”

Kybernetika , (5), 890–906. URL http://eudml.org/doc/251418 .Chalkis A, Emiris IZ, Fisikopoulos V (2019). “Practical Volume Estimation by a new An-nealing Schedule for Cooling Convex Bodies.” URL http://arxiv.org/abs/1905.05494 .Chen Y, Dwivedi R, Wainwright M, Yu B (2018). “Fast MCMC Sampling Algorithms onPolytopes.” Journal of Machine Learning Research , (55), 1–86. URL http://jmlr.org/papers/v19/18-158.html .Cousins B, Vempala S (2015). “Bypassing KLS: Gaussian Cooling and an O ∗ ( n ) VolumeAlgorithm.” In The Annual ACM Symposium on Theory of Computing , pp. 539–548. doi:10.1145/2746539.2746563 .Cousins B, Vempala S (2016). “A Practical Volume Algorithm.”

Mathematical ProgrammingComputation , (2). doi:10.1007/s12532-015-0097-z .Dabbene F, Shcherbakov PS, Polyak BT (2010). “A Randomized Cutting Plane Method withProbabilistic Geometric Convergence.” SIAM Journal on Optimization , (6), 3185–3207.ISSN 1052-6234. doi:10.1137/080742506 .Dalalyan AS (2017). “Theoretical Guarantees for Approximate Sampling from a Smooth andLog-Concave Density.” Journal of the Royal Statistical Society: Series B , , 651–676. doi:10.1111/rssb.12183 . URL http://arxiv.org/pdf/1412.7392v3.pdf .Dieker AB, Vempala SS (2015). “Stochastic Billiards for Sampling from the Boundary of aConvex Set.” Mathematics of Operations Research , (4), 888–901. doi:10.1287/moor.2014.0701 .Duane S, Kennedy A, Pendleton BJ, Roweth D (1987). “Hybrid Monte Carlo.” Physics LettersB , (2), 216 – 222. ISSN 0370-2693. doi:https://doi.org/10.1016/0370-2693(87)91197-X .Duca M, Koban A, Basten M, Bengtsson E, Klaus B, Kusmierczyk P, Lang J, Detken C,Peltonen T (2017). “A new Database for Financial Crises in European Countries.” Oc-casional Paper Series 194 , European Central Bank. URL https://ideas.repec.org/p/ecb/ecbops/2017194.html .Dyer M, Frieze A (1988). “On the Complexity of Computing the Volume of a Polyhedron.”

SIAM Journal on Computing , (5), 967–974. doi:10.1137/0217060 .Dyer M, Frieze A, Kannan R (1991). “A Random Polynomial-Time Algorithm for Ap-proximating the Volume of Convex Bodies.” Journal of the ACM , (1), 1–17. URL http://doi.acm.org/10.1145/102782.102783 .Eddelbuettel D, François R (2011). “ Rcpp : Seamless R and C++

Integration.”

Journalof Statistical Software , (8), 1–18. doi:10.18637/jss.v040.i08 . URL .6 volesti : Volume Approximation and Sampling in R Eddelbuettel D, Fran¸cois R (2017). “Exposing

C++

Functions and Classes With

Rcpp

Mod-ules.”

Rcpp version 0.12.10 as of March 17, 2017 .Elekes G (1986). “A Geometric Inequality and the Complexity of Computing Volume.”

Discrete and Computational Geometry , (4), 289–292. ISSN 1432-0444. doi:10.1007/BF02187701 .Emiris I, Fisikopoulos V (2014). “Practical Polytope Volume Approximation.” ACM Trans-actions of Mathematical Software, 2018 , (4), 38:1–38:21. ISSN 0098-3500. doi:10.1145/3194656 . Conf. version: Proc. Symp. Comp. Geometry, 2014.Fisikopoulos V, Chalkis A (2019). volesti : Volume Approximation and Sampling of ConvexPolytopes in R . R package version 1.1.0, URL https://CRAN.R-project.org/package=volesti .Geman S, Geman D (1984). “Stochastic Relaxation, Gibbs Distributions, and the BayesianRestoration of Images.” IEEE Transactions on Pattern Analysis and Machine Intelligence , PAMI-6 (6), 721–741. ISSN 1939-3539. doi:10.1109/TPAMI.1984.4767596 .Genz A, Bretz F (2009).

Computation of Multivariate Normal and t Probabilities . 1st edition.Springer Publishing Company, Incorporated. ISBN 364201688X, 9783642016882.Gover E, Krikorian N (2010). “Determinants and the Volumes of Parallelotopes and Zono-topes.”

Linear Algebra and its Applications , (1), 28 – 40. ISSN 0024-3795. doi:https://doi.org/10.1016/j.laa.2010.01.031 .Guegan D, Calès L, Billio M (2011). “A Cross-Sectional Score for the Relative Perfor-mance of an Allocation.” Université Paris 1 Panthéon-Sorbonne (Post-Print and Work-ing Papers) halshs-00646070 , HAL. URL https://ideas.repec.org/p/hal/cesptp/halshs-00646070.html .Guennebaud G, Jacob B, et al. (2010).

Eigen v3 . URL http://eigen.tuxfamily.org .Hahn T (2018). Cuba - A Library for Multidimensional Numerical Integration . Max PlanckInstitute for Physics, Munich. URL .Haraldsdòttir H, Cousins B, Thiele I, Fleming R, Vempala S (2017). “CHRR: CoordinateHit-and-Run with Rounding for Uniform Sampling of Constraint-Based Models.”

Bioinfor-matics , (11), 1741–1743. doi:10.1093/bioinformatics/btx052 .Hastings WK (1970). “Monte Carlo Sampling Methods Using Markov Chains and TheirApplications.” Biometrika , (1), 97–109. ISSN 00063444. URL .Herrmann H, Dyson B, Vass L, Johnson G, Schwartz J (2019). “Flux Sampling is a PowerfulTool to Study Metabolism Under Changing Environmental Conditions.” Systems Biologyand Applications , (32). doi:10.1038/s41540-019-0109-0 .Iyengar S (1988). “Evaluation of Normal Probabilities of Symmetric Regions.” SIAM Journalon Scientiﬁc and Statistical Computing , (3), 418–423. doi:https://doi.org/10.1137/0909028 . postolos Chalkis, Vissarion Fisikopoulos SSRNScholarly Paper ID 244153 , Social Science Research Network, Rochester, NY. URL https://papers.ssrn.com/abstract=244153 .Johnson SG (2018).

Cubature - Multi-Dimensional Adaptive Integration (Cubature) in C .Massachusetts Institute of Technology. URL https://github.com/stevengj/cubature .Kaufman DE, Smith RL (1998). “Direction Choice for Accelerated Convergence in Hit-and-Run Sampling.” Operations Research , (1), 84–95. doi:10.1287/opre.46.1.84 .Kopetzki A, Schürmann B, Althoﬀ M (2017). “Methods for Order Reduction of Zonotopes.”In IEEE Conference on Decision and Control , pp. 5626–5633. doi:10.1109/CDC.2017.8264508 .Kühn W (1998). “Rigorously Computed Orbits of Dynamical Systems Without the WrappingEﬀect.”

Computing , , 47–67. doi:https://doi.org/10.1007/BF02684450 .Laddha A, Lee YT, Vempala S (2019). “Strong Self-Concordance and Sampling.” https://arxiv.org/abs/1911.05656 .Lee Y, Vempala S (2017). “Eldan’s Stochastic Localization and the KLS Hyperplane Con-jecture: An Improved Lower Bound for Expansion.” In ,pp. 998–1007. doi:10.1109/FOCS.2017.96 .Lee Y, Vempala S (2018). “Convergence Rate of Riemannian Hamiltonian Monte Carlo andFaster Polytope Volume Computation.” In Proceedings of the 50th Annual ACM SIGACTSymposium on Theory of Computing , STOC 2018, pp. 1115–1121. ISBN 978-1-4503-5559-9. doi:10.1145/3188745.3188774 .Lee YT, Shen R, Tian K (2020). “Logsmooth Gradient Concentration and Tighter Runtimesfor Metropolized Hamiltonian Monte Carlo.” .Lee YT, Song Z, Vempala SS (2018). “Algorithmic Theory of ODEs and Sampling fromWell-conditioned Logconcave Densities.” https://arxiv.org/abs/1812.06243 .Lo A (2003). “The Statistics of Sharpe Ratios.”

Financial Analysts Journal , . doi:10.2469/faj.v58.n4.2453 .Lovász L, Kannan R, Simonovits M (1997). “Random Walks and an O ∗ ( n ) Volume Algorithmfor Convex Bodies.” Random Structures and Algorithms , , 1–50. doi:10.1002/(SICI)1098-2418(199708)11:1<1::AID-RSA1>3.0.CO;2-X .Lovász L, Vempala S (2006). “Hit-and-Run from a Corner.” SIAM Journal on Computing , (4), 985–1005. ISSN 0097-5397. doi:10.1137/S009753970544727X .Lovász L, Vempala S (2006). “Simulated Annealing in Convex Bodies and an O ∗ ( n ) VolumeAlgorithm.” Journal of Computer and System Sciences , , 392–417. doi:https://doi.org/10.1016/j.jcss.2005.08.004 .Mangoubi O, Smith A (2017). “Rapid Mixing of Hamiltonian Monte Carlo on Strongly Log-Concave Distributions.” .8 volesti : Volume Approximation and Sampling in R Mangoubi O, Vishnoi NK (2019). “Faster Polytope Rounding, Sampling, and Volume Compu-tation via a Sub-Linear Ball Walk.” In , pp. 1338–1357. doi:doi:10.1109/FOCS.2019.00082 .Maurer J, Watanabe S (2017). “Boost Random Number Library.” Software. URL .Megchelenbrink W, Huynen M, Marchiori E (2014). “optGpSampler: An Improved Toolfor Uniformly Sampling the Solution-Space of Genome-Scale Metabolic Networks.”

PLOSONE , (2), 1–8. doi:10.1371/journal.pone.0086587 .Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953). “Equation ofState Calculations by Fast Computing Machines.” The Journal of Chemical Physics , (6),1087–1092. doi:10.1063/1.1699114 .Murty K (2009). “Ball Centers of Special Polytopes.” Dept Industrial & Operations Engineer-ing, U. Michigan . URL .Narasimhan B, Koller M, Johnson SG, Hahn T, Bouvier A, Kiêu K, Gaure S (2019). cubature :Adaptive Multivariate Integration over Hypercubes . R package version 2.0.4, URL https://CRAN.R-project.org/package=cubature .Neal RM (2011). MCMC Using Hamiltonian Dynamics , chapter 5. CRC Press. doi:10.1201/b10905-7 .Nikolic B (2015).

BNMin1 : A Minimisation Library . C++ package version 1.11, URL .Nolan JP, Genz A (2016).

SimplicialCubature : Integration of Functions overSimplices . R package version 1.2, URL https://CRAN.R-project.org/package=SimplicialCubature .Pereira A, Althoﬀ M (2015). “Safety Control of Robots Under Computed Torque ControlUsing Reachable Sets.” In , pp. 331–338. ISSN 1050-4729. doi:10.1109/ICRA.2015.7139020 .Peterson BG, Carl P (2020). PerformanceAnalytics : Econometric Tools for Performance andRisk Analysis . R package version 2.0.4, URL https://CRAN.R-project.org/package=PerformanceAnalytics .Polyak B, Gryazina E (2014). “Billiard Walk - a new Sampling Algorithm for Control andOptimization.” IFAC Proceedings Volumes , (3), 6123 – 6128. ISSN 1474-6670. doi:https://doi.org/10.3182/20140824-6-ZA-1003.02312 . 19th IFAC World Congress.Pouchkarev I (2005). Performance Evaluation of Constrained Portfolios . Ph.D. thesis, Eras-mus Research Institute of Management, The Netherlands.Rubinstein R, Melamed B (1998).

Modern Simulation and Modeling . Wiley, New York.Saa PA, Nielsen LK (2016). “ll-ACHRB: A Scalable Algorithm for Sampling the FeasibleSolution Space of Metabolic Networks.”

Bioinformatics , (15), 2330–2337. ISSN 1367-4803. doi:10.1093/bioinformatics/btw132 . postolos Chalkis, Vissarion Fisikopoulos The Journal of biological Chemistry ,

284 9 , 5457–61. doi:10.1074/jbc.R800048200 .Schneebeli C (2015). “Randomized Algorithms to Sample from the Boundary of ConvexPolytopes.” Department of Mathematics, ETH Zurich. Bachelor’s Thesis.Sharpe WF (1966). “Mutual Fund Performance.”

The Journal of Business , (1), 119–138.ISSN 0021-9398. URL .Shen R, Lee YT (2019). “The Randomized Midpoint Method for Log-Concave Sampling.” https://arxiv.org/abs/1909.05503 .Smith RL (1984). “Eﬃcient Monte Carlo Procedures for Generating Points Uniformly Dis-tributed Over Bounded Regions.” Operations Research , (6), 1296–1308. ISSN 0030364X,15265463. URL .Somerville P (1998). “Numerical Computation of Multivariate Normal and Multivariate-tProbabilities over Convex Regions.” Journal of Computational and Graphical Statistics , (4), 529–544. doi:10.1080/10618600.1998.10474793 .Stanley RP (1986). “Two Poset Polytopes.” Discrete & Computational Geometry , (1), 9–23.ISSN 1432-0444. doi:10.1007/BF02187680 .Talvitie T, Kangas K, Niinimäki T, Koivisto M (2018). “Counting Linear Extensions inPractice: MCMC Versus Exponential Monte Carlo.” In AAAI Conference on ArtiﬁcialIntelligence . URL https://aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16957 .Todd MJ, Yildirim EA (2007). “On Khachiyan’s Algorithm for the Computation of Minimum-Volume Enclosing Ellipsoids.”

Discrete Applied Mathematics , (13), 1731 – 1744. ISSN0166-218X. doi:https://doi.org/10.1016/j.dam.2007.02.013 .Treynor JL (2015). “How to Rate Management of Investment Funds.” In Treynor on In-stitutional Investing , pp. 69–87. John Wiley & Sons, Ltd. ISBN 978-1-119-19667-9. doi:10.1002/9781119196679.ch10 .van Valkenhoef G, Tervonen T (2019). hitandrun : "Hit and Run" and "Shake and Bake" forSampling Uniformly from Convex Shapes . R package version 0.5-5, URL https://CRAN.R-project.org/package=hitandrun .Varsi G (1973). “The Multidimensional Content of the Frustum of the Simplex.” Paciﬁc Jour-nal of Mathematics , (1), 303–314. URL https://projecteuclid.org:443/euclid.pjm/1102946623 .Venzke A, Molzahn D, Chatzivasileiadis S (2019). “Eﬃcient Creation of Datasets for Data-Driven Power System Applications.” URL https://arxiv.org/abs/1910.01794 .Ziegler G (1995). Lectures on Polytopes . Springer-Verlag, New York. URL https://doi.org/10.1007/978-1-4613-8431-1 .0 volesti : Volume Approximation and Sampling in R A. R scripts for plotting

A.1. Random walks on hypercubes

The following script was used to generate Figure 4. The generated sample points are storedin ﬁles parameterized by walk length i . For example, BaW_2.png stores the points for Ballwalk with walk length 2. open3d()plot3d(x=points1[1,], y=points1[2,], z=points1[3,], xlim = c(-1,1),ylim = c(-1,1), zlim = c(-1,1), col = "cyan", size=2, main = "",sub = "", ann = FALSE, axes = FALSE, xlab="", ylab="", zlab="")box3d()rgl.snapshot( filename=paste0("BaW_",i,".png"), fmt = "png", top = TRUE )rgl.close()open3d()plot3d(x=points2[1,], y=points2[2,], z=points2[3,], xlim = c(-1,1),ylim = c(-1,1), zlim = c(-1,1), col = "black", size=2, main = "",sub = "", ann = FALSE, axes = FALSE, xlab="", ylab="", zlab="")box3d()rgl.snapshot( filename=paste0("CDHR_",i,".png"), fmt = "png", top = TRUE )rgl.close()open3d()plot3d(x=points3[1,], y=points3[2,], z=points3[3,], xlim = c(-1,1),ylim = c(-1,1), zlim = c(-1,1), col = "red", size=2, main = "",sub = "", ann = FALSE, axes = FALSE, xlab="", ylab="", zlab="")box3d()rgl.snapshot( filename=paste0("RDHR_",i,".png"), fmt = "png", top = TRUE )rgl.close()open3d()plot3d(x=points4[1,], y=points4[2,], z=points4[3,], xlim = c(-1,1),ylim = c(-1,1), zlim = c(-1,1), col = "blue", size=2, main = "",sub = "", ann = FALSE, axes = FALSE, xlab="", ylab="", zlab="")box3d()rgl.snapshot( filename=paste0("BiW_",i,".png"), fmt = "png", top = TRUE )rgl.close()

A.2. Direct sampling

The following script can be used to generate Figure 5. postolos Chalkis, Vissarion Fisikopoulos y = c(points1[2,],points1[2,]),z = c(points1[3,],points1[3,]))plot_ly(p.data, x = ~x, y = ~y, z = ~z,marker=list(size=1),colors = c(' A.3. Intersections of polytopes

The following script can be used to generate Figure 6.

A.4. Rounding

The following scripts can be used to generate Figures 7, 8. volesti : Volume Approximation and Sampling in R A.5. Rotation and inscribed ball

The following script can be used to generate Figure 9.

A.6. Copola characterization and compound returns postolos Chalkis, Vissarion Fisikopoulos R function is used in the script of Section 4.1 which generates Figure 10. get_compound_return <- function(MatReturns) {nassets = dim(MatReturns)[2]compRet = rep(1,nassets)for (j in 1:nassets) {for (k in 1:dim(MatReturns)[1]) {compRet[j] = compRet[j] * (1 + MatReturns[k, j])}compRet[j] = compRet[j] - 1}return(compRet)} A.7. Values of indicators

The following script can be used to generate Figure 11. n = length(I)crisis_per = rep(NaN, n)crisis_per[which(market_states %in% "crisis")] =I[which(market_states %in% "crisis")]normal_per = rep(NaN, n)normal_per[which(market_states %in% "normal")] =I[which(market_states %in% "normal")]warning_per = rep(NaN, n)warning_per[which(market_states %in% "warning")] =I[which(market_states %in% "warning")]indi.data <- data.frame(x=rep(1:707,3), y=c(normal_per, warning_per,crisis_per), period = rep(market_states,3))ggplot(indi.data, aes(x=x, y=y))+ geom_line(aes(color=period)) +geom_point(aes(color=period)) + labs(x ="Dates", y="Indicator") +scale_y_continuous(breaks = 0:12)+scale_x_continuous(breaks=seq(from=1, to =707, by=50),labels=c(dates[seq(from=1, to =707, by=50)])) +scale_color_manual(values=c("red", "blue", "orange")) +theme(legend.position="top",text = element_text(size=20),axis.text.x = element_text(size=10), axis.text.y =element_text(size=20))

A.8. Zonotope approximation