Using Householder Matrices to Establish Mixing Test Critical Values
UUsing Householder Matrices to
Establish Mixing Test Critical Values
Aaron Carl [email protected]
Department of MathematicsUniversity of Central Florida4000 Central Florida BlvdP.O. Box 161364Orlando, FL 32816-1364(407) 823-0538(407) 823-6253 (fax)
Abstract:
A measure-preserving dynamical system can be approxi-mated by a Markov shift with a bistochastic matrix. This leads to usingempirical stochastic matrices to measure and estimate properties of stir-ring protocols. Specifically, the second largest eigenvalue can be usedto statistically decide if a stirring protocol is weak-mixing, ergodic, ornonergodic. Such hypothesis tests require appropriate probability distri-butions. In this paper, we propose using Monte Carlo empirical probabil-ity distributions from unistochastic matrices to establish critical values.These unistochastic matrices arise from randomly constructed House-holder matrices.
AMS 2000 subject classifications:
Primary 37A25, 62P30; secondary37A05.
Keywords and phrases: weak-mixing, measure-preserving, stirringprotocol.
Contents imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 a r X i v : . [ m a t h . D S ] O c t
1. Introduction
If a dynamical system has a probability measure and the system is measure-preserving, then partitioning the domain into n states of equal measure leadsto a Markov shift whose measure is defined by a bistochastic matrix and thelength n row vector (cid:0) n . . . n (cid:1) . (1.1)This partition approximation is called Ulam’s method [21]. After partitioningthe space, data point movement from one iteration of the function providesa stochastic matrix that approximates the bistochastic matrix from Ulam’smethod. Hence a Markov shift with an empirical stochastic matrix and the1 /n row vector approximates the dynamical system [see [17, Chapter 1, Chap-ter 9] for procedure and convergence rate].We may model a stirring protocol’s affect on a compression-resistant fluidwith a measure-preserving dynamical system. In this paper we are interestedin discrete interations of a stirring protocol where the fluid at the beginningis the same fluid at the end. We ’look’ at the fluid before and after stirring,but not during.Properties of an empirical Markov shift can measure and evaluate proper-ties of a measure-preserving dynamical system [8, 10, 13]. The second largesteigenvalue of a empirical stochastic matrix arising from Ulam’s method maybe used to statistically decide if a measure-preserving dynamical system isweak-mixing, ergodic, or nonergodic [6, 7, 9, 14, 17]. To statistically test ifthe dynamical system is ergodic, we need to have some knowledge of P ( | (cid:98) λ − | > k : λ = 1); (1.2)to statistically test if the dynamical system is weak-mixing, we need to havesome knowledge of P ( | (cid:98) λ | > k : | λ | = 1) . (1.3)We use λ to denote the second largest eigenvalue of a bistochastic matrixarising from Ulam’s method, and (cid:98) λ denotes the second largest eigenvalue imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 of a corresponding empirical stochastic matrix (bistochastic and unistochas-tic matrices will be defined shortly). The utility of (cid:98) λ as a test statisticarises directly from the relationship between stochastic matrix eigenvaluesand Markov shift ergodic, mixing properties. Since the unit circle containsall eigenvalues of stochastic matrices, there are no reasonable probabilitydistributions of (cid:98) λ with 1 as the mean or median of either (cid:98) λ or | (cid:98) λ | . So forhypothesis testing, we should use a probability distribution that has signifi-cant mass near (cid:98) λ = 1 or | (cid:98) λ | = 1.In this paper, we show that it is reasonable to approximate the condi-tional probability distributions with Monte Carlo probability distributionswhen the equal-measure partition sets are small. These Monte Carlo prob-ability distributions are constructed using randomly generated Householdermatrices.Stirring protocols of compression resistant fluids, such as chocolate andwater, provide examples of nearly measure-perserving dynamical systems.The need for confidence in the mixing of food items and in the mixing ofpharmaceuticals highlights the utility of such probability distributions.We propose using randomly generated, nonzero, independent, indenticallydistributed real numbers to generate Householder matrices; take products ofpermutation matrices with Householder matrices; then square the magnitudeof the products’ entries to get unistochastic matrices. From these unistochas-tic matrices, construct a Monte Carlo approximation of a desired probabilitydistribution. From the Monte Carlo probability distribution, establish thecritical value for rejecting the null-hypothesis. The primary focus of this pa-per is to establish a method for determining hypothesis test critical values.Deciding which specific probability distribution to use in a hypothesis testdepends on properties of the dynamical system; we will only show that thepresented Monte Carlo methods are reasonable and leave probability distri-bution selection for the future.There are several ways to use Monte Carlo methods to generate bistochas-tic matrices; unfortunately, many techniques lead to empirical probabilitydistributions where the central tendency of (cid:98) λ is close to zero [17, Chapter12]. Such distributions provide little utility for a weak-mixing, ergodic, ornonergodic hypothesis test. The methods presented here lead to Householdermatrices that, in a Frobenius norm sense, are likely to be close to the identitymatrix. Squaring the magnitude of entries from these unitary matrices givesunistochatic matrices. If we want a unistochastic matrix close to a particular imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 permutation matrix, we may multiply the Householder matrix by the desiredpermutation matrix. The main advantage of this method is that it providesprobability distributions based on observed unistochastic matrices.
2. Bistochastic MatricesDefinition 2.1. An n × n bistochastic matrix is a stochastic matrix whosetranspose is also a stochastic matrix. By the Birkhoff-von Neumann theorem, the set of n × n bistochastic ma-trices form a convex set with permutation matrices as extreme points. Werefer to this set as Birkhoff’s polytope [2, 3]. Bistochastic matrices are alsoreferred to as doubly stochastic . Definition 2.2. An n × n bistochastic matrix is called unistochastic if eachentry is equal to the squared magnitude of some unitary matrix. The set of n × n unistochastic matrices form a proper subset of Birkhoff’spolytope [2, page 307, section 1]. Since the set of unistochastic matrices is aproper subset, the proposed method should only be used when the Ulam’smethod bistochastic matrix is approximately unistochastic. Definition 2.3. An n × n Householder matrix is of the form H = I − (cid:126)v(cid:126)v ∗ (2.1) where (cid:126)v is a unit vector. Every Householder matrix is a unitary matrix [11, Chapter 5]. Since theset of n × n unitary matrices is closed under multiplication, taking the squaremagnitude of entries from a Householder matrix-permutation matrix productresults in a unistochastic matrix.
3. Modeling Dynamical Systems
Consider running a stirring protocol on a compression resistant fluid. Let’smodel this with a measure-preserving dynamical system ( D , B , µ, f ),1. D represents the compression resistant fluid,2. B is the Borel σ -algebra,3. µ is rescaled Lebesgue measure so that µ ( D ) = 1,4. f : D → D models fluid movement during stirring. imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 The Monte Carlo method we will outline uses n × n unistochastic matricesarising from Householder matrix-permutation matrix products. These unis-tochastic matrices are close to the permutation matrices when n is large. It isreasonable to use the described Monte Carlo distribution for hypothesis test-ing when the dynamical system has the following property: For any A, B ∈ B where P ( A ∩ B ) = 0, if f is perturbed so that P ( f ( x ) ∈ A | x ∈ B ) increases (decreases), (3.1)then for all Borel set C ⊆ B c P ( f ( x ) ∈ A | x ∈ C ) decreases (increases) proportionally. (3.2)The 1 /n row vector and unistochastic matrices arising from Householdermatrix-permutation matrix products provide Markov shifts that reflect Ulam’smethod with an equal measure partition applied to such dynamical systems.This is not saying that all such dynamical systems lead unistochastic ma-trices, but that the Householder constructed unistochastic matrices reflectthese properties.If (cid:126)v is a real unit vector and H is the corresponding Householder matrix, (cid:126)v = v v ... v n , H = I − (cid:126)v(cid:126)v t . (3.3)Then squaring the entries of H gives a unistochastic matrix M , m ij = (cid:40) (1 − v i ) if i = j, v i v j if i (cid:54) = j. (3.4)If D , D , . . . , D n are our equal measure partition sets, and M arose fromUlam’s method, then the entries of M provide conditional probabilities, m ij = P ( f ( x ) ∈ D j | x ∈ D i ) . (3.5)Increasing (decreasing) v i leads to nearly proportional decreases (increases)in m ij = P ( f ( x ) ∈ D j | x ∈ D i ) when i (cid:54) = j. (3.6)Because of these observations, we propose using unistochastic Ulam matri-ces arising from squaring the entries of real Householder matrix-permutationmatrix products to model weak-mixing stirring protocols of such dynamicalsystems. imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018
4. Mixing Hypothesis Test Procedure
In this section we will discuss and outline our procedure for testing a compression-resistant fluid stirring protocol.There are many techniques to measure a stirring protocol’s ability to mix,such as decay of correlations [5], Fourier analysis, Artin braid patterns [1, 20],chaotic advection [18, 19], and other topological methods [4, 12]. Unfortu-nately, it is typical for different protocols to lend themselves to differentanalytical methods. Thus comparing mixing quality between mechanicallydissimilar stirring protocols is difficult. The primary advantage of the Ulammethod approximation is that it can be used to evaluate any incompressiblefluid stirring pattern. This allows one to compare and evaluate the mixingof stirring protocols by comparing and evaluating eigenvalues. The main dis-advantage is that the method is statistical and does not prove the results.Another significant advantage of our method is that it requires only oneiteration of stirring, in contrast to other techniques that call for iteratedexperiments.Since our method only approximates the dynamical system, we make noinferences regarding strong-mixing when we conclude that the stirring proto-col is weak-mixing. If the protocol is not ergodic, then it is not weak-mixing.If the protocol is not weak-mixing, then it is not strong-mixing.Our test hypotheses are1. H o : ( D , B , µ, f ) is not ergodic (and hence not weak-mixing).2. H a : ( D , B , µ, f ) is ergodic but not weak-mixing.3. H a : ( D , B , µ, f ) is weak-mixing (and hence ergodic).We partition the fluid into connected, equal volume regions, and use thesepartition sets to generate a new σ -algebra contained in B . If our data stronglyindicate that the stirring protocol is weak-mixing or ergodic over the gen-erated σ -algebra, we will conclude the same about the original dynamicalsystem. If the stirring protocol is nonmixing or nonergodic over the gener-ated σ -algebra, then the original dynamical system is nonmixing or noner-godic. The procedure evaluates stirring over a smaller σ -algebra, thus thetest is inherently more reliable for detecting if a protocol is nonmixing ornonergodic.The null hypothesis is that the stirring protocol is nonergodic. It is betterto reject a protocol that mixes well than to produce poorly mixed product.The repercussions of a testing error are as follows:1. Type I Error: discard a desirable stirring protocol for a different stirring imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 method2. Type II Error: produce a product that is insufficiently mixedThe stirring protocol’s purpose determines the tolerable risks of error andthe number of partition states. If poorly mixed fluid could result in minorconsequences or mixing on a small scale is inapt, then the number of par-tition regions may be relatively small. If poorly mixed fluid could result insevere consequences, then the number of partition regions must be large andpartition volume small. For example, poorly mixed batter from a kitchencould result in unpalatable food; poorly mixed pharmaceuticals with a lowLD50 could lead to overdose and death. A mixing test for a kitchen coulduse a relatively coarse partition, while a pharmaceutical company would usea fine partition.If we know an upper bound for the stirring protocol’s entropy, call it h ,then Froyland’s entropy estimate and expected values show that the numberof states should be greater than e h [8].Data point movement from one iteration of stirring leads to our empiricalstochastic matrix, (cid:98) P . The percent of points that start in region i and end inregion j gives us (cid:98) p ij . We model the entries of (cid:98) P as nonindependent binomialrandom variables, whose probabilities come from the Ulam stochastic matrix.Our test statistic is (cid:98) λ = λ ( (cid:98) P ). Some dynamical systems have measure zerosets with atypical properties. In an attempt to avoid such difficulties, werandomly select data points rather than select points from a grid.If the data points are independent, uniform, and randomly distributedwithin each region, then the empirical matrix will converge to a bistochasticmatrix in a Frobenius norm sense (the proof of this follows from extending astandard Monte Carlo argument [16]). We approximate the stirring protocolwith a one-sided Markov shift, (cid:18)(cid:0) n , . . . , n (cid:1) , (cid:98) P (cid:19) . The pair will not define aMarkov shift if (cid:0) n , . . . , n (cid:1) (cid:98) P (cid:54) = (cid:0) n , . . . , n (cid:1) . (4.1)The 1 /n row vector is a stationary distribution for any bistochastic matrix.Since Ulam method’s partitions our fluid into n equal volume regions,it is reasonable to use the 1 /n vector as the stationary distribution. If wedo not use equal volume partitions, the stationary distribution will be theprobability vector corresponding to the rescaled volume of each region. Manyof the results regarding convergence, expected values, convergence rates, etc. imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 depend on equal measure partitions, we should use equal volume regions ifappropriate [17]. Mixing Hypothesis Test Procedure:
1. Set the type II error significance levels for both alternative hypotheses, α and α .2. Set n to be the number of partition regions.3. Decide which conditional probability distribution(s) for | λ ( P ) | = 1and λ ( P ) = 1 to use.4. Establish the critical values for H a and H a , c and c . The purpose ofthis paper is to propose using Householder matrix-permutation matrixproducts to estimate c and c .5. Partition the fluid into n connected equal volume regions, D , D , . . . , D n .6. Randomly select data points in each partition region. These pointsshould be independent and uniformly distributed.7. Run the stirring protocol one time.8. Use data point movement between regions toconstruct an empirical stochastic matrix, (cid:98) P .9. Determine the hypothesis test result. The test statistic is λ ( (cid:98) P );compare | λ ( (cid:98) P ) − | to c ;compare | λ ( (cid:98) P ) | to c .10. Use Froyland’s entropy estimate to estimate the dynamical system’sentropy. − n n (cid:88) i =1 n (cid:88) j =1 (cid:98) p ij log (cid:98) p ij (4.2)(we define 0 log 0 to equal 0).11. If the null hypothesis is rejected in favor of weak-mixing, let the rateat which (cid:18) Nn − (cid:19) ( λ ( (cid:98) P )) N − n +1 → N → ∞ (4.3)be our estimate of the rate of mixing.
5. Constructing the Monte Carlo Matrices
Let n ∈ { , , , , , ... } (5.1) imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 be the number of equal measure states that we partition the measure-preservingdynamical system into while using Ulam’s method [see [17, Chapter 1] forthe procedure]. Let { u , u , . . . , u n } (5.2)be real independent, identically distributed random variables such that u i (cid:54) = 0 almost surely, and E (cid:0) u i (cid:1) , E ( u i ) , E (cid:0) u i (cid:1) , E ( u i ) < ∞ . (5.3)Let (cid:126)u = u u ... u n , (cid:126)v = (cid:126)u | (cid:126)u | . (5.4)We may use a unit vector to construct a Householder matrix. Let H = ( h ij )be the Householder matrix corresponding to (cid:126)v . H = I − (cid:126)v(cid:126)v T (5.5)The entries of H are h ij = (cid:40) − u + u + ... + u n u i if i = j, − u + u + ... + u n u i u j if i (cid:54) = j. (5.6)Let Q be a permutation matrix that we want our random unistochasticmatrix to be proximal to. Set U = QH . Now, let M = ( m ij ) be the matrixdefined by m ij = u ij . (5.7)Since H is a Householder matrix and Q is a permutation matrix, U is aunitary matrix. It follows that M is unistochastic.Notice that m ij = (cid:40)(cid:0) − u + u + ... + u n u i (cid:1) if i = j, u + u + ... + u n ) u i u j if i (cid:54) = j. (5.8) imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 Since u i (cid:54) = 0 almost surely for all i , all entries of M are positive almostsurely; by the Perron-Frobenius theorem, all but one of M ’s eigenvalues are ofmagnitude strictly less than one [15, Chapter 8]. It follows that any Markovshift with stationary distribution (cid:0) n . . . n (cid:1) (5.9)will be strong-mixing (for Markov shifts, weak-mixing is equivalent to strong-mixing). Our hypothesis test for weak-mixing (ergodic) requires a probabilitydistribution over [0 ,
1] (the unit circle) with significant mass near 1. In thenext two sections, we will see that the expected value of M ’s eigenvaluesconverge to one as n goes to infinity.How does our dynamical system relate to n ? Generally speaking, finer par-titions are more apt to detect nonmixing (nonergodicity). If we are confidentin mixing, we will use a coarse partition to reduce effort; if our confidence inmixing is poor, we will use a fine partition.By the Birckoff-von Neumann theorem, bistochastic matrices are convexcombinations of permutation matrices [2, 3]. So our unistochastic matriceswill tend to be near the ’corners’ of the set of bistochastic matrices. For anystatistic from unistochastic matrices we are interested in, we may use suchmatrices to generate a Monte Carlo empirical probability distribution.
6. Establishing Critical Values
In this section, we will outline the procedure we propose for establishingcritical values for a weak-mixing, ergodic, nonergodic hypothesis test.Our test hypotheses are1. H o : ( D , B , µ, f ) is not ergodic (and hence not weak-mixing).2. H a : ( D , B , µ, f ) is ergodic but not weak-mixing.3. H a : ( D , B , µ, f ) is weak-mixing (and hence ergodic).After partitioning the space into n equal measure connected subsets, Ulam’smethod approximates the dynamical system with a Markov shift. We willapproximate the bistochastic matrix defining the Markov shift’s measure, P ,with an empirical stochastic matrix, (cid:98) P . So we approximate( D , B , µ, f ) with (cid:16)(cid:0) n , . . . , n (cid:1) , (cid:98) P (cid:17) . (6.1) Remark 6.1.
The pair (cid:16)(cid:0) n , . . . , n (cid:1) , (cid:98) P (cid:17) (6.2) imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 will not define a Markov shift if (cid:0) n , . . . , n (cid:1) (cid:98) P (cid:54) = (cid:0) n , . . . , n (cid:1) , (6.3) but if our data points are uniform random variables within each state, then E ( (cid:107) P − (cid:98) P (cid:107) F ) → as the minimum number of points in a state goes towards infinity. It followsthat for each eigenvalue | λ i ( P ) − λ i ( (cid:98) P ) |→ in probability ∀ i (6.5) in the Hausdorf topology when our data points are uniform random variableswithin each state and the minimum number of points in a state goes towardsinfinity [17, Chapter 8]. Our test statistic is the second largest eigenvalue of (cid:98) P . Let α , α ∈ (0 ,
1) bethe alpha values for the hypothesis test; let c , c ∈ (0 ,
1) be the correspondingcritical values, c ≤ − c , P ( | λ ( (cid:98) P ) − |≥ c : λ ( P ) = 1) < α , (6.6) P ( | λ ( (cid:98) P ) |≤ c : | λ ( P ) | = 1) < α . (6.7)Our goal is to use Householder matrices to estimate c and c . The proba-bility distribution used to establish c , c should reflect properties of a classof dynamical systems containing our stirring protocol. imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 Argand diagram of mixing hypothesis test criteria:
1. If λ ( (cid:98) P ) is in the region containing 1 (yellow), fail to reject thenull hypothesis; conclude that the dynamical system isnonergodic.2. If λ ( (cid:98) P ) is in the outer region away from 1 (green), reject thenull hypothesis in favor of the first alternative hypothesis;conclude that the dynamical system is ergodic, but notweak-mixing.3. If λ ( (cid:98) P ) is in the center region (red), reject the null hypothesisin favor of the second alternative hypothesis; conclude that thedynamical system is weak-mixing (and hence ergodic). − − c c − c − i − c ic ii Establishing Critical Values for the Test:
1. Partition D into n equal measure connected subsets. If an upper boundof the dynamical system’s entropy is known, call the upper bound h ,set n greater than e h [8].2. Select a random variable with which to construct unit vectors.Let u , u , . . . , u n be independent, identically distributed, random vari-ables, (cid:126)v i = (cid:126)u (cid:107) u (cid:107) .3. Set N ∈ N so that our empirical probability distributions will be suffi-ciently accurate.4. Select permutation matrices { Q i } Ni =1 near which we want the probabil-ity distribution to have significant mass.5. Randomly generate N Householder matrices, H i = I − (cid:126)v i (cid:126)v Ti . Then imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 square the entries of Q i H i to get the matrix M i .6. Use { λ ( M i ) } Ni =1 to approximate P ( | λ ( (cid:98) P ) − |≥ k : λ ( P ) = 1). Usethe approximation to estimate c .7. Use {| λ ( M i ) |} Ni =1 to approximate P ( | λ ( (cid:98) P ) | < k : λ ( P ) = 1). Usethe approximation to estimate c .
7. Matrix Convergence
In this section, we will show that M i from the previous section will convergeto the permutation matrix Q i as n increases. Since permutation matrix eigen-values are on the unit circle, as a random variable, it is likely that the secondlargest eigenvalue from one of our unistochastic matrices will be near mag-nitude one. Because of the likely proximity to one, it is reasonable to use aprobability distribution from such an eigenvalue to establish critical valuesfor our weak-mixing, ergodic, nonergodic hypothesis test.Our proofs take advantage of the Frobenius norm. After using Jensen’sinequality to remove the square root from consideration, finding an expectedvalue upper bound is similar to finding a second moment. A permutationmatrix acting on a matrix does not change the magnitude of the entries,without loss of generality will prove the results for when Q is the indentitymatrix and focus on M ’s convergence to the identity matrix. Proposition 7.1. If M is a matrix constructed in section with Q = I , n ∈ { , , , . . . } , { u i } ni =1 are identically distributed, u i (cid:54) = 0 a.s. and E ( u i ) , E ( u i ) , E (cid:0) u i (cid:1) , E (cid:0) u i (cid:1) < ∞ , (7.1) then E ( (cid:107) M − I (cid:107) F ) → as n → ∞ . Moreover, E ( (cid:107) M − I (cid:107) F ) ≤ n ( n − E ( u i ) E ( u i ) (7.2)+ n ( n − E ( u i ) E ( u i ) (7.3)+ n ( n − n − E ( u i )( E ( u i )) . (7.4)(7.5) Proof.
First, by Jensen’s inequality E ( (cid:107) M − I (cid:107) F ) ≤ (cid:113) E ( (cid:107) M − I (cid:107) F ) . (7.6) imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 So it is sufficient to show the second part of the proposition. Let’s look atthe entries of M − I ; by computation we see that:( M − I ) ij = − u + u + ... + u n u i ) × (1 − u + u + ... + u n u i ) if i = j, u + u + ... + u n ) u i u j if i (cid:54) = j. (7.7)It follows that ( M − I ) ij = u + u + ... + u n ) u i − u + u + ... + u n ) u i + u + u + ... + u n ) u i if i = j, u + u + ... + u n ) u i u j if i (cid:54) = j. (7.8)If we expand the addends and remove the negative terms, it follows thatalmost surely (cid:107) M − I (cid:107) F < n (cid:88) i =1 16( u + u + ... + u n ) u i (7.9)+ n (cid:88) i =1 16( u + u + ... + u n ) u i (7.10)+ (cid:88) i (cid:54) = j u + u + ... + u n ) u i u j . (7.11)Almost surely, all of the terms in the denominators are positive; if we subtractterms from the denominators, we get an upper bound on the fractions. Thusalmost surely (cid:107) M − I (cid:107) F < n (cid:88) i =1 16( u + u + ... + u n − u i ) u i (7.12)+ n (cid:88) i =1 16( u + u + ... + u n − u i ) u i (7.13)+ (cid:88) i (cid:54) = j u + u + ... + u n − u i − u j ) u i u j . (7.14) imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 Notice that the numerator and denominator in each fraction in the upperbound are independent.The subtraction of terms in denominators removed some positive terms,so the denominators are sums of positive terms. Therefore, we may use theharmonic-arithmetic means inequality, (cid:107) M − I (cid:107) F < n (cid:88) i =1 1( n − ( u + u + . . . + u n − u i ) u i (7.15)+ 16 n (cid:88) i =1 1( n − ( u + u + . . . + u n − u i ) u i (7.16)+ 16 (cid:88) i (cid:54) = j n − ( u + u + . . . + u n − u i − u j ) u i u j (7.17)Now let’s take expected values; since the u i ’s are independent, E ( (cid:107) M − I (cid:107) F )16 ≤ n (cid:88) i =1 E (( u + u + . . . + u n − u i ) ) E ( u i )( n − (7.18)+ n (cid:88) i =1 E (( u + u + . . . + u n − u i ) ) E ( u i )( n − (7.19)+ (cid:88) i (cid:54) = j E (( u + u + . . . + u n − u i − u j ) ) E ( u i ) E ( u j )( n − . (7.20)Next we use Minkowski’s inequality and the fact that the u i ’s are indepen-dently distributed, E ( (cid:107) M − I (cid:107) F ) ≤ n ( n − E ( u i ) E ( u i ) (7.21)+ n ( n − E ( u i ) E ( u i ) (7.22)+ n ( n − n − E ( u i )( E ( u i )) . (7.23)(7.24)Since E ( u i ) , E ( u i ) , E ( u i ), and E ( u i ) are all finite, it follows that E ( (cid:107) M − I (cid:107) F ) → n → ∞ . Hence by Jensen’s inequality, E ( (cid:107) M − I (cid:107) F ) → n → ∞ . imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 Since the second largest eigenvalue gives a test statistic to decide if ameasure-preserving dynamical system is weak mixing, ergodic, or nonergodic,we need a conditional probability distribution of eigenvalues to conduct hy-pothesis tests. To statistically test if a measure-preserving dynamical systemis weak-mixing, we could randomly select permutation matrices { Q k } Nk =1 (7.27)and generate { M k } Nk =1 , with our Householder method, then use the empiricalprobability distribution from {| λ ( M k ) |} Ni =1 (7.28)to establish the critical value for the weak-mixing hypothesis test.To statistically test if a measure-preserving dynamical system is ergodic,we could randomly select permutation matrices { Q k : the multiplicity of λ = 1 is at least two } Nk =1 (7.29)and use Householder matrices to generate { M k } Nk =1 , then use the empiricalprobability distribution from { λ ( M k ) } Nk =1 (7.30)to establish a critical value for the ergodic hypothesis test.
8. Using Specific Random Variables
In this section, we find more precise upper bounds for specific random vari-ables. These upper bounds give better estimates of convergence rate than theresults in the previous section. The first two proofs in this section start outthe same way as the first proof in the previous section, then the argumentstake advantage of the distribution properties.Let’s find a more precise upper bound when the u i ’s are independent stan-dard normal random variables. The proof is similar to the first convergenceproof, the difference is that we take advantage of the relationship betweennormal random variables and χ -distributions. imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 Proposition 8.1.
If the u i ’s in the construction of a unistochastic matrixare independent standard normal random variables and n ∈ { , , , . . . } ,then E ( (cid:107) M − I (cid:107) F ) < n ( n − n − n − n − (8.1)+ n ( n − n − (8.2)+ n ( n − n − n − n − n − . (8.3) Proof.
From the previous proof, we know that almost surely (cid:107) M − I (cid:107) F < n (cid:88) i =1 16( u + u + ... + u n − u i ) u i (8.4)+ n (cid:88) i =1 16( u + u + ... + u n − u i ) u i (8.5)+ (cid:88) i (cid:54) = j u + u + ... + u n − u i − u j ) u i u j . (8.6)Since u i ’s are independent standard normal random variables, we mayreplace the u i ’s with χ -random variables when we take expected values. E ( (cid:107) M − I (cid:107) F ) < n (cid:88) i =1 E ( γ n − ) ) E ( u i ) (8.7)+ n (cid:88) i =1 E ( γ n − ) ) E ( u i ) (8.8)+ (cid:88) i (cid:54) = j E ( γ n − ) ) E ( u i ) E ( u j ) . (8.9)We use γ j to denote a χ -random variable with j degrees of freedom. If wetake expected values and remove negative terms, it follows that E ( (cid:107) M − I (cid:107) F ) < n ( n − n − n − n − (8.10)+ n ( n − n − (8.11)+ n ( n − n − n − n − n − . (8.12) imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 Now let’s consider gamma random variables. A finite sum of independentgamma random variables with the same scale parameter is a new gammarandom variable with the same scale parameter, but the shape parameteris the sum of the addend shape parameters. In the next proof, we look atindependent and indentically distributed gamma random variables.
Proposition 8.2.
If the u i ’s in the construction of our unistochastic matrixare independent Γ( α, β ) random variables and α + 2 < n , then E ( (cid:107) M − I (cid:107) F ) < n (cid:81) i =0 ( α + i ) (cid:81) i =1 [( n − α − i ] (8.13)+ n (cid:81) i =0 ( α + i ) (cid:81) i =1 [( n − α − i ] (8.14)+ n ( n − (cid:81) i =0 ( α + i ) (cid:81) i =1 [( n − α − i ] . (8.15) Proof.
Previously we showed that almost surely (cid:107) M − I (cid:107) F < n (cid:88) i =1 16( u + u + ... + u n − u i ) u i (8.16)+ n (cid:88) i =1 16( u + u + ... + u n − u i ) u i (8.17)+ (cid:88) i (cid:54) = j u + u + ... + u n − u i − u j ) u i u j . (8.18)Using the Cauchy-Schwarz inequality and the fact that 0 < u i almost surelyfor all i , we get (cid:107) M − I (cid:107) F < n (cid:88) i =1 16 n ( u + u + ... + u n − u i ) u i (8.19)+ n (cid:88) i =1 16 n ( u + u + ... + u n − u i ) u i (8.20)+ (cid:88) i (cid:54) = j n ( u + u + ... + u n − u i − u j ) u i u j . (8.21)If we take expected values, and take advantage of the independent and iden-tically distributed u i ’s, imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 E ( (cid:107) M − I (cid:107) F ) Assume true for n . Using the fact that interchangingany two rows or any two columns of a real matrix changes the sign of thedeterminant, we see thatdet( D n +1 ) = α det( D n ) − βn det( S n ) and (8.33)det( S n +1 ) = β det( D n ) − βn det( S n ) . (8.34)Using the induction hypothesis, these equations becomedet( D n +1 ) = α (cid:18) ( α − β ) n − ( α + ( n − β ) (cid:19) − βn (cid:18) ( α − β ) n − β (cid:19) and(8.35)det( S n +1 ) = β (cid:18) ( α − β ) n − ( α + ( n − β ) (cid:19) − βn (cid:18) ( α − β ) n − β (cid:19) . (8.36)Factoring out the ( α − β ) terms gives us the results. Proposition 8.4. If M is the n × n matrix matrix M = (cid:16) n − n (cid:17) I + n . . . n ... . . . ... n . . . n , (8.37) then det( M ) = (cid:0) n − n (cid:1) n − , trace ( M ) = ( n − n and the Jordan canonical formof M is the diagonal matrix with entries (1 , n − n , . . . , n − n ) . imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 Proof. The trace of M follows from the definition. The determinant followsfrom the previous lemma by setting α = ( n − n , and β = n .Now, the matrix is a symmetric real matrix; hence it is diagonalizable. Theeigenvalues follow from the previous lemma by setting α = ( n − n − λ , and β = n to get the characteristic polynomial of M .Since M is diagonalizable, if the Markov shift ((1 /n, /n, . . . , /n ) , M )approximates a measure-preserving dynamical system that is mixing, theestimate of mixing rate is the rate at which (cid:0) n − n (cid:1) N → N → ∞ , (8.38)instead of the estimate given in [17, Chapter 4], (cid:0) Nn − (cid:1) ( n − n ) N − n +1 → N → ∞ . 9. Two Region Partitions There are few instances of interest where one would use our method with twopartition regions. We look at this special case as an example to help developunderstanding. A potential application is equal ratio mixing of items withminimal consequences of poor mixing, such as combining blends of coffee.When combining two equal volumes of coffee, poor mixing would result isinconsistent taste. Only the most serious baristas would say that inconsistentcup-of-Joe flavor is worse than poorly mixing pharmaceuticals.Say that our unit vector is (cid:126)v = (cid:18) v v (cid:19) . (9.1)Since the vector has norm one, we may write its Householder matrix as H = (cid:20) − v ± v (cid:112) − v ± v (cid:112) − v v − (cid:21) . (9.2)There are two possible doubly-stochastic matrices arising from our method, (cid:20) (1 − v ) v (1 − v )4 v (1 − v ) (1 − v ) (cid:21) , and (cid:20) v (1 − v ) (1 − v ) (1 − v ) v (1 − v ) (cid:21) . (9.3) imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 Whose characteristic polynomials are(1 − λ )(8 v − v + 1 − λ ) , and (1 − λ )( − v + 8 v − − λ ) . (9.4)The second largest eigenvalue depends on v . Let’s graph the relationship. − − √ − √ √ √ − − v λ Graphs of the relationship between v and λ when n = 2 When n = 2, the relationship between v and second largest eigenvaluedefines a function from [ − , 1] to [ − , P ( | (cid:98) λ − | > k : λ = 1) and P ( | (cid:98) λ | > k : | λ | = 1) . (9.5)If v is a beta random variable with parameters α and β , then E ( λ ) = ± α ( α +1)( α +2)( α +3)( α + β )( α + β +1)( α + β +2)( α + β +3) (9.6) ∓ α ( α +1)( α + β )( α + β +1) ± , (9.7)where the sign of addends depends on the permutation matrix used. References [1] G. Band and P. Boyland , The Burau estimate for the entropy of abraid , Algebr. Geom. Topol., 7 (2007), pp. 1345–1378. imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 [2] I. Bengtsson, ˚A. Ericsson, M. Ku´s, W. Tadej, andK. ˙Zyczkowski , Birkhoff ’s polytope and unistochastic matrices, N = 3 and N = 4, Comm. Math. Phys., 259 (2005), pp. 307–324.[3] G. Birkhoff , Three observations on linear algebra , Univ. Nac. Tu-cum´an. Revista A., 5 (1946), pp. 147–151.[4] P. Boyland and J. Harrington , The entropy efficiency ofpoint-push mapping classes on the punctured disk , Arxiv preprintarXiv:1103.1829, (2011).[5] G. Casati, G. Comparin, and I. Guarneri , Decay of correlationsin certain hyperbolic systems , Phys. Rev. A, 26 (1982), pp. 717–719.[6] M. Dellnitz, G. Froyland, and S. Sertl , On the isolated spectrumof the Perron-Frobenius operator , Nonlinearity, 13 (2000), pp. 1171–1188.[7] J. Ding and A. Zhou , Finite approximations of Frobenius-Perronoperators. A solution of Ulam’s conjecture to multi-dimensional trans-formations , Phys. D, 92 (1996), pp. 61–68.[8] G. Froyland , Using Ulam’s method to calculate entropy and otherdynamical invariants , Nonlinearity, 12 (1999), pp. 79–101.[9] , On Ulam approximation of the isolated spectrum and eigenfunc-tions of hyperbolic maps , Discrete Contin. Dyn. Syst., 17 (2007), pp. 671–689 (electronic).[10] G. Froyland and K. Aihara , Ulam formulae for random and forcedsystems , Thinking, 1, p. 1.[11] G. H. Golub and C. F. Van Loan , Matrix computations , JohnsHopkins Studies in the Mathematical Sciences, Johns Hopkins Univer-sity Press, Baltimore, MD, third ed., 1996.[12] J. Harrington , Topological efficiency of stirring with obstacles , PhDthesis, University of Florida, 2011.[13] F. Y. Hunt , Unique ergodicity and the approximation of attractors andtheir invariant measures using Ulam’s method , Nonlinearity, 11 (1998),pp. 307–317.[14] T. Y. Li , Finite approximation for the Frobenius-Perron operator. Asolution to Ulam’s conjecture , J. Approximation Theory, 17 (1976),pp. 177–186.[15] C. Meyer , Matrix analysis and applied linear algebra , Society for Indus-trial and Applied Mathematics (SIAM), Philadelphia, PA, 2000. With1 CD-ROM (Windows, Macintosh and UNIX) and a solutions manual(iv+171 pp.). imsart-generic ver. 2011/11/15 file: UsingHouseholder.tex date: November 2, 2018 [16] C. Robert and G. Casella , Introducing Monte Carlo Methods withR , Use R!, Springer, 2009.[17] A. C. Smith , Using Ulam’s method to test for mixing , ProQuest LLC,Ann Arbor, MI, 2010. Thesis (Ph.D.)–University of Florida.[18] M. A. Stremler , Fluid mixing, chaotic advection, and microarrayanalysis , in Analysis and control of mixing with an application to mi-cro and macro flow processes, vol. 510 of CISM Courses and Lectures,SpringerWienNewYork, Vienna, 2009, pp. 323–337.[19] M. A. Stremler, F. R. Haselton, and H. Aref , Designing forchaos: applications of chaotic advection at the microscale , Philos. Trans.R. Soc. Lond. Ser. A Math. Phys. Eng. Sci., 362 (2004), pp. 1019–1036.[20] J.-L. Thiffeault and M. D. Finn , Topology, braids and mixing influids , Philos. Trans. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci., 364(2006), pp. 3251–3266.[21] S. M. Ulam , Problems in modern mathematics , Science Editions JohnWiley & Sons, Inc., New York, 1964., Science Editions JohnWiley & Sons, Inc., New York, 1964.