[PDF] A Numerical Approach to Solving Nonlinear Differential Equations on a Grid with Potential Applicability to Computational Fluid Dynamics

Abstract

A finite element method for solving nonlinear differential equations on a grid, with potential applicability to computational fluid dynamics (CFD), is developed and tested. The current method facilitates the computation of solutions of a high polynomial degree on a grid. A high polynomial degree is achieved by interpolating both the value, and the value of the derivatives up to a given order, of continuously distributed unknown variables. The two-dimensional lid-driven cavity, a common benchmark problem for CFD methods, is used as a test case. It is shown that increasing the polynomial degree has some advantages, compared to increasing the number of grid-points, when solving the given benchmark problem using the current method. The current method yields results which agree well with previously published results for this test case.

Full PDF

AA Numerical Approach to Solving Nonlinear DiﬀerentialEquations on a Grid with Potential Applicability toComputational Fluid Dynamics

Jesper Tveit ∗ Department of Physics and Technology, University of Bergen, NorwayBKK Production, Kokstad, Norway

November 7, 2018

Abstract

A ﬁnite element method for solving nonlinear diﬀerential equations on a grid, with po-tential applicability to computational ﬂuid dynamics (CFD), is developed and tested.The current method facilitates the computation of solutions of a high polynomial degreeon a grid. A high polynomial degree is achieved by interpolating both the value, and thevalue of the derivatives up to a given order, of continuously distributed unknown variables.The two-dimensional lid-driven cavity, a common benchmark problem for CFD methods,is used as a test case. It is shown that increasing the polynomial degree has some advan-tages, compared to increasing the number of grid-points, when solving the given benchmarkproblem using the current method. The current method yields results which agree well withpreviously published results for this test case.

Through development and testing in the well known case of lid-driven cavity ﬂow (see Figure 1for details) the current method is shown to have potential applicability to CFD. Steady statesolutions of the Navier-Stokes equations for incompressible two-dimensional ﬂow are computedfor this test case with Reynolds number Re ≤ × .Obtaining a steady-state ﬂow solution in a two-dimensional lid-driven cavity becomes increas-ingly challenging as the Reynolds number increases. Computing a steady-state ﬂow solution istherefore useful as a bench-mark for the quality of numerical schemes, although the solutiondoes not necessarily describe a physical ﬂuid. In order to compute a solution, which accuratelyrepresents details at high Reynolds numbers, the most successful approaches have been usingvery high grid resolutions [Erturk et al., 2005, Wahba, 2012]. The current approach obtainscomparable results with much lower grid resolutions.The current solutions have up to 9’th order (polynomial degree) of spatial accuracy and thehighest grid resolution is 135 by 135 grid points. Several other high order solutions to the lid-driven cavity have previously been presented. Barragy and Carey [1996] present solutions forthe lid-driven cavity up to Re = 12500 (as well as an under-resolved solution for Re = 16500) ∗ email: [email protected] a r X i v : . [ phy s i c s . c o m p - ph ] S e p ith spatial accuracies from 6’th to 8’th order. Other works which present high order solutionsinclude Schreiber and Keller [1982] (8’th order), and Nishida and Satofuka [1992] (10’th order).Wahba [2012] present reliable steady state solutions of comparably high Reynolds numbers(Re ≤ × ) but this is the ﬁrst time reliable high order (above fourth, in polynomialdegree) steady state solutions for Reynolds number, Re ≥ Square brackets will be used to identify components in matrices. A component in a two-dimensional matrix, A , will thus be referred to as A [ r, c ], where r is the row index and c isthe column index. Indices in an R × C matrix are deﬁned to go from 0 to R − C − C = 1, then the matrix may be referred to as a column vector and if R = 1,then the matrix may be referred to as a row-vector. A matrix of matrices will be equivalent toa four dimensional matrix, Q , where Q [ r, c ][ τ, ν ] ≡ Q [ r, c, τ, ν ]. If a matrix, A , is square andnonsingular, its inverse will be written as A − .For brevity, the evaluation of a derivative of a function, Φ( ζ ), at a certain point, ζ , will inunambiguous cases be written as shown on the right-hand side of Eq.(1): ∂ Φ( ζ ) ∂ζ (cid:12)(cid:12)(cid:12)(cid:12) ζ = ζ ≡ ∂ Φ( ζ ) ∂ζ (1)For functions of two variables, the ﬁrst and the second argument will be referred to as the x -,and the y - component (or variable), respectively.Mapping of indices from double index form to single index form will, unless otherwise stated,be on the form a = b + cD , where D is a positive integer and b, c ∈ { , . . . , D − } and a ∈{ , . . . , D − } . The indices, b and c , may also a single index form of other index tuples, inwhich case the mapping will recursively follow the given form.2 .2 Order of Continuity Consider a discretization of a function, f ( x, y ), on a uniform two-dimensional grid. Let thematrices x and y be composed of the x and y components, respectively, of the position of thegrid points. Let the value of the function, f ( x, y ), and its derivatives up to, and including, the(Ω − F , with components given by Eq.(2): ∂ α + β f ( x [ k, l ] , y [ k, l ]) ∂x α ∂y β ≡ F [ k, l, α, β ] (2)where α ∈ { , . . . , Ω − } , β ∈ { , . . . , Ω − } , and the indices, k and l , identify the grid point. Thediscretization is then by deﬁnition continuous and has continuous derivatives up to (Ω − x [ k, l ] , y [ k, l ]). This is referred to as C Ω − continuity. For the sake of simplicity the grid will be oriented and scaled such that the location of each gridpoint is uniquely determined by its indices, k and l , as shown in Eq.(3):( x [ k, l ] , y [ k, l ]) ≡ ( k, l ) (3)deﬁning a uniform square grid. Let the functions, b m,n ( x, y ), be a set of polynomial basis functions where the value of m is thepolynomial degree of the ﬁrst variable, x , and the value of n is the polynomial degree of thesecond variable, y .Let the matrix of column vectors, f , the row vector-function, b ( x, y ), and the matrix, B , bedeﬁned as shown in Eqs.(4-6), respectively: f [ k, l ] [ τ, ≡ F [ k + i, l + j, α, β ] (4) b [0 , m + N n ]( x, y ) ≡ b m,n ( x, y ) (5) B [ τ, m + N n ] ≡ ∂ α + β b m,n ( i, j ) ∂x α ∂y β (6)where τ = α + β Ω + Ω ( i + 2 j ), i ∈ { , } , j ∈ { , } , m ∈ { , . . . , N − } , n ∈ { , . . . , N − } and N = 2Ω. It follows that each column vector in the matrix, f , then has (2Ω) = N components,that the matrix, B , is a N × N square matrix and that the row vector-function, b ( x, y ), has N components.The four neighboring points: ( x [ k, l ] , y [ k, l ]), ( x [ k + 1 , l ] , y [ k + 1 , l ]), ( x [ k, l + 1] , y [ k, l + 1])and ( x [ k + 1 , l + 1] , y [ k + 1 , l + 1]), surround a square region which will be referred to as the k, l ’th grid-cell .Within the k, l ’th grid-cell, the function, f ( x, y ), may be approximated by a weighted sum ofthe polynomial basis functions, b m,n ( x, y ), written in matrix form in Eq.(7): f ( x, y ) = b ( x (cid:48) , y (cid:48) ) B − f [ k, l ] + O ( x (cid:48) N + y (cid:48) N ) ≈ b ( x (cid:48) , y (cid:48) ) B − f [ k, l ] (7)where x (cid:48) = x − x [ k, l ] and y (cid:48) = y − y [ k, l ]. 3rom the deﬁnitions, Eqs.(4-6), it is clear that the approximation, b ( x (cid:48) , y (cid:48) ) B − f [ k, l ], matchesup exactly with the discretization, F , at the four grid points, i.e.: F [ k + i, l + j, α, β ] = ∂ α + β b ( i, j ) ∂x α ∂y β B − f [ k, l ] (8)where i ∈ { , } and j ∈ { , } as previously. The idea of approximating a function by sampling both its value and the value of its derivatives isknown as Hermite interpolation. The approximation, b ( x (cid:48) , y (cid:48) ) B − f [ k, l ], of the function, f ( x, y ),is a two-dimensional generalization of a Hermite spline, equivalent to recursively interpolating aset of Hermite splines. The condition number , cond( B ), deﬁned in Eq.(9), gives an estimate of the relative numericalaccuracy of the matrix product, B − f [ k, l ].cond ( B ) ≡ σ max ( B ) σ min ( B ) (9)In Eq.(9), σ max ( B ) is the largest singular value of B , and σ min ( B ) is the smallest singular valueof B . When numerically solving a linear system, f = Bc , using ﬂoating point numbers withmachine precision, (cid:15) m , an error of order O ( (cid:15) m cond ( B )) should be expected. The reader mayrefer to Trefethen and Bau [1997, pg. 95] for a more detailed explanation of the condition numberand numerical accuracy of linear equation systems.The condition number, cond( B ), depends on the choice of basis functions, b m,n ( x, y ), and onthe order of continuity (in other words, the value of Ω).For the computations in this paper, the basis functions, b m,n ( x, y ), are deﬁned in terms ofthe Bernstein polynomials, B λ, Λ ( x ), given in Eq.(10): B λ, Λ ( x ) = (cid:18) Λ λ (cid:19) x λ (1 − x ) Λ − λ , λ ∈ { , . . . , Λ } (10)as b m,n ( x, y ) ≡ B m, Ω − ( x ) B n, Ω − ( y ) (11)Table 1 shows that the condition number of the matrix, B , increases exponentially with thevalue of Ω. The machine precision is a limiting factor for the computation of the functionapproximation, Eq.(7). For higher orders of continuity, it may be considered an ill-conditionedsystem. As a result one should not expect the coeﬃcients of the function approximation, Eq.(7),to be accurate down to machine epsilon. For this reason, the main results presented in this paperhave been computed using 64 bit ﬂoating point numbers ( double in C++ syntax) rather thanthe more common 32 bit ﬂoat. The corresponding ISO C standard deﬁnition of machine epsilonis 2 − ≈ . × − , referred to in this paper as (cid:15) .4 cond ( B ) (cid:15) cond ( B )Ω with b m,n ( x, y ) with x m y n . × − . × − . × − . × − . × − . × − . × − . × − . × − . B , is deﬁned inEq.(6) and the condition number, cond ( B ), is deﬁned by Eq.(9). This is the estimated precisionof the numerical computation of the quantity B − f using 64 bit ﬂoating point numbers withmachine precision, (cid:15) ≈ . × − , (ISO C standard). The ﬁrst column shows diﬀerent valuesof Ω, corresponding to C Ω − continuity. The second column shows, (cid:15) cond ( B ), constructed withthe basis functions, b m,n ( x, y ), as given by Eq.(11). The third column shows what the expectedprecision would be if the monomial basis functions, x m y n , of equal degree were used to constructthe matrix B instead of b m,n ( x, y ). The grid has L × L grid-points at positions deﬁned in Eq.(3). The indices, k and l , then havevalues ranging from 0 to L − L −

1. Thecomputational domain is thus the two dimensional interval [0 , L − × [0 , L −

1] (see Figure 1).The Navier–Stokes Equations for steady state incompressible ﬂow in two dimensions, wherethe physical variables have been scaled with appropriate scales, read0 = u ∂u∂x + v ∂u∂y + ∂p∂x − (cid:48) (cid:18) ∂ u∂x + ∂ u∂y (cid:19) (12a)0 = u ∂v∂x + v ∂v∂y + ∂p∂y − (cid:48) (cid:18) ∂ v∂x + ∂ v∂y (cid:19) (12b)0 = ∂u∂x + ∂v∂y (12c)where u and v are the x – and y –components of the ﬂow velocity, respectively, p is the pressure andRe (cid:48) = Re / ( L − , × [0 , L −

1, so that Eqs.(12a-12c) remainmathematically equivalent to the equations solved in the references.The variables u , v and p will be referred to as the ﬂow-variables and are functions of the twovariables x and y . Additionally, the pair ( u, v ) will be referred to as the ﬂow-velocity.The pressure-velocity form of the Navier–Stokes equations (Eqs.(12a-12c)) is usually trans-formed into an equivalent vorticity-streamfunction form which, in the two dimensional case, hasone less ﬂow variable to deal with. Most published papers dealing with the lid-driven cavity intwo dimensions use the vorticity-streamfunction form. In the current work, however, we solvefor the pressure and velocity directly. 5 .2 Boundary Conditions The no-slip Dirichlet boundary condition is imposed on the ﬂow velocity, ( u ( x, y ) , v ( x, y )). Figure1 shows the details of the boundary of the computational domain of the grid. This test case isknown as the lid-driven cavity for two dimensions. No boundary values are required for thepressure during the iterative solution process. The origin of the pressure is implicitly determinedby the initial condition for the iteration (see Section 5).(0 ,

0) ( L − , , L −

1) ( L − , L − y → x → u = 1 , v = 0 u = 0 , v = 0 u = , v = u = , v = Figure 1: This ﬁgure showsthe computational domain of thegrid and its boundary condi-tions. The ﬂow velocity, u ( x, y )and v ( x, y ), along the edges isgiven. The lower edge, where y = 0, is a ”lid” which drivesthe ﬂow by sliding horizontally(positive x - direction) at a con-stant speed equal to one. Atthe three remaining edges, bothcomponents of the ﬂow velocityare zero. This system is referredto as a lid-driven cavity. The ﬂow-variables, velocity and pressure, are discretized as shown in Section 2, with U , V , and P being the discrete counterparts of u ( x, y ), v ( x, y ) and p ( x, y ), respectively. In each grid-cell,the ﬂow variables are approximated by u ( x, y ) = b ( x (cid:48) , y (cid:48) ) B − u [ k, l ] + O ( x (cid:48) N + y (cid:48) N ) (13a) v ( x, y ) = b ( x (cid:48) , y (cid:48) ) B − v [ k, l ] + O ( x (cid:48) N + y (cid:48) N ) (13b) p ( x, y ) = b ( x (cid:48) , y (cid:48) ) B − p [ k, l ] + O ( x (cid:48) N + y (cid:48) N ) (13c)where u , v and p are deﬁned in terms of U , V , and P in the same way as f was deﬁned in termsof F in Section 2.Let the row vector-functions, c α,β ( x, y ) and s ( x, y ), and the matrix of row vector-functions, m ( x, y ), be deﬁned as shown in Eq.(14), Eq.(15) and Eq.(16): c α,β ( x, y ) ≡ ∂ α + β b ( x, y ) ∂x α ∂y β B − (14) s ( x, y ) ≡ − L − c , ( x, y ) + c , ( x, y )) (15) m [ k, l ] ( x, y ) ≡ ( c , ( x, y ) u [ k, l ]) c , ( x, y ) + ( c , ( x, y ) v [ k, l ]) c , ( x, y ) + s ( x, y ) (16)6he Navier–Stokes equations, Eqs.(12a-12c), are then approximated, within a grid-cell as0 = m [ k, l ] u [ k, l ] + c , p [ k, l ] + O ( x N − + y N − ) (17a)0 = m [ k, l ] v [ k, l ] + c , p [ k, l ] + O ( x N − + y N − ) (17b)0 = c , u [ k, l ] + c , v [ k, l ] + O ( x N − + y N − ) (17c)As indicated in Eqs.(17a-17c), the formal polynomial order of accuracy is reduced due to thediﬀerentation with respect to x and y . From now on, the indication of polynomial order ofaccuracy, O ( . . . ), will be omitted.Eqs.(17a-17c), can be written in the form of a single matrix equation, as shown in Eq.(18): = E [ k, l ]( x, y ) z [ k, l ] (18)where the 3 × N matrix-functions, E [ k, l ]( x, y ), and the 3 N × z [ k, l ], aredeﬁned by Eq.(19) and Eq.(20), respectively, as: E [ k, l ]( x, y ) ≡  m [ k, l ]( x, y ) , ( x, y ) [ k, l ]( x, y ) c , ( x, y ) c , ( x, y ) c , ( x, y )  (19)and z [ k, l ] ≡  u [ k, l ] v [ k, l ] p [ k, l ]  (20) In accordance with the deﬁnition of the grid structure, Eq.(3), and the boundary conditions forthe lid-driven cavity (Figure 1), a grid component of the ﬂow velocity, U [ k, l, α, β ] or V [ k, l, α, β ],will be deﬁned to be a constant boundary component if ( l = 0 ∨ l = L − ∧ β = 0 (boundaryparallel to the x -axis) or if ( k = 0 ∨ k = L − ∧ α = 0 (boundary parallel to the y -axis). Notethat derivatives, parallel to the boundary, at the boundary are also deﬁned as boundary compo-nents. The values of the boundary components are all zero, except in the cases given by Eq.(21)and Figure 2. The components which are not deﬁned to be boundary components will be referredto as internal components or as internal ﬂow components . U [ k, , ,

0] = (cid:40) , for 0 < k < L − , for k ∈ { , L − } (21) In this subsection (4.1), the grid-cell indices, [ k, l ], will be omitted occasionally for the sake ofbrevity. Unless otherwise stated, the equations and deﬁnitions will be understood to correspondto a single, arbitrary grid-cell. 7 x →← y ] b ( x , y ) B − u [ , ] / Ω = Ω = Ω = Ω =

11 0 Figure 2: This ﬁgure shows how the bound-ary value of the x -component of the veloc-ity, u ( x, y ), is approximated near the lowerleft corner ( x = y = 0) of the boundaryfor diﬀerent values of Ω. The graphs on theright hand side of the ﬁgure are for y = 0out to the nearest grid point along the x -axis. The graphs on the left hand side of theﬁgure are for x = 0 out the to the nearestgrid point along the y -axis. By using a stan-dard least squares algorithm (see for exam-ple Howard [2000, pg. 437]), the components U [0 , , { , . . . , Ω − } ,

0] are set to producethe best ﬁt to the unit boundary conditionand the components U [0 , , , { , . . . , Ω − } ]are set to produce the best ﬁt to the zeroboundary condition. Note that the deriva-tive of u with respect to x at the corner, U [0 , , , v with respect to y at the corner, V [0 , , , x -axis).The error-squared, R , for each grid-cell will be deﬁned, using the approximated Navier-Stokes equations on matrix form (Eq.(18)), as R ≡ (cid:90) (cid:90) z T E T ( x, y ) E ( x, y ) z d x d y (22)A linear approximation of the derivatives of the error-squared, R [ k, l ], with respect to thecomponents of the column vector, z , is ∂R ∂ z ≈  (cid:90) (cid:90) E T ( x, y ) E ( x, y ) d x d y  z (23)The derivative, given in Eq.(23), is an approximation because the matrix, E , is taken to beconstant (while, in fact, it depends on the ﬂow velocity through its dependence on the matrix, m [ k, l ]).It is possible to compute the integral, Eq.(23), analytically. But, by using numerical inte-gration, the following procedure is more ﬂexible (with future modiﬁcations and extensions inmind).Let the points, ( x s , y s ), for s ∈ { , . . . , S − } form a uniform set of sample positions, deﬁnedin Eq.(24): ( x s , y s ) ≡ (cid:18) s x S , s y S (cid:19) (24)8here s = s x + s y S and s x , s y ∈ { , . . . , S − } . From Eq.(24) it is clear that 0 < x s < < y s <

1. The approximation of the derivatives of the error-squared, R [ k, l ], with respect tothe components of the column vector, z [ k, l ], where the integral has been replaced by a sum overthe samples, ( x s , y s ), read: ∂R ∂ z ≈  S S − (cid:88) s =0 E T ( x s , y s ) E ( x s , y s )  z (25) Let the ( L − × ( L − × N × N matrix, R , be deﬁned by Eq.(26):1 S S − (cid:88) s =0 E [ k, l ] T ( x s , y s ) E [ k, l ]( x s , y s ) ≡ R [ k, l ] (26)with the matrix, E , as deﬁned in Eq.(19). The deﬁnition, Eq.(26), implies that the the 3 N × N matrices, R [ k, l ], are symmetric. The grid-cell system, Eq.(25), may be written as shown inEq.(27): ∂R [ k, l ] ∂ z ≈ R [ k, l ] z [ k, l ] (27)To ﬁnd an approximate minimum of the error-squared, R [ k, l ], for all the grid-cells, we formulatethe equation system given in Eq.(28) from which a solution for the internal ﬂow components isimplied. R [ k, l ] z [ k, l ] = (28)Recall that the matrices, u , v and p (and thus also z ), are deﬁned in terms of the three four-dimensional matrices U , V , and P . From the deﬁnition given in Eq.(4) it is clear that there isan overlap between some of the components in the vectors, u [ k, l ], v [ k, l ] and p [ k, l ], for diﬀerentvalues of the cell indices, k and l (for example, the components u [ k, l ][ α + β Ω + 3Ω ,

0] and u [ k + 1 , l + 1][ α + β Ω , α, β ∈ { , . . . , Ω − } , correspond to the same components in thematrix, U , and are then by deﬁnition equal). The system given in Eq.(28) can thus not be solvedindependently for each grid-cell but must be solved for the entire grid.In order to employ eﬃcient techniques for solving linear systems, it is convenient to formulatethe set of systems for each grid cell, Eq.(28), into a single system for the entire grid, given inEq.(29): Πw = t (29)where Π is a square, symmetric matrix and w and t are column vectors. This is done bydeﬁning a one to one index mapping from all the internal ﬂow components to the components inthe column vector, w . Coeﬃcients of the components in the grid cell systems (i.e. componentsin the symmetric matrix, R [ k, l ] in Eq.(28)), are added to Π if they correspond to internalcomponents. If a component of z [ k, l ] is a boundary component, then it is multiplied with thecorresponding row in R [ k, l ] and subtracted, forming the right hand vector, t , in Eq.(29). Thisprocedure is shown in detail by Algorithm 1.It is convenient to arrange components so that the matrix, Π , gets a narrow band structure,allowing more eﬃcient computations on the system. For computations presented in this paperthe order is arranged by ﬁrst sorting the internal ﬂow components according to their location inthe grid (indices k, l ), then by what type of ﬂow component ( x -velocity, y -velocity or pressure),then by the order of the derivative (indices α, β ).Note that Algorithm 1 shows the entire matrix, Π , being assembled. In the implementationof this algorithm the sub-diagonal elements are not stored since the matrix is symmetric.9 lgorithm 1 This pseudo-code shows the details of how the set of grid-cell systems (Eq.(28))is reformulated into the system given by Eq.(29). The variables, k , l , m , n , L and N are allintegers following their previous deﬁnitions from Section 2. The temporary variables r and c arealso integers and contain the row and column indices for the matrix Π . Internal ﬂow componentsare ordered, ﬁrst by the grid point location, then by what type of ﬂow component, and then bythe order of the derivative, into a contiguous list. The method, index( . . . ), returns the positionin this list if its arguments correspond to a an internal component. Otherwise, a negative numberis returned. Π ← , t ← ≤ k < L, ≤ l < L dofor ≤ m < N do r ← index( k, l, m ) if ≤ r thenfor ≤ n < N do c ← index( k, l, n ) if ≤ c thenΠ [ r, c ] ← Π [ r, c ] + R [ k, l ][ m, n ] elset [ r ] ← t [ r ] − R [ k, l ][ m, n ] z [ m, end ifend forend ifend forend for The Navier-Stokes equations on matrix form, Eq.(18), are solved by an iteration over severalstages. Initially the internal ﬂow components are either set to zero (velocity) and one (pressure)or corresponding to a solution for a lower Reynolds number, forming an initial approximatematrix, Π , and an approximate solution, w , of the system given in Eq.(29). At the ( κ − w κ − .A new approximate solution, w (cid:48) κ − , is found using the linear conjugate gradient iteration [seefor example Trefethen and Bau, 1997, chap. 38]. The linear conjugate gradient iteration isterminated when the relative improvement factor, ˆ r κ − , deﬁned in Eq.(30), of the solution of thelinear system reaches a predetermined value, ˆ r κ − ≤ ˆ ω (cid:28) (cid:13)(cid:13) Π κ − w (cid:48) κ − − t κ − (cid:13)(cid:13) (cid:107) Π κ − w κ − − t κ − (cid:107) ≡ ˆ r κ − (30)The approximation, w (cid:48) κ − , is used to deﬁne a search direction, ∆ w κ − , as shown in Eq.(31):∆ w κ − ≡ w (cid:48) κ − − w κ − (31)The search direction is deﬁned to have corresponding grid-cell components given by Eq.(32):∆ z κ − [ k, l ] = z (cid:48) κ − [ k, l ] − z κ − [ k, l ] (32)10here the mapping from z (cid:48) κ − to w (cid:48) κ − is the same as used in Algorithm 1 for internal components.If a component, z (cid:48) κ − [ k, l ][ τ, z κ − [ k, l ][ τ, z κ − [ k, l ][ τ,

0] = 0 for boundary components.The updated ﬂow components, z κ [ k, l ], are given by Eq.(33):  u κ [ k, l ] v κ [ k, l ] p κ [ k, l ]  =  u κ − [ k, l ] v κ − [ k, l ] p κ − [ k, l ]  +  θ u ∆ u κ − [ k, l ] θ v ∆ v κ − [ k, l ] θ p ∆ p κ − [ k, l ]  (33)where θ u , θ v and θ p are three parameters to be determined in each iterative step and z κ [ k, l ] ≡  u κ [ k, l ] v κ [ k, l ] p κ [ k, l ]  , z κ − [ k, l ] ≡  u κ − [ k, l ] v κ − [ k, l ] p κ − [ k, l ]  , ∆ z κ − [ k, l ] ≡  ∆ u κ − [ k, l ]∆ v κ − [ k, l ]∆ p κ − [ k, l ]  (34)in accordance with the deﬁnition given in Eq.(20). Consider the integrand of the grid-cell residual squared (Eq.(22)) at the κ ’th stage: z Tκ [ k, l ] E Tκ [ k, l ]( x, y ) E κ [ k, l ]( x, y ) z κ [ k, l ] ≡ ρ κ [ k, l ]( θ u , θ v , θ p , x, y ) (35)According to Eq.(33), the column vector, z κ , depends linearly on the parameters θ u , θ v and θ p ,and the matrix, E κ ( x, y ), depends linearly on the parameters θ u and θ v . Eq.(35) can thus bewritten as a fourth degree polynomial of θ u , θ v and θ p , shown in Eq.(36) ρ κ [ k, l ]( θ u , θ v , θ p , x, y ) = c + c θ u + c θ u + c θ u + c θ u + c θ v + c θ u θ v + c θ u θ v + c θ u θ v + c θ v + c θ u θ v + c θ u θ v + c θ v + c θ u θ v + (36) c θ v + c θ p + c θ u θ p + c θ u θ p + c θ v θ p + c θ u θ v θ p + c θ v θ p + c θ p The coeﬃcients, c ... [ k, l ]( x, y ), in Eq.(36) are determined from the deﬁnition of E [ k, l ]( x, y ) (seeEq.(19)) and m [ k, l ]( x, y ) (see Eq.(16)) through basic algebraic operations by substituting u κ − + θ u ∆ u κ − for u , v κ − + θ v ∆ v κ − for v and p κ − + θ p ∆ p κ − for p (the grid cell indices [ k, l ] andfunction arguments ( x, y ) for the coeﬃcients, c ... [ k, l ]( x, y ), are omitted in Eq.(36) for the sakeof brevity). Eq.(36) is integrated numerically over the entire grid by the sum given in Eq.(37): P ( θ u , θ v , θ p ) ≡ W L − (cid:88) k =0 l =0 S − (cid:88) s =0 ρ κ [ k, l ]( θ u , θ v , θ p , x s , y s ) (37)11here W = S ( L − , yielding P ( θ u , θ v , θ p ) = C + C θ u + C θ u + C θ u + C θ u + C θ v + C θ u θ v + C θ u θ v + C θ u θ v + C θ v + C θ u θ v + C θ u θ v + C θ v + C θ u θ v + (38) C θ v + C θ p + C θ u θ p + C θ u θ p + C θ v θ p + C θ u θ v θ p + C θ v θ p + C θ p The function, P ( θ u , θ v , θ p ), is then minimized with respect to the parameters θ u , θ v , θ p . Theminimization of Eq.(38) is not a computationally expensive step since the function, P ( θ u , θ v , θ p ),is a fourth degree polynomial depending on only three variables. For the purposes of this paper,the nonlinear conjugate gradient iteration was suﬃcient. A ﬁxed number of iterations (50) wasused and the Fletcher-Reeves method determined the line search direction (the reader may referto Shewchuk [1994] for details concerning the nonlinear conjugate gradient iteration).With the parameters θ u , θ v , θ p determined, the ﬂow components are updated as shown inEq.(33) and the iteration may be repeated until desired accuracy is reached or until errors, dueto limited ﬂoating point precision or due to the approximate nature of the discretization, preventsfurther improvement. Solutions were computed for Reynolds numbers, Re ∈ { , , , , , , } .The data from all the computations is too extensive to be displayed in detail in this paper butis available from the author upon request. In Subsections 6.1-6.3 details of a selection of thecomputed solutions are discussed. The x -component of the velocity, u ( x, y ), through the geometric center of the cavity, from thecenter of the ”lid”, ( x = ( L − / , y = 0), to the opposing side, ( x = ( L − / , y = L − velocity proﬁle from now on.For Reynolds number, Re = 100, the well known results from Ghia et al. [1982] are used asa comparison (Figure 3). For Reynolds number, Re = 20000, the current results are compared with the very ﬁne-grid solutions from Erturk et al. [2005] and Wahba [2012] (Figure 4). Addition-ally, the interesting features for various high Reynolds numbers from Re = 5000 to Re = 40000is compared with each other (Figure 5).Figure 3, which shows the velocity proﬁle for Reynolds number, Re = 100, conﬁrms that thecurrent results agree with established results [Ghia et al., 1982] for this Reynolds number.Figure 4, which shows the velocity proﬁle for Reynolds number, Re = 20000, shows a smalldeviance from the reference solutions by Erturk et al. [2005] and Wahba [2012]. In this case thecurrent solution tends to agree with the references where they coincide and tends to lie betweenthe references where they do not coincide. Reynolds number, Re = 20000, was the highest Reynolds number for which multiple reference results wereavailable. u −→ Ghia et al.Current 1 y → Figure 3: This ﬁgure showsthe computed x -componentof the velocity on a verti-cal line through the geomet-ric center of the grid forReynolds number, Re = 100.The line shows current re-sults, b ( x (cid:48) , y (cid:48) ) B − u [ k, l ] for x = ( L − /

2, 0 ≤ y ≤ L −

1, with L = 5 and Ω = 4where x (cid:48) = x − x [ k, l ] and y (cid:48) = y − y [ k, l ]. The dot-ted circles show results pre-sented by Ghia et al. [1982]as a comparison.Figure 5 shows how the upper and lower parts of the velocity proﬁle evolve as the Reynoldsnumber increases. The upper part shows a systematic trend where minimum drops while shiftingincreasingly closer to the edge. The lower part shows a similar trend for Re = 5000 to Re = 20000.However from Re = 20000 to Re = 40000, the local minimum move toward the edge at a muchsmaller rate while increasing in magnitude at a greater rate.Figure 7 shows the velocity proﬁles for Reynolds number, Re = 1000, obtained with increasingvalues of Ω and decreasing values of L , compared with the results given by Ghia et al. [1982],Erturk et al. [2005]. The value of L was the lowest value which did not give any signiﬁcantdeviance from solutions obtained using a higher resolution. These results illustrate how anincrease of the order of continuity allows for a lower grid resolution while still achieving resultsof similar accuracy. Also note that the amount of data contained in the grid (proportional to L Ω in a two dimensional grid) decreases with increasing values of Ω. Figure 7 shows visualizations of computed ﬂow conﬁgurations. Due to the high polynomialdegree of the solution, ﬂow features below grid resolution are resolved. This can be seen inthe close-up plot in Subﬁgure 7(b). At higher Reynolds numbers the required grid resolutionis higher compared to the scale of the main vortices of the ﬂow. However, secondary, tertiaryand quaternary vortices all split up in several sub-vortices at high Reynolds numbers. It seemsreasonable to assume that the higher resolution requirement is connected to this phenomenon.

The computation times were achieved on a standard desktop computer (quad core Xeon W3565CPU at 3.2 GHz with 12 GB RAM). The computation times are not comparable to what mightbe achieved on a high end system utilizing parallel computing, but might have some use forinternal comparison. Up to four separate computations were run simultaneously (each singlethreaded) each utilizing 23-25 percent of the CPU capacity.Figure 8 shows examples of accuracy versus computation time for diﬀerent Reynolds numbers,13 u −→ WahbaErturk et al.Current 1 y → (a) The complete velocity proﬁle . . . − . − . − . u −→ y → (b) The upper and lower range of the velocity proﬁle Figure 4: These ﬁgures show the computed x -component of the velocity on a vertical line throughthe geometric center of the grid for Reynolds number, Re = 20000. The line shows current results, b ( x (cid:48) , y (cid:48) ) B − u [ k, l ] for x = ( L − /

2, 0 ≤ y ≤ L −

1, with L = 100 and Ω = 5 where x (cid:48) = x − x [ k, l ]and y (cid:48) = y − y [ k, l ]. The dotted circles show results presented by Erturk et al. [2005] and Wahba[2012] as a comparison. Subﬁgure 4(a) shows the plot for the entire y -range while subﬁgure 4(b)shows a larger view of the upper and lower y -range.grid resolutions and order of continuity. The linear conjugate gradient iteration (see Subsection5.1) accounted for most of the computation time (typically about 95 %). For higher Reynoldsnumbers, a higher grid resolution was required to achieve convergence. For Reynolds number, Re > Re = 40000, the computation was run for approximately 72 hourswith L = 135 and Ω = 5.The error, √ P , plotted with empty squares in Figure 8, shows a rapid initial convergencefollowed by a much slower rate of convergence. It is clear, however, when comparing withthe reference ﬁgures from Erturk et al. [2005], plotted with solid squares in Figure 8, that thecomputed solutions still undergo changes during the ﬁnal iterations. This may be explained bythe fact that, while the quantity √ P only measures how well the solution conforms to the givendiﬀerential equations (independently) at each point in the computational domain, the quantity RM S ref depends on the value of the solution at speciﬁc points which may be aﬀected by theaccumulation of small errors elsewhere in the computational domain. Additionally, some areas ofthe computational domain (e.g. near the lower corners where the velocity is discontinuous) maysuﬀer from large errors compared to the rest of the grid, dominating the value of the quantity √ P in the later stages of the iteration. The residual of the solutions computed by the references[Erturk et al., 2005, Wahba, 2012] converge to a smaller factor than the error of the currentsolutions. However, the error between grid points is not taken into consideration by Erturk et al.[2005], Wahba [2012], whereas in the current work the error is computed over a large number ofsub-grid sample points. 14 −→ Re = 40000Re = 30000Re = 20000Re = 10000Re = 5000 y → L − . . . − . − . L − L − L − Figure 5: This ﬁgure shows thecomputed x -component of thevelocity on a vertical line throughthe geometric center of thegrid for Reynolds number, Re ∈{ , , , , } .Only the upper ( y near L − y near 0) part of thecomputational domain is plotted.The diﬀerent lines shows currentresults, b ( x (cid:48) , y (cid:48) ) B − u [ k, l ] for x = ( L − /

2, 0 ≤ y ≤ L − L ∈ { , , , , } and Ω = 5 where x (cid:48) = x − x [ k, l ]and y (cid:48) = y − y [ k, l ].It is clear that both an increase of the grid resolution, L , and the order of continuity, Ω − Π (see Eq.(29)), and thus the required computation timeand the required amount of memory. It should also be noted, however, that increasing the orderof continuity (i.e. increasing Ω) also increases the number of nonzero sub/super-diagonals of thelinear system, Π , which also increases memory requirements and computation time. Despite thisdisadvantage for higher orders of continuity, it was found that using higher values of Ω was moreeﬃcient at computing solutions for high Reynolds numbers (Re (cid:39) L . The value of Ω is limited by theﬂoating point errors, as shown in Subsection 2.6. For this reason, a maximum value of 5 waschosen (i.e. Ω ≤ C )and a polynomial accuracy of order 9. An increase in spatial order (polynomial degree) of the grid ( p -reﬁnement) has advantages com-pared to increasing the grid resolution ( h -reﬁnement) in some cases when using the currentmethod, as shown by the high Re solutions for the lid-driven cavity. These solutions, computedon an ordinary desktop computer, are among the highest Reynolds numbers at which steadystate solutions for the lid-driven cavity have been published, even though obvious optimizations(e.g. mesh grading or parallel computation) were not used.Unlike pseudo-time ﬁnite diﬀerencing approaches, the current method for arriving at a steadystate solution does not yield periodic solutions as artifacts. Instability may appears if the gridresolution is insuﬃcient, but it is chaotic, and does not resemble a periodic ﬂow conﬁguration.It is clear from physical evidence that, for the high Reynolds numbers (Re (cid:39) = 70, Ω = 2 L Ω = 19600 L = 35, Ω = 3 L Ω = 11025 L = 11, Ω = 4 L Ω = 1936 L = 8, Ω = 5 L Ω = 1600 y = L − y = 0 Ghia et al.Erturk et al.CurrentFigure 6: This ﬁgure shows a comparison of the computed x -component of the velocity on avertical line through the geometric center of the grid for ( L, Ω) ∈ { (70 , , (35 , , (11 , , (8 , } with Reynolds number, Re = 1000. At the top ( y = L −

1) of the ﬁgure the x -componentof the velocity is zero and at the bottom ( y = 0) it is one (positive x -direction). The dottedcircles show results presented by Erturk et al. [2005] and Ghia et al. [1982].explained by the diﬀerent nature and magnitude of these inaccuracies. If a periodic behavior,observed when solving the two dimensional system, was exclusively due to the mathematicalqualities of the of the system (i.e. due to Poincar´e–Andronov–Hopf bifurcation), it is reasonableto assume that this behavior would have occurred at similar Reynolds numbers even thoughdiﬀerent numerical schemes were used.The grids with the highest order of continuity, Ω = 5 (equivalent to a polynomial degree of9, see Subsection 2.2-2.4), were the most eﬃcient for computing steady state solutions for highReynolds number ﬂows, but the numerical accuracy imposed limitations on further increase ofthe order of continuity. An improvement, for example by using increased ﬂoating point precisionor by ﬁnding basis functions with better numerical properties, is clearly possible.It is clear from the mathematical framework (see Section 2) that the current method canbe generalized to higher dimensions. Further, linear terms (e.g. time derivative, for unsteadyﬂows) may also be added to the governing equations in matrix form (see Subsection 3.3) with-out fundamentally changing the properties of the method. With the current method, and otherﬁnite-element based methods, one obtains coupled sets of equations depending on informationin a grid. The computational cost required to solve these systems tend to grow exponentiallywith the number of grid points. However, the computational cost of the numerical integration,which deﬁnes the equation set for the current method (see Subsection 4.1) grows linearly withthe number of grid points. This is an advantage because, instead of adapting the grid to complexgeometry or to diﬀerent ﬂuid phases, with the current method it is possible to select diﬀerentgoverning equations independently at diﬀerent sample points. One can also increase the densityof sample points in some areas if necessary (assuming appropriate weighting is applied). Inter-16ction with objects smaller than the grid scale can thus be incorporated. An interface betweenimmiscible ﬂuid phases can be incorporated in the same way. The latter will be demonstratedwith three-dimensional unsteady ﬂow in a forthcoming paper. Acknowledgments

This work was supported by BKK Production and The Research Council of Norway under TheIndustrial Ph.D Scheme.Alex Hoﬀmann , Jan Vaagen , Laszlo Csernai and Arne Sm˚abrekke are acknowledged forproductive discussions and valuable suggestions. Bibliography

E. Barragy and G. F. Carey. Stream function-vorticity driven cavity solutions using p ﬁniteelements.

Computers and Fluids , 26, 1996.E. Erturk, T. C. Corke, and G¨ok¸c¨ol. Numercal solutions of 2-d steady incompressible drivencavity ﬂow at high reynolds numbers.

International Journal for Numerical Methods in Fluids ,48, 2005.U. Ghia, K. N. Ghia, and T. C. Shin. High-re solutions for incompressible ﬂow using the navier-stokes equations and a multigrid method.

Journal of Computational Physics , 48, 1982.Anton Howard.

Elementary Linear Algebra . Wiley, New York, 2000.H. Nishida and N. Satofuka. Higher-order solutions of square driven cavity ﬂow using a variable-order multigrid method.

International Journal for Numerical Methods in Engineering , 34,1992.R. Schreiber and H. B. Keller. Driven cavity ﬂows by eﬃcient numerical techniques.

Journal ofComputational Physics , 49, 1982.Jonathan R. Shewchuk. An introduction to the conjugate gradient method without the agonizingpain, 1994.H. Takewaki, A. Nishiguri, and T. Yabe. Cubic interpolated pseudo-particle method (CIP) forsolving hyperbolic-type equations.

Journal of Computational Physics , 61, 1984.Lloyd N. Trefethen and David Bau, III.

Numerical Linear Algebra . Siam, Philadelphia, 1997.E. M. Wahba. Steady ﬂow simulations inside a driven cavity up to reynolds number 35,000.

Computers and Fluids , 66, 2012. Professor, University of Bergen, MAE Professor, University of Bergen, MAE Professor, University of Bergen, MAE Department Manager, BKK Production → x → (a) L = 20, Ω = 2, Re = 100 y → x → (b) L = 20, Ω = 2, Re = 100 yx y → x → (c) L = 135, Ω = 5, Re = 40000 y → x → (d) L = 135, Ω = 5, Re = 40000 Figure 7: Subﬁgures 7(a) and 7(b) show the computed ﬂow conﬁguration for Reynolds number,Re = 100, with L = 20 and Ω = 2. Subﬁgure 7(a) shows the entire computational domain, whileSubﬁgure 7(b) shows details at the upper left corner (commonly referred to as the tertiary vortex).Subﬁgures 7(c) and 7(d) show the computed ﬂow conﬁguration for Reynolds number, Re = 40000,with L = 135 and Ω = 5. Subﬁgure 7(c) shows the entire computational domain, while Subﬁgure7(d) shows details at the lower left corner (commonly referred to as the quaternary vortex,however in this case it is split up into multiple sub-vortices). The color indicates the magnitudeof the velocity, (cid:107) ( u, v ) (cid:107) , where orange is for (cid:107) ( u, v ) (cid:107) = 1, green is for (cid:107) ( u, v ) (cid:107) = 1 / (cid:107) ( u, v ) (cid:107) = 0 (and interpolated between these colors for the intermediate values).Contour lines of the velocity magnitude are drawn in white with a contour interval of 1 /

20 inSubﬁgure 7(a), 1 / /

20 in Subﬁgure 7(c) and 1 /

200 in Subﬁgure 7(d). Thearrows are of constant length in each subﬁgure and are drawn in a Lagrangian coordinate system,deﬁned by the two orthogonal unit vectors ˆ x l = ( u, v ) / (cid:107) ( u, v ) (cid:107) and ˆ y l = ( v, − u ) / (cid:107) ( u, v ) (cid:107) ,pointing in the positive direction along the unit vector, ˆ x l . The grid resolution is indicated bydots along the edge of the grid. 18 M S ref √ P R M S r e f − → √ P − → − − − (a) L = 20, Ω = 3, Re = 1000 RM S ref √ P R M S r e f − → √ P − → − − − (b) L = 11, Ω = 4, Re = 1000 RM S ref √ P R M S r e f − → √ P − → − − − (c) L = 200, Ω = 2, Re = 5000 RM S ref √ P R M S r e f − → √ P − → − − − (d) L = 40, Ω = 5, Re = 5000 Figure 8: These ﬁgures show the error (empty squares) and the deviance (solid squares) froma reference solution [Erturk et al., 2005] on a logarithmic scale for four separate computations,run through several iterations until comparable accuracy was reached. The error, √ P , (plottedwith empty squares) is the root–mean–square grid-cell error (numerically integrated over theentire grid). The quantity, P , is given in Eq.(37) as a function of the parameters θ u , θ v and θ p which are determined as explained in Subsection 5.2. The quantity, RM S ref , (plotted withsolid squares) is the root–mean–square deviance of the computed solution as compared with theﬁgures given by Erturk et al. [2005] for the x -component of the velocity along a vertical linethrough the geometric center of the cavity. Subﬁgures 8(a) and 8(b) shows these quantities over25 iterations using L = 20 , Ω = 3 and L = 11 , Ω = 4, respectively, and with Reynolds number,Re = 1000, where an approximate solution for Reynolds number, Re = 400, was used to initializethe ﬂow components. Subﬁgures 8(c) and 8(d) shows these quantities over 30 and 50 iterationsusing L = 200 , Ω = 2 and L = 40 ,,