Refined isogeometric analysis for generalized Hermitian eigenproblems
RRefined isogeometric analysis for generalized Hermitian eigenproblems
Ali Hashemian a, ∗ , David Pardo b,a,c , Victor M. Calo d,e,f a BCAM – Basque Center for Applied Mathematics, Bilbao, Basque Country, Spain b University of the Basque Country UPV / EHU, Bilbao, Basque Country, Spain c Ikerbasque – Basque Foundation for Sciences, Bilbao, Basque Country, Spain d Western Australian School of Mines, Curtin University, Perth, Australia e Commonwealth Scientific and Industrial Research Organisation (CSIRO), Perth, Australia f Curtin Institute for Computation, Curtin University, Perth, Australia
Abstract
We use the refined isogeometric analysis (rIGA) to solve generalized Hermitian eigenproblems ( Ku = λ Mu ). TherIGA framework conserves the desirable properties of maximum-continuity isogeometric analysis (IGA) discretizationswhile reducing the computation cost of the solution through partitioning the computational domain by adding zero-continuity basis functions. As a result, rIGA enriches the approximation space and decreases the interconnectionbetween degrees of freedom. We compare computational costs of rIGA versus those of IGA when employing a Lanczoseigensolver with a shift-and-invert spectral transformation. When all eigenpairs within a given interval [ λ s , λ e ] areof interest, we select several shifts σ k ∈ [ λ s , λ e ] using a spectrum slicing technique. For each shift σ k , the cost offactorization of the spectral transformation matrix K − σ k M drives the total computational cost of the eigensolution.Several multiplications of the operator matrices ( K − σ k M ) − M by vectors follow this factorization. Let p be thepolynomial degree of basis functions and assume that IGA has maximum continuity of p −
1, while rIGA introduces C separators to minimize the factorization cost. For this setup, our theoretical estimates predict computational savingsto compute a fixed number of eigenpairs of up to O ( p ) in the asymptotic regime, that is, large problem sizes. Yet,our numerical tests show that for moderately-sized eigenproblems, the total computational cost reduction is O ( p ).Nevertheless, rIGA improves the accuracy of every eigenpair of the first N eigenvalues and eigenfunctions. Here, weallow N to be as large as the total number of eigenmodes of the original maximum-continuity IGA discretization. Keywords:
Refined isogeometric analysis; generalized Hermitian eigenproblem; Lanczos eigensolver; spectraltransformation; shift-and-invert approach.
1. Introduction
Hughes et al. [1] introduced isogeometric analysis (IGA), a widely-used numerical technique for the solution ap-proximation of partial di ff erential equations (PDEs). IGA delivered useful solutions to many scientific and engineeringproblems (see, e.g., [2–9]). IGA uses spline basis functions, which are standard in computer-aided design (CAD), asshape functions of finite element analysis (FEA). These functions can have high continuity (up to C p − where p is thepolynomial order of spline bases) across the element interfaces.Compared to the minimal interconnection of elements in traditional FEA, maximum-continuity IGA discretizationsstrengthen the interconnection between elements. This increased interconnectivity degrades the performance of sparsedirect solvers (see, e.g., [10]). Garcia et al. [11] introduced the refined isogeometric analysis (rIGA) to amelioratethis performance degradation and to exploit the recursive partitioning capability of multifrontal direct solvers [12].The rIGA framework preserves some of the desirable properties of maximum-continuity IGA discretizations whilepartitioning the computational domain into macroelement blocks that are weakly interconnected by C separators. ∗ Corresponding author
Email address: [email protected] (Ali Hashemian)
Preprint submitted to Elsevier September 18, 2020 a r X i v : . [ m a t h . NA ] S e p s a result, the matrix factorization (e.g., LU or Cholesky) step has a lower computational cost. The performanceof the rIGA framework on preconditioned conjugate gradient solvers as well as its applicability to mechanical andelectromagnetic problems are also studied in [13, 14].The application of IGA in eigenanalysis is a well-studied topic in the literature (see, e.g., [15–18]). Improving thee ffi ciency of the system integration and accuracy of the spectral approximation of the IGA approach are also of interest(see, e.g., [19–26]). Herein, we investigate the beneficial e ff ect of using refined isogeometric analysis in eigenproblems.We compare the computational cost and accuracy of the resulting eigenpairs that both refined and maximum-continuityIGA produce. We first review some numerical aspects of eigenanalysis to perform a detailed comparison of the methodsand their results.Eigenanalysis is a computationally expensive proposition, especially when seeking for a large number of eigen-pairs (i.e., eigenvalues and eigenvectors) on a multidimensional domain. Frequently-used Krylov eigensolvers suchas Lanczos and Arnoldi project onto Krylov subspaces. The convergence rate of these iterative algorithms degradeswhen computing a large interval of eigenvalues, particularly when the eigenmodes are not well separated. Eigenvalueclustering and repetition are common in multidimensional PDEs. Let us consider the discrete system Ku = λ Mu asa generalized Hermitian eigenproblem (GHEP), where the term generalized distinguishes it from the standard Hermi-tian eigenproblem Au = λ u . A well-established recommendation in the literature (e.g., [27–31]) is to first perform aspectral transformation (ST), and then solve the shifted eigenproblem ( K − σ M ) u = ( λ − σ ) Mu . This spectral shiftresults in a fast convergence when calculating eigenvalues near the shift σ . A more e ffi cient way in eigenpairs approx-imation is to solve a “shift-and-invert” spectral transformed eigenproblem ( K − σ M ) − Mu = θ u , with θ = / ( λ − σ ).This approach preserves the separation of eigenvalues near σ to reach an accurate eigensolution. In many practicalcases, we seek all eigenpairs λ i and u i within a given (large) interval λ i ∈ [ λ s , λ e ], where either λ s or λ e can be in-finite. Hence, we select several shifts σ k ∈ [ λ s , λ e ] to preserve the convergence rate for eigenvalues far from σ . Weemploy a “spectrum slicing technique” (see, e.g., [32]) to dynamically select σ k s and find all eigenpairs with the truemultiplicities and without incurring in unnecessary computational e ff orts. Section 4 provides more algorithmic detailsof the eigenanalysis.The factorization of the ST matrix K − σ k M for each σ k is a major component of computational e ff ort that the eige-nanalysis requires, especially when dealing with moderate to large algebraic systems. Once we compute this factor-ization, the computation of the Krylov subspace requires several multiplications of the operator matrix ( K − σ k M ) − M by vectors. These multiplications consist of two steps, namely the forward / backward eliminations of the respectivefactorized forms of the ST matrices, and the products of matrix M by vectors. Let N be the total number of degreesof freedom in the system. Using maximum-continuity IGA, the computational cost of factorization is O ( N . p ) and O ( N p ) for 2D and 3D systems, respectively. The cost of performing forward / backward eliminations is O ( N p ) and O ( N . p ) for 2D and 3D systems, respectively, while it is O ( N p ) and O ( N p ) for multiplying the M matrix byvectors in 2D and 3D cases, respectively (cf. [10, 33]).In Section 5, we show that rIGA computes a fixed number of eigenpairs faster than maximum-continuity IGA.When using multifrontal direct solvers, rIGA reduces the factorization cost by up to O ( p ) for large domains. Also,the cost of the forward / backward eliminations decreases by O ( p ), since the factorized form of the ST matrix has fewernonzero entries in rIGA versus its IGA counterpart (see [11]). Nevertheless, the matrix multiplication of M by vectorsis slightly more expensive for rIGA as it has more nonzero entries than its smooth counterpart. There are other costssuch as vector–vector products which grow when using rIGA. However, their contribution to the total cost is irrelevantcompared to those of the factorization and matrix–vector multiplications, as we mentioned above.In practice, the numerical tests show that the total computational cost of the eigensolution decreases by a factor of O ( p ) when employing the rIGA discretization. While our theoretical analysis shows that an improvement of O ( p )is asymptotically possible. That is, for su ffi ciently large problems, the matrix factorization governs the solution cost.Additionally, rIGA approximates better the first N eigenvalues and eigenfunctions, where N can be as large as to thetotal number of degrees freedom in the smooth IGA discretization. The improved accuracy is a consequence of the2ontinuity reduction of the basis functions that enriches the Galerkin space and modifies the approximation propertiesof the smooth IGA approach.The organization of the remainder of this paper is as follows. Section 2 defines the problem. Section 3 brieflyrevisits the notation and definitions of smooth (maximum-continuity) IGA and rIGA frameworks. Section 4 describesthe eigensolution algorithm for finding the eigenpairs, while Section 5 derives theoretical cost estimates of the eigenso-lution under IGA and rIGA discretizations. We provide implementation details in Section 6, numerical cost evaluationsin Section 7, and accuracy assessments in Section 8. Finally, Section 9 draws conclusions.
2. Preliminaries
We consider the eigensolution of the Laplace operator in the unit square:Find λ ∈ IR + and u ( x ) ∈ H ( Ω ) , satisfying (cid:40) ∇ u + λ u = , u ( x ) | ∂ Ω = , x ∈ Ω : (0 , d , (1)where d = λ e and eigenfunctions u e in 2D and 3D are2D : λ ei j = π (cid:16) i + j (cid:17) , u ei j = i π x ) sin ( j π y ) ,
3D : λ ei jk = π (cid:16) i + j + k (cid:17) , u ei jk = √ i π x ) sin ( j π y ) sin ( k π z ) . (2)Considering the arbitrary test function v ( x ) ∈ H ( Ω ), a weak form of Eq. (1) is (cid:90) Ω ∇ u · ∇ v d Ω = (cid:90) Ω λ uv d Ω . (3)Using symmetric bilinear forms a ( u , v ) : = (cid:90) Ω ∇ u · ∇ v d Ω , ( u , v ) : = (cid:90) Ω uv d Ω , (4)we write the weak formulation as a ( u , v ) = λ ( u , v ) . (5)Introducing a Galerkin discretization of the continuous eigenproblem leads to the following discrete form (super-script h refers to the numerical computed eigenpairs) [34], a (cid:16) u h , v h (cid:17) = λ h (cid:16) u h , v h (cid:17) . (6) Considering the exact eigenpairs of Eq. (2) and their numerical counterparts obtained by Eq. (6), in order to assessthe accuracy of the spectral approximation, we study the eigenvalue error and the eigenfunction L and energy errornorms. Using the Pythagorean eigenvalue error theorem (see [34]), for the i th discrete eigenmode, we reach thefollowing relation between the eigenvalue error and the eigenfunction errors in L and energy norms: λ hi − λ ei + λ ei (cid:13)(cid:13)(cid:13) u hi − u ei (cid:13)(cid:13)(cid:13) L = (cid:13)(cid:13)(cid:13) u hi − u ei (cid:13)(cid:13)(cid:13) E , (7)3here (cid:13)(cid:13)(cid:13) u hi − u ei (cid:13)(cid:13)(cid:13) L = (cid:16) u hi − u ei , u hi − u ei (cid:17) , (cid:13)(cid:13)(cid:13) u hi − u ei (cid:13)(cid:13)(cid:13) E = a (cid:16) u hi − u ei , u hi − u ei (cid:17) . (8)Based on the above theorem, we define for the i th eigenmode of our spectral approximation, its normalized eigenvalueerror (EVerr) and eigenfunction L and energy norm errors (EFerr L and EFerr E , respectively) asEVerr : = λ hi − λ ei λ ei , EFerr L : = (cid:13)(cid:13)(cid:13) u hi − u ei (cid:13)(cid:13)(cid:13) L , EFerr E : = (cid:13)(cid:13)(cid:13) u hi − u ei (cid:13)(cid:13)(cid:13) E λ ei = EVerr + EFerr L . (9)
3. Refined isogeometric analysis
We first review some basic concepts of maximum-continuity IGA discretizations. For the sake of simplicity, weassume that the computational grid consists of a tensor-product mesh in [0 , d with the same number of equally-spacedelements in each direction. For descriptions of how to use tensor-product discretization relying on mapped geometriesrefer to [1, 35–37]. We discretize our computational domain with a uniform mesh of n de elements, being n e the number of elementsin each direction (see Fig. 1 for the 2D case). The approximate solution field u h ( x ) is described by the B-splinerepresentation as 2D : u h ( x , y ) = n − (cid:88) i = n − (cid:88) j = B i , p ( x ) B j , p ( y ) U i j ,
3D : u h ( x , y , z ) = n − (cid:88) i = n − (cid:88) j = n − (cid:88) k = B i , p ( x ) B j , p ( y ) B k , p ( z ) U i jk , (10)where U is the d th-order tensor of control variables (i.e., degrees of freedom), p is the polynomial degree of the basisfunctions (equal in all directions), and n = n e + p is the number of control variables in each direction (with maximumcontinuity of C p − in basis functions). The parameter space is then characterized by the knot sequence Ξ with singlemultiplicities, given for the x direction by Ξ = [0 , , ..., (cid:124) (cid:32)(cid:32)(cid:32)(cid:32) (cid:123)(cid:122) (cid:32)(cid:32)(cid:32)(cid:32) (cid:125) p + , x p + , x p + ..., x n − , , , ..., (cid:124) (cid:32)(cid:32)(cid:32)(cid:32) (cid:123)(cid:122) (cid:32)(cid:32)(cid:32)(cid:32) (cid:125) p + ] , (11)and the B-spline basis functions B i , p are expressed by the Cox–De Boor recursion formula [38] as B i , ( x ) = (cid:40) x i ≤ x < x i + , , B i , p ( x ) = x − x i x i + p − x i B i , p − ( x ) + x i + p + − xx i + p + − x i + B i + , p − ( x ) . (12)Applying the weak form given by Eq. (6) to the above discretization, the generalized Hermitian eigenprob-lem (GHEP) is Ku h = λ h Mu h , (13)where K and M are real symmetric (or Hermitian) sti ff ness and mass matrices, respectively.4 p x p + x n x B i , p ( x ) y p y p + y n y B j , p ( y ) (cid:72)(cid:72)(cid:72)(cid:89) (cid:8)(cid:8)(cid:8)(cid:42)(cid:54) xy u h ( x , y ) (cid:64)(cid:64)(cid:64) (cid:64)(cid:64)(cid:64) u h ( x , y ) U ij Figure 1: Left: a uniform bicubic 8 × u h ( x , y ) with the net of control variables U ij (red) and the projection of elements on the solution surface (blue). Controlvariables coincide with the solution surface on the domain boundaries, but not necessarily at interior points. Remark 1.
The eigenvectors u h in Eq. (13) are the vectorial form of the tensor of control variables U . Hence, weobtain the tensor form by U = Tensor( u h ) [39], and compute the numerical eigenfunction u h for each eigenmode byEq. (10). ff ness matrices We define 1D sti ff ness and mass matrices in x , y , and z directions as K xi j : = (cid:90) x B (cid:48) i , p ( x ) B (cid:48) j , p ( x ) dx , K yi j : = (cid:90) y B (cid:48) i , p ( y ) B (cid:48) j , p ( y ) dy , K zi j : = (cid:90) z B (cid:48) i , p ( z ) B (cid:48) j , p ( z ) dz , M xi j : = (cid:90) x B i , p ( x ) B j , p ( x ) dx , M yi j : = (cid:90) y B i , p ( y ) B j , p ( y ) dy , M zi j : = (cid:90) z B i , p ( z ) B j , p ( z ) dz . (14)We build the 2D and 3D system matrices with the following formulae (see, e.g., [39]):2D : K = M y ⊗ K x + K y ⊗ M x , M = M y ⊗ M x ,
3D : K = M z ⊗ M y ⊗ K x + M z ⊗ K y ⊗ M x + K z ⊗ M y ⊗ M x , M = M z ⊗ M y ⊗ M x , (15)where ⊗ indicates the Kronecker product. In refined IGA, we improve the approximation space to reduce the computational cost of the solution as wellas to approximate better the solution. That is, rIGA reduces the continuity of certain basis functions to reduce theinterconnection between degrees of freedom of the mesh [11]. By increasing the multiplicity of some existing knotsup to the degree of B-spline bases in the h -refinement sense, the continuity and support size of the basis functionsdecreases without adding new elements. The resulting zero-continuity basis functions partition the computationalspace into interconnected blocks separated by C hyperplanes. This connectivity reduction significantly reduces thesolver cost by reducing the cost of the matrix factorization as well as the forward / backward elimination (see [11, 14]).The knot insertion steps add new control variables and, therefore, enrich the Galerkin space, modifying the spectralapproximation properties of the IGA approach. Fig. 2 describes three symmetric partitioning levels with the respectiveblocksizes of 4, 2, and 1 for the bicubic 8 × C blocks in eachdirection). 5 a) (cid:96) = =
4) (b) (cid:96) = =
2) (c) (cid:96) = = Figure 2: Di ff erent partitioning levels ( (cid:96) = , ,
3) and respective C separators (in red, purple and black) for the bicubic 8 × Remark 2.
For simplicity, we assume the mesh size in each direction is a power of two (i.e., n e = s ). This as-sumption allows us to split the mesh symmetrically and obtain 2 (cid:96) blocks (i.e., macroelements) in each direction withblocksize 2 s − (cid:96) , where (cid:96) = , , ..., s is the partitioning level. Here, (cid:96) = C p − continuity everywhere, while (cid:96) = s is equivalent to FEA with C continuity at all element interfaces (knot lines).Fig. 3 depicts the matrix patterns of a 1D domain under di ff erent discretizations with n e = p =
3. Thefigure shows the strong interconnectivity between degrees of freedom for maximum-continuity IGA. The figure alsoshows that rIGA partitioning weakens this connectivity accelerating the solution and reducing the memory required tosolve. For the maximum partitioning level ( (cid:96) = (a) C p − IGA (b) rIGA with (cid:96) = (cid:96) = (cid:96) = Figure 3: Matrix patterns of a cubic eight-element domain in 1D under di ff erent discretizations (green squares delimit elemental matrices).(a) maximum-continuity C p − IGA, (b) rIGA with blocksize =
4, (c) rIGA with blocksize =
2, and (d) rIGA with blocksize =
1, which is equalto the cubic FEA discretization. Uniform high-continuity implies strong interconnection between degrees of freedom (red dots) in (a). The in-terconnection weakens for the interior elements of each macroelement under rIGA discretizations in (b) and (c). Magenta dots denote the C interconnection between elements. Increasing the multiplicity of an existing knot up to p (i.e., the degree of bases) adds p − (cid:96) th partitioning level ( (cid:96) >
0) adds [2 (cid:96) − ( p − d control variables. Consequently, the total numberof degrees of freedom to discretize Eq. (1) isIGA : N = ( n e + p − d , rIGA : N = (cid:104) n e + (cid:96) ( p − − (cid:105) d . (16)6 . Solving generalized Hermitian eigenproblems The eigenproblem (13) defines a large sparse system of matrices with eigenvalues that may have arbitrary multi-plicity. Numerically, we seek to compute the the eigenpairs λ hi and u hi within the given interval λ hi ∈ [ λ s , λ e ], whereeither λ s or λ e can be infinite. The conversion of a generalized eigenproblem to a standard one is fraught. The transfor-mation can factor M (or K ) into its Cholesky decomposition as M = LL T (or K = LL T ) and solve L − KL − T w = λ h w (or L − ML − T w = w / λ h ), where u h = L − T w (superscript T refers to the transpose or conjugate transpose of a realsymmetric or a Hermitian matrix, respectively). Either transformation may numerically fail. This fragility of the com-putation process can have many causes. For example, K could be semidefinite, the eigenvalues may be insu ffi cientlyseparated, or L may be poorly conditioned. Any of these failings a ff ect the extraction of eigenvectors from u h = L − T w in the backward elimination [29]. Thus, the eigensolvers generally perform a spectral transformation (ST) and solvethe following shifted problem ( K − σ M ) u h = ( λ h − σ ) Mu h , (17)to obtain an accurate approximation of eigenpairs (see, e.g., [28–31]). Then, most eigensolvers use the shift-and-inverteigenproblem by solving the following system:( K − σ M ) − Mu h = θ u h , (18)where θ = / ( λ h − σ ). The resulting operator matrix ( K − σ M ) − M is not symmetric, but self-adjoint with respect to M . Since matrices K and M have di ff erent null spaces, the ST matrix K − σ M is non-singular unless σ is an eigenvalue.Hence, Eq. (18) resolves the (potential) issue of dealing with semidefinite system matrices. Another main advantagebehind using the shift-and-invert problem is to transform the eigenvalues λ hi nearest the shift σ into the largest andwell separated eigenvalues θ i of the reciprocal eigenproblem of Eq. (18) (see Fig. 4). A well-selected shift enables theeigensolver to compute many eigenpairs in a single iteration. λ h λ h λ h λ h θ θ θ θ λ h = σ θ = λ h − σ ( σ < λ h ) θ = λ h − σ ( σ > λ h )Figure 4: The shift-and-invert spectral transformation: λ h values near the shift σ are well-separated on the θ axis. In practical cases where the requested eigenvalue interval is large, we select additional shifts to prevent deteriorationof the convergence rate of the eigensolution when the desired eigenvalues are far from σ . An example is to find allcritical speeds of a turbine shaft in a given working interval. For this purpose, we select σ k s using a spectrum slicingtechnique (see, e.g., [32]). This method finds the requested eigenpairs with the true multiplicities while minimizing thecomputational e ff ort (see Section 4.2). 7part from slicing the spectrum, Krylov eigensolvers also incorporate e ff ective restarting mechanisms. Restartingprevents an increase of the computational work needed for each shift in systems with significant numbers of degrees offreedom. Well-known restarting techniques are the Sorensen’s implicit restart [40], employed in the context of Lanczosmethods, the thick-restart Lanczos method of Wu and Simon [41], and its unsymmetrical equivalent for the Arnoldicase referred to as the Krylov–Schur method of Stewart [42, 43]. Herein we focus on the implementation of the Lanczos method for solving the generalized Hermitian eigen-problems. For each individual shift σ k , given the factorization ( K − σ k M ) = LL T , we define the operator matrix H : = L − T L − M , so that the eigenproblem of Eq. (18) becomes Hu h = θ u h . The m -step Lanczos decomposition con-sists of reducing the N × N matrix H to a symmetric tridiagonal matrix T m ( m (cid:28) N ) as follows (see, e.g., [30, 44]), HV m = V m T m + β m v m + e Tm , (19)where e m is the m th coordinate vector, and the term β m v m + e Tm is the residual of the m -step Lanczos decomposition. Inthe above equation, T m = α β β α β . . . . . . . . .β m − α m − β m β m α m , (20)and V m : = [ v , v , ..., v m ] is the matrix of Lanczos vectors. Assuming β = v is an initial generalized unit vectorof length N , i.e., (cid:107) v (cid:107) M = ( v T Mv ) / =
1, the components of T m and vectors v j + ( j = , , ..., m ) are obtained by thefollowing recurrence formulae, α j = v Tj MHv j ,β j + = (cid:13)(cid:13)(cid:13) Hv j − α j v j − β j v j − (cid:13)(cid:13)(cid:13) M , v j + = β j + (cid:16) Hv j − α j v j − β j v j − (cid:17) . (21)In this way, the Lanczos vector v m + is M -orthogonal with respect to the columns of V m in the Gram–Schmidtsense, resulting in V Tm Mv m + =
0. Hence, the M -inner product of V m premultiplied in Eq. (19) leads to the followingequation, noting that V m is an M -orthogonal matrix (i.e., V Tm MV m = I ), V Tm MHV m = T m . (22)The above equation reveals that T m is the M -orthogonal projection of H onto the m th order Krylov subspace K m ( H , v ). Therefore, if ω j and w j are the eigenpairs of T m (commonly referred to as Ritz values and Ritz vectors),the Rayleigh–Ritz approximation of the eigenpairs of H can be computed as θ j = ω j , u hj = V m w j . (23)The eigenvalues of the original GHEP of Eq. (13) for each Lanczos iteration is then obtained as λ hj = σ k + /θ j foreach shift σ k . Since T m is a symmetric tridiagonal matrix, there exist multiple methods for computing its eigenpairs(see, e.g., [45]). Remark 3.
By employing an LDL Cholesky factorization of the ST matrix, i.e., ( K − σ k M ) = LDL T , where D is adiagonal matrix, based on the Sylvester’s law of inertia [46], the number of eigenvalues smaller than σ k is equal tothe number of negative eigenvalues of D . Therefore, by defining ν k : = ν ( K − σ k M ) as the number of eigenvaluessmaller than σ k , for two consecutive shifts σ k and σ k + , the interval [ σ k , σ k + ] has ν k + − ν k eigenvalues including theirmultiplicities. This rule drives the spectrum slicing technique when determining the required number of shifts.8or each shift, only m Lanczos steps are carried out followed by a restarting algorithm until computing all eigenpairscorresponding to the interval [ σ k , σ k + ]. After computing V m , a new Lanczos process starts, which benefits from thepreviously obtained spectral approximation. The thick-restart approach presented in [41] is an e ff ective restartingtechnique in the case of Hermitian eigenproblems. When restarting, the eigensolver keeps an appropriate number ofLanczos vectors, let say c < m . The Lanczos recurrence of Eq. (21) continues after restarting with the following initialvalues: r c + = Hv c − α c v c − c (cid:88) i = β i v i − ,β c + = (cid:107) r c + (cid:107) M , v c + = r c + β c + , (24)where the orthogonalization is with respect to all previously stored Lanczos vectors. Another points to consider are thedeflation and spectrum recycling algorithms. The former is for maintaining the orthogonality of eigenvectors associatedto a cluster of eigenvalues obtained from di ff erent shifts. The latter is to transform some Lanczos basis from a previousshift to the current one in case they create the same Krylov subspace. More descriptions about the mathematical detailsof the above-mentioned algorithms can be found in, e.g., [30, 32, 41, 47].
5. Cost estimation of the eigensolution
Estimating the cost of an eigensolution is challenging because it contains several numerical algorithms. In addition,di ff erent eigensolvers like, e.g., SLEPc [48], ANASAZI [49], ARPACK [50], have their own methodologies for theeigensolution. In here, we focus on the formulation of Lanczos decomposition in Section 4.2, for which we estimateits computational cost (measured in time) based on the most expensive operations. According to Eq. (20), constructing T m involves m computations of α j and m − β j . Eq. (21) shows that com-puting each α j requires one forward / backward elimination of the Cholesky factors of the ST matrix, two multiplicationsof the mass matrix M by the respective vectors, and one vector product. In the following, we refer to these operationsas “f / b elimination”, “mat–vec” and “vec–vec”, respectively. On the other hand, the computation of β j only needsone mat–vec followed by one vec–vec for the M -norm calculation. Additionally, extracting eigenvectors by Eq. (23)requires N multiplications of the N × m Lanczos matrix V m by the Ritz vectors.To determine the cost of matrix factorization and f / b elimination, we follow the theoretical estimates of 2D and 3Dsystems in terms of floating point operations (FLOPs) described in [10, 11]. We haveIGA : FLOPs fact = O (cid:16) N ( d + / p (cid:17) , rIGA : FLOPs fact = d (cid:96) O (cid:16) (2 − d (cid:96) N ) ( d + / p (cid:17) + O (cid:16) N ( d + / (cid:17) , (25)IGA : FLOPs f / b = O (cid:16) N ( d + / p (cid:17) , rIGA : FLOPs f / b = d (cid:96) O (cid:16) (2 − d (cid:96) N ) ( d + / p (cid:17) + O (cid:16) N ( d + / (cid:17) . (26)Garcia el al. [11] show the factorization and f / b elimination costs in large systems reduces by up to O ( p ) and O ( p ), respectively, when using rIGA with blocksize of 2 elements. The cost of vec–vec products is proportional tothe length of the vectors, N , and the cost of mat–vecs is proportional to the number of nonzeros of the mass matrix, N nz ( M ). In particular, the number of nonzeros of either mass or sti ff ness matrix is related to the sum of interactions that9ach basis function has with all other bases [33]. As a result, referring to the matrix layouts of 1D systems in Fig. 3and the tensor-product property described in Lemma 3.1.1, one obtains the number of nonzeros of M asIGA : N nz ( M ) = (cid:104) n e (2 p + + p (cid:105) d , rIGA : N nz ( M ) = d (cid:96) (cid:104) − (cid:96) n e (2 p + + p − (cid:105) d . (27)Therefore, the cost of mat–vecs with IGA discretization is close to O ( N p d ) when N is su ffi ciently large, while thatof rIGA discretizations is slightly higher. Considering the optimal blocksize of 2 , the ratio N nz ( M rIGA ) / N nz ( M IGA )is equal to (1 . d and (1 . d , for p = n e is in the range of 2 ∼ . The degradationincurred by rIGA, however, is not comparable to the improvements we obtain in the factorization and f / b eliminationsteps by using rIGA.For large systems, we identify the three most expensive operations, respectively, as the Cholesky factorizationsof the ST matrix, f / b eliminations, and mat–vecs. The costs of vec–vec products and extracting eigenvectors aresignificantly lower than that of mat–vecs, since N (cid:28) N nz ( M ) and, henceforth, we can exclude them from our costestimates. In addition, other numerical procedures in the eigenanalysis (e.g., system integration) are assumed to be ofa lower order compared to the most expensive operations. Hence, the total cost of the eigensolution is governed by thenumber of factorizations for each shift, N fact , the number of f / b eliminations, N f / b , and the number of mat–vecs, N m–v .Table 1 expresses these numbers in terms of number of shifts, N shift , and the total number of iterations, N iter , carriedout by the eigensolver. The table also compares how an rIGA discretization improves or degrades the performance ofeach operation with respect to that of an IGA discretization. To build this table, we assume the following: • IGA and rIGA discretizations use the same number of shifts. This is derived from Remark 3 and confirmed bynumerical results (see Section 7). • IGA and rIGA discretizations require the same number of iterations. We show this numerically in Section 7 fora su ffi ciently large number of eigenpairs ( N ≥ ). • The number of Lanczos steps m has the same average per shift through all iterations under both IGA and rIGAdiscretizations. Numerical results of Section 7 confirm this assumption. • The number of degrees of freedom is su ffi ciently large, so the cost improvements due to the use of rIGA describedin [11] hold. Table 1: Main numerical operations required by the Lanczos eigensolution algorithm and the improvement / degradation we obtain by using rIGAversus IGA discretization. We express the number of times we call each operation in terms of number of shifts, N shift , and the total number ofiterations, N iter . Numerical operation Matrix factorization F / b elimination Matrix–vector productNumber of timesthe operation is called N fact = N shift N f / b ≈ m N iter N m–v ≈ m N iter Improvement / degradation ofperforming one operation in rIGA Improving by O ( p ) Improving by O ( p ) Degrading by N nz ( M rIGA ) / N nz ( M IGA ) Referring to Eqs. (25)–(27), we express the number of FLOPs of algorithms described in Table 1 as O ( N a p b ),where factors a and b vary for di ff erent operations and space dimensions as displayed in Table 2. Herein, we are inter-ested in measuring the computational time. Since time and FLOPs are correlated for the type of operations consideredhere (as already shown in, e.g., [11]), we estimate the time T required to perform each operation as T ≈ AN a p b , (28)10nd in the logarithmic form as log T ≈ log A + a log N + b log p . (29) Table 2: Constants a and b in Eq. (28) for the theoretical estimation of computational time of the most expensive numerical operations of the Lanczoseigensolution algorithm. Constant a is equal for both IGA and rIGA discretizations. Constants Discretization method Matrix factorization F / b elimination Mat–vec product ∗ a IGA and rIGA ( d + / d + / b IGA 3 2 d rIGA 1 1 d ∗ The time of mat–vecs under rIGA increases by a factor in the range of (1 . d and (1 . d with p = ∼ Remark 4.
When seeking for a su ffi ciently large number of eigenpairs, N , we assume the computational time growslinearly with respect to N . Numerical results of Section 7 confirm this assumption.
6. Implementation details
We discretize the model problem using PetIGA [36], a high-performance isogeometric analysis implementationbased on PETSc (portable extensible toolkit for scientific computation) [51]. PetIGA has been utilized in many scien-tific and engineering applications (see, e.g., [6, 10, 11, 13, 14, 33, 52–54]). It allows us to investigate both IGA andrIGA discretizations on test cases with di ff erent numbers of elements in 2D and 3D, di ff erent polynomial degrees ofthe B-spline spaces, and di ff erent partitioning levels of the mesh.We also use SLEPc, the scalable library for eigenvalue problem computations [48], for performing the eigenanaly-sis, allowing us to apply the shift-and-invert spectral transformation. SLEPc, which has been used in solving di ff erenteigenproblems (see, e.g., [32, 55–59]), employs the Krylov–Schur algorithm by default, which is equal to the thick-restart Lanczos algorithm in the case of generalized Hermitian eigenproblems. SLEPc computes almost the samenumber of eigenpairs are computed for each shift, allowing us to e ffi ciently estimate the computational costs.We use multifrontal direct solver MUMPS [60] to construct the LDL Cholesky factors, compute the required inertiafor shift selections, and perform the forward / backward eliminations. We employ the sequential version of MUMPS,which runs on a single thread (core). We also use the automatic orderings provided by METIS [61]. For each test case,since all ST matrices have the same sparsity pattern, we only perform one symbolic factorization, followed by a certainnumber of numerical factorizations depending on the number of required shifts. We executed all tests on a workstationequipped with an Intel Xeon Gold 6230 CPU at 2.10 GHz with 256 GB of RAM.
7. Numerical results
We report the computational times (in seconds) required for finding the eigenpairs of the Laplace operator describedin Section 2. We test di ff erent mesh sizes with n e = s elements in each direction, di ff erent partitioning levels of therIGA discretization, namely (cid:96) = , , , ..., s (FEA), and di ff erent polynomial degrees p of B-spline bases. For2D problems, we consider uniform meshes with s = , , ,
11 and degree of p = , , ,
5. For the 3D case, we test on s = , , N eigenpairs of the PDE system with di ff erent partitioning levels (cid:96) , where N can be as large as the number of eigenmodes of the IGA discretization ( N IGA ). We report the elapsed time for findingall N eigenpairs, T N , the average required time per eigenmode, T av : = T N / N , and a normalization of time given by (cid:101) T : = T N / N N .Before proceeding with time performance of IGA and rIGa discretizations in eigenanalysis, we show in Fig. 5athat the number of Lanczos steps m for a su ffi ciently large N becomes constant, independently of the mesh size and11imension. This confirms the assumptions of Section 5. Furthermore, Fig. 5b demonstrates that the number of shiftsand the number of iterations increase linearly with N , indicating the proportional relationship of T N and T av (seeRemark 4). The figure also shows that 3D systems need more iterations than 2D systems for finding the same numberof eigenpairs. These observations allow to predict expected times for large systems by solving only a small portion ofthe spectrum. No. of computed eigenpairs, N N f / b / N iter N m–v / N iter N f / b / N iter N m–v / N iter N f / b / N iter N m–v / N iter mesh: 512 , p = , (cid:96) = , p = , (cid:96) = , p = , (cid:96) = A v e r a g e N o . o f L a n cz o ss t e p s , m (a) No. of computed eigenpairs, N N shift = N fact N iter N shift = N fact N iter N shift = N fact N iter mesh: 64 , p = , (cid:96) = , p = , (cid:96) = , p = , (cid:96) = N s h i f t a nd N it e r (b) Figure 5: (a) Independence of the average number of Lanczos steps, m ≈ N f / b / N iter ≈ N m–v / N iter from number of computed eigenpairs when N is large. (b) Linear increase of the numbers of shifts and iterations in terms of the number of computed eigenpairs. We test on three di ff erent systemsin terms of space dimension, domain size, polynomial degree and partitioning level. The partitioning level (cid:96) of rIGA a ff ects the computational time of the eigenanalysis. To see this, we focus onthe most expensive numerical operations described in Section 5.1. We monitor the total elapsed time for di ff erentblocksizes considering the partitioning scheme presented in Section 3.2. There is an optimal blocksize of 2 s − (cid:96) at whichthe factorization and f / b elimination times are minimum; however, the time of mat–vec products always increases with (cid:96) . Fig. 6 shows the computational times of finding N = eigenpairs (i.e., T N ) as a function of the blocksize. TherIGA factorization time reaches a minimum at a blocksize of 16 elements almost in all cases, which coincides withthe optimal blocksize for the f / b elimination time. Hence, we obtain the maximum savings for the total elapsed time(considering all numerical operations) by employing macroelements of size 16.When a matrix is su ffi ciently large, its Cholesky factorization is more expensive than f / b elimination and mat–vecmultiplication. However, Krylov eigensolvers perform multiple f / b eliminations and mat–vecs per Cholesky factor-ization. In our 2D case, N f / b ≈ N fact and N m–v ≈ N fact (see Table 1 and Fig. 5). It brings the costs of thesetwo operations in a comparable range with the matrix factorization. To compare results, Table 3 reports the numberof executions of each operation for finding N = eigenpairs. We also report the average computational times pereigenvalue obtained by dividing the time of each numerical procedure by N (i.e., T av = T N / N ). Results indicate animprovement in the cost of matrix factorization close to O ( p ) for large problems, and of almost O ( p ) for f / b elim-inations. We observe a slight degradation in cost of mat–vec multiplications due to the increase of nonzero terms ofthe mass matrix under rIGA. In summary, the total observed time saving for the entire eigensolution is up to O ( p ) forlarge domains. The time of finding N = discrete eigenpairs with n e = locksize p = p = p = p = (cid:1)(cid:1) (cid:65)(cid:65) C (FEA) C p − (IGA)2 M a t –v ec ti m e ( s ec ) F / b e li m i n a ti on ti m e ( s ec ) F ac t o r i za ti on ti m e ( s ec ) T o t a lti m e ( s ec ) (a) Mesh: 256 Blocksize (cid:1)(cid:1) (cid:65)(cid:65) C (FEA) C p − (IGA)2 M a t –v ec ti m e ( s ec ) F / b e li m i n a ti on ti m e ( s ec ) F ac t o r i za ti on ti m e ( s ec ) T o t a lti m e ( s ec ) (b) Mesh: 512 Blocksize (cid:8)(cid:8) (cid:65)(cid:65) C (FEA) C p − (IGA)2 M a t –v ec ti m e ( s ec ) F / b e li m i n a ti on ti m e ( s ec ) F ac t o r i za ti on ti m e ( s ec ) T o t a lti m e ( s ec ) (c) Mesh: 1024 Blocksize (cid:8)(cid:8) (cid:65)(cid:65) C (FEA) C p − (IGA)2 M a t –v ec ti m e ( s ec ) F / b e li m i n a ti on ti m e ( s ec ) F ac t o r i za ti on ti m e ( s ec ) T o t a lti m e ( s ec ) (d) Mesh: 2048 Figure 6: The total times and those of the most expensive numerical operations for finding the first N = ff erent mesh sizes of n e = s ( s = , , ,
11) and degrees p = , , ,
5. We test rIGA discretizations with di ff erent blocksizes (2 s − (cid:96) ) obtainedusing di ff erent levels of partitioning (cid:96) = , , ..., s . to O ( p ). To observe them, we would need larger computational resources. On the other hand, for small problems(e.g., 2D systems with n e ≤
256 and p ≤ T av = T N / N . It also describes the contribution ofeach of the most expensive operations to the total time. Fig. 8 demonstrates the same results after normalizing the timeper eigenvalue by the number of degrees of freedom as (cid:101) T = T N / N N . Both figures confirm that matrix factorization13 able 3: The average computational times per eigenvalue, T av = T N / N , and the number of executions of the most expensive operations for finding N = / b elimination Mat–vec product TotalMesh p Method N fact T av , fact (sec) Improvedby N f / b T av , f / b (sec) Improvedby N m–v T av , m–v (sec) Degradedby T av (sec) Improvedby256 is the most decisive numerical operation of the eigenanalysis for large problems. rIGA with blocksize of 16 reducesthe cost of the matrix factorization by a factor of up to O ( p ). It also improves the cost of f / b elimination, which is thegoverning cost in dealing with small problems. On the other hand, the cost of mat–vec multiplication slightly increaseswhen using rIGA, which is not as decisive as the improvements of the other two operations. Fig. 9 depicts the times required to compute N = eigenpairs of the Laplace operator in 3D versus di ff erentblocksizes of the rIGA discretizations. Similar to the 2D test cases, the factorization time of rIGA reaches a minimumat blocksize of 16 elements expect for n e =
32, where referring to Fig. 9a, the optimal blocksize for matrix factoriza-tion is 8 elements (see the same inference in [11] for optimal blocksize of small and large domains). However, themaximum computational saving for the total elapsed time, considering all numerical operations of the eigenanalysis,is achieved by employing an rIGA discretization with blocksize of 16 elements in each direction. Table 4 reports thenumber of times we perform each operation when finding N eigenpairs as well as the average computational timesper eigenvalue, T av = T N / N . In 3D systems, we observe improvements of O ( p ) in matrix factorization and O ( p ) inthe total eigenanalysis when using rIGA with n e ≥
64. We find N = eigenpairs with a 128 mesh and cubic bases14 o. of elements in each direction, n e − − − IGA, factorizationIGA, f / b eliminationIGA, mat–vec productIGA, total timerIGA, factorizationrIGA, f / b eliminationrIGA, mat–vec productrIGA, total time T i m e p e r e i g e nv a l u e , T a v ( s ec ) (a) p = No. of elements in each direction, n e − − IGA, factorizationIGA, f / b eliminationIGA, mat–vec productIGA, total timerIGA, factorizationrIGA, f / b eliminationrIGA, mat–vec productrIGA, total time T i m e p e r e i g e nv a l u e , T a v ( s ec ) (b) p = No. of elements in each direction, n e − − IGA, factorizationIGA, f / b eliminationIGA, mat–vec productIGA, total timerIGA, factorizationrIGA, f / b eliminationrIGA, mat–vec productrIGA, total time T i m e p e r e i g e nv a l u e , T a v ( s ec ) (c) p = No. of elements in each direction, n e − − IGA, factorizationIGA, f / b eliminationIGA, mat–vec productIGA, total timerIGA, factorizationrIGA, f / b eliminationrIGA, mat–vec productrIGA, total time T i m e p e r e i g e nv a l u e , T a v ( s ec ) (d) p = Figure 7: Average time per eigenvalue, T av = T N / N , versus the number of elements in each direction, n e . 2D eigenproblems under maximum-continuity IGA and optimal rIGA discretizations with blocksize of 16 elements in each direction. in 168 hours using IGA and 32 hours using rIGA. (see Fig. 9c). However, we expect an improvement of O ( p ) forsu ffi ciently large problems.Figs. 10 and 11 show the average time per computed eigenvalue, T av , and the normalized time, (cid:101) T . As in the 2D case,the f / b elimination dominates the total cost in small problems. Whereas for large problems, the matrix factorizationis the most expensive procedure. However, to find a fixed number of eigenpairs, 3D problems employ more iterationsthan 2D ones (see Fig. 5b), resulting in more f / b eliminations and mat–vecs for each spectral transform (i.e., shift). Inour case, the number of f / b eliminations and mat–vec multiplications is close to N f / b ≈ N fact and N m–v ≈ N fact in 3D eigenproblems (see Table 4).
8. Accuracy assessment of eigensolution using rIGA
This section investigates the e ff ect of employing rIGA discretizations on the accuracy of eigenanalysis we utilize theeigenvalue error, EVerr, and eigenfunction L and energy norm errors, EFerr L and EFerr E , respectively, as expressed15 o. of elements in each direction, n e − − − − IGA, factorizationIGA, f / b eliminationIGA, mat–vec productIGA, total timerIGA, factorizationrIGA, f / b eliminationrIGA, mat–vec productrIGA, total time N o r m a li ze d ti m e , (cid:101) T ( s ec ) (a) p = No. of elements in each direction, n e − − − − IGA, factorizationIGA, f / b eliminationIGA, mat–vec productIGA, total timerIGA, factorizationrIGA, f / b eliminationrIGA, mat–vec productrIGA, total time N o r m a li ze d ti m e , (cid:101) T ( s ec ) (b) p = No. of elements in each direction, n e − − − − IGA, factorizationIGA, f / b eliminationIGA, mat–vec productIGA, total timerIGA, factorizationrIGA, f / b eliminationrIGA, mat–vec productrIGA, total time N o r m a li ze d ti m e , (cid:101) T ( s ec ) (c) p = No. of elements in each direction, n e − − − − IGA, factorizationIGA, f / b eliminationIGA, mat–vec productIGA, total timerIGA, factorizationrIGA, f / b eliminationrIGA, mat–vec productrIGA, total time N o r m a li ze d ti m e , (cid:101) T ( s ec ) (d) p = Figure 8: Normalized average time per eigenvalue, (cid:101) T = T N / N N , versus the number of elements in each direction, n e . 2D eigenproblems undermaximum-continuity IGA and optimal rIGA discretizations with blocksize of 16 elements in each direction. by Eq. (9). The knot insertion steps of the rIGA approach add new control variables and, therefore, enriches theGalerkin space, modifying the spectral approximation properties of the IGA approach. In order to investigate how rIGAa ff ects the accuracy of eigenpairs throughout the entire spectrum, we introduce a 1D eigenproblem with n e =
32 and p =
3. Fig. 12 depicts the eigenvalue and L eigenfunction errors of maximum-continuity IGA versus those obtainedby di ff erent partitioning levels of the rIGA approach. The abscissa of this figure shows the eigenmode numbers i normalized with respect to N , the total number of eigenmodes of the IGA discretization. Since the rIGA-discretizedsystem has more discrete eigenmodes, the spectra plots extend to i / N >
1. The eigenvalue errors are computed bycomparing the approximated values, λ h , with exact ones, λ e = i π , while the approximated eigenfunctions, u h , arecompared with u e = √ i π x ). In terms of eigenvalues (see Fig. 12a), rIGA discretizations of lower blocksize reacha lower error at the same mode number in the range of i / N ≤
1, improving the outliers behavior of the original IGA-discretized system. It is justified in such a way that by decreasing the blocksize, the eigenvalue errors converge to theacoustic branch of the FEA spectrum and we achieved a better approximation (see, e.g., [19]). However, for i / N > locksize p = p = p = p = (cid:1)(cid:1) (cid:65)(cid:65) C (FEA) C p − (IGA)2 M a t –v ec ti m e ( s ec ) F / b e li m i n a ti on ti m e ( s ec ) F ac t o r i za ti on ti m e ( s ec ) T o t a lti m e ( s ec ) (a) Mesh: 32 Blocksize (cid:1)(cid:1) (cid:65)(cid:65) C (FEA) C p − (IGA)2 M a t –v ec ti m e ( s ec ) F / b e li m i n a ti on ti m e ( s ec ) F ac t o r i za ti on ti m e ( s ec ) T o t a lti m e ( s ec ) (b) Mesh: 64 Blocksize p = p = (cid:8)(cid:8) (cid:72)(cid:72) C (FEA) C p − (IGA)2 M a t –v ec ti m e ( s ec ) F / b e li m i n a ti on ti m e ( s ec ) F ac t o r i za ti on ti m e ( s ec ) T o t a lti m e ( s ec ) (c) Mesh: 128 Figure 9: The total computational times and those of the most expensive numerical operations for finding the first N = ff erent mesh sizes of n e = s ( s = , ,
7) and degrees p = , , ,
5. We test rIGA discretizations with di ff erent blocksizes(2 s − (cid:96) ) obtained using di ff erent levels of partitioning (cid:96) = , , ..., s . Since the eigenanalysis in 3D case is highly demanding, for n e = p = we observe larger errors for the outliers. We obtain similar conclusions in terms of eigenvector errors (see Fig. 12b).We now consider the accuracy of the eigenanalysis for 2D systems discretized with n e =
64 and 128 elementsin each direction and polynomial degrees of p = , , ,
5. The exact eigenvalue and eigenfunctions are expressedby Eq. (2). We use Remark 1 to compare the approximate eigenpairs λ hi and u hi with the tensor-represented exactones λ ei j and u ei j . Note that for any o ff -diagonal combination of i and j (i.e., i (cid:44) j ), u ei j and u eji are generally referred17 able 4: The average computational times per eigenvalue, T av = T N / N , and the number of executions of the most expensive operations for finding N = n e =
32, a higher improvement in matrix factorization is achievable with blocksize of 8 elements. However, the overalleigenanalysis is better improved with blocksize of 16 elements.Discretization Factorization F / b elimination Mat–vec product TotalMesh p Method N fact T av , fact (sec) Improvedby N f / b T av , f / b (sec) Improvedby N m–v T av , m–v (sec) Degradedby T av (sec) Improvedby32 n e (a) p = − − − T i m e p e r e i g e nv a l u e , T a v ( s ec ) No. of elements ineach direction, n e (b) p = − − − IGA, factorizationIGA, f / b eliminationIGA, mat–vec productIGA, total timerIGA, factorizationrIGA, f / b eliminationrIGA, mat–vec productrIGA, total time No. of elements ineach direction, n e (c) p = − − No. of elements ineach direction, n e (d) p = − − Figure 10: Average computational time of each numerical operation per eigenvalue, T av = T N / N , versus the number of elements in each direction, n e . 3D eigenproblems under maximum-continuity IGA and optimal rIGA discretizations with blocksize of 16 elements in each direction (for degrees p = n e = to as degenerate eigenfunctions (see, e.g., [62]). Since degenerate eigenfunctions are not unique, we restrict oureigenfunction accuracy assessments only to diagonal eigenfunctions. Figs. 13 and 14 describe the eigenvalue error aswell as eigenfunction L and energy norm errors of the above-mentioned systems. In both cases, we find the first18 o. of elements ineach direction, n e (a) p = − − − − − IGA, factorizationIGA, f / b eliminationIGA, mat–vec productIGA, total timerIGA, factorizationrIGA, f / b eliminationrIGA, mat–vec productrIGA, total time N o r m a li ze d ti m e , (cid:101) T ( s ec ) No. of elements ineach direction, n e (b) p = − − − − − No. of elements ineach direction, n e (c) p = − − − − − No. of elements ineach direction, n e (d) p = − − − − − Figure 11: Normalized average time per eigenvalue, (cid:101) T = T N / N N , versus the number of elements in each direction, n e . 3D eigenproblems undermaximum-continuity IGA and optimal rIGA discretizations with blocksize of 16 elements in each direction (for degrees p = n e = . . . i / N E V e rr blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) (a) Eigenvalue error . . . . . i / N E F e rr L blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) (b) Eigenfunction L error Figure 12: Eigenvalue and L eigenfunction errors of the 1D eigenproblem discretized by n e =
32 with cubic bases: maximum-continuity IGAversus rIGA of di ff erent blocksizes. N = N IGA eigenpairs of the problem. As for the 1D systems, rIGA discretizations improve the accuracy of eigenpairsapproximation compared to the maximum-continuity IGA when i / N ≤
1. (although we may see higher errors in theoutliers for i / N > ff erences are more noticeable for higher p .
9. Conclusions
This paper proposes the use of refined isogeometric analysis (rIGA) discretizations to solve generalized Hermi-tian eigenproblems (GHEP). We compare the computational time of rIGA versus that of maximum-continuity IGAwhen computing a fixed number of eigenpairs using a Lanczos eigensolver with a shift-and-invert spectral transform19 . . . . . . . i / N E V e rr blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) . . . . . . . i / N E F e rr L blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) . . . . . i / N E F e rr E blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) (a) p = . . . . . . . . i / N E V e rr blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) . . . . . . i / N E F e rr L blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) . . . . i / N E F e rr E blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) (b) p = . . . . . . i / N E V e rr blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) . . . . . . i / N E F e rr L blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) . . . . i / N E F e rr E blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) (c) p = . . . . i / N E V e rr blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) . . . . . . i / N E F e rr L blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) . . . . i / N E F e rr E blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) (d) p = Figure 13: Eigenvalue error, EVerr, and eigenfunction L and energy norm errors, EFerr L and EFerr E , of the 2D eigenproblem discretized by n e =
64 elements in each direction and polynomial degrees of p = , , ,
5, obtained by maximum-continuity IGA and rIGA of di ff erent blocksizes. . . . . . . . i / N E V e rr blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) . . . . . . . i / N E F e rr L blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) . . . . . i / N E F e rr E blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) (a) p = . . . . . . . . i / N E V e rr blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) . . . . i / N E F e rr L blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) . . . . i / N E F e rr E blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) (b) p = . . . . . . i / N E V e rr blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) . . . . i / N E F e rr L blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) . . . . i / N E F e rr E blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) (c) p = . . . . i / N E V e rr blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) . . . . i / N E F e rr L blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) . . . . i / N E F e rr E blocksize = (IGA)blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = blocksize = (FEA) (d) p = Figure 14: Eigenvalue error, EVerr, and eigenfunction L and energy norm errors, EFerr L and EFerr E , of the 2D eigenproblem discretized by n e =
128 elements in each direction and polynomial degrees of p = , , ,
5, obtained by maximum-continuity IGA and rIGA of di ff erent blocksizes. N ≥ in 2D and N ≥ in 3D, the most expensive operation is the matrix factorization. This is followed by matrix–vector operations. Wecompute the Cholesky factorization O ( p ) times faster with an rIGA discretization than with an IGA one. As aresult, in the asymptotic regime, we theoretically reach an improvement of O ( p ) in the total computational timeof the eigenanalysis. In our computations with N up to 2048 in 2D, and 128 in 3D, savings associated with theimplementation of rIGA limit to O ( p ). This occurs because we need to consider larger matrices to arrive at theasymptotic limit. In a 2D mesh with 2048 elements and p =
5, we need an average of 141 seconds to compute eacheigenpair when using IGA, and only 26 seconds for rIGA. In a 3D mesh with 128 elements and p =
3, rIGA reducesthe average required time per eigenmode from 590 seconds to 112 seconds.For smaller problems (and with lower polynomial degrees), the forward / backward elimination is the most expensivenumerical operation. This operation, which is called almost 150 and 200 times per each matrix factorization in our 2Dand 3D cases, is theoretically up to O ( p ) faster with rIGA than with IGA for su ffi ciently large problems. For smallproblems, our improvement hardly reaches the expected rates. As a result, we suggest to use the maximum-continuityIGA discretization for small problems.Finally, we obtain a better accuracy in spectral approximation using rIGA discretizations when computing the first N eigenpairs, being N the size of the IGA-discretized system. This occurs because the continuity reduction of basisfunctions enriches the Galerkin space and modifies the approximation properties of the IGA approach. However, rIGAeigenvalues beyond N show important inaccuracies. This may be detrimental for solving nonlinear problems. Acknowledgment
This work has received funding from the European Union’s Horizon 2020 research and innovation program un-der the Marie Sklodowska-Curie grant agreement No 777778 (MATHROCKS), the European POCTEFA 2014-2020Project PIXIL (EFA362 /
19) by the European Regional Development Fund (ERDF) through the Interreg V-A Spain-France-Andorra program, the Project of the Spanish Ministry of Science and Innovation with reference PID2019-108111RB-I00 (FEDER / AEI), the BCAM “Severo Ochoa” accreditation of excellence (SEV-2017-0718) and theBasque Government through the BERC 2018-2021 program, the two Elkartek projects ArgIA (KK-2019-00068) andMATHEO (KK-2019-00085), the grant “Artificial Intelligence in BCAM number EXP. 2019 / / EHU) for providing HPC resources that have contributed to the research results reported inthe paper.
References [1] T. J. R. Hughes, J. A. Cottrell, Y. Bazilevs, Isogeometric analysis: CAD, finite elements, NURBS, exact geometry and meshrefinement, Computer Methods in Applied Mechanics and Engineering 194 (2005) 4135–4195. doi:10.1016/j.cma.2004.10.008 .[2] Y. Bazilevs, V. M. Calo, Y. Zhang, T. J. R. Hughes, Isogeometric fluid–structure interaction analysis with applications toarterial blood flow, Computational Mechanics 38 (2006) 310–322. doi:10.1007/s00466-006-0084-3 .[3] H. G´omez, V. M. Calo, Y. Bazilevs, T. J. R. Hughes, Isogeometric analysis of the Cahn–Hilliard phase-field model, ComputerMethods in Applied Mechanics and Engineering 197 (2008) 4333–4352. doi:10.1016/j.cma.2008.05.003 .[4] A. Bu ff a, G. Sangalli, R. V´azquez, Isogeometric analysis in electromagnetics: B-splines approximation, Computer Methodsin Applied Mechanics and Engineering 199 (2010) 1143–1152. doi:10.1016/j.cma.2009.12.002 .[5] F. Auricchio, F. Calabr`o, T. J. R. Hughes, A. Reali, G. Sangalli, A simple algorithm for obtaining nearly optimal quadraturerules for NURBS-based isogeometric analysis, Computer Methods in Applied Mechanics and Engineering 249-252 (2012)15–27. doi:10.1016/j.cma.2012.04.014 .
6] L. F. R. Espath, A. F. Sarmiento, P. Vignal, B. O. N. Varga, A. M. A. Cortes, L. Dalcin, V. M. Calo, Energy exchangeanalysis in droplet dynamics via the Navier–Stokes–Cahn–Hilliard model, Journal of Fluid Mechanics 797 (2016) 389–430. doi:10.1017/jfm.2016.277 .[7] H. Casquero, L. Liu, Y. Zhang, A. Reali, J. Kiendl, H. Gomez, Arbitrary-degree T-splines for isogeometric analysis of fullynonlinear Kirchho ff –Love shells, Computer-Aided Design 82 (2017) 140–153. doi:10.1016/j.cad.2016.08.009 .[8] R. N. Simpson, Z. Liu, R. V´azquez, J. A. Evans, An isogeometric boundary element method for electromagnetic scatteringwith compatible B-spline discretizations, Journal of Computational Physics 362 (2018) 264–289. doi:10.1016/j.jcp.2018.01.025 .[9] V. Puzyrev, M. Ło´s, G. Gurgul, V. Calo, W. Dzwinel, M. Paszy´nski, Parallel splitting solvers for the isogeometric analysis ofthe Cahn-Hilliard equation, Computer Methods in Biomechanics and Biomedical Engineering 22 (2019) 1269–1281. doi:10.1080/10255842.2019.1661388 .[10] N. Collier, D. Pardo, L. Dalcin, M. Paszynski, V. M. Calo, The cost of continuity: A study of the performance of isogeometricfinite elements using direct solvers, Computer Methods in Applied Mechanics and Engineering 213-216 (2012) 353–361. doi:10.1016/j.cma.2011.11.002 .[11] D. Garcia, D. Pardo, L. Dalcin, M. Paszy´nski, N. Collier, V. M. Calo, The value of continuity: Refined isogeometric analysisand fast direct solvers, Computer Methods in Applied Mechanics and Engineering 316 (2017) 586–605. doi:10.1016/j.cma.2016.08.017 .[12] I. S. Du ff , J. K. Reid, The multifrontal solution of indefinite sparse symmetric linear equations, ACM Transactions on Mathe-matical Software 9 (1983) 302–325. doi:10.1145/356044.356047 .[13] D. Garcia, D. Pardo, L. Dalcin, V. M. Calo, Refined isogeometric analysis for a preconditioned conjugate gradient solver,Computer Methods in Applied Mechanics and Engineering 335 (2018) 490–509. doi:10.1016/j.cma.2018.02.006 .[14] D. Garcia, D. Pardo, V. M. Calo, Refined isogeometric analysis for fluid mechanics and electromagnetics, Computer Methodsin Applied Mechanics and Engineering 356 (2019) 598–628. doi:10.1016/j.cma.2019.06.011 .[15] J. A. Cottrell, A. Reali, Y. Bazilevs, T. J. R. Hughes, Isogeometric analysis of structural vibrations, Computer Methods inApplied Mechanics and Engineering 195 (2006) 5257–5296. doi:10.1016/j.cma.2005.09.027 .[16] T. J. R. Hughes, A. Reali, G. Sangalli, Duality and unified analysis of discrete approximations in structural dynamics and wavepropagation: Comparison of p -method finite elements with k -method NURBS, Computer Methods in Applied Mechanics andEngineering 197 (2008) 4104–4124. doi:10.1016/j.cma.2008.04.006 .[17] T. J. R. Hughes, J. A. Evans, A. Reali, Finite element and NURBS approximations of eigenvalue, boundary-value, and initial-value problems, Computer Methods in Applied Mechanics and Engineering 272 (2014) 290–320. doi:10.1016/j.cma.2013.11.012 .[18] M. Mazza, C. Manni, A. Ratnani, S. Serra-Capizzano, H. Speleers, Isogeometric analysis for 2D and 3D curl–div problems:Spectral symbols and fast iterative solvers, Computer Methods in Applied Mechanics and Engineering 344 (2019) 970–997. doi:10.1016/j.cma.2018.10.008 .[19] V. Puzyrev, Q. Deng, V. Calo, Dispersion-optimized quadrature rules for isogeometric analysis: Modified inner products, theirdispersion properties, and optimally blended schemes, Computer Methods in Applied Mechanics and Engineering 320 (2017)421–443. doi:10.1016/j.cma.2017.03.029 .[20] Q. Deng, V. Calo, Dispersion-minimized mass for isogeometric analysis, Computer Methods in Applied Mechanics and Engi-neering 341 (2018) 71–92. doi:10.1016/j.cma.2018.06.016 .[21] S. F. Hosseini, A. Hashemian, A. Reali, On the application of curve reparameterization in isogeometric vibration analysis offree-from curved beams, Computers & Structures 209 (2018) 117–129. doi:10.1016/j.compstruc.2018.08.009 .[22] Q. Deng, M. Bartoˇn, V. Puzyrev, V. Calo, Dispersion-minimizing quadrature rules for C quadratic isogeometric analysis,Computer Methods in Applied Mechanics and Engineering 328 (2018) 554–564. doi:10.1016/j.cma.2017.09.025 .[23] V. Calo, Q. Deng, V. Puzyrev, Dispersion optimized quadratures for isogeometric analysis, Journal of Computational andApplied Mathematics 355 (2019) 283–300. doi:10.1016/j.cam.2019.01.025 .[24] Q. Deng, V. Puzyrev, V. Calo, Optimal spectral approximation of 2 n -order di ff erential operators by mixed isogeometric anal-ysis, Computer Methods in Applied Mechanics and Engineering 343 (2019) 297–313. doi:10.1016/j.cma.2018.08.042 .
25] Q. Deng, V. Puzyrev, V. Calo, Isogeometric spectral approximation for elliptic di ff erential operators, Journal of ComputationalScience 36 (2019) 100879. doi:10.1016/j.jocs.2018.05.009 .[26] M. Bartoˇn, V. Puzyrev, Q. Deng, V. Calo, E ffi cient mass and sti ff ness matrix assembly via weighted gaussian quadrature rulesfor b-splines, Journal of Computational and Applied Mathematics 371 (2020) 112626. doi:10.1016/j.cam.2019.112626 .[27] T. Ericsson, A. Ruhe, The spectral transformation lanczos method for the numerical solution of large sparse generalizedsymmetric eigenvalue problems, Mathematics of Computation 35 (1980) 1251–1268. doi:10.2307/2006390 .[28] B. Nour-Omid, B. N. Parlett, T. Ericsson, P. S. Jensen, How to implement the spectral transformation, Mathematics of Com-putation 48 (1987) 663–673. doi:10.1090/S0025-5718-1987-0878698-5 .[29] R. G. Grimes, J. G. Lewis, H. D. Simon, A shifted block lanczos algorithm for solving sparse symmetric generalized eigen-problems, SIAM Journal on Matrix Analysis and Applications 15 (1994) 228–272. doi:10.1137/S0895479888151111 .[30] J. Demmel, J. Dongarra, A. Ruhe, H. van der Vorst, Templates for the Solution of Algebraic Eigenvalue Problems, Society forIndustrial and Applied Mathematics, Philadelphia, PA, 2000.[31] F. Xue, H. C. Elman, Fast inexact subspace iteration for generalized eigenvalue problems with spectral transformation, LinearAlgebra and its Applications 435 (2011) 601–622. doi:10.1016/j.laa.2010.06.021 .[32] C. Campos, J. E. Roman, Strategies for spectrum slicing based on restarted lanczos methods, Numerical Algorithms 60 (2012)279–295. doi:10.1007/s11075-012-9564-z .[33] N. Collier, L. Dalcin, D. Pardo, V. M. Calo, The cost of continuity: Performance of iterative solvers on isogeometric finiteelements, SIAM Journal on Scientific Computing 35 (2013) A767–A784. doi:10.1137/120881038 .[34] G. Strang, G. J. Fix, An analysis of the finite element method, Prentice-Hall series in automatic computation, Prentice-Hall,Englewood Cli ff s, NJ, 1973.[35] J. A. Cottrell, T. J. R. Hughes, Y. Bazilevs, Isogeometric Analysis: Toward Integration of CAD and FEA, John Wiley & Sons,Ltd, New York, NY, 2009.[36] L. Dalcin, N. Collier, P. Vignal, A. M. A. Cˆortes, V. M. Calo, PetIGA: A framework for high-performance isogeometricanalysis, Computer Methods in Applied Mechanics and Engineering 308 (2016) 151–181. doi:10.1016/j.cma.2016.05.011 .[37] A. Sarmiento, A. Cˆortes, D. Garcia, L. Dalcin, N. Collier, V. Calo, PetIGA-MF: A multi-field high-performance toolbox forstructure-preserving B-splines spaces, Journal of Computational Science 18 (2017) 117–131. doi:10.1016/j.jocs.2016.09.010 .[38] L. Piegl, W. Tiller, The NURBS Book, 2nd Edition, Springer-Verlag, New York, NY, 1997.[39] L. Gao, V. M. Calo, Preconditioners based on the alternating-direction-implicit algorithm for the 2D steady-state di ff usionequation with orthotropic heterogeneous coe ffi cients, Journal of Computational and Applied Mathematics 273 (2015) 274–295. doi:10.1016/j.cam.2014.06.021 .[40] D. C. Sorensen, Implicit application of polynomial filters in a k -step Arnoldi method, SIAM Journal on Matrix Analysis andApplications 13 (1992) 357–385. doi:10.1137/0613025 .[41] K. Wu, H. Simon, Thick-restart Lanczos method for large symmetric eigenvalue problems, SIAM Journal on Matrix Analysisand Applications 22 (2000) 602–616. doi:10.1137/S0895479898334605 .[42] G. W. Stewart, A Krylov–Schur algorithm for large eigenproblems, SIAM Journal on Matrix Analysis and Applications 23(2002) 601–614. doi:10.1137/S0895479800371529 .[43] G. W. Stewart, Addendum to “a Krylov–Schur algorithm for large eigenproblems”, SIAM Journal on Matrix Analysis andApplications 24 (2002) 599–601. doi:10.1137/S0895479802403150 .[44] G. W. Stewart, Matrix Algorithms, Volume II: Eigensystems, Society for Industrial and Applied Mathematics, Philadelphia,PA, 2001.[45] E. S. Coakley, V. Rokhlin, A fast divide-and-conquer algorithm for computing the spectra of real symmetric tridiagonal matri-ces, Applied and Computational Harmonic Analysis 34 (2013) 379–414. doi:10.1016/j.acha.2012.06.003 .
46] B. N. Parlett, The Symmetric Eigenvalue Problem, Prentice-Hall, Inc., Upper Saddle River, NJ, 1998.[47] K. H. A. Olsson, A. Ruhe, Rational Krylov for eigenvalue computation and model order reduction, BIT Numerical Mathemat-ics 46 (2006) 99–111. doi:10.1007/s10543-006-0085-9 .[48] V. Hernandez, J. E. Roman, V. Vidal, SLEPc: A scalable and flexible toolkit for the solution of eigenvalue problems, ACMTransactions on Mathematical Software 31 (2005) 351–362. doi:10.1145/1089014.1089019 .[49] V. Hern´andez, J. E. Rom´an, A. Tom´as, V. Vidal, A survey of software for sparse eigenvalue problems, Universitat PolitecnicaDe Valencia, SLEPc Technical Report STR-6 (2009).[50] R. B. Lehoucq, D. C. Sorensen, C. Yang, ARPACK Users Guide, Solution of Large-Scale Eigenvalue Problems by ImplicitlyRestarted Arnoldi Methods, Society for Industrial and Applied Mathematics, Philadelphia, PA, 1998.[51] S. Balay, W. D. Gropp, L. C. McInnes, B. F. Smith, E ffi cient management of parallelism in object oriented numerical softwarelibraries, in: E. Arge, A. M. Bruaset, H. P. Langtangen (Eds.), Modern Software Tools in Scientific Computing, Birkh¨auserPress, 1997, pp. 163–202.[52] P. Vignal, L. Dalcin, D. L. Brown, N. Collier, V. M. Calo, An energy-stable convex splitting for the phase-field crystal equation,Computers & Structures 158 (2015) 355–368. doi:10.1016/j.compstruc.2015.05.029 .[53] A. M. A. Cˆortes, A. L. G. A. Coutinho, L. Dalcin, V. M. Calo, Performance evaluation of block-diagonal preconditioners forthe divergence-conforming B-spline discretization of the Stokes system, Journal of Computational Science 11 (2015) 123–136. doi:10.1016/j.jocs.2015.01.005 .[54] L. F. R. Espath, A. F. Sarmiento, L. Dalcin, V. M. Calo, On the thermodynamics of the Swift–Hohenberg theory, ContinuumMechanics and Thermodynamics 29 (2017) 1335–1345. doi:10.1007/s00161-017-0581-y .[55] E. Romero, J. E. Roman, A parallel implementation of Davidson methods for large-scale eigenvalue problems in SLEPc, ACMTransactions on Mathematical Software 40 (2014) 1–29. doi:10.1145/2543696 .[56] C. Campos, J. E. Roman, Parallel Krylov solvers for the polynomial eigenvalue problem in SLEPc, SIAM Journal on ScientificComputing 38 (2016) S385–S411. doi:10.1137/15m1022458 .[57] B. J. Faber, M. J. Pueschel, P. W. Terry, C. C. Hegna, J. E. Roman, Stellarator microinstabilities and turbulence at low magneticshear, Journal of Plasma Physics 84 (2018) 905840503. doi:10.1017/s0022377818001022 .[58] M. Kec¸eli, F. Corsetti, C. Campos, J. E. Roman, H. Zhang, ´A. V´azquez-Mayagoitia, P. Zapol, A. F. Wagner, SIESTA-SIPs:Massively parallel spectrum-slicing eigensolver for an ab initio molecular dynamics package, Journal of Computational Chem-istry 39 (2018) 1806–1814. doi:10.1002/jcc.25350 .[59] J. C. Araujo C., C. Campos, C. Engstr¨om, J. E. Roman, Computation of scattering resonances in absorptive and dispersivemedia with applications to metal-dielectric nano-structures, Journal of Computational Physics 407 (2020) 109220. doi:10.1016/j.jcp.2019.109220 .[60] P. R. Amestoy, I. S. Du ff , J.-Y. L’Excellent, J. Koster, A fully asynchronous multifrontal solver using distributed dynamicscheduling, SIAM Journal on Matrix Analysis and Applications 23 (2001) 15–41. doi:10.1137/s0895479899358194 .[61] G. Karypis, V. Kumar, A fast and high quality multilevel scheme for partitioning irregular graphs, SIAM Journal on ScientificComputing 20 (1998) 359–392. doi:10.1137/s1064827595287997 .[62] H. J. Korsch, On the nodal behaviour of eigenfunctions, Physics Letters A 97 (1983) 77–80. doi:10.1016/0375-9601(83)90514-5 ..