A Product to Sum Approach for Matrix Filling in a Hierarchical Finite-Element Method
11 Abstract − This paper presents a product to sum approach for a fast and efficient matrix filling in a hierarchical finite-element method (FEM). Due to the existence of a coupling factor arising from the material and Jacobian inhomogeneities in curved inhomogeneous elements, the calculation of the FEM matrix elements should be carried out through full multidimensional integrations. This reduces the efficiency of the higher order FEM solvers especially when the coupling factor varies rapidly inside the elements. In the product to sum approach, every product of the basis and weighting polynomials is replaced with a sum of appropriate polynomials. This reduces the number of required multidimensional integrals significantly and converts the integration into a summation. Therefore, the method will be efficient if the number of summation terms in the product to sum conversion is as small as possible.
Index terms − Product to sum, finite-element method, matrix filling. I. INTRODUCTION
The finite-element method (FEM) is a robust and accurate numerical method for solving vector partial differential equations (PDEs) appearing in a variety of research areas especially in the area of electromagnetic field computations [1]-[3]. When the FEM is applied on curved elements containing inhomogeneous materials, the FEM matrix elements should be calculated through full multidimensional integrations, which are costly and time-consuming. The coupling factor in the matrix element integrals is due to the material inhomogeneity or Jacobian inhomogeneity or both. In the case of straight triangular and tetrahedral elements with homogeneous materials, the coupling factor is reduced to a constant and is removed from the integral sign. Therefore, a universal matrix approach can be applied for a fast matrix filling [4]. However, in general even in a straight quadrilateral or hexahedral element with a homogeneous material, the coupling factor is not a constant and hence the universal matrix approach is not applicable. The generalized universal matrix approach [5] and a universal array approach [6] have also been proposed and can be used for fast matrix filling in a curved inhomogeneous element. These methods are based on a polynomial approximation of the coupling factor and conversion of the integral into a sum over universal integrals that can be calculated and stored beforehand. However, these methods are more efficient than direct numerical integrations if the polynomial order of the field expansion inside the element is chosen larger than the half of the order of the coupling factor approximation. The efficiency of these methods increases as the difference between the polynomial orders of the field and coupling factor is increased. Separable Jacobian methods have also been proposed for fast matrix filling in the 2-D curved homogeneous elements [7]-[9]. These methods are based on the fact that in the ring-type elements the coupling factor is separable [7], [8]. On the other hand, an arbitrary curved polygonal element can be divided into special curved triangles for which the coupling factor is separable [9]. Therefore, 2-D integrals of the FEM matrix elements are reduced to 1-D ones that can be calculated very quickly. In this paper, we present a product to sum approach for fast matrix filling in a hierarchical curved inhomogeneous quadrilateral or hexahedral element. The method is based on the fact that the product of two polynomials can be represented as a sum of appropriate polynomials. Therefore, the multidimensional integrals are reduced to summations over another type of integrals. In this procedure, the number of multidimensional integrals is reduced significantly. However, the efficiency of the method depends on the number of terms in the equivalent polynomial representations of the products.
The author is with the Sum Institute for Computational Physics, Tabriz, Iran (e-mail: ehskhodapanah@ yahoo.com).
A Product to Sum Approach for Matrix Filling in a Hierarchical Finite-Element Method
Ehsan Khodapanah
II. G ENERAL REPRESENTATION OF THE PRODUCT TO SUM APPROACH
We consider a second order vector PDE in a 3-D space. The unknown vector field, governed by the vector PDE, is denoted by (cid:1775) . In a hierarchical version of the FEM, different components of the vector field (cid:1775) in a curved hexahedral element can be expanded in terms of the following polynomial basis functions (cid:1827) (cid:3082)(cid:3040)(cid:3041)(cid:3043) = (cid:1842) (cid:3040)(cid:3082) ((cid:1873))(cid:1843) (cid:3041)(cid:3082) ((cid:1874))(cid:1844) (cid:3043)(cid:3082) ((cid:1875)) (cid:1749)(cid:1750)(cid:1748)(cid:1750)(cid:1747)(cid:1865) = 0,1, ⋯ (cid:1840) (cid:3048)(cid:3082) − 1(cid:1866) = 0,1, ⋯ (cid:1840) (cid:3049)(cid:3082) − 1(cid:1868) = 0,1, ⋯ (cid:1840) (cid:3050)(cid:3082) − 1(cid:2011) = (cid:2011) (cid:2869) , (cid:2011) (cid:2870) , (cid:2011) (cid:2871) (1) where (cid:2011) (cid:2869) , (cid:2011) (cid:2870) , and (cid:2011) (cid:2871) stand for the scalar components of the vector field, (cid:3419)(cid:1842) (cid:3040)(cid:3082) ((cid:1873))(cid:3423) (cid:3040)(cid:2880)(cid:2868)(cid:3015) (cid:3296)(cid:3330) (cid:2879)(cid:2869) , (cid:3419)(cid:1843) (cid:3041)(cid:3082) ((cid:1874))(cid:3423) (cid:3040)(cid:2880)(cid:2868)(cid:3015) (cid:3297)(cid:3330) (cid:2879)(cid:2869) , and (cid:3419)(cid:1844) (cid:3043)(cid:3082) ((cid:1875))(cid:3423) (cid:3040)(cid:2880)(cid:2868)(cid:3015) (cid:3298)(cid:3330) (cid:2879)(cid:2869) are three sets of appropriate orthogonal polynomials in one dimension and ((cid:1873), (cid:1874), (cid:1875)) are the coordinates of the reference element. By applying a standard Galerkin method to discretize the vector PDE, we arrive at the following general expressions for the matrix elements (cid:1839) (cid:3082) (cid:3284) (cid:3082) (cid:3285) (cid:3040) (cid:3117) (cid:3040) (cid:3118) (cid:3041) (cid:3117) (cid:3041) (cid:3118) (cid:3043) (cid:3117) (cid:3043) (cid:3118) = (cid:3533) (cid:3512) (cid:3429)(cid:1842) (cid:3040) (cid:3117) (cid:3082) (cid:3284) (cid:4593) (cid:1842) (cid:3040) (cid:3117) (cid:3082) (cid:3284) (cid:3433) (cid:4686)(cid:1842) (cid:3040) (cid:3118) (cid:3082) (cid:3285) (cid:4593) (cid:1842) (cid:3040) (cid:3118) (cid:3082) (cid:3285) (cid:4687) (cid:3429)(cid:1843) (cid:3041) (cid:3117) (cid:3082) (cid:3284) (cid:4593) (cid:1843) (cid:3041) (cid:3118) (cid:3082) (cid:3284) (cid:3433) (cid:4686)(cid:1843) (cid:3041) (cid:3118) (cid:3082) (cid:3285) (cid:4593) (cid:1843) (cid:3041) (cid:3118) (cid:3082) (cid:3285) (cid:4687) (cid:3429)(cid:1844) (cid:3043) (cid:3117) (cid:3082) (cid:3284) (cid:4593) (cid:1844) (cid:3043) (cid:3117) (cid:3082) (cid:3284) (cid:3433) (cid:4686)(cid:1844) (cid:3043) (cid:3118) (cid:3082) (cid:3285) (cid:4593) (cid:1844) (cid:3043) (cid:3118) (cid:3082) (cid:3285) (cid:4687) (cid:2009) (cid:3046)(cid:3082) (cid:3284) (cid:3082) (cid:3285) ((cid:1873), (cid:1874), (cid:1875))(cid:1856)(cid:1873)(cid:1856)(cid:1874)(cid:1856)(cid:1875) (cid:2869)(cid:2879)(cid:2869)(cid:3015) (cid:3294) (cid:3046)(cid:2880)(cid:2869) (2) where (cid:1861), (cid:1862) = 1,2,3 , (cid:2009) (cid:3046)(cid:3082) (cid:3284) (cid:3082) (cid:3285) is the coupling factor, originating from the material inhomogeneity and the Jacobian components, and one element in every column vector in (2) should be chosen depending upon (cid:1871) , (cid:2011) (cid:3036) , and (cid:2011) (cid:3037) . In general (cid:2009) (cid:3046)(cid:3082) (cid:3284) (cid:3082) (cid:3285) is not separable and hence a full 3-D integration should be performed to obtain every 3-D integral inside the sum in the right-hand-side (RHS) of (2). This means that for every submatrix in the RHS of (2) that has a contribution to (cid:4674)(cid:1839) (cid:3082) (cid:3284) (cid:3082) (cid:3285) (cid:4675) , (cid:1840) (cid:3048)(cid:3082) (cid:3284) (cid:1840) (cid:3048)(cid:3082) (cid:3285) (cid:1840) (cid:3049)(cid:3082) (cid:3284) (cid:1840) (cid:3049)(cid:3082) (cid:3285) (cid:1840) (cid:3050)(cid:3082) (cid:3284) (cid:1840) (cid:3050)(cid:3082) (cid:3285) full 3-D integrals should be evaluated. In the product, to sum approach we replace every product of two polynomials of the same variable with a sum over a set of appropriate polynomials. For example (cid:3429)(cid:1842) (cid:3040) (cid:3117) (cid:3082) (cid:3284) (cid:4593) ((cid:1873))(cid:1842) (cid:3040) (cid:3117) (cid:3082) (cid:3284) ((cid:1873)) (cid:3433) (cid:3046) (cid:4686)(cid:1842) (cid:3040) (cid:3118) (cid:3082) (cid:3285) (cid:4593) ((cid:1873))(cid:1842) (cid:3040) (cid:3118) (cid:3082) (cid:3285) ((cid:1873)) (cid:4687) (cid:3046) = (cid:3533) (cid:1853) (cid:3038) (cid:3117) (cid:3046)(cid:3082) (cid:3284) (cid:3082) (cid:3285) (cid:1842) (cid:3038) (cid:3117) (cid:3046)(cid:3082) (cid:3284) (cid:3082) (cid:3285) ((cid:1873)) (cid:3012) (cid:3117) (cid:3038) (cid:3117) (cid:2880)(cid:2868) (3) where (cid:1837) (cid:2869) = (cid:1865) (cid:2869) + (cid:1865) (cid:2870) or (cid:1865) (cid:2869) + (cid:1865) (cid:2870) − 1 or (cid:1865) (cid:2869) + (cid:1865) (cid:2870) − 2 depending on whether the polynomial or its derivative is considered and (cid:1842) (cid:3038) (cid:3117) (cid:3046)(cid:3082) (cid:3284) (cid:3082) (cid:3285) may be chosen one among (cid:1842) (cid:3038) (cid:3117) (cid:3082) (cid:3117) , (cid:1842) (cid:3038) (cid:3117) (cid:3082) (cid:3118) , and (cid:1842) (cid:3038) (cid:3117) (cid:3082) (cid:3119) or even another polynomial if appropriate. By substituting the product to sum formulas (3) into (2), we obtain the following expressions for the elements of (cid:4674)(cid:1839) (cid:3082) (cid:3284) (cid:3082) (cid:3285) (cid:4675) (cid:1839) (cid:3082) (cid:3284) (cid:3082) (cid:3285) (cid:3040) (cid:3117) (cid:3040) (cid:3118) (cid:3041) (cid:3117) (cid:3041) (cid:3118) (cid:3043) (cid:3117) (cid:3043) (cid:3118) = (cid:3533) (cid:3533) (cid:1853) (cid:3038) (cid:3117) (cid:3038) (cid:3118) (cid:3038) (cid:3119) (cid:3046)(cid:3082) (cid:3284) (cid:3082) (cid:3285) (cid:3512) (cid:1842) (cid:3038) (cid:3117) (cid:3046)(cid:3082) (cid:3284) (cid:3082) (cid:3285) ((cid:1873))(cid:1843) (cid:3038) (cid:3118) (cid:3046)(cid:3082) (cid:3284) (cid:3082) (cid:3285) ((cid:1874))(cid:1844) (cid:3038) (cid:3119) (cid:3046)(cid:3082) (cid:3284) (cid:3082) (cid:3285) ((cid:1875))(cid:2009) (cid:3046)(cid:3082) (cid:3284) (cid:3082) (cid:3285) ((cid:1873), (cid:1874), (cid:1875))(cid:1856)(cid:1873)(cid:1856)(cid:1874)(cid:1856)(cid:1875) (cid:2869)(cid:2879)(cid:2869)(cid:3038) (cid:3117) ,(cid:3038) (cid:3118) ,(cid:3038) (cid:3119) (cid:3015) (cid:3294) (cid:3046)(cid:2880)(cid:2869) (4) Now it is sufficient to calculate only (cid:3435)(cid:1840) (cid:3048)(cid:3082) (cid:3284) + (cid:1840) (cid:3048)(cid:3082) (cid:3285) (cid:3439)(cid:3435)(cid:1840) (cid:3049)(cid:3082) (cid:3284) + (cid:1840) (cid:3049)(cid:3082) (cid:3285) (cid:3439)(cid:3435)(cid:1840) (cid:3050)(cid:3082) (cid:3284) + (cid:1840) (cid:3050)(cid:3082) (cid:3285) (cid:3439) (cid:4674)(cid:1839) (cid:3082) (cid:3284) (cid:3082) (cid:3285) (cid:4675) . However, the product to sum approach is efficient if total number of terms in the triple sum in (4) is relatively small compared to the total number of sample points in the direct 3-D numerical integration of (4). In other words, the efficiency of the method is directly proportional to the sparsity of the vector containing the coefficients of the polynomials in the RHS of the product to sum formulas in (3). Therefore, the key question is that are there any polynomials that lead to a sparse RHS in the product to sum formulas? The answer is yes; one such choice is the Chebyshev polynomials that are used in the next section.
III. A SPECIFIC EXAMPLE
In order to clarify the method outlined in section II and for the ease of computer implementation, we consider a two-dimensional curl-curl equation in the area of electromagnetic field computation as a specific example. Indeed we solve the following vector PDE ∇ × 1(cid:2020) (cid:3045) ∇ × (cid:1779) − (cid:1863) (cid:2868)(cid:2870) (cid:2013) (cid:3045) (cid:1779) = 0 (5)
The domain of the problem is a fourth-order curved quadrilateral domain partially filled with a continuously-varying inhomogeneous material (Fig. 1). As in the standard FEM, we first divide the domain of Fig. 1 into a number of smaller curved quadrilateral elements. Inside each element, the vector field components are expanded in terms of the following basis functions (cid:3422)(cid:1831) (cid:3048) = (cid:1847) (cid:3040) ((cid:1873)) (cid:1846) (cid:3041) ((cid:1874)) (cid:4676)(cid:1865) = 0,1, ⋯ , (cid:1839) − 1(cid:1866) = 0,1, ⋯ , (cid:1840) (cid:1831) (cid:3049) = (cid:1846) (cid:3040) ((cid:1873)) (cid:1847) (cid:3041) ((cid:1874)) (cid:4676)(cid:1865) = 0,1, ⋯ , (cid:1839) (cid:1866) = 0,1, ⋯ , (cid:1840) − 1 (6)
After applying a standard Galerkin method to (5), we obtain the following general expressions for the stiffness and mass matrices (cid:1845) (cid:3047)(cid:3029) = (cid:3509) 1(cid:2020) (cid:3045) ∇ × (cid:1779) (cid:3047) ∙ ∇ × (cid:1779) (cid:3029) (cid:1856)(cid:1876)(cid:1856)(cid:1877)(cid:1839) (cid:3047)(cid:3029) = (cid:3509) (cid:2013) (cid:3045) (cid:1779) (cid:3047) ∙ (cid:1779) (cid:3029) (cid:1856)(cid:1876)(cid:1856)(cid:1877) (7) where (cid:1872) and (cid:1854) stand for the testing and basis functions, respectively, and (cid:4670)(cid:1845)(cid:4671) and (cid:4670)(cid:1839)(cid:4671) are the stiffness and mass matrices, respectively. Substitution of (6) into (7) leads to the following explicit expressions for the stiffness and mass submatrices (cid:1845) (cid:3048)(cid:3048)(cid:3040) (cid:3117) (cid:3040) (cid:3118) (cid:3041) (cid:3117) (cid:3041) (cid:3118) = (cid:1866) (cid:2869) (cid:1866) (cid:2870) (cid:3509) (cid:1847) (cid:3040) (cid:3117) ((cid:1873))(cid:1847) (cid:3040) (cid:3118) ((cid:1873))(cid:1847) (cid:3041) (cid:3117) (cid:2879)(cid:2869) ((cid:1874))(cid:1847) (cid:3041) (cid:3118) (cid:2879)(cid:2869) ((cid:1874)) (cid:2869)(cid:2879)(cid:2869) (cid:1856)(cid:1873)(cid:1856)(cid:1874)(cid:2020) (cid:3045)
J (8) (cid:1845) (cid:3048)(cid:3049)(cid:3040) (cid:3117) (cid:3040) (cid:3118) (cid:3041) (cid:3117) (cid:3041) (cid:3118) = −(cid:1866) (cid:2869) (cid:1865) (cid:2870) (cid:3509) (cid:1847) (cid:3040) (cid:3117) ((cid:1873))(cid:1847) (cid:3040) (cid:3118) (cid:2879)(cid:2869) ((cid:1873))(cid:1847) (cid:3041) (cid:3117) (cid:2879)(cid:2869) ((cid:1874))(cid:1847) (cid:3041) (cid:3118) ((cid:1874)) (cid:2869)(cid:2879)(cid:2869) (cid:1856)(cid:1873)(cid:1856)(cid:1874)(cid:2020) (cid:3045)
J (9) (cid:1845) (cid:3049)(cid:3048)(cid:3040) (cid:3117) (cid:3040) (cid:3118) (cid:3041) (cid:3117) (cid:3041) (cid:3118) = −(cid:1865) (cid:2869) (cid:1866) (cid:2870) (cid:3509) (cid:1847) (cid:3040) (cid:3117) (cid:2879)(cid:2869) ((cid:1873))(cid:1847) (cid:3040) (cid:3118) ((cid:1873))(cid:1847) (cid:3041) (cid:3117) ((cid:1874))(cid:1847) (cid:3041) (cid:3118) (cid:2879)(cid:2869) ((cid:1874)) (cid:2869)(cid:2879)(cid:2869) (cid:1856)(cid:1873)(cid:1856)(cid:1874)(cid:2020) (cid:3045)
J (10) (cid:1845) (cid:3049)(cid:3049)(cid:3040) (cid:3117) (cid:3040) (cid:3118) (cid:3041) (cid:3117) (cid:3041) (cid:3118) = (cid:1865) (cid:2869) (cid:1865) (cid:2870) (cid:3509) (cid:1847) (cid:3040) (cid:3117) (cid:2879)(cid:2869) ((cid:1873))(cid:1847) (cid:3040) (cid:3118) (cid:2879)(cid:2869) ((cid:1873))(cid:1847) (cid:3041) (cid:3117) ((cid:1874))(cid:1847) (cid:3041) (cid:3118) ((cid:1874)) (cid:2869)(cid:2879)(cid:2869) (cid:1856)(cid:1873)(cid:1856)(cid:1874)(cid:2020) (cid:3045)
J (11) (cid:1839) (cid:3048)(cid:3048)(cid:3040) (cid:3117) (cid:3040) (cid:3118) (cid:3041) (cid:3117) (cid:3041) (cid:3118) = (cid:3509) (cid:1847) (cid:3040) (cid:3117) ((cid:1873))(cid:1847) (cid:3040) (cid:3118) ((cid:1873))(cid:1846) (cid:3041) (cid:3117) ((cid:1874))(cid:1846) (cid:3041) (cid:3118) ((cid:1874)) (cid:2013) (cid:3045) ((cid:1876) (cid:3049)(cid:2870) + (cid:1877) (cid:3049)(cid:2870) )J (cid:1856)(cid:1873)(cid:1856)(cid:1874) (cid:2869)(cid:2879)(cid:2869) (12) (cid:1839) (cid:3048)(cid:3049)(cid:3040) (cid:3117) (cid:3040) (cid:3118) (cid:3041) (cid:3117) (cid:3041) (cid:3118) = − (cid:3509) (cid:1847) (cid:3040) (cid:3117) ((cid:1873))(cid:1846) (cid:3040) (cid:3118) ((cid:1873))(cid:1846) (cid:3041) (cid:3117) ((cid:1874))(cid:1847) (cid:3041) (cid:3118) ((cid:1874)) (cid:2013) (cid:3045) ((cid:1876) (cid:3048) (cid:1876) (cid:3049) + (cid:1877) (cid:3048) (cid:1877) (cid:3049) )J (cid:1856)(cid:1873)(cid:1856)(cid:1874) (cid:2869)(cid:2879)(cid:2869) (13)
Fig. 1. Geometry of the 2-D problem considered in section III. f((cid:1876)) = −0.2((cid:1876) (cid:2870) − 1) (cid:2870) + 1 , g((cid:1877)) = 0.2((cid:1877) (cid:2870) − 1) (cid:2870) + 1 , (cid:2020) (cid:3045) = 1 , and (cid:2013) (cid:3045) = 2exp((cid:1876) + (cid:1877) + 2) . Fig. 2. The finite-element mesh for the domain of Fig. 1. (cid:1839) (cid:3049)(cid:3048)(cid:3040) (cid:3117) (cid:3040) (cid:3118) (cid:3041) (cid:3117) (cid:3041) (cid:3118) = − (cid:3509) (cid:1846) (cid:3040) (cid:3117) ((cid:1873))(cid:1847) (cid:3040) (cid:3118) ((cid:1873))(cid:1847) (cid:3041) (cid:3117) ((cid:1874))(cid:1846) (cid:3041) (cid:3118) ((cid:1874)) (cid:2013) (cid:3045) ((cid:1876) (cid:3048) (cid:1876) (cid:3049) + (cid:1877) (cid:3048) (cid:1877) (cid:3049) )J (cid:1856)(cid:1873)(cid:1856)(cid:1874) (cid:2869)(cid:2879)(cid:2869) (14) (cid:1839) (cid:3049)(cid:3049)(cid:3040) (cid:3117) (cid:3040) (cid:3118) (cid:3041) (cid:3117) (cid:3041) (cid:3118) = (cid:3509) (cid:1846) (cid:3040) (cid:3117) ((cid:1873))(cid:1846) (cid:3040) (cid:3118) ((cid:1873))(cid:1847) (cid:3041) (cid:3117) ((cid:1874))(cid:1847) (cid:3041) (cid:3118) ((cid:1874)) (cid:2013) (cid:3045) ((cid:1876) (cid:3048)(cid:2870) + (cid:1877) (cid:3048)(cid:2870) )J (cid:1856)(cid:1873)(cid:1856)(cid:1874) (cid:2869)(cid:2879)(cid:2869) (15)
As can be realized from (8)-(15) every submatrix element is obtained through a 2-D integration in the reference element. As mentioned in the foregoing section, in the product to sum approach, we replace every product term containing the product of two chebyshev polynomials of a same variable with a sum of two appropriate polynomials by applying the following formulas (cid:1749)(cid:1750)(cid:1748)(cid:1750)(cid:1747)(cid:1847) (cid:3040) (cid:3117) ((cid:1873))(cid:1847) (cid:3040) (cid:3118) ((cid:1873)) = (cid:1846) |(cid:3040) (cid:3117) (cid:2879)(cid:3040) (cid:3118) |(cid:3041)(cid:3046) ((cid:1873)) − (cid:1846) (cid:3040) (cid:3117) (cid:2878)(cid:3040) (cid:3118) (cid:2878)(cid:2870)(cid:3041)(cid:3046) ((cid:1873))(cid:1846) (cid:3040) (cid:3117) ((cid:1873))(cid:1846) (cid:3040) (cid:3118) ((cid:1873)) = (cid:1846) |(cid:3040) (cid:3117) (cid:2879)(cid:3040) (cid:3118) | ((cid:1873)) + (cid:1846) (cid:3040) (cid:3117) (cid:2878)(cid:3040) (cid:3118) ((cid:1873))(cid:1847) (cid:3040) (cid:3117) ((cid:1873))(cid:1846) (cid:3040) (cid:3118) ((cid:1873)) = (cid:1847) (cid:3040) (cid:3117) (cid:2879)(cid:3040) (cid:3118) ((cid:1873)) + (cid:1847) (cid:3040) (cid:3117) (cid:2878)(cid:3040) (cid:3118) ((cid:1873)) (16) where the nonsingular Chebyshev polynomials are defined as (cid:1846) (cid:3041)(cid:3041)(cid:3046) ((cid:1873)) = (cid:1749)(cid:1750)(cid:1748)(cid:1750)(cid:1747)(cid:1846) (cid:3041) ((cid:1873)) − 12(1 − (cid:1873) (cid:2870) ) ∶ (cid:1866) is even(cid:1846) (cid:3041) ((cid:1873)) − (cid:1873)2(1 − (cid:1873) (cid:2870) ) ∶ (cid:1866) is odd (17) For example, for the (cid:4670)(cid:1839) (cid:3048)(cid:3048) (cid:4671) elements we have
Fig. 3. Convergence pattern for the problem of Fig. 1.
Table I computational times for the product to sum approach and the direct numerical integration approach in solving the problem of Fig. 1. Reduction factor in the matrix filling time Direct numerical integration Product to sum approach Predicted value (section IV) From the left columns Total computational time (s) Matrix filling time (s) Total computational time (s) Matrix filling time (s) (cid:1839) = (cid:1840) = 3 (cid:1839) = (cid:1840) = 4 (cid:1839) = (cid:1840) = 6 (cid:1839) = (cid:1840) = 8 (cid:1839) = (cid:1840) = 10 (cid:1839) = (cid:1840) = 12 (cid:1839) = (cid:1840) = 14 (cid:1839) = (cid:1840) = 16 (cid:1839) = (cid:1840) = 18 (cid:1839) (cid:3048)(cid:3048)(cid:3040) (cid:3117) (cid:3040) (cid:3118) (cid:3041) (cid:3117) (cid:3041) (cid:3118) = (cid:3509)(cid:3427)(cid:1846) |(cid:3040) (cid:3117) (cid:2879)(cid:3040) (cid:3118) |(cid:3041)(cid:3046) ((cid:1873))(cid:1846) |(cid:3041) (cid:3117) (cid:2879)(cid:3041) (cid:3118) | ((cid:1874)) + (cid:1846) |(cid:3040) (cid:3117) (cid:2879)(cid:3040) (cid:3118) |(cid:3041)(cid:3046) ((cid:1873))(cid:1846) (cid:3041) (cid:3117) (cid:2878)(cid:3041) (cid:3118) ((cid:1874)) − (cid:1846) (cid:3040) (cid:3117) (cid:2878)(cid:3040) (cid:3118) (cid:2878)(cid:2870)(cid:3041)(cid:3046) ((cid:1873))(cid:1846) |(cid:3041) (cid:3117) (cid:2879)(cid:3041) (cid:3118) | ((cid:1874)) (cid:2869)(cid:2879)(cid:2869) − (cid:1846) (cid:3040) (cid:3117) (cid:2878)(cid:3040) (cid:3118) (cid:2878)(cid:2870)(cid:3041)(cid:3046) ((cid:1873))(cid:1846) (cid:3041) (cid:3117) (cid:2878)(cid:3041) (cid:3118) ((cid:1874))(cid:3431) (cid:2013) (cid:3045) ((cid:1876) (cid:3049)(cid:2870) + (cid:1877) (cid:3049)(cid:2870) )2J (cid:1856)(cid:1873)(cid:1856)(cid:1874) (18) (18) shows that a four-term sum can be used to calculate the (cid:4670)(cid:1839) (cid:3048)(cid:3048) (cid:4671) elements instead of a direct numerical integration containing a large number of sums and products. However, the simple formula in (18) requires calculate the following 2-D integrals as a starting point (cid:3509) (cid:1846) (cid:3040)(cid:3041)(cid:3046) ((cid:1873))(cid:1846) (cid:3041) ((cid:1874)) (cid:2869)(cid:2879)(cid:2869) (cid:2013) (cid:3045) ((cid:1876) (cid:3049)(cid:2870) + (cid:1877) (cid:3049)(cid:2870) )2J (cid:1856)(cid:1873)(cid:1856)(cid:1874) (19) Total number of 2-D integrals in (19) is approximately where (cid:1830) ≈ (cid:1839)(cid:1840) , while total number of 2-D integrals in the direct numerical calculation of (cid:4670)(cid:1839) (cid:3048)(cid:3048) (cid:4671) is approximately (cid:1830) (cid:2870) /4 (taking the four-fold symmetry of (cid:4670)(cid:1839) (cid:3048)(cid:3048) (cid:4671) into account). Hence, the reduction factor for the number of 2-D integrals is (cid:1830)/16 . IV. O PERATION COUNT ESTIMATION FOR THE PROPOSED METHOD
This section, we would like to give a relatively more accurate number for the arithmetic operation count of the product to sum approach applied to the problem of section III. Before proceeding, we make some assumptions to simplify the analysis. We assume that the curved elements in the domain decomposition are isoparametric and the geometrical mapping into the reference element is Lagrangian. These assumptions allow to find an upper limit for the number of operations required for the calculation of the Jacobian components. We also assume that the total number of sampling points for a direct 2-D integration is , which is a lower limit for an accurate numerical integration when the Jacobian and material inhomogeneities vary rapidly. Total number of multiply-add operations for the calculation of the (cid:1876) , (cid:1877) , and four Jacobian components is (cid:2870) . Total number of multiply-add operations for the calculation of the 2-D integrals as starting points is (cid:2870) . Total number of add operations due to the product to sum formulas (e.g., (18)) is (cid:2870) . Total number of multiply operations for the calculation of the stiffness submatrices is (cid:2870) . Finally, there is a need for a rearrangement of the Chebyshev polynomials of the first kind in (6) to ensure the curl-conformity of the basis functions. This final step requires (cid:2870) add operations. Therefore, the total number of multiply-add operations in the product to sum formula is approximately (cid:2870) . On the other hand, the total number of multiply-add operations in the direct numerical calculations is (cid:2870) = 3.5(cid:1830) (cid:2871) (neglecting the Jacobian calculations and the multiply operations in the stiffness submatrices and the rearrangements). The above analysis shows that the number of arithmetic operations is reduced by a factor of (cid:1830)/15 in the proposed method compared with the direct numerical integration. V. N UMERICAL RESULTS
In this section, the results of the method of section IV are presented. The finite-element mesh for the domain of Fig. 1 containing -fourth-order curved quadrilateral elements is shown in Fig. 2. All the calculations of this section have been performed in a laptop computer equipped with a Core2Due CPU of 2.5GHz clock and 2-GBs of RAM under a 32 bit operating system. The convergence pattern of the method of section IV is shown in Fig. 3, which is completely the same for both the product to sum approach and the direct numerical integration approach. This shows that the product to sum approach in matrix filling does not have any effect on the accuracy and the convergence of the FEM solver. The computational times for both methods are listed in Table I. As can be seen from the table, the matrix filling time in the product to sum approach is relatively small compared with the direct numerical integration approach especially when the order of the bases increases. It is also clear from the table that the matrix filling time is a small portion of the total computational time in the product to sum approach, which is in contrast with the direct numerical integration approach. VI. C ONCLUSION
A product to sum approach has been proposed for a fast and efficient matrix filling in a high order hierarchical curved inhomogeneous FEM. In the conversion of a product of polynomials to a sum, the ideal choice is the ordinary monomials. However, the ordinary monomials suffer from the ill-conditioned finite-element matrices. The Chebyshev polynomials stand at the second rank. These polynomials were used to check the method in solving a 2-D curl-curl equation and the improvement in the efficiency of the method was verified. R EFERENCES [1]
J. M. Jin,
The Finite Element Method in Electromagnetics,
New York: John Wiley & Sons, 2002. [2]
J. L. Volakis, A. Chatterjee, and L. C. Kempel,
Finite Element Method for Electromagnetics , Piscataway, NJ: IEEE Press, 1998. [3]
P. P. Silvester and R. L. Ferrari,
Finite Elements for Electrical Engineers , 3 rd ed. Cambridge, U.K.: Cambridge Univ. Press, 1996. [4] P. P. Silvester, "Universal finite-element matrices for tetrahedral,"
Int. J. Num. Met. Eng., vol. 18, pp. 1055-1061, 1982. [5]
J. P. Webb, "Universal matrices for high order finite elements in nonlinear magnetic field problems,"
IEEE Trans. Magn., vol. 33, Sep. 1997. [6]
D. Ansari Oghol Beig, J. Wang, Z. Peng, and J. F. Lee, "A universal array approach for finite elements with continuously inhomogeneous material properties and curved boundaries,"
IEEE Trans. Antennas Propag., vol. 60, no. 10, Oct. 2012. [7]
E. Khodapanah and S. Nikmehr, "A higher order analysis of a class of inhomogeneously filled conducting waveguides,"
Progress In Electromagnetics Research, PIER, vol. 118, 223-241, 2011. [8]
E. Khodapanah, "Numerical separation of vector wave equation in a 2-D doubly-connected domain,"
IEEE Trans. Microw. Theory Techn., vol. 62, no. 11, pp. 2551-2562, Nov. 2014. [9] arXiv:1603.03791 [physics.comp-ph]arXiv:1603.03791 [physics.comp-ph]