Adaptive and anisotropic finite element approximation : Theory and algorithms
UUniversit´e Pierre et Marie Curie – Paris 6 Laboratoire Jacques-Louis Lions – UMR 7598
Approximation adaptative etanisotrope par ´el´ements finisTh´eorie et Algorithmes. TH ` ESE DE DOCTORAT pr´esent´ee et soutenue publiquement le 6 d´ecembre 2010pour l’obtention du
Doctorat de l’universit´e Pierre et Marie Curie – Paris 6
Sp´ecialit´e Math´ematiques Appliqu´ees par
Jean-Marie Mirebeau
Composition du jury
Rapporteurs :
Weiming
Cao
Gabriel
Peyr´e
Examinateurs :
Albert
Cohen
Directeur de th`eseJean-Daniel
Boissonnat
Bruno
Despr´es
Ronald
DeVore
Fr´ed´eric
Hecht
Yves
Meyer ´Ecole Doctorale de Sciences Math´ematiques de Paris Centre UFR 929 - Math´ematiques a r X i v : . [ m a t h . NA ] M a r Remerciements
En tout premier lieu, je voudrais remercier Albert Cohen pour ces trois ann´ees pass´eessous sa direction. En particulier, je lui suis reconnaissant de m’avoir d`es le d´ebut propos´edes sujets de recherche passionnants, puis d’avoir progressivement stimul´e et ´eduqu´e maprise d’ind´ependance, par son regard math´ematique pr´ecis et critique, mais aussi bien-veillant et constructif. Je le remercie chaleureusement pour sa tr`es grande disponibilit´eet son attention constante qui m’ont pouss´e `a donner le meilleur de moi-mˆeme. Etre son´etudiant fut un honneur et une chance exceptionnels.Je remercie vivement Weiming Cao et Gabriel Peyr´e d’avoir accept´e d’ˆetre les rappor-teurs de cette th`ese. La pr´ecision de leurs commentaires et l’enthousiasme de certainesremarques sont pour moi un encouragement sans ´egal. En dehors de ce rˆole de rapporteur,je suis reconnaissant `a Weiming Cao d’avoir tr`es tˆot appr´eci´e et encourag´e ma contribu-tion `a son domaine de recherche, notamment en m’invitant `a un minisymposium ; et `aGabriel Peyr´e pour son dynamisme et son enthousiasme en math´ematiques, et pour denombreuses et instructives discussions.Je suis tr`es honor´e que Jean-Daniel Boissonnat, Bruno Despr´es, Ronald DeVore, Fr´e-d´eric Hecht et Yves Meyer t´emoignent de l’int´erˆet qu’ils portent `a ma th`ese en prenantpart `a son jury. Je remercie par ailleurs Jean-Daniel Boissonnat et Fr´ed´eric Hecht pournos discussions qui m’ont beaucoup instruit sur la g´en´eration pratique de maillages ; jesuis ´egalement reconnaissant `a Fr´ed´eric Hecht pour m’avoir aid´e `a int´egrer dans Free-Fem++ certains outils d´evelopp´es dans cette th`ese. Un grand merci `a Ronald DeVorepour m’avoir tr`es tˆot accept´e comme collaborateur et pour ses nombreux conseils sur leplan math´ematique comme sur le plan humain.Je remercie vivement Nira Dyn, qui m’a invit´e `a discuter avec elle, Albert Cohen etShai Dekel une semaine durant laquelle de nombreuses id´ees ont pu ´emerger. J’ai beaucoupappr´eci´e la collaboration avec Yuliya Babenko, que je remercie pour son enthousiasme etson efficacit´e. Je suis tr`es reconnaissant `a Suzanne Brenner et Li-Yen Sung, pour leurinvitation en Louisiane et l’extrˆeme gentillesse de leur accueil.Je tiens `a remercier l’ensemble du Laboratoire Jacques Louis Lions, chercheurs, infor-maticiens, secr´etaires et doctorants, qui donnent `a ce lieu de travail une ambiance tr`esagr´eable. Parmi les th´esards, je voudrais remercier en particulier : Evelyne pour les sortiesescalade, partag´ees avec Frank, Fr´ed´eric et ma femme ; Mathieu qui m’a fait d´ecouvrir lasymphonie `a Gollum ; Alexis, Marianne et Pierre, ainsi que les nouveaux venus, Marie,Malek et Ange, avec qui j’ai eu le plaisir de partager le bureau 3D23.Je voudrais remercier mes amis, et parmi eux ceux de toujours, Emmanuel, Romain,Valentin.Je voudrais remercier ma famille pour son soutien inconditionnel. Je remercie mesparents et mon fr`ere de leur curiosit´e, parfois incr´edule, de mes sujets d’´etude.i Je d´edie cette th`ese `a ma femme Jennifer, sans qui rien de ceci n’aurait de sens, et `amon fils Nathana¨el pour le bonheur qu’il m’apporte chaque jour.ii
A Jenniferet Nathana¨el v ummary Introduction 1Introduction (English Version) 23
Part I Optimal mesh adaptation for finite elements of arbi-trary order 45
Chapter 1 Sharp asymptotics of the L p approximation error on rec-tangles 47 Chapter 2 Sharp asymptotics of the L p interpolation error 71 Summary
Chapter 3 Sharp asymptotics of the W ,p interpolation error 113 L m,p and local error estimates . . . . . . . . . . . . 1183.3 Proof of Theorems 3.1.1 and 3.1.2 . . . . . . . . . . . . . . . . . . . . . 1283.3.1 Proof of the lower estimate (3.7) . . . . . . . . . . . . . . . . . 1283.3.2 Proof of the upper estimates (3.4) and (3.8) . . . . . . . . . . . 1303.4 Optimal metrics for linear and quadratic elements . . . . . . . . . . . . 1333.4.1 Optimal metrics . . . . . . . . . . . . . . . . . . . . . . . . . . 1343.4.2 The case of linear and quadratic elements . . . . . . . . . . . . 1353.4.3 Limiting the anisotropy in mesh adaptation . . . . . . . . . . . 1363.4.4 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . 1393.4.5 Quality of a triangulation generated from a metric . . . . . . . 1413.5 Polynomial equivalents of the shape function . . . . . . . . . . . . . . . 1423.6 Extension to higher dimension . . . . . . . . . . . . . . . . . . . . . . . 1453.6.1 Generalisation of the shape function and of the measure of sli-verness. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1453.6.2 Generalisation of the error estimates . . . . . . . . . . . . . . . 149ii3.6.3 Optimal metrics and algebraic expressions of the shape function. 1523.7 Final remarks and conclusion . . . . . . . . . . . . . . . . . . . . . . . 1533.8 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1533.8.1 Proof of Lemma 3.3.2 . . . . . . . . . . . . . . . . . . . . . . . 1533.8.2 Proof of Theorem 3.5.2 . . . . . . . . . . . . . . . . . . . . . . . 1553.8.3 Proof of Proposition 3.6.1 . . . . . . . . . . . . . . . . . . . . . 156 Part II Anisotropic smoothness classes 161
Chapter 4 From finite element approximation to image models 163
Part III Mesh adaptation and riemannian metrics 203
Chapter 5 Are riemannian metrics equivalent to simplicial meshes ? 205
Summary R d , d H ) . . . . . . . . . . . . 2235.3 Metrics having an eigenspace of dimension d − Chapter 6 Approximation theory based on metrics 273 L p error . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2956.4.2 The W ,p error, when the measure of sliverness is uniformlybounded . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2986.4.3 The W ,p error, on a general mesh . . . . . . . . . . . . . . . . 3016.5 Asymptotic approximation and explicit metrics . . . . . . . . . . . . . 3046.5.1 Explicit Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 3066.5.2 Geometric convexity . . . . . . . . . . . . . . . . . . . . . . . . 3106.5.3 A well posed variant of the shape function . . . . . . . . . . . . 3156.5.4 Construction of the metric . . . . . . . . . . . . . . . . . . . . . 3186.5.5 The shape function is equivalent to a continuous function . . . . 322 Part IV Hierarchical refinement algorithms 329
Chapter 7 Adaptive and anisotropic multiresolution analysis 331 x7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3317.2 An adaptive and anisotropic multiresolution framework . . . . . . . . . 3347.2.1 The refinement procedure . . . . . . . . . . . . . . . . . . . . . 3347.2.2 Adaptive tree-based triangulations . . . . . . . . . . . . . . . . 3367.2.3 Anisotropic wavelets . . . . . . . . . . . . . . . . . . . . . . . . 3387.3 Convergence analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3407.3.1 A convergence criterion . . . . . . . . . . . . . . . . . . . . . . 3407.3.2 A case of non-convergence . . . . . . . . . . . . . . . . . . . . . 3427.3.3 A modified refinement rule . . . . . . . . . . . . . . . . . . . . . 3437.4 Numerical illustrations . . . . . . . . . . . . . . . . . . . . . . . . . . . 3477.4.1 Quadratic functions . . . . . . . . . . . . . . . . . . . . . . . . 3477.4.2 Sharp transition . . . . . . . . . . . . . . . . . . . . . . . . . . 3497.4.3 Numerical images . . . . . . . . . . . . . . . . . . . . . . . . . . 3507.5 Conclusions and perspectives . . . . . . . . . . . . . . . . . . . . . . . 352
Chapter 8 Greedy bisection generates optimally adapted triangulations355
Chapter 9 Variants of the greedy bisection algorithm 385
Summary
Bibliography 431 ntroduction
Although this may seem a paradox, all exact science is dominated by theidea of approximation. (Bertrand Russel, logicien et prix Nobel)
Cette th`ese est consacr´ee au probl`eme de l’approximation de fonctions par des ´el´e-ments finis polynomiaux par morceaux sur des triangulations, et sur des maillages plusg´en´eraux. Nous nous int´eressons tout particuli`erement `a la situation o`u le maillage est construit en adaptation avec la fonction approch´ee. Ce maillage peut donc comporter des´el´ements de taille, de rapport d’aspect et d’orientation fortement variables.L’approximation par des fonctions polynomiales par morceaux est une proc´edure quiintervient dans de nombreuses applications. Dans certaines d’entre elles comme la com-pression de donn´ees de terrain, d’images ou de surfaces, la fonction f approch´ee peut ˆetreconnue exactement. Dans d’autres applications comme le d´ebruitage de donn´ees, l’ap-prentissage statistique ou la discr´etisation par ´el´ements finis d’Equations aux D´eriv´eesPartielles (EDP), la fonction approch´ee n’est connue que partiellement, voire totalementinconnue initialement. Dans toutes ces applications, on ´etablit usuellement une distinc-tion entre l’approximation uniforme ou adaptative . Dans le cadre uniforme, le domainede d´efinition de la fonction est d´ecompos´e en une partition form´ee d’´el´ements de tailleet de forme comparables, alors que ces attributs peuvent varier fortement dans le cadreadaptatif. La partition peut dans ce dernier cas ˆetre adapt´ee aux propri´et´es locales de lafonction f , dans l’objectif d’optimiser le compromis entre la pr´ecision et la complexit´e del’approximation.Du point de vue de la th´eorie de l’approximation, ce compromis entre pr´ecision etcomplexit´e est g´en´eralement li´e `a la r´egularit´e de la fonction : on s’attend typiquement `ades vitesses de convergence plus ´elev´ees pour des fonctions plus r´eguli`eres. Les fonctions quise pr´esentent dans les applications concr`etes peuvent cependant pr´esenter des propri´et´esh´et´erog`enes de r´egularit´e, dans le sens o`u elles sont r´eguli`eres dans certaines r´egions,qui s´eparent des discontinuit´es localis´ees. Deux exemples typiques, illustr´es Figure 1 etFigure 2, sont (i) les bords des objets dans les fonctions qui repr´esentent des images,et (ii) les chocs dans les solutions d’EDP hyperboliques et non-lin´eaires. Les m´ethodesnum´eriques destin´ees au traitement de l’image, pour le d´ebruitage ou la compression1 Introduction
Figure f qui correspondentaux plus grands coefficients. Un second pas dans l’adaptativit´e est de remarquer qu’uneplus grande r´esolution de la partition est requise dans la direction orthogonale `a la courbede discontinuit´e que dans la direction tangentielle, et de tirer parti de cette propri´et´een employant une partition anisotrope du domaine. En deux dimensions ces partitionssont typiquement construites `a l’aide de triangles de fort rapport d’aspect align´es avec lesdiscontinuit´es, comme illustr´e Figure 1 (bas droite) et Figure 2.Dans le contexte de la simulation num´erique des EDP, l’adaptativit´e signifie ´egalementque le maillage de calcul n’est pas fix´e a priori, mais dynamiquement mis `a jour au coursde la simulation `a mesure que la solution exacte se d´evoile. D’un point de vue num´eriqueces m´ethodes requi`erent des algorithmes et des structures de donn´ees plus complexes queleurs pendants non-adaptatifs. D’un point de vue th´eorique l’analyse de ces algorithmesest difficile, lorsqu’elle est possible. On ne sait en fait rigoureusement ´etablir que les m´e-thodes adaptatives am´eliorent la vitesse de convergence des solutions approch´ees vers lasolution exacte, que pour un nombre r´eduit de syst`emes d’EDP et seulement dans le cas de Figure isotrope de maillage. Nous r´ef´erons au survey [77] pour une vue d’ensemblede ces r´esultats dans le cas des ´equations elliptiques. Nous devons mentionner que cesdifficult´es sont accentu´ees lorsque des ´el´ements finis anisotropes sont utilis´es. Plusieurslogiciels comme [20, 93, 94] utilisent cependant avec succ`es les maillages anisotropes pourla simulation num´erique des EDP, comme illustr´e par exemple Figure 2. D’un point de vuenum´erique l’am´elioration apport´ee par ces m´ethodes semble ´evidente en comparaison avecles m´ethodes non-adaptatives ou adaptatives isotropes. Cependant de nombreux aspectsde l’analyse th´eorique de ces m´ethodes restent des questions ouvertes.Cette th`ese ´etudie le probl`eme de l’adaptation de maillage anisotrope pour l’approxi-mation d’une fonction connue . Ceci peut ˆetre consid´er´e comme une ´etape pr´eliminairepour l’analyse de l’adaptation anisotrope de maillage dans la simulation des EDP, maisd’autres applications peuvent en tirer parti comme le traitement de donn´ees de terrain,de surfaces ou d’images.Etant donn´ee une triangulation T d’un domaine born´e et polygonal Ω ⊂ R , et unentier fix´e k ≥
1, nous notons V k ( T ) l’espace des ´el´ements finis de degr´e k sur T . L’espace V k ( T ) est form´e de toutes les fonctions qui co¨ıncident sur chaque triangle T ∈ T avec unpolynˆome de degr´e total k V k ( T ) := { g ; g | T ∈ IP k , T ∈ T } . La dimension de V k ( T ) est de l’ordre de O ( k T )). Etant donn´ee une fonction f : Ω → R et une triangulation T de Ω, l’erreur de meilleure approximation de f dans V k ( T ) estd´efinie par e T ( f ) X := inf g ∈ V k ( T ) (cid:107) f − g (cid:107) X . (1)La lettre X d´esigne la norme ou la semi-norme dans laquelle l’erreur d’approximation (cid:107) f − g (cid:107) X est mesur´ee. Dans cette th`ese nous restreignons notre attention `a la norme L p et `a la semi-norme W ,p , o`u 1 ≤ p ≤ ∞ . Elles sont d´efinies comme suit : (cid:107) h (cid:107) L p (Ω) := (cid:18)(cid:90) Ω | h | p (cid:19) p et | h | W ,p (Ω) := (cid:18)(cid:90) Ω |∇ h | p (cid:19) p , avec la modification usuelle lorsque p = ∞ . Notons que l’on doit imposer la continuit´eglobale de g dans la d´efinition ci dessus de V k ( T ) lorsqu’on utilise la semi-norme W ,p . Introduction
La meilleure approximation g ∈ V k ( T ) de f peut ˆetre calcul´ee exactement dans le casde la norme L ou de la semi-norme W , (ou H ) : g est la projection orthogonale de f sur V k ( T ) par rapport au produit scalaire associ´e `a la norme d’int´erˆet. Dans le cas p (cid:54) = 2de normes non hilbertiennes la meilleure approximation de f est g´en´eralement difficile `acalculer, mais des approximations “satisfaisantes” peuvent ˆetre obtenues par diff´erentesm´ethodes. Si la fonction est continue, on peut utiliser l’interpolation de Lagrange, tandisqu’un op´erateur de quasi-interpolation est pr´ef´er´e pour les fonctions non-lisses, voir Cha-pitre 6. Plus g´en´eralement, si P T est un op´erateur arbitraire de projection de l’espace X sur V k ( T ), il est ais´e de voir que l’on a pour toute f ∈ X (cid:107) f − P T f (cid:107) X ≤ Ce T ( f ) X , (2)o`u C := 1 + (cid:107) P T (cid:107) X → X . Le probl`eme de l’approximation d’une fonction f sur une trian-gulation donn´ee T , par des ´el´ements finis de degr´e k , est donc en bonne partie r´esolu.Dans le contexte de l’approximation adaptative, la triangulation T du domaine Ωn’est pas fix´ee, mais peut ˆetre choisie librement en adaptation avec f (par contraste noussupposons toujours dans cette th`ese que l’entier k est fix´e, bien qu’arbitraire). Ceci nousm`ene naturellement `a l’objectif de caract´eriser et de construire un maillage optimal pourune fonction donn´ee f . Etant donn´ee une norme X d’int´erˆet et une fonction f `a approcher,nous formulons le probl`eme de l’adaptation optimale de maillage , comme la minimisationde l’erreur d’approximation parmi toutes les triangulations de cardinalit´e donn´ee . Nousd´efinissons donc l’erreur de meilleure approximation adaptative comme suit : e N ( f ) X := inf T ) ≤ N e T ( f ) X = inf T ) ≤ N inf g ∈ V k ( T ) (cid:107) f − g (cid:107) X . (3)Par contraste avec le probl`eme de la meilleure approximation par ´el´ements finis sur unmaillage fix´e, l’approximation adaptative et anisotrope n’est pas encore bien comprise.En particulier (i) comment le maillage optimal d´epend-il de la fonction f , et (ii) com-ment l’erreur optimale e N ( f ) X d´ecroˆıt-elle lorsque N augmente ? Ces probl`emes sont biencompris dans le cadre isotrope, o`u l’optimisation est restreinte aux triangulations danslesquelles les triangles satisfont uniform´ement une contrainte sur leur rapport d’aspectdiam( T ) ≤ C | T | o`u diam( T ) et | T | repr´esentent le diam`etre et l’aire de T respectivement, et C > V k ( T ), et qu’une solution presque optimale peut doncˆetre obtenue en appliquant `a f un op´erateur de projection stable comme indiqu´e dans(2). Par contraste, le probl`eme d’optimisation (3) est pos´e sur la r´eunion d’espaces V k ( T )pour toutes les triangulations T satisfaisant T ) ≤ N . Il s’agit donc d’un probl`eme d’ap-proximation non-lin´eaire . D’autres exemples de ce type de probl`eme sont l’approximationpar les N meilleurs termes dans un dictionnaire de fonctions, ou l’approximation par desfonctions rationnelles. Nous r´ef´erons le lecteur `a [42] pour un survey sur l’approximationnon-lin´eaire.L’objectif de cette th`ese est de mieux comprendre le probl`eme d’adaptation de maillageoptimale pos´e sur la classe enti`ere des triangulations potentiellement anisotropes . Lesquatre parties de cette th`ese sont consacr´ees respectivement aux quatre questions ci-dessous :I. Comment l’erreur d’approximation e N ( f ) X se comporte-t-elle dans le r´egime asymp-totique o`u le nombre N de triangles tend vers l’infini, lorsque f est une fonctionsuffisamment r´eguli`ere ? Nous ´etablissons dans ce contexte une caract´erisation ma-th´ematique du maillage optimal, ainsi que des estimations pr´ecises sup´erieures etinf´erieures de e N ( f ) X `a l’aide de N et de quantit´es qui d´ependent non lin´eairement des d´eriv´ees de f .II. Quelles classes de fonctions gouvernent la vitesse de d´ecroissance de e N ( f ) X lorsque N augmente, et sont en ce sens naturellement li´ees au probl`eme d’adaptation op-timale de maillage ? Nous pensons en particulier `a ce qu’on appelle les fonctionscartoon , qui par d´efinition sont r´eguli`eres except´e le long d’une famille de courbes dediscontinuit´e. Il s’agit d’un mod`ele d’image populaire, illustr´e par exemple Figure 1.Nous verrons que ce mod`ele s’inscrit naturellement dans une classe de fonctions plusriche qui correspond `a une vitesse donn´ee de d´ecroissance de e N ( f ) X .III. Le probl`eme d’optimisation (3), qui porte sur les triangulations T de cardinalit´edonn´ee N , peut-il ˆetre remplac´e par un probl`eme ´equivalent mais plus accessible ?Les triangulations sont en effet des objets combinatoires et discrets, d´ecrits par leurssommets et arˆetes, ce qui est peu commode lorsque l’on r´esout des probl`emes d’opti-misation de la forme de (3). Nous ´etudions la correspondance entre certaines classesde triangulations et de m´etriques riemanniennes qui sont par contraste des objetscontinus. Ceci nous permet de reformuler le probl`eme original d’optimisation par unprobl`eme plus facilement soluble pos´e sur l’ensemble des m´etriques riemanniennes.IV. Est-il possible de construire une suite quasi-optimale de triangulations ( T N ) N ≥ , o`u T N ) = N , en utilisant une proc´edure hi´erarchique de raffinement ? La propri´et´ede hi´erarchie garantit l’inclusion des espaces d’´el´ements finis associ´es aux triangula-tions : V k ( T N ) ⊂ V k ( T N +1 ). Elle est requise dans des applications comme le codageprogressif ou le traitement de donn´ees en ligne (i.e. `a mesure qu’elles sont trans-mises). Nous proposons un algorithme simple et explicite qui donne une r´eponsepositive `a cette question sous certaines conditions. Avant d’entrer dans le d´etail du contenu de la th`ese, nous donnons dans cette sectionun aper¸cu rapide de l’´etat de l’art dans les deux probl`emes de l’´etude de e N ( f ) X lorsque N augmente et de la construction d’une triangulation proche de l’optimale. Nous devonspour cela introduire certaines notations. Nous supposons dans la suite que f ∈ C (Ω),o`u Ω ⊂ R est un domaine polygonal born´e et Ω d´esigne l’adh´erence de Ω. Pour chaquetriangulation T de Ω nous notons I k T l’op´erateur usuel d’interpolation de Lagrange sur Introduction les ´el´ements finis de degr´e k sur T : sur chaque triangle T ∈ T l’interpolation I k T f estl’unique ´el´ement de IP k qui co¨ıncide avec f aux points de coordonn´ees barycentriques { , k , · · · , k − k , } . Lorsque k = 1 ces points sont simplement les trois sommets de T . Sila triangulation T est conforme (chaque arˆete de chaque triangle est soit sur le bord deΩ, soit co¨ıncide avec l’arˆete enti`ere d’un autre triangle), alors I k T f est continu. Pour ˆetreconsistant avec le reste de cette th`ese nous d´efinissons m := k + 1 ≥
2, et nous avons donc e T ( f ) L p ≤ (cid:107) f − I m − T f (cid:107) L p (Ω) et e T ( f ) W ,p ≤ (cid:107)∇ f − ∇ I m − T f (cid:107) L p (Ω) . Toute estimation sur l’erreur d’interpolation donne donc une estimation sup´erieure surl’erreur de meilleure approximation e T ( f ) X . De plus si f est suffisamment r´eguli`ere et T suffisamment fine, alors ces quantit´es sont g´en´eralement comparables.L’un des r´esultats fondateurs en approximation adaptative et anisotrope par des ´el´e-ments finis porte sur les ´el´ements finis affines par morceaux ( m = 2) lorsque l’erreurest mesur´ee en norme L p , voir aussi [27] ou Chapitre 2. Ce r´esultat peut ˆetre formul´ecomme suit : pour tout domaine polygonal born´e Ω ⊂ R , pour tout 1 ≤ p < ∞ et pourtoute fonction f ∈ C (Ω), il existe une suite ( T N ) N ≥ N de triangulations de Ω satisfaisant T N ) ≤ N , et telles quelim sup N →∞ N (cid:107) f − I T N f (cid:107) L p (Ω) ≤ C (cid:13)(cid:13)(cid:13)(cid:112) | det( d f ) | (cid:13)(cid:13)(cid:13) L τ (Ω) , (4)o`u l’exposant τ ∈ (0 , ∞ ) est d´efini par 1 τ := 1 + 1 p , et C est une constante universelle ( C est ind´ependante de p , Ω et f ). Nous rappelons quela limite sup´erieure d’une suite ( u N ) N ≥ N est d´efinie parlim sup N →∞ u N := lim N →∞ sup n ≥ N u n , (5)et est en g´en´eral strictement inf´erieure au supremum sup N ≥ N u N . Trouver une majorationpertinente de sup N ≥ N (cid:107) f − I T N f (cid:107) L p (Ω) reste aujourd’hui un probl`eme ouvert lorsque destriangulations anisotropes adapt´ees de mani`ere optimale sont utilis´ees. Le r´esultat (4)s’´etend `a l’exposant p = ∞ , et aux maillages simpliciaux de domaines Ω de plus grandedimension, mais les maillages T N peuvent ne pas ˆetre conformes.Le r´esultat (4) r´ev`ele que la pr´ecision de l’approximation de f est gouvern´ee par laquantit´e (cid:112) | det( d f ) | qui d´epend non lin´eairement de la matrice hessienne d f . Cette d´e-pendance non lin´eaire est fortement li´ee au fait que nous autorisons des triangles de formespotentiellement fortement anisotropes. L’estimation d’erreur (4) est obtenue en produi-sant des triangulations suffisamment fines qui combinent les deux propri´et´es heuristiquessuivantes :a) Equidistribution des erreurs : la contribution (cid:107) f − I T f (cid:107) L p ( T ) de chaque triangle T ∈ T N `a l’erreur d’interpolation globale (cid:107) f − I T N f (cid:107) L p ( T ) est du mˆeme ordre. Cette conditionse traduit par une contrainte locale sur l’aire des triangles, qui est dict´ee par le com-portement local de f et en particulier par det( d f ( z )). Figure i sont couvertespar des pavages p´eriodiques, qui sont ensuite recoll´es.b) Forme optimale des triangles : le rapport d’aspect et l’orientation d’un triangle T ∈ T N est dict´e par le rapport des valeurs propres et par la direction des vecteurs propres dela matrice hessienne d f ( z ) pour z ∈ T .La m´ethode la plus simple pour construire une suite ( T N ) N ≥ N de triangulations satis-faisant (4) est d’utiliser une strat´egie de “patchs locaux” que l’on peut d´ecrire intuitivementcomme suit. Dans une premi`ere ´etape le domaine Ω est d´ecoup´e en r´egions Ω i , 1 ≤ i ≤ k ,suffisamment petites pour que la matrice hessienne d f ( z ) varie peu sur chaque Ω i autourd’une valeur moyenne M i . En d’autres termes f est bien approch´ee par un polynˆomequadratique sur chaque Ω i . Chaque r´egion Ω i est ensuite pav´ee par une triangulation uni-forme T iN dont les mailles sont de taille, de rapport d’aspect et d’orientation dict´es par M i . Les triangulations T iN de Ω sont ensuite recoll´ees de mani`ere conforme pour formerune triangulation T N de Ω, au prix de quelques triangles suppl´ementaires aux interfacesentre les Ω i .Cette construction, illustr´ee Figure 3 est suffisante pour ´etablir le r´esultat asympto-tique (4), mais pas pour des applications pratiques car elle ne devient efficace que pour ungrand nombre de triangles. L’approche suivante, fond´ee sur les m´etriques riemanniennes,est souvent pr´ef´er´ee dans les applications. Pour simplifier l’exposition nous supposons quela matrice hessienne M ( z ) := d f ( z ) est d´efinie positive en chaque point z ∈ Ω, et nousd´efinissons H ( z ) := λ (det M ( z )) − p +2 M ( z ) (6)o`u λ > H , qui associe continˆument `a chaque point z ∈ Ω une matrice sym´e-trique d´efinie positive H ( z ) ∈ S +2 , est appel´ee une m´etrique riemannienne. Notez que pourchaque z ∈ Ω, la matrice H ( z ) d´efinit une ellipse E z := { u ∈ R ; u T H ( z ) u ≤ } . Comme illustr´e Figure 4, partie gauche, une m´etrique encode `a chaque point z ∈ Ωune information d’aire, de rapport d’aspect et d’orientation, sous la forme d’une matricesym´etrique d´efinie positive H ( z ) ou de mani`ere ´equivalente d’une ellipse E z . Plusieursalgorithmes de g´en´eration de maillage, comme [93, 94], sont capables de produire unetriangulation adapt´ee `a la m´etrique z (cid:55)→ H ( z ), dans le sens o`u pour chaque z ∈ Ω letriangle T ∈ T contenant z a une forme “similaire” `a l’ellipse E z comme illustr´e Figure 4, Introduction
Figure T ∈ T etchaque z ∈ T , l’on a b T + c E z ⊂ T ⊂ b T + c E z , (7)o`u 0 < c < c sont des constantes fix´ees et b T d´esigne le barycentre de T , ce qui signifieaussi que T est “proche” d’ˆetre un triangle ´equilat´eral dans la m´etrique H ( z ). Si T estune telle triangulation, et si λ est suffisamment grand, un argument heuristique (qui serarappel´e Chapitre 2) montre que T ) (cid:107) f − I T f (cid:107) L p (Ω) ≤ C (cid:107) (cid:112) | det( d f ) |(cid:107) L τ (Ω) , o`u la constante C d´epend de c et c .En faisant varier le param`etre λ nous obtenons diff´erentes triangulations T λ , de car-dinalit´e proportionnelle `a λ , ce qui m`ene `a l’estimation d’erreur (4). Mentionnons quela construction du maillage T `a partir de la m´etrique H n’est pas ´evidente. De plus iln’existe pas de preuve rigoureuse que la condition de similarit´e (7) est v´erifi´ee par les al-gorithmes de g´en´eration de maillage les plus courants, `a l’exception notable de [66] et [15].La borne d’approximation (4) est optimale si l’on se restreint aux triangulations quisatisfont une condition technique d´efinie comme suit. Nous disons qu’une suite ( T N ) N ≥ N de triangulations d’un domaine polygonal Ω ⊂ R est admissible si T N ) ≤ N et sisup N ≥ N (cid:18) √ N max T ∈T N diam( T ) (cid:19) < ∞ . (8)Pour toute suite admissible ( T N ) N ≥ N de triangulations de Ω, on peut ´etablir la minorationlim inf N →∞ N (cid:107) f − I T N f (cid:107) L p (Ω) ≥ c (cid:13)(cid:13)(cid:13)(cid:112) | det( d f ) | (cid:13)(cid:13)(cid:13) L τ (Ω) , (9)o`u la constante c > ε > T N ) N ≥ N qui satisfait la majorationd’erreur (4) `a la constante ε pr`es ajout´ee au terme de droite.Des r´esultats similaires `a (4) et (9) peuvent ˆetre d´evelopp´es pour les maillages isotropes,dans lesquels la taille des triangles peut varier mais pas leur forme, en ce sens que la mesurede d´eg´en´erescence ρ ( T ) := diam( T ) / | T | est uniform´ement born´ee par une constante ρ ,voir par exemple [35]. Dans ce cas l’estimation (4) doit ˆetre remplac´ee parlim sup N →∞ N (cid:107) f − I T N f (cid:107) L p (Ω) ≤ C (cid:107) d f (cid:107) L τ (Ω) , (10)avec la mˆeme valeur de τ , et l’estimation (9) a un pendant similaire. Les constantes C et c apparaissant dans ces estimations d´ependent d´esormais de la borne ρ sur la mesurede d´eg´en´erescence. Ainsi la quantit´e non lin´eaire (cid:112) | det( d f ) | est remplac´ee par le termelin´eaire d f dans la norme L τ , et ces r´esultats sont d´esormais tr`es similaires `a ceux demeilleure approximation par N ondelettes [30].En termes des valeurs propres λ ( z ) , λ ( z ) de la matrice sym´etrique d f ( z ) nous rem-pla¸cons donc la moyenne g´eom´etrique (cid:112) | λ ( z ) λ ( z ) | par max {| λ ( z ) | , | λ ( z ) |} , qui peutˆetre significativement plus grand quand ces valeurs propres sont d’ordres de grandeur dif-f´erents. C’est le cas typiquement si la fonction f approch´ee pr´esente des caract´eristiquesfortement anisotropes, et nous pouvons donc nous attendre `a une am´elioration substan-tielle des propri´et´es d’approximation lorsque des maillages anisotropes sont utilis´es pourde telles fonctions.Le r´esultat (4) donne un compte rendu pr´ecis de l’am´elioration que peuvent appor-ter des triangulations anisotropes en comparaison avec les triangulations isotropes, maismalheureusement seulement dans un cadre restreint, ce qui a motiv´e notre travail :I. Le r´esultat original ne s’applique qu’`a l’erreur d’interpolation lin´eaire mesur´ee ennorme L p , alors que les ´el´ements finis de plus haut degr´e et les normes de Sobo-lev W ,p sont aussi pertinents. En particulier la norme W , (ou H ) apparaˆıt tr`esnaturellement dans le contexte des EDP elliptiques.II. La fonction approch´ee f doit ˆetre C , alors que les applications les plus int´eressantesde l’approximation adaptative font intervenir des fonctions non lisses voire disconti-nues. Quel sens peut-on donner `a la quantit´e (cid:107) (cid:112) | det( d f ) |(cid:107) L τ (Ω) lorsque f est unefonction discontinue ?III. La m´etrique riemannienne z (cid:55)→ H ( z ) est utilis´ee comme objet interm´ediaire pour lag´en´eration de maillages dans les applications num´eriques. Cette approche manquecependant d’un r´esultat pr´ecis d’´equivalence entre ces objets continus et les diff´e-rentes classes de triangulations anisotropes. Sous quelles conditions peut-on associer`a une m´etrique z (cid:55)→ H ( z ) une triangulation T qui lui est adapt´ee au sens de (7) ?IV. Les algorithmes pr´ec´edemment mentionn´es de g´en´eration de maillages anisotropesne sont pas hi´erarchiques, dans le sens o`u une meilleure pr´ecision n’est pas atteintepar un raffinement local mais par la re-g´en´eration globale d’un nouveau maillage.Peut-on proposer un algorithme de raffinement hi´erarchique qui permet d’obtenirl’erreur d’approximation optimale (4) ?Nous devons aussi mentionner deux probl`emes fondamentaux qui sont discut´es danscette th`ese, mais qui sont rest´es des probl`emes ouverts et feront l’objet de travaux fu-turs. En premier lieu l’estimation d’erreur (4) ne donne qu’une information asymptotique,lorsque le nombre de triangles tend vers ∞ , alors qu’une estimation d’erreur portant surtoutes les valeurs de N est fortement souhait´ee. En second lieu l’extension de ce r´esultat0 Introduction `a des fonctions d´efinies sur des domaines de dimension sup´erieure `a deux est entrav´ee parun probl`eme difficile de g´eom´etrie algorithmique : la production de maillages conformeset anisotropes en dimension 3 ou plus ´elev´ee. Des m´ethodes num´eriques comme [94] s’at-taquent `a ce probl`eme, mais il n’est pas r´esolu d’un point de vue th´eorique.
La th`ese est constitu´ee de quatre parties qui visent `a r´esoudre les quatre probl`emesclefs, num´erot´es I `a IV, que nous avons rencontr´es dans les r´esultats ant´erieurs sur l’ap-proximation adaptative et anisotrope par ´el´ements finis. Ces quatre parties sont essentiel-lement auto-consistantes et peuvent donc ˆetre lus ind´ependamment. Les chapitres consti-tuant chaque partie gagnent `a ˆetre lus dans l’ordre, `a l’exception du chapitre 1 dont lalecture peut ˆetre ignor´ee par le lecteur souhaitant aller plus rapidement au coeur du sujet.Les chapitres 1, 2, 3, 4, 7, 8, ainsi que la troisi`eme partie du chapitre 9, sont respecti-vement issus des articles [9], [72], [73], [37], [34], [36] et [35]. Les notations utilis´ees dansces articles ont ´et´e unifi´es pour la clart´e de l’ensemble. Les chapitres 5 et 6 sont l’objetd’articles en pr´eparation.
Partie I. El´ements finis de degr´e arbitraire, et normes de Sobolev
Nous g´en´eralisons dans cette partie l’estimation d’erreur asymptotique (4) aux ´el´e-ments finis de degr´e arbitraire et aux semi-normes de Sobolev W ,p pour la mesure del’erreur. Cette analyse nous am`ene `a introduire des concepts clefs pour l’adaptation opti-male de maillage.Pour commencer nous consid´erons dans le Chapitre 1 des partitions d’un domaine rec-tangulaire en rectangles align´es avec les axes de coordonn´ees, comme illustr´e Figure 5, `ala place de triangles de directions arbitraires. De telles partitions sont pertinentes lorsqueles axes de coordonn´ees jouent un rˆole privil´egi´e, de sorte que les traits anisotropes dela fonction f sont align´es avec les axes de coordonn´ees. Nous obtenons une estimationd’erreur asymptotique optimale dans ce contexte. Nos r´esultats s’appliquent `a des poly-nˆomes par morceaux de degr´e arbitraire, en dimension quelconque d >
1, lorsque l’erreurd’approximation est mesur´ee en norme L p . Nous ne consid´erons pas ici les normes W ,p car ces approximations polynomiales par morceaux sont g´en´eralement discontinues.Le principal avantage de ce cadre est que les d´etails techniques requis pour la construc-tion d’une partition anisotrope du domaine, ainsi que l’analyse d’erreur, sont simplifi´espar la pr´esence de directions privil´egi´ees. Nous tirons avantage de ce contexte simple pourintroduire et ´etudier un concept clef appel´e la fonction de forme , ou shape function enanglais, qui gouverne l’erreur d’approximation locale apr`es une adaptation optimale des´el´ements de la partition aux propri´et´es locales de la fonction approch´ee. Cette fonctionde forme est aussi d´efinie et utilis´ee dans les Chapitres 2 et 3 pour des triangulationsanisotropes. Nous donnons ci-dessous sa d´efinition pr´ecise dans ce cadre.Le Chapitre 2 est consacr´e aux ´el´ements finis triangulaires de degr´e arbitraire m − Figure m ≥
2, lorsque l’erreur est mesur´ee en norme L p . Pour pr´esenter nos r´esultats, nousdevons introduire quelques notations. Nous notons IH m l’espace vectoriel des polynˆomeshomog`enes de degr´e m : IH m := Span { x k y l ; k + l = m } . Un ingr´edient clef de notre approche est la fonction de forme K m,p : IH m → R + , o`u1 ≤ p ≤ ∞ est un exposant donn´e. Cette fonction est d´efinie par une optimisation del’erreur L p d’interpolation parmi les triangles d’aire 1 de toutes les formes possibles : pourtout π ∈ IH m , K m,p ( π ) := inf | T | =1 (cid:107) π − I m − T π (cid:107) L p ( T ) , (11)o`u I m − T d´esigne l’op´erateur d’interpolation locale sur T . Notre r´esultat principal est l’es-timation d’erreur asymptotique suivante. Th´eor`eme 1.
Soit Ω ⊂ R un domaine polygonal born´e, soit f ∈ C m (Ω) et soit ≤ p < ∞ . Il existe une suite ( T N ) N ≥ N , T N ) ≤ N , de triangulations conformes de Ω telles que lim sup N →∞ N m (cid:107) f − I m − T N f (cid:107) L p (Ω) ≤ (cid:13)(cid:13)(cid:13)(cid:13) K m,p (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) (12) o`u l’exposant τ ∈ (0 , ∞ ) est d´efini par τ := m + p . Dans l’estimation d’erreur (12), nous identifions la donn´ee d m f ( z ) des d´eriv´ees d’ordre m de f au point z au polynˆome homog`ene qui lui correspond dans le d´eveloppement deTaylor de f en z . Sous forme math´ematique d m f ( z ) m ! ∼ (cid:88) k + l = m ∂ m f∂ k x ∂ l y ( z ) x k k ! y l l ! . Le Th´eor`eme 1 ´etend le r´esultat connu (4) aux ´el´ements finis de degr´e arbitraire m − f est d´eter-min´ee par une expression non-lin´eaire des d´eriv´ees de f : la fonction de forme K m,p ( d m f ( z ))est la “g´en´eralisation” aux d´eriv´ees d’ordre sup´erieur du d´eterminant (cid:112) | det( d f ( z )) | ap-paraissant dans (4).2 Introduction
Ces r´esultats motivent une ´etude approfondie de la fonction de forme K m,p . Nous avonsmontr´e que K ,p , qui correspond au cas m = 2 de l’approximation affine par morceaux,est proportionnel `a la racine carr´ee du d´eterminant du polynˆome quadratique π = ax +2 bxy + cy ∈ IH : K ,p ( π ) = c ,p (cid:112) | det π | = c ,p (cid:112) | ac − b | , o`u la constante c ,p > π . Nous retrouvons donc ler´esultat ant´erieur (4). Dans le cas m = 3 de l’approximation quadratique par morceaux,nous montrons que la fonction de forme K ,p est la racine quatri`eme du discriminant dupolynˆome cubique homog`ene π = ax + 3 bx y + 3 cxy + dy ∈ IH : K ,p ( π ) = c ,p (cid:112) | disc π | = c ,p (cid:112) | ac − b )( bd − c ) − ( ad − bc ) | , o`u la constante c ,p > π . Pour les plus grandes valeursde m , m ≥
4, nous n’avons pas obtenu d’expression explicite de la fonction de forme, maisune quantit´e explicite qui lui est uniform´ement ´equivalente. Cet ´equivalent a la formesuivante : il existe un polynˆome Q m ( a , · · · , a m ) des m + 1 variables a , · · · , a m , et uneconstante C m ≥ π = a x m + a x m − y + · · · a m y m ∈ IH m on ait ennotant r m := deg Q m , C − m rm (cid:112) Q m ( a , · · · , a m ) ≤ K m,p ( π ) ≤ C m rm (cid:112) Q m ( a , · · · , a m ) . (13)Le polynˆome Q m s’obtient `a l’aide de la th´eorie des polynˆomes invariants d´evelopp´ee parHilbert dans [61]. Nous caract´erisons aussi les z´eros de la fonction de forme, et donc lescas possibles de “super-convergence” : K m,p ( π ) = 0 si et seulement si π se factorise parun facteur lin´eaire ax + by de multiplicit´e s > m/
2, en d’autres termes si le polynˆomehomog`ene π est suffisamment d´eg´en´er´e.La preuve du Theor`eme 1 est fond´ee sur la “strat´egie des patchs locaux” qui a ´et´e´evoqu´ee pr´ec´edemment : on consid`ere en premier lieu une “macro-triangulation” R dudomaine initial Ω, qui est suffisamment fine pour que les d´eriv´ees d’ordre m de f varientpeu sur chaque triangle R ∈ R autour d’une valeur moyenne π R ∈ IH m . A chaque poly-nˆome π R , R ∈ R , on associe ensuite un triangle T R qui minimise, ou presque, le probl`emed’optimisation d´efinissant K m,p ( π R ). On pave ensuite chaque “macro-triangle” R ∈ R demani`ere p´eriodique en utilisant le triangle T R convenablement mis `a l’´echelle et son sy-m´etrique par rapport `a l’origine. Finalement, comme illustr´e Figure 3, la triangulation T N est obtenue en recollant ensemble les pavages p´eriodiques d´efinis sur chaque R ∈ R ,`a l’aide de quelques triangles suppl´ementaires aux interfaces pour obtenir un maillageglobalement conforme.Le th´eor`eme suivant ´etablit que l’estimation asymptotique (12) est optimale, si l’onse restreint aux suites admissibles de triangulations qui sont d´efinies par la condition (8).Ce th´eor`eme montre de plus que la condition d’admissibilit´e n’est pas trop restrictive. Th´eor`eme 2.
Soit Ω ⊂ R un domaine polygonal born´e, soit f ∈ C m (Ω) et soit ≤ p ≤∞ . Soit ( T N ) N ≥ N , T N ) ≤ N , une suite admissible de triangulations de Ω . Alors lim inf N →∞ N m (cid:107) f − I m − T N f (cid:107) L p (Ω) ≥ (cid:13)(cid:13)(cid:13)(cid:13) K m,p (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) , (14)3 Figure o`u τ := m + p . De plus pour chaque ε > il existe une suite admissible ( T εN ) N ≥ N detriangulations de Ω , T N ) ≤ N , telle que : lim sup N →∞ N m (cid:107) f − I m − T εN f (cid:107) L p (Ω) ≤ (cid:13)(cid:13)(cid:13)(cid:13) K m,p (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) + ε. (15)Le Chapitre 3 est consacr´e aux versions des Th´eor`emes 1 et 2 lorsque l’erreur d’inter-polation est mesur´ee dans la semi-norme de Sobolev W ,p , 1 ≤ p < ∞ . Ces estimationsfont intervenir l’analogue L m,p de la fonction de forme K m,p qui est d´efini comme suit :pour tout π ∈ IH m L m,p ( π ) := inf | T | =1 (cid:107)∇ ( π − I m − T π ) (cid:107) L p ( T ) . Nous donnons de nouveau des ´equivalents explicites de L m,p ( π ), d´efinis de la mˆeme mani`ereque (13) par une expression alg´ebrique en les coefficients de π ∈ IH m . Nos r´esultats pourles semi-normes W ,p sont donc extrˆemement similaires aux r´esultats obtenus pour lesnormes L p .L’adaptation des preuves n’est en revanche pas ´evidente, `a cause du ph´enom`ene sui-vant : la pr´esence de triangles fortement obtus (avec un angle proche de π ) dans unmaillage peut causer des oscillations du gradient de l’interpolation d’une fonction, commeillustr´e Figure 6. Ce ph´enom`ene d´et´eriore l’erreur d’interpolation dans la semi-norme W ,p ,mais pas dans la norme L p . Ces triangles “plats” doivent donc ˆetre ´evit´es avec pr´ecaution.En r´esum´e, les triangles longs et fins peuvent ˆetre souhaitables mais il ne doivent pas ˆetretrop fortement obtus.Avant de continuer la description de cette th`ese, nous rappelons au lecteur que lestrois chapitres qui composent la Partie I sont auto-consistants et peuvent donc ˆetre lusind´ependamment. Partie II. Classes d’approximation anisotropes et mod`eles d’images
Dans cette partie, form´ee de l’unique Chapitre 4, nous discutons de l’extension de nosr´esultats d’approximation aux fonctions non lisses.Il existe des moyens vari´es de mesurer la r´egularit´e de fonctions d´efinies sur un domaineΩ ⊂ R , le plus souvent au moyen d’un espace de r´egularit´e appropri´e et d’une norme4 Introduction associ´ee. Des exemples classiques sont les espaces de Sobolev et de Besov. Ces espacessont souvent utilis´es pour d´ecrire la r´egularit´e de solutions d’EDP. D’un point de vuenum´erique ils caract´erisent pr´ecis´ement la vitesse `a laquelle une fonction f peut ˆetreapproch´ee par des fonctions plus simples comme les s´eries de Fourier, les ´el´ements finis(sur des triangulations isotropes), les fonctions splines ou les ondelettes.Le r´esultat d’approximation adaptative anisotrope (4) et sa g´en´eralisation par le Th´eo-r`eme 1 indiquent que la qualit´e de l’approximation d’une fonction f par des ´el´ements finissur des triangulations anisotropes est gouvern´ee par une quantit´e non-lin´eaire de ses d´e-riv´ees, d’un point de vue asymptotique du moins. Dans le cas des ´el´ements finis de degr´e1, et de l’approximation en norme L , la quantit´e pertinente est la suivante A ( f ) := (cid:13)(cid:13)(cid:13)(cid:112) | det( d f ) | (cid:13)(cid:13)(cid:13) L / (Ω) . La fonctionnelle A diff`ere fortement des normes de Sobolev, H¨older ou Besov, car elle estfortement non lin´eaire : A ne satisfait pas l’in´egalit´e triangulaire, ni aucune quasi-in´egalit´etriangulaire. En d’autres termes pour chaque constante C il existe f, g ∈ C (Ω) telles que A ( f + g ) > C ( A ( f ) + A ( g )) . (16)L’absence d’une in´egalit´e triangulaire interdit d’utiliser les techniques classiques de l’ana-lyse lin´eaire pour d´efinir un espace de r´egularit´e attach´e `a la fonctionnelle A . L’extensiondu r´esultat d’approximation (4) aux fonctions qui ne sont pas C n’est donc pas ´evidente.Les fonctions qui se pr´esentent dans les applications concr`etes, comme par exemple entraitement de l’image ou comme solutions d’EDP hyperboliques, pr´esentent souvent deszones de r´egularit´e s´epar´ees par des discontinuit´es localis´ees. Un mod`ele math´ematiquesimple pour ce type de comportement est donn´e par la classe des fonctions cartoon, quisont r´eguli`eres except´e le long d’une famille de courbes elles mˆemes r´eguli`eres, `a traverslesquelles elles sont discontinues. Une analyse heuristique pr´esent´ee dans le Chapitre 4sugg`ere que pour toute fonction cartoon f d´efinie sur un domaine polygonal born´e Ω, ilexiste une suite ( T N ) N ≥ N de triangulations anisotropes de Ω, T N ) ≤ N , telles quesup N ≥ N N (cid:107) f − I T N f (cid:107) L (Ω) < ∞ . (17)Comme illustr´e Figure 1, les triangulations ( T N ) N ≥ N se composent de triangles fortementanisotropes align´es avec les discontinuit´es de f , et de grands triangles dans les r´egions o`u f est r´eguli`ere. Le r´esultat d’approximation (17) fait esp´erer qu’il existe une estimationd’erreur asymptotique pr´ecise, pour l’approximation anisotrope des fonctions cartoon, qui´etende le r´esultat (4) connu lorsque f est C , `a savoirlim sup N →∞ N (cid:107) f − I T N f (cid:107) L (Ω) ≤ CA ( f ) . (18)Nous n’avons rempli pour l’instant qu’une partie de ce programme : l’extension de lafonctionnelle A aux fonctions cartoon. Plus pr´ecis´ement, consid´erons une fonction ϕ ∈ C ∞ ( R ) radiale, `a support compact et d’int´egrale unit´e. D´efinissons ϕ δ ( z ) := δ ϕ (cid:0) zδ (cid:1) (cid:39) A / (cid:29) (cid:39) A (cid:29) (cid:39) A (cid:29) Figure A sur diff´erents types d’images.pour chaque δ > f δ := f ∗ ϕ δ la convolution de f avec ϕ δ . Nous prouvonsque si f est une fonction cartoon, alors A ( f δ ) converge lorsque δ → δ → A ( f δ ) / = (cid:13)(cid:13)(cid:13)(cid:112) | det( d f ) | (cid:13)(cid:13)(cid:13) / L / (Ω \ Γ) + C ( ϕ ) (cid:13)(cid:13)(cid:13)(cid:112) | κ | γ (cid:13)(cid:13)(cid:13) / L / (Γ) = (cid:90) Ω \ Γ (cid:112) | det( d f ( z )) | dz + C ( ϕ ) (cid:90) Γ | κ ( s ) | / | γ ( s ) | / ds. Nous avons not´e ici Γ la famille de courbes le long desquelles la fonction cartoon f est dis-continue, γ ( s ) le saut de f en un point s ∈ Γ, et κ ( s ) la courbure de Γ en s . La constante C ( ϕ ) est strictement positive et s’exprime explicitement en fonction de ϕ .L’extension de la fonctionnelle non lin´eaire A aux fonctions cartoon met en lumi`ereun lien, explor´e en § A comme le pendant “d’ordre deux” de la semi-normeTV de variation totale, une mesure de r´egularit´e d´efinie en termes des d´eriv´ees d’ordre unde f et qui est aussi finie lorsque f est une fonction cartoon. La variation totale joue unrˆole central en traitement de l’image et dans l’analyse des ´equations de transport, deuxdomaines dans lesquels les fonctions r´eguli`eres par morceaux apparaissent naturellement.Lorsque f est une fonction cartoon, sa variation totale est donn´ee par la formule suivante :TV( f ) = (cid:90) Ω \ Γ |∇ f ( z ) | dz + (cid:90) Γ | γ ( s ) | ds. Nous comparons le comportement des quantit´es A ( f ) et TV( f ) pour diff´erentes famillesde fonctions cartoon f . Les quantit´es TV( f ) et A ( f ) / se r´ev`elent ´equivalentes lorsque f est la fonction oscillant de mani`ere lisse f ( z ) := cos( ω | z | ), illustr´ee Figure 7 (i), o`u ω estun grand param`etre. En revanche, les discontinuit´es sont p´enalis´ees de mani`ere diff´erente6 Introduction par ces deux fonctionnelles. Pour une fonction “en escalier”, comme illustr´e Figure 7 (ii),la variation totale TV reste born´ee alors que A tend vers l’infini `a mesure que le nombrede marches croˆıt, `a cause du terme de saut | γ ( s ) | / dans l’int´egrale sur l’ensemble Γ desdiscontinuit´es. Par ailleurs, `a cause du terme de courbure | κ ( s ) | , la fonctionnelle A estbien plus grande que TV pour les fonctions caract´eristiques d’ensembles ayant un bordcomplexe ou oscillant comme illustr´e Figure 7.La fonctionnelle A peut donc ˆetre consid´er´ee comme un mod`ele d’image quantitatif :une image monochrome, d´ecrite par sa luminosit´e f : [0 , → [0 , A ( f )est suffisamment petit. A partir de l’approche introduite dans [70], nous proposons un al-gorithme de d´ebruitage d’images utilisant un a-priori bay´esien fond´e sur ce mod`ele. Danssa version actuelle, cet algorithme n’est pas satisfaisant en terme de vitesse de conver-gence, et pour cette raison nous pr´esentons uniquement des illustrations num´eriques dansun cadre simplifi´e unidimensionel.Enfin nous ´etudions l’extension aux fonctions cartoon des autres quantit´es non-lin´eairesqui apparaissent en approximation par ´el´ements finis sur des triangulations anisotropes,comme la norme (cid:107) K m,p ( d m f ) (cid:107) L τ de la fonction de forme pour m ≥
2, ou l’analogue decette quantit´e lorsque f est une fonction de plus de deux variables. Partie III. G´en´eration de maillage anisotrope et m´etriques rie-manniennes
Les triangulations sont des objets discrets de nature combinatoire : elles peuvent ˆetred´ecrites par une famille de sommets et d’arˆetes les joignant. Cette description est fruc-tueuse pour la d´emonstration de r´esultats alg´ebriques comme la formule d’Euler, ou pourle traitement informatis´e. Par contraste, comme expliqu´e pr´ec´edemment, de nombreusesapproches en adaptation anisotrope de maillage [15,20,66] se fondent sur un objet continu´equivalent aux triangulations, `a savoir une m´etrique riemannienne z (cid:55)→ H ( z ). Il s’agit end’autres termes d’une fonction continue H de Ω dans l’ensemble S +2 des matrices sym´e-triques d´efinies positives. Une fois cette m´etrique con¸cue convenablement, un algorithmede g´en´eration de maillages a la charge de fabriquer une triangulation qui lui corresponddans le sens de (7). L’objectif de la Partie III est de formuler des r´esultats pr´ecis d’´equi-valence entre certaines classes de triangulations et de m´etriques riemanniennes. Cette´equivalence traduit les contraintes g´eom´etriques que satisfont les triangulations sous laformes de propri´et´es de r´egularit´e des m´etriques riemanniennes qui leur sont ´equivalentes.Pour ´enoncer nos r´esultats nous devons introduire certaines notations. Nous associons`a chaque triangle T son barycentre z T ∈ R et la matrice sym´etrique d´efinie positive H T ∈ S +2 telle que l’ellipse E T d´efinie par E T := { z ∈ R ; ( z − z T ) T H T ( z − z T ) } , est l’ellipse d’aire minimale contenant T . Le point z T indique donc la position de T , tandisque la matrice H T ∈ S +2 d´ecrit son aire, son rapport d’aspect et son orientation.Nous notons T la famille de toutes les triangulations conformes du domaine infini R .Le choix de consid´erer des triangulations infinies est guid´e par la simplicit´e, et un travail7futur sera consacr´e aux triangulations de domaines polygonaux born´es. Nous notons H := C ( R , S +2 ) la famille des m´etriques riemanniennes sur R . Une m´etrique H ∈ H associecontinˆument `a chaque point z ∈ R une matrice sym´etrique d´efinie positive H ( z ) ∈ S +2 .Soit C ≥
1, nous disons qu’une triangulation
T ∈ T est C -´equivalente `a une m´etrique H ∈ H si pour tout T ∈ T et tout z ∈ T nous avons au sens des matrices sym´etriques C − H ( z ) ≤ H T ≤ C H ( z ) . Nous disons qu’une famille T ∗ ⊂ T de triangulations est ´equivalente `a une famille H ∗ ⊂ H de m´etriques s’il existe une constante uniforme C ≥ T ∈ T ∗ il existe une m´etrique H ∈ H ∗ telle que T et H soient C -´equivalents.– Pour chaque m´etrique H ∈ H ∗ il existe une triangulation T ∈ T ∗ telle que T et H soient C -´equivalents.Nous consid´erons trois familles pertinentes de triangulations de R T i,C ⊂ T a,C ⊂ T g,C qui d´ependent de mani`ere mineure d’un param`etre C ≥
1. La famille T g,C des trian-gulation ´etag´ees , en anglais graded , est d´efinie par la condition suivante qui impose unminimum de consistance dans les formes des triangles voisins. Une triangulation T ap-partient `a T g,C si pour tous T, T (cid:48) ∈ T nous avons T ∩ T (cid:48) (cid:54) = ∅ ⇒ C − H T ≤ H T (cid:48) ≤ C H T . Une triangulation appartient `a la classe T i,C des triangulations isotropes si T est ´etag´ee, T ∈ T g,C , et si les ´el´ements de T sont suffisamment proches du triangle ´equilat´eral, cequ’exprime la condition suivante : pour tout T ∈ T(cid:107)H T (cid:107)(cid:107)H − T (cid:107) ≤ C . La classe T a,C des triangulations quasi-aigues est d´efinie par une condition l´eg`erement plustechnique sur les angles maximaux des triangles, voir Chapitre 5. Comme expliqu´e dansla description ci-dessus du chapitre 3, ´eviter les angles trop fortement obtus est n´ecessairepour garantir la stabilit´e du gradient lorsqu’on applique l’op´erateur d’interpolation. L’unde nos r´esultats clefs est la reformulation de cette condition sous la forme d’une hypoth`esede r´egularit´e de la m´etrique riemannienne ´equivalente. D’un point de vue pratique, cettecondition n’est malheureusement pas garantie par les programmes existants de g´en´erationde maillage. Les contraintes satisfaites par ces familles de triangulations de R sont illus-tr´ees pour des triangulations de domaines born´es sur la Figure 8.Les r´esultats du Chapitre 5 ´etablissent que lorsque C est suffisamment grand, ces troisclasses de triangulations sont ´equivalentes `a trois familles de m´etriques, respectivement H i ⊂ H a ⊂ H g , Introduction qui sont d´efinies par des conditions de r´egularit´es pr´ecises sur la fonction z (cid:55)→ H ( z ). Enparticulier la famille H i est form´ee des m´etriques H qui sont proportionnelles `a l’identit´e H ( z ) = Id /s ( z ) , et telles que le facteur de proportionnalit´e s satisfait l’une des conditionssuivantes qui de mani`ere surprenante sont ´equivalentes • Propri´et´e de Lipschitz Euclidienne : | s ( z ) − s ( z (cid:48) ) | ≤ | z − z (cid:48) | pour tous z, z (cid:48) ∈ R . • Propri´et´e de Lipschitz Riemannienne : | ln s ( z ) − ln s ( z (cid:48) ) | ≤ d H ( z, z (cid:48) ) pour tous z, z (cid:48) ∈ R . Rappelons que la distance riemannienne d H ( z, z (cid:48) ) mesure la longueur, au sens de la m´e-trique H , du plus court chemin joignant z `a z (cid:48) : d H ( z, z (cid:48) ) := inf γ (0)= zγ (1)= z (cid:48) (cid:90) (cid:107) γ (cid:48) ( t ) (cid:107) H ( γ ( t )) dt o`u (cid:107) u (cid:107) M := √ u T M u et o`u l’infimum est pris parmi tous les chemins γ ∈ C ([0 , , R d )joignant z `a z (cid:48) .Les deux propri´et´es de Lipschitz pr´esent´ees ci-dessus s’´etendent naturellement auxm´etriques riemanniennes g´en´erales, mais ne sont alors plus ´equivalentes . Pour H ∈ H , onpose S ( z ) := H ( z ) − , et on introduit les deux propri´et´es distinctes d + ( H ( z ) , H ( z (cid:48) )) ≤ | z − z (cid:48) | pour tout z, z (cid:48) ∈ R , (19)o`u d + ( H ( z ) , H ( z (cid:48) )) := (cid:107) S ( z ) − S ( z (cid:48) ) (cid:107) , et d × ( H ( z ) , H ( z (cid:48) )) ≤ d H ( z, z (cid:48) ) pour tout z, z (cid:48) ∈ R . (20)o`u d × ( H ( z ) , H ( z (cid:48) )) := ln max {(cid:107) S ( z ) S ( z (cid:48) ) − (cid:107) , (cid:107) S ( z (cid:48) ) S ( z ) − (cid:107)} . Notons que d + et d × sontdes distances sur S +2 .Le r´esultat principal du Chapitre 5 caract´erise les familles de m´etriques H g et H a (associ´ees aux familles T g,C des triangulations ´etag´ees, et T a,C des triangulations quasi-aigues) sous la forme des propri´et´es de Lipschitz ci-dessus. Th´eor`eme 3.
Si la constante C est suffisamment grande, alors la famille T g,C des tri-angulations ´etag´ees est ´equivalente `a la famille H g des m´etriques qui satisfont (20), et lafamille T a,C des triangulations quasi-aigues est ´equivalente `a la famille H a des m´etriquesqui satisfont simultan´ement (20) et (19). Nous donnons Chapitre 6 des applications de ces r´esultats dans le contexte de lath´eorie de l’approximation et de la g´en´eration contrainte de maillage. D´ecrivons ce dernierexemple. Pour chaque ensemble ferm´e E ⊂ R d et chaque triangulation T ∈ T nous notons V T ( E ) le voisinage de E dans la triangulation T , qui est d´efini comme suit V T ( E ) := (cid:91) T ∈T T ∩ E (cid:54) = ∅ T. Nous disons qu’une triangulation
T ∈ T s´epare deux ensembles ferm´es et disjoints X, Y ⊂ R si V T ( X ) ∩ V T ( Y ) = ∅ . La propri´et´e analogue pour les m´etriques est la9 Figure δ le long du cercle unit´een utilisant O ( δ − ) triangles isotropes (haut gauche), O ( δ − | ln δ | ) triangles “quasi-aigus”(haut droit), O ( δ − ) triangles satisfaisant, ou non, la condition d’´etagement (bas gaucheet bas droit).suivante : une m´etrique H ∈ H s´epare X et Y si d H ( x, y ) ≥ x ∈ X et y ∈ Y .Nous montrons que ces deux propri´et´es sont rigoureusement ´equivalentes, et nous utilisonscette reformulation pour calculer le plus petit nombre de triangles (`a p´eriodicit´e pr`es) re-quis pour s´eparer des ensembles (p´eriodiques) en utilisant une triangulation (p´eriodique)isotrope, quasi-aigue ou ´etag´ee. Comme illustr´e sur la Figure 8, imposer davantage decontraintes sur la triangulation augmente typiquement le nombre de triangles requis pourr´ealiser la mˆeme tˆache.La suite de ce chapitre est consacr´ee au contrˆole de l’erreur d’approximation par ´el´e-ments finis d’une fonction sur une triangulation T , en norme L p ou W ,p , par une quantit´e e H ( f ) p , e aH ( ∇ f ) p ou e gH ( ∇ f ) p attach´ee `a une m´etrique H ´equivalente `a T . Cette analysefait apparaˆıtre, dans le cas des normes de Sobolev, le rˆole particulier jou´e par les conditionsd’angle et de r´egularit´e qui d´efinissent les triangulations et m´etriques appartenant `a T a et H a respectivement. Finalement nous ´etendons aux m´etriques les estimations d’erreurasymptotiques d´evelopp´es pour des triangulations Chapitres 2 et 3. Partie IV. Algorithmes de raffinement hi´erarchique
La derni`ere partie de cette th`ese est consacr´ee `a l’´etude d’un algorithme propos´e parCohen, Dyn et Hecht qui produit des suites hi´erarchiques ( T N ) N ≥ N de triangulationsanisotropes (non-conformes) adapt´ees `a une fonction donn´ee f . Etant donn´ee une trian-gulation T d’un domaine Ω et une fonction f ∈ L p (Ω), cet algorithme cr´ee en une ´etapeune triangulation T (cid:48) de Ω, de cardinalit´e T (cid:48) ) = T ) + 1, de la fa¸con suivante :1. (S´election “greedy” du triangle `a raffiner) On s´electionne un triangle T ∈ T dont la0 Introduction
Figure T := argmax T (cid:48) ∈T (cid:107) f − A T (cid:48) f (cid:107) L p ( T (cid:48) ) , o`u A T : L p ( T ) → IP m − est un op´erateur de projection, par exemple la projection L ( T ) orthogonale sur IP m − .2. (Choix d’une bissection) Une arˆete e ∈ { a, b, c } de T est choisie en minimisant une fonction de d´ecision donn´ee e (cid:55)→ d T ( e, f ) parmi les trois arˆetes. Le triangle estd´ecoup´e le long du segment joignant le point milieu de l’arˆete choisie au sommetoppos´e, ce qui cr´ee les sous-triangles T e et T e . La nouvelle triangulation est donc T (cid:48) := T − { T } + { T e , T e } . Partant d’une triangulation T de cardinalit´e N du domaine Ω, l’algorithme produit pasapr`es pas une suite ( T N ) N ≥ , voir Figure 9, de triangulations “adapt´ees” `a une fonctiondonn´ee f ∈ L p (Ω). Les propri´et´es de ces triangulations d´ependent fortement de la fonc-tion approch´ee f et du choix de la fonction de d´ecision d T ( e, f ), qui guide la cr´eation del’anisotropie. Par contraste l’op´erateur de projection A T joue un rˆole plutˆot mineur. Unchoix typique de la fonction de d´ecision e (cid:55)→ d T ( e, f ) est l’erreur locale apr`es bissection,en d’autres termes l’algorithme choisit la bissection qui r´eduit le plus possible l’erreurd’approximation.Nous d´ecrivons cet algorithme plus en d´etail dans le Chapitre 7, et nous ´etablissons saconvergence au sens o`u les approximations polynomiales par morceaux A T N f d´efinies par A T N f ( z ) = A T ( z ) , z ∈ T, T ∈ T N convergent vers f dans L p lorsque N → ∞ pour n’importe quelle f ∈ L p , sous certaineshypoth`eses sur la fonction de d´ecision e (cid:55)→ d T ( e, f ). Nous discutons aussi de la possibilit´ed’utiliser la structure hi´erarchique multi-´echelle pour d´efinir des approximations multi-r´esolution, des ondelettes et un algorithme de type CART. Nous illustrons l’adaptationanisotrope donn´ee par l’algorithme sur plusieurs types de fonctions et d’images qui pr´e-sentent des transitions rapides le long de lignes courbes.1 -1-0.5 0 0.5 1-1 -0.5 0 0.5 1 Figure
10 – Triangulation produite par l’algorithme de raffinement hi´erarchique, etinterpolation, pour une fonction ayant une variation brusque le long d’une courbe sinu-so¨ıdale.Nous faisons une analyse plus approfondie de la convergence de l’algorithme Chapitre8, dans le cas m = 2 de l’approximation lin´eaire par morceaux. Notre r´esultat princi-pal montre que lorsque f est C et strictement convexe, pour un choix particulier de lafonction de d´ecision fond´e sur l’erreur d’interpolation L , la suite ( T N ) N ≥ N de triangu-lations g´en´er´ee par cet algorithme satisfait l’estimation asymptotiquement optimale deconvergence lim sup N →∞ N (cid:107) f − A T N f (cid:107) L p (Ω) ≤ C (cid:107) (cid:112) | det( d f ) |(cid:107) L τ (Ω) , o`u τ := 1 + p . L’observation clef qui m`ene `a ce r´esultat est que lorsque f est un polynˆomequadratique convexe, la minimisation de la fonction de d´ecision choisit l’arˆete la pluslongue dans la m´etrique associ´ee `a la partie homog`ene quadratique de ce polynˆome. Cettepropri´et´e permet de montrer que les triangles g´en´er´es par l’algorithme tendent en majorit´e`a adopter un rapport d’aspect optimal. Le bon comportement de l’algorithme est aussiobserv´e pour des fonctions g´en´erales non-convexes, comme illustr´e Figure 10. Cependantprouver le r´esultat de convergence optimale ci-dessus reste un probl`eme ouvert dans cecadre.Nous ´etudions Chapitre 9 des variantes de l’algorithme de raffinement hi´erarchiquepr´esent´e ci-dessus. Nous consid´erons en premier lieu une fonction de d´ecision fond´ee surl’erreur locale de projection L , qui est particuli`erement bien adapt´ee `a l’impl´ementationnum´erique car elle peut ˆetre ´evalu´ee en un temps machine r´eduit.Nous nous concentrons ensuite sur le comportement de l’algorithme lorsqu’il est appli-qu´e `a des fonctions cartoon. L’algorithme original ne satisfait pas la meilleure estimationpossible de convergence pour de telles fonctions. Nous montrons que la vitesse optimalede convergence pourrait ˆetre r´etablie en rempla¸cant la bissection du point milieu d’unearˆete vers le sommet oppos´e par d’autres choix de d´ecoupages g´eom´etriques.Finalement nous consid´erons une autre variante de l’algorithme, fond´ee sur des rec-tangles align´es avec les axes de coordonn´ees au lieu de triangles de direction arbitraire,dans l’esprit des partitions rectangulaires ´etudi´ees Chapitre 1. Cette simplification m`ene`a un r´esultat qui garantit la meilleure estimation possible de convergence pour toutes les2 Introduction fonctions C , dans le cas d’approximations constantes par morceaux. ntroduction (English Version) Although this may seem a paradox, all exact science is dominated by theidea of approximation. (Bertrand Russel, logician and Nobel prize)
This thesis is devoted to the problem of approximating functions by piecewise poly-nomial finite elements over triangulations, and more general meshes. We are particularlyconcerned with the setting where the mesh is adaptively designed depending on the func-tion to be approximated. This mesh may therefore include elements of strongly varyingsize, aspect ratio and orientation.Approximation by piecewise polynomial functions is a procedure that occurs in nume-rous applications. In some of them such as terrain data simplification, surface or imagecompression, the function f to be approximated might be fully known. In other applica-tions such as denoising, statistical learning or in the finite element discretization of PDE’s,it might be only partially known or fully unknown. In all these applications, one usuallymakes the distinction between uniform and adaptive approximation. In the uniform case,the domain of interest is decomposed into a partition where all elements have comparableshape and size, while these attributes are allowed to vary strongly in the adaptive case.The partition may therefore be adapted to the local properties of f , with the objective ofoptimizing the trade-off between accuracy and complexity of the approximation.From an approximation theoretic point of view, the trade-off between accuracy andcomplexity if usually tied to the smoothness properties of the function : typically oneexpects higher convergence rates for smoother functions. Functions arising in concrete ap-plications may however have inhomogeneous smoothness properties, in the sense that theyexhibit areas of smoothness separated by localized discontinuities. Two typical instancesdisplayed in Figure 1 and Figure 2 are (i) edges in functions representing images, and (ii)shock profiles in the solutions to non-linear hyperbolic PDE’s. Numerical procedures forimage processing, such as image denoising or compression, or for the simulation of PDE’sgreatly benefit from economical and faithful approximations of such functions. A piecewisepolynomial approximation on a uniform partition of the domain is generally not sufficientfor these purposes. A first step toward adaptivity is to vary the size of the elements for-ming the partition according to the local smoothness properties of the function. A very234 Introduction (English Version)
Figure anisotropic partition of the domain. In the two dimensional cases, such partitionsare typically built from triangles of high aspect ratio aligned with the discontinuities, asdisplayed Figure 1 (bottom right) and Figure 2.In the context of numerical PDE’s, adaptivity also refers to the fact that the compu-tational mesh is not fixed in advance, but instead is dynamically updated based on theavailable information on the exact solution gained through the solution process. From anumerical point of view, such methods require more complex algorithms and more intri-cate data structures than their non-adaptive counterparts. From a theoretical point ofview the analysis of these adaptive algorithms, when it is possible, is generally involved.As a matter of fact, the improvement brought by adaptivity in terms of convergence rateis rigorously established only for few systems of PDE’s, and in the case of isotropic finiteelement meshes. We refer to the survey paper [77] for a complete overview on these aspectsin the case of elliptic equations. We need to mention that these difficulties are exacerba-ted when anisotropic elements are used. Several numerical mesh generation software suchas [20, 93, 94], nevertheless successfully use anisotropic adaptativity for the numerical si-mulation of PDE’s, as displayed for instance on Figure 2. From a numerical point of view,the improvement brought by these methods seems obvious compared to non-adaptive oradaptive isotropic methods. However, many aspects of the theoretical analysis of aniso-5
Figure known function, which may be regarded as a preliminary step for the analysis ofanisotropic mesh adaptation for the numerical simulation of PDE’s, but may also servein other applications such as terrain data, surface and image processing.Given a triangulation T of a bounded polygonal domain Ω ⊂ R , and a fixed integer k ≥
1, we denote by V k ( T ) the space of finite elements of degree k on T . The space V k ( T )consists of all functions which coincide on each triangle T ∈ T with a polynomial of totaldegree k V k ( T ) := { g ; g | T ∈ IP k , T ∈ T } . The dimension of V k ( T ) is of the order O ( k T )). Given a function f : Ω → R and atriangulation T of Ω, the best approximation error of f in V k ( T ) is defined by e T ( f ) X := inf g ∈ V k ( T ) (cid:107) f − g (cid:107) X . (1)The letter X indicates the norm or semi-norm in which the approximation error (cid:107) f − g (cid:107) X ismeasured. In this thesis we restrict our attention to the L p norm and the W ,p semi-norm,where 1 ≤ p ≤ ∞ . They are defined as follows : (cid:107) h (cid:107) L p (Ω) := (cid:18)(cid:90) Ω | h | p (cid:19) p and | h | W ,p (Ω) := (cid:18)(cid:90) Ω |∇ h | p (cid:19) p , with the standard modification when p = ∞ . Note that when using the W ,p semi-norm,we need to impose global continuity of g in the above definition of V k ( T ).The best approximation g ∈ V k ( T ) of f can be exactly computed in the case ofthe L norm or the W , (or H ) semi-norm : g is the orthogonal projection of f onto V k ( T ) with respect to the scalar product associated to the norm of interest. In the case p (cid:54) = 2 of non hilbertian norms the best approximation of f is generally hard to compute,but “satisfactory” approximations can be obtained by different methods. If the functionis smooth (at least continuous), one may use the standard Lagrange interpolant, whilefor non-smooth functions a quasi-interpolant operator is preferred, see Chapter 6. Moregenerally, if P T is any continuous projection operator from the space X to V k ( T ), it is6 Introduction (English Version) easily seen that for any f ∈ X , one has (cid:107) f − P T f (cid:107) X ≤ Ce T ( f ) X , (2)where C := 1 + (cid:107) P T (cid:107) X → X . The problem of approximating a function f on a given trian-gulation T , using finite elements of degree k , is thus solved in good part.In the context of adaptive approximation, the triangulation T of the domain Ω isnot fixed, but can be freely chosen depending on the function f (in contrast we alwaysassume in this thesis that the integer k is fixed although arbitrary). This naturally raisesthe objective of characterizing and constructing an optimal mesh for a given function f .Given a norm X of interest and a function f to be approximated, we formulate the problemof optimal mesh adaptation , as minimizing the approximation error over all triangulationsof prescribed cardinality . We therefore define the adaptive best approximation error by e N ( f ) X := inf T ) ≤ N e T ( f ) X = inf T ) ≤ N inf g ∈ V k ( T ) (cid:107) f − g (cid:107) X . (3)In contrast to the procedure of best finite element approximation on a fixed mesh, adaptiveand anisotropic approximation is not yet well understood. In particular (i) how does theoptimal mesh depend on the function f and (ii) how does the optimal error e N ( f ) X decayas N grows ? These problems are well understood is the isotropic setting, for which theoptimization is restricted to triangulations for which all triangles satisfy a uniform shapeconstraint diam( T ) ≤ C | T | where diam( T ) and | T | stand for the diameter and area of T respectively, and C > V k ( T ), and that a near best solution may thereforebe obtained by applying to f a stable projection operator as expressed by (2). In contrastthe optimization problem (3) is posed on the union of spaces V k ( T ) for all triangulations T satisfying T ) ≤ N , which is certainly not a linear space. This problem is thereforean instance of nonlinear approximation . Other instances include best N -terms approxi-mations in a dictionary of functions, or best approximation by rational function. We referto [42] for a survey on nonlinear approximation.The purpose of this thesis is to better understand optimal mesh adaptation posed onthe full class of potentially anisotropic triangulations . The four parts of the thesis arerespectively devoted to the four questions below :I. How does the approximation error e N ( f ) X behaves in the asymptotic regime whenthe number of triangles N grows to + ∞ , when f is a smooth function ? In thatcontext, we establish a mathematical characterization of the optimal mesh, as wellas sharp estimates of e N ( f ) X by above and below in terms of N and quantities thatdepend nonlinearly on the derivatives of f .7II. Which classes of functions govern the rate of decay of e N ( f ) X as N grows, and are inthat sense naturally tied to the problem of optimal mesh adaptation ? In particular,we have in mind the model of the so-called cartoon functions , which by definition aresmooth except along a collection of smooth curves of discontinuity. This is a popularimage model in the image processing community (see for instance Figure 1 for aninstance of a cartoon image). We shall see that such a model naturally fits in a richerfunction class corresponding to a given rate of decay of e N ( f ) X .III. Could the optimization problem (3) posed on triangulations T of a given cardina-lity N , be replaced by an equivalent more tractable problem ? Triangulations areindeed discrete combinatorial objects, described in terms of points and edges, whichis not handy when solving optimization problems of the form (3). We study the cor-respondence between certain classes of triangulations and of riemanninan metrics which in contrast are continuous objects. This allows us to reformulate and to solvethe original optimization problem as a more tractable problem posed on the set ofriemannian metrics.IV. Is it possible to produce a near-optimal sequence of triangulations ( T N ) N ≥ with T N ) = N , using a hierarchical refinement procedure ? The property of hierarchyguarantees the inclusion of the associated finite element spaces V k ( T N ) ⊂ V k ( T N +1 ).It is required in applications such as progressive encoding or online data processing.We provide with a simple and explicit algorithm which gives a positive answer tothis question under some conditions. Before detailing the content of the thesis, we give in this section a short overview of thestate of the art on both problems of the study of e N ( f ) X as N grows, and the constructionof a near-optimal triangulation. For that purpose we need to introduce some notations.We assume in the following that f ∈ C (Ω), where Ω ⊂ R is a bounded polygonaldomain and Ω denotes the closure of Ω. For each triangulation T of Ω we denote by I k T the standard interpolation operator on Lagrange finite elements of degree k on T : oneach triangle T ∈ T the interpolation I k T f is the unique element of IP k which agrees with f on the points of barycentric coordinates in { , k , · · · , k − k , } . In the case k = 1, thesepoints are simply the three vertices of T . If the triangulation T is conforming (each edgeof a triangle is either on the boundary of Ω or coincides with the entire edge of anothertriangle), then I k T f is continuous. In order to be consistent with the rest of this thesis wedefine m := k + 1 ≥
2, and we thus have e T ( f ) L p ≤ (cid:107) f − I m − T f (cid:107) L p (Ω) and e T ( f ) W ,p ≤ (cid:107)∇ f − ∇ I m − T f (cid:107) L p (Ω) . Any estimate on the interpolation error thus automatically yields an upper estimate onthe best approximation error e T ( f ) X . Furthermore if f is sufficiently smooth and if T issufficiently fine, then these quantities are generally comparable.One of the founding results of adaptive anisotropic finite element approximation dealswith the case of piecewise linear elements ( m = 2) with the error measured in the L p norm,8 Introduction (English Version) see [27] or Chapter 2. This result may be stated as follows : for any bounded polygonaldomain Ω ⊂ R , for any 1 ≤ p < ∞ , and for any function f ∈ C (Ω) there exists asequence ( T N ) N ≥ N of triangulations of Ω, satisfying T N ) ≤ N , and such thatlim sup N →∞ N (cid:107) f − I T N f (cid:107) L p (Ω) ≤ C (cid:13)(cid:13)(cid:13)(cid:112) | det( d f ) | (cid:13)(cid:13)(cid:13) L τ (Ω) , (4)where the exponent τ ∈ (0 , ∞ ) is defined by1 τ := 1 + 1 p , and C is a universal constant ( C is independent of p , Ω and f ). We recall that the upperlimit of a sequence ( u N ) N ≥ N is defined bylim sup N →∞ u N := lim N →∞ sup n ≥ N u n , (5)and is in general strictly smaller than the supremum sup N ≥ N u N . It is still an openquestion to find an appropriate upper bound for sup N ≥ N (cid:107) f − I T N f (cid:107) L p (Ω) when optimallyadapted anisotropic partitions are used. The result (4) can be extended to the exponent p = ∞ , and to simplicial partitions of domains Ω of higher dimension, but the meshes T N may then not be conforming.The result (4) reveals that the accuracy of the approximation to f is governed bythe quantity (cid:112) | det( d f ) | , which depends non-linearly on the hessian matrix d f . Thisnon-linear dependency is heavily tied to the fact that we authorize triangles of poten-tially highly anisotropic shape. The estimate (4) is obtained by producing sufficiently finetriangulations which combine the two following heuristical properties :a) Error equidistribution : the contribution (cid:107) f − I T f (cid:107) L p ( T ) of each triangle T ∈ T N to theglobal interpolation error (cid:107) f − I T N f (cid:107) L p (Ω) is approximately the same. This conditioncan be translated into a local constraint on the area of the triangles, which is dictatedby the local behavior of f and in particular by det( d f ( z )).b) Optimal shape of the triangles : the aspect ratio and the orientation of a triangle T ∈ T N is dictated by the ratio of eigenvalues and by the eigenvectors of the hessianmatrix d f ( z ) for z ∈ T .The simplest method for producing a sequence ( T N ) N ≥ N of triangulations satisfying(4) is to use a “local patching strategy” that may be intuitively described as follows. In afirst step the domain Ω is split into regions Ω i , 1 ≤ i ≤ k , sufficiently small so that thehessian matrix d f ( z ) varies little on each Ω i around an average value M i . In other words f is well approximated by a quadratic polynomial on each Ω i . Each region Ω i is then tiledusing a uniform triangulation T iN which cells have an area, aspect ratio and orientationbased on M i , and the triangulations T iN of Ω i are glued together into a triangulation T N of Ω, at the price of a few additional triangles at the interfaces between the Ω i .This construction, which is illustrated on Figure 3 is sufficient for the purpose ofproving the asymptotical result (4) but not for practical applications since it only becomesefficient for a large number of triangles. The following approach, based on riemannian9 Figure i are covered byperiodic tilings, which are then glued together.metrics, is often preferred in applications. We assume for simplicity that the hessianmatrix M ( z ) := d f ( z ) is positive definite for each z ∈ Ω, and we define H ( z ) := λ (det M ( z )) − p +2 M ( z ) (6)where λ > H is called a riemannian metric, and continuously associates to each point z ∈ Ωa symmetric positive definite matrix H ( z ). Note that for each z ∈ Ω the matrix H ( z )defines an ellipse E z := { u ∈ R ; u T H ( z ) u ≤ } . As illustrated on the left part of Figure 4, a metric encodes at each point z ∈ Ω aninformation of area, aspect ratio and orientation, under the form of a symmetric positivedefinite matrix H ( z ), or equivalently of an ellipse E z . Several mesh generation algorithms,such as [93, 94] are able to produce a triangulation adapted to a given metric z (cid:55)→ H ( z )in the sense that for each z ∈ Ω the triangle T ∈ T containing z has a shape “similar”to the ellipse E z as illustrated on the right part of Figure 4. In mathematical terms, thismeans that one has for each triangle T and z ∈ T , b T + c E z ⊂ T ⊂ b T + c E z , (7)where 0 < c < c are fixed constants and b T is the barycenter of T , or equivalentlythat T is “close” to an equilateral triangle of unit area in the metric H ( z ). If T is such atriangulation and if λ is sufficiently large, a heuristic argument (which will be recalled inChapter 2) shows that T ) (cid:107) f − I T f (cid:107) L p (Ω) ≤ C (cid:107) (cid:112) | det( d f ) |(cid:107) L τ (Ω) , where the constant C depends on c and c .Varying the parameter λ we obtain different triangulations T λ , of cardinality propor-tional to λ , which leads to the error estimate (4). The production of the mesh T fromthe metric H is not straightforward. In addition, there exists no rigorous proof that thesimilarity condition (7) is achieved for most mesh generation algorithms, to the notableexception of [66] and [15].0 Introduction (English Version)
Figure T N ) N ≥ N of triangulations of a polygonal domain Ω ⊂ R is admissible if T N ) ≤ N and if sup N ≥ N (cid:18) √ N max T ∈T N diam( T ) (cid:19) < ∞ . (8)For any admissible sequence ( T N ) N ≥ N of triangulations of Ω, one can establish the lowerbound lim inf N →∞ N (cid:107) f − I T N f (cid:107) L p (Ω) ≥ c (cid:13)(cid:13)(cid:13)(cid:112) | det( d f ) | (cid:13)(cid:13)(cid:13) L τ (Ω) , (9)where the constant c > ε > T N ) N ≥ N which satisfies the upperestimate (4) up to the additive constant ε in the right hand side.Similar results to (4) and (9) can be developed for isotropic meshes, in which thetriangles may vary in size but are constrained to be isotropic in the sense that the mea-sure of degeneracy ρ ( T ) := diam( T ) / | T | is uniformly bounded by a constant ρ , see forinstance [35]. In that case the counterpart to (4) has the formlim sup N →∞ N (cid:107) f − I T N f (cid:107) L p (Ω) ≤ C (cid:107) d f (cid:107) L τ (Ω) , (10)for the same value of τ and similarly for the counterpart to (9), with with constants c and C that depend on the bound ρ on the measure of degeneracy, see for instance [35].Therefore the nonlinear quantity (cid:112) | det( d f ) | is replaced by the linear d f in the L τ norm,and these results are now very similar to those of best N -term wavelet approximation [30].In terms of the eigenvalues λ ( z ) , λ ( z ) of the symmetric matrix d f ( z ) we thus replacethe geometric mean (cid:112) | λ ( z ) λ ( z ) | by max {| λ ( z ) | , | λ ( z ) |} which may be significantly lar-ger when these eigenvalue have different order of magnitude. This is typically the case ifthe approximated function f exhibits strongly anisotropic features, and therefore we mayexpect a significant improvement when using anisotropic meshes for such functions.The result (4) gives a sharp account of the improvement that anisotropic triangulationscan provide compared to isotropic triangulations, however in a rather restrictive settingwhich has motivated our work :1I. The original result only applies to piecewise linear interpolation error measured inthe L p norm, while higher order finite elements and Sobolev W ,p norms are equallyrelevant. In particular the W , (or H ) norm appears very naturally in the contextof second order elliptic problems. How should the elements be optimally adapted andhow should the resulting error bound (4) be modified in this more general context ?II. The approximated function f needs to be C , while the most interesting applicationsof adaptive approximation involve non smooth and even discontinuous functions,such as those appearing in Figures 1 and 2. Can we define in some sense a quantitysuch as (cid:107) (cid:112) | det( d f ) |(cid:107) L τ (Ω) when f is a discontinuous function ?III. In numerical applications, the riemannian metric z (cid:55)→ H ( z ) is used as an intermediatetool for mesh generation, but this approach lacks a precise equivalence result betweenthese continuous objects and the different classes of anisotropic triangulations. Underwhich conditions on the metric z (cid:55)→ H ( z ) can we associate a triangulation T whichagrees with this metric in the sense of (7) ?IV. The already mentionned anisotropic mesh generation algorithms are not hierarchical,in the sense that better accuracy is not achieved by local refinement but by a globalre-design of the mesh. Can we propose a hierarchical mesh refinement algorithm thatmeets the optimal approximation bound (4) ?We also need to mention two fundamental issues which are discussed in this thesis, buthave remained open problems and will be the subject of future research. First the errorestimate (4) only gives asymptotical information, as the number of triangles N tendsto ∞ , while an error estimate valid for all values of N is highly desirable. Second theextension of this result to functions defined on domains of dimension higher than two ishindered by a difficult problem of computational geometry : the production of anisotropicand conforming meshes in dimension 3 or higher. This problem is addressed by somenumerical methods such as [94] but not solved from a theoretical point of view. This thesis is composed of four parts which attempt to solve four key issues, numberedI to IV, encountered in earlier results on adaptive and anisotropic finite element approxi-mation. These four parts are mostly self consistent and can be read independently. Thechapters that constitute each part should preferably be read sequentially, to the exeptionof chapter 1 which may be skipped by the reader who wishes to go more directly to theheart of the matter. Chapters 1, 2, 3, 4, 7, 8, as well as the third part of Chapter 9, comefrom the papers [9], [72], [73], [37], [34], [36] and [35], respectively. Notations have beenunified for the purpose of the thesis. Chapters 5 and 6 are the object of future papers.
Part I. Finite elements of arbitrary degree, and Sobolev norms
In this part, we discuss generalizations of the asymptotic estimate (4) to finite elementsof arbitrary degree, and to the Sobolev W ,p semi-norms as measure of error. Through this2 Introduction (English Version)
Figure f are likely tobe aligned with the coordinate axes. In this setting, we obtain an optimal asymptoticerror estimate. Our results apply to piecewise polynomials of arbitrary degree, when theapproximation error is measured in the L p norm, and in any dimension d >
1. Here, wedo not consider W ,p norms, since the approximants are generally discontinuous.The biggest advantage of this setting is that the mathematical technicalities requiredfor the construction of an anisotropic partition of a domain, as well as the error analy-sis, are simplified by the presence of preferred directions. We thus take advantage of thissimple setting to introduce and study a key concept named the shape function , which go-verns the local approximation error after optimal adaptation of the elements to the localproperties of the approximated function. The shape function is also defined and used foranisotropic triangulations which are the object of Chapter 2 and 3. We give below itsprecise definition in this setting.Chapter 2 is devoted triangular finite elements of arbitrary degree m − m ≥ L p as error norm. In order to present our results we need to introduce some notations.We denote by IH m the space of homogeneous polynomials of degree m :IH m := Span { x k y l ; k + l = m } . A key ingredient of our approach is the shape function K m,p : IH m → R + , where 1 ≤ p ≤ ∞ is an exponent. This function is defined by an optimization of the L p local interpolationerror among the collection of triangles of unit area and of all possible shapes : for all π ∈ IH m , K m,p ( π ) := inf | T | =1 (cid:107) π − I m − T π (cid:107) L p ( T ) , (11)where I m − T is the local interpolation operator on T . Our main result is the followingasymptotic error estimate.3 Theorem 1.
Let Ω ⊂ R be a bounded polygonal domain, let f ∈ C m (Ω) and let ≤ p < ∞ . There exists a sequence ( T N ) N ≥ N , T N ) ≤ N , of conforming triangulations of Ω such that lim sup N →∞ N m (cid:107) f − I m − T N f (cid:107) L p (Ω) ≤ (cid:13)(cid:13)(cid:13)(cid:13) K m,p (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) (12) where the exponent τ ∈ (0 , ∞ ) is defined by τ := m + p . In the error estimate (12), we identify the collection d m f ( z ) of derivatives of order m of f at the point z ∈ Ω to the corresponding homogeneous polynomial in the Taylordevelopment of f at z . In mathematical form d m f ( z ) m ! ∼ (cid:88) k + l = m ∂ m f∂ k x ∂ l y ( z ) x k k ! y l l ! . Theorem 1 extends the known result (4) to finite elements of arbitrary degree m − f is similarly determinedby a non-linear quantity of its derivatives : the shape function K m,p ( d m f ( z )) is the “gene-ralization” of the determinant (cid:112) | det( d f ( z )) | appearing in (4) to derivatives of arbitraryorder.This remark raises the need for an in depth study of the shape function K m,p . For m = 2we have established that K ,p ( π ) is proportional to the square root of the determinant ofthe quadratic polynomial π = ax + 2 bxy + cy ∈ IH : K ,p ( π ) = c ,p (cid:112) | det π | = c ,p (cid:112) | ac − b | , where the constant c ,p > π . We thus recover the earlierresult (4). For m = 3, we establish that the shape function K ,p is the fourth root of thediscriminant of the homogenous cubic polynomial π = ax + 3 bx y + 3 cxy + dy ∈ IH : K ,p ( π ) = c ,p (cid:112) | disc π | = c ,p (cid:112) | ac − b )( bd − c ) − ( ad − bc ) | , where the constant c ,p > π . For larger values m ≥ K m,p , but an explicitquantity uniformly equivalent to this function. This equivalent has the following form :there exists a polynomial Q m ( a , · · · , a m ) of the m + 1 variables a , · · · , a m and a constant C m ≥ π = a x m + a x m − y + · · · a m y m ∈ IH m one has with r m := deg Q m , C − m rm (cid:112) Q m ( a , · · · , a m ) ≤ K m,p ( π ) ≤ C m rm (cid:112) Q m ( a , · · · , a m ) . (13)The polynomial Q m is obtained using the theory of invariant polynomials developed byHilbert in [61]. We also characterise the zeros of the shape function, and thus the possiblecases of “super-convergence” : K m,p ( π ) = 0 if and only if π has a linear factor ax + by of multiplicity s > m/
2, in other words the homogeneous polynomial π is sufficientlydegenerated.The proof of Theorem 1 is based on a “local patching strategy”, a two scale meshgeneration procedure which proceeds as follows : we consider a first “macro-triangulation”4 Introduction (English Version) R of the polygonal domain Ω which is sufficiently fine in such way that the derivativesof order m of f vary little on each triangle R ∈ R around an average value denoted by π R ∈ IH m . To each polynomial π R , R ∈ R , we then associate a triangle T R which is aminimizer, or a near minimizer, of the optimization problem defining K m,p ( π R ). We thentile each R ∈ R using the triangle T R properly scaled and its symmetric with respectto the origin. Eventually, as illustrated on Figure 3 the triangulation T N is obtained bygluing together the periodic tilings on each R ∈ R , with a few additional triangles inorder obtain a globally conforming mesh.The next theorem establishes that the asymptotic error estimate (12) is sharp, at leastif we restrict our attention to admissible sequences of triangulations, a condition defined in(8). The approximation result (15) establishes furthermore that the admissibility conditionis not too restrictive. Theorem 2.
Let Ω ⊂ R be a bounded polygonal domain, let f ∈ C m (Ω) and let ≤ p ≤∞ . Let ( T N ) N ≥ N , T N ) ≤ N , be an admissible sequence of triangulations of Ω . Then lim inf N →∞ N m (cid:107) f − I m − T N f (cid:107) L p (Ω) ≥ (cid:13)(cid:13)(cid:13)(cid:13) K m,p (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) , (14) where τ := m + p . Furthermore for any ε > there exists an admissible sequence ( T εN ) N ≥ N of triangulations of Ω , satisfying T εN ) ≤ N , and such that : lim sup N →∞ N m (cid:107) f − I m − T εN f (cid:107) L p (Ω) ≤ (cid:13)(cid:13)(cid:13)(cid:13) K m,p (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) + ε. (15)Chapter 3 is devoted to the counterparts of Theorems 1 and 2 when the interpolationerror is measured in the Sobolev W ,p semi-norms, 1 ≤ p < ∞ . These estimates involvethe counterpart L m,p : IH m → R + of the shape function K m,p which is defined as follows :for all π ∈ IH m L m,p ( π ) := inf | T | =1 (cid:107)∇ ( π − I m − T π ) (cid:107) L p ( T ) . We also provide some explicit equivalents of L m,p ( π ), defined similarly to (13) by analgebraic expression in the coefficients of π ∈ IH m . Our results for the W ,p semi-norm arethus extremely similar to the results obtained for the L p norm.The adaptation of the proofs however is not straightforward, due to the following newphenomenon : the presence of strongly obtuse triangles in a mesh (with one of their anglesclose to π ) may cause the gradient of the interpolated function to oscillate, as illustratedin Figure 6. This phenomenon deteriorates the interpolation error in the W ,p semi-norm,but not in the L p norm. These “flat” triangles therefore need to be carefully avoided. Insummary, long and thin triangles may be desirable but they should not be strongly obtuse.Before turning to description of the rest of this thesis we recall to the reader that thethree chapters composing Part I are self consistent and can be read independently.5 Figure
Part II. Anisotropic smoothness classes and image models
In this part that consists of the sole Chapter 4, we discuss the extension of our ap-proximation results to non-smooth functions.There exists various ways of measuring the smoothness of functions on a domainΩ ⊂ R , generally through the definition of an appropriate smoothness space with anassociated norm. Classical instances are Sobolev and Besov spaces. Such spaces are ofcommon use when describing the regularity of solutions to partial differential equations.From a numerical perspective they are also useful in order to sharply characterize at whichrate a function f may be approximated by simpler functions such as Fourier series, finiteelements (on isotropic triangulations), splines or wavelets.The result of adaptive anisotropic approximation (4) and its generalisation Theorem1 establish that the quality of the approximation of a function f by finite elements onanisotropic triangulations is governed by a non-linear quantity of the derivatives of f , atleast from an asymptotical point of view. In the case of finite elements of degree 1, andof the approximation in L norm, the relevant quantity is the following A ( f ) := (cid:13)(cid:13)(cid:13)(cid:112) | det( d f ) | (cid:13)(cid:13)(cid:13) L / (Ω) . The functional A strongly differs from the Sobolev, Holder or Besov norms, because ofits nonlinear behavior : A does not satisfy the triangle inequality, or any quasi-triangleinequality. In other words for any constant C there exists f, g ∈ C (Ω) such that A ( f + g ) > C ( A ( f ) + A ( g )) . (16)The lack of a triangular inequality forbids to use the classical techniques of linear analysisto define a smoothness space attached to the functional A . The extension of the approxi-mation result (4) to functions which are not C is therefore not straightforward.Functions arising in concrete applications, for instance in image processing or in thesolution of hyperbolic PDEs, often exhibit areas of smoothness separated by local dis-continuities. A mathematical model for this type of behavior is the collection of cartoonfunctions, which are smooth except for a jump discontinuity accross a collection of smooth6 Introduction (English Version) curves. A heuristical analysis presented in Chapter 4 suggests that for any cartoon func-tion f defined on a bounded polygonal domain Ω, there exists a sequence ( T N ) N ≥ N ofanisotropic triangulations of Ω, satisfying T N ) ≤ N , and such thatsup N ≥ N N (cid:107) f − I T N f (cid:107) L (Ω) < ∞ . (17)As illustrated on Figure 1, the triangulations ( T N ) N ≥ N combine highly anisotropic tri-angles aligned with the discontinuities of f , and large triangles in the regions where f issmooth. The approximation result (17) raises the hope that a precise and quantitativeasymptotic error estimate for the adaptive anisotropic approximation of cartoon functionsextends the known result lim sup N →∞ N (cid:107) f − I T N f (cid:107) L (Ω) ≤ CA ( f ) , (18)which holds if f is C .So far we have only fulfilled one part of this program : the extension of the functional A to cartoon functions. More precisely, consider a radial and compactly supported function ϕ ∈ C ∞ ( R ) of unit integral. For any δ > ϕ δ ( z ) := δ ϕ (cid:0) zδ (cid:1) andwe denote by f δ := f ∗ ϕ δ the convolution of f with ϕ δ . We establish in Chapter 4 that if f is a cartoon function, then A ( f δ ) converges as δ → δ → A ( f δ ) / = (cid:13)(cid:13)(cid:13)(cid:112) | det( d f ) | (cid:13)(cid:13)(cid:13) / L / (Ω \ Γ) + C ( ϕ ) (cid:13)(cid:13)(cid:13)(cid:112) | κ | γ (cid:13)(cid:13)(cid:13) / L / (Γ) = (cid:90) Ω \ Γ (cid:112) | det( d f ( z )) | dz + C ( ϕ ) (cid:90) Γ | κ ( s ) | / | γ ( s ) | / ds. Here, we have denoted by Γ the collection of curves where the cartoon function f is dis-continuous, by γ ( s ) the jump of f at the point s ∈ Γ, and by κ ( s ) the curvature of Γ at s . The constant C ( ϕ ) is positive and explicit in terms of ϕ .The extension of the nonlinear functional A to cartoon functions puts in light a link,explored in § A as a “second order” counterpart to the total variationsemi-norm TV, a measure of smoothness defined in terms of the first derivatives whichis also finite for all cartoon functions. The total variation plays a central role in imageprocessing and in the analysis of transport equations, two domains in which piecewisesmooth functions naturally appear. For any cartoon function f , the total variation TV( f )is given by the following formulaTV( f ) = (cid:90) Ω \ Γ |∇ f ( z ) | dz + (cid:90) Γ | γ ( s ) | ds. (cid:39) A / (cid:29) (cid:39) A (cid:29) (cid:39) A (cid:29) Figure A on different types of images.We compare the behavior of the two quantities A ( f ) / and TV( f ) for different fami-lies of cartoon functions f . It appears that these quantities are equivalent when f is asmoothly oscillating function f ( z ) := cos( ω | z | ) with large ω , see figure 7 (i). In contrastthe discontinuities are penalized in a different manner by these two functionals. For astaircase-like function, as illustrated on Figure 7 (ii), the total variation remains boundedwhile A tends to infinity as the number of steps grows, due contribution | γ ( s ) | / of thejump term. Furthermore, due to the curvature term | κ ( s ) | , A is much larger than TVfor characteristic functions of sets with a complex or oscillating boundary as illustratedin Figure 7.The functional A can thus be thought of as a new quantitative image model : a mono-chromatic image, described by a brightness function f : [0 , → [0 , A ( f )is sufficiently small. Generalising the work [70] we discuss an image recovery procedureusing this model as a bayesian prior. At the present time, this algorithm is not satisfac-tory due to very high convergence time, and for this reason we present some numericalillustrations in a simplified one-dimensional setting.Eventually we discuss the extension to cartoon functions of other non-linear quantitiesappearing in anisotropic finite element approximation such as the norm (cid:107) K m,p ( d m f ) (cid:107) L τ of the shape function for m ≥
2, or the counterpart to this quantitiy for functions f ofmore than two variables. Part III. Mesh generation and riemannian metrics
Triangulations are discrete objects of combinatorial nature : they can be described bythe collection of their vertices and of the edges between them. This description is fruit-ful for the demonstration of algebraic results such as the Euler formula, or for computerprocessing. In contrast, as explained earlier, many approaches towards anisotropic meshadaptation [15, 20, 66] are based on a continuous equivalent object, namely a riemannianmetric z (cid:55)→ H ( z ), in other words a continuous function H from Ω to the set S +2 of sym-metric positive definite matrices. Once this metric has been properly designed, it is thetask of the mesh generation algorithm to design a triangulation that agrees with this me-tric according to (7). The purpose of Part III is to formulate precise equivalence resultsbetween some classes of triangulations and of riemannian metrics. This equivalence trans-8 Introduction (English Version) lates some geometrical constraints satisfied by the triangulations into the form regularityproperties of the equivalent riemannian metrics.In order to state our results we need to introduce some notations. We associate to eachtriangle T its barycenter z T ∈ R and the symmetric positive definite matrix H T ∈ S +2 such that the ellipse E T defined by E T := { z ∈ R ; ( z − z T ) T H T ( z − z T ) } , is the ellipse of minimal area containing T . The point z T thus encodes the position of T ,while the matrix H T ∈ S +2 encodes the area, the aspect ratio and the orientation of thetriangle T .We denote by T the collection of conforming triangulations of the infinite domain R . The choice to consider infinite triangulations is guided by simplicity, and furtherwork will be devoted to triangulations of bounded polygonal domains. We denote by H := C ( R , S +2 ) the collection of riemannian metrics on R . A metric H ∈ H continuouslyassociates to each point z ∈ R a symmetric positive matrix H ( z ) ∈ S +2 . For C ≥
1, wesay that a triangulation
T ∈ T is C -equivalent to a metric H ∈ H if for all T ∈ T andfor all z ∈ T we have C − H ( z ) ≤ H T ≤ C H ( z ) , where these inequalities are meant in the sense of symmetric positive matrices. We saythat a class of triangulations T ∗ ⊂ T is equivalent to a class of metrics H ∗ ⊂ H if thereexists a uniform constant C ≥ T ∈ T ∗ there exists a metric H ∈ H ∗ such that T and H are C -equivalent.– For any metric H ∈ H ∗ there exists a triangulation T ∈ T ∗ such that T and H are C -equivalent.We consider three particularly relevant classes of triangulation of R T i,C ⊂ T a,C ⊂ T g,C which have a minor dependency on a parameter C ≥
1. The class T g,C of graded trian-gulations, is defined by the following condition which imposes a minimum of consistencyamong the shapes of neighbouring triangles. A triangulation T belongs to T g,C if for all T, T (cid:48) ∈ T one has T ∩ T (cid:48) (cid:54) = ∅ ⇒ C − H T ≤ H T (cid:48) ≤ C H T . A triangulation T belongs to the class T i,C of isotropic triangulations if T is graded, T ∈ T g,C , and if the elements of T are sufficiently close to the equilateral triangle, asexpressed by the following condition : for all T ∈ T(cid:107)H T (cid:107)(cid:107)H − T (cid:107) ≤ C . The class T a,C of quasi-acute triangulations is defined by a slightly more technical condi-tion on the maximal angles of the triangles, see Chapter 5. As explained in the earlier9description of Chapter 3, avoiding flat angles is needed to guarantee the stability of thegradient when applying the interpolation operator. One of our key results is the refor-mulation of this condition under the form of a regularity assumption on the equivalentriemannian metric. From a practical point of view, this condition is unfortunately notguaranteed by existing anisotropic mesh generation software. The constraints satisfied bythese classes of triangulations of R are illustrated by some bounded triangulations onFigure 8.The results of Chapter 5 establish that when C is sufficiently large, these three classesof triangulations are equivalent to three classes of metrics respectively H i ⊂ H a ⊂ H g , that are defined by precise smoothness conditions on the function z (cid:55)→ H ( z ). In par-ticular, the class H i consists of those metrics H which are proportional to the identity H ( z ) = Id /s ( z ) , and such that the proportionality factor s satisfies one of the followingsurprisingly equivalent properties • Euclidean Lipschitz Property : | s ( z ) − s ( z (cid:48) ) | ≤ | z − z (cid:48) | for all z, z (cid:48) ∈ R . • Riemannian Lipschitz Property : | ln s ( z ) − ln s ( z (cid:48) ) | ≤ d H ( z, z (cid:48) ) for all z, z (cid:48) ∈ R . We recall that the riemannian distance d H ( z, z (cid:48) ) measures the length, distorted by themetric H , of the smallest path joining z and z (cid:48) . d H ( z, z (cid:48) ) := inf γ (0)= zγ (1)= z (cid:48) (cid:90) (cid:107) γ (cid:48) ( t ) (cid:107) H ( γ ( t )) dt where (cid:107) u (cid:107) M := √ u T M u and where the infimum is taken among all paths γ ∈ C ([0 , , R d )joining z and z (cid:48) .The two lipschitz properties presented above have natural extensions to general ani-sotropic riemannian metrics, which are not anymore equivalent . Considering H ∈ H anddefining S ( z ) := H ( z ) − for all z ∈ R , these Lipschitz properties are namely d + ( H ( z ) , H ( z (cid:48) )) ≤ | z − z (cid:48) | for all z, z (cid:48) ∈ R , (19)where d + ( H ( z ) , H ( z (cid:48) )) := (cid:107) S ( z ) − S ( z (cid:48) ) (cid:107) , and d × ( H ( z ) , H ( z (cid:48) )) ≤ d H ( z, z (cid:48) ) for all z, z (cid:48) ∈ R , (20)where d × ( H ( z ) , H ( z (cid:48) )) := ln max {(cid:107) S ( z ) S ( z (cid:48) ) − (cid:107) , (cid:107) S ( z (cid:48) ) S ( z ) − (cid:107)} . Note that d + and d × are distances on S +2 .The main result of Chapter 5 characterizes the classes of metrics H g and H a (associatedto the classes of graded triangulations T g,C and of quasi-acute triangulations T a,C ) interms of the above Lipschitz properties. Theorem 3.
If the constant C is sufficiently large, then the class T g,C of graded trian-gulations is equivalent to the class H g of metrics satisfying (20), and the class T a,C ofquasi-acute triangulations is equivalent to the class H a of metrics satisfying simultaneously(20) and (19). Introduction (English Version)
Figure δ along the unit circle using O ( δ − )isotropic triangles (top left), O ( δ − | ln δ | ) “quasi-acute” triangles (top right), O ( δ − ) tri-angles satisfying the grading condition or not (bottom left and bottom right).We give in Chapter 6 some applications of these results in the contexts of approxima-tion theory and of constrained mesh generation. Let us describe the latter example. Forany closed set E ⊂ R and any mesh T ∈ T we denote by V T ( E ) the neighborhood of E in the triangulation T , which is defined as follows V T ( E ) := (cid:91) T ∈T T ∩ E (cid:54) = ∅ T. We say that a mesh T separates two disjoint closed sets X and Y if V T ( X ) ∩ V T ( Y ) = ∅ .The counterpart for metrics of this property is the following : a metric H ∈ H separates X and Y if d H ( x, y ) ≥ x ∈ X and y ∈ Y . We show that these two propertiesare rigorously equivalent, and we use this reformulation to compute the smallest numberof triangles (up to periodicity) required for the separation of some (periodic) sets usinga (periodic) isotropic, quasi-acute or graded triangulation. As illustrated on Figure 8,imposing more constraints on the triangulation typically raises the number of trianglesneeded to achieve the same task.The rest of this chapter is devoted to the control of the finite element approximationerror of a function f on a triangulation T , in L p norm or W ,p semi-norm, by a quantity e H ( f ) p , e aH ( ∇ f ) p or e gH ( ∇ f ) p attached to a metric H equivalent to T . This analysis putsin light, in the case of Sobolev norms, the role played by the angle or regularity conditionswhich define the elements of T a or H a respectively. Finally, we present a counterpart formetrics of the asymptotic approximation results developed for triangulations Chapters 2and 3. Part IV. Hierarchical refinement algorithms
The last part of this thesis is devoted to the study an algorithm proposed by Cohen,Dyn and Hecht which produces hierarchical sequences ( T N ) N ≥ N of (non-conforming) ani-1 Figure f . Given a triangulation T of a domainΩ and a function f ∈ L p (Ω), this algorithm creates in one step a triangulation T (cid:48) of Ω,of cardinality T (cid:48) ) = T ) + 1, proceeding as follows :1. (Greedy triangle selection) A triangle T ∈ T which has a maximal contribution tothe error is selected T := argmax T (cid:48) ∈T (cid:107) f − A T (cid:48) f (cid:107) L p ( T (cid:48) ) , where A T : L p ( T ) → IP m − is a projection operator, for instance the L ( T ) ortho-gonal projection onto IP m − .2. (Decision of a bisection) An edge e ∈ { a, b, c } of T is selected by minimizing a agiven decision function e (cid:55)→ d T ( e, f ) among the three edges. The triangle is refinedby joining the mid-point of this edge to the opposite vertex creating the sub-triangles T e and T e . The new triangulation is thus T (cid:48) := T − { T } + { T e , T e } . Starting from a triangulation T of cardinality N of the domain Ω, the algorithm pro-duces step after step a sequence ( T N ) N ≥ N , see Figure 9, of triangulations “adapted” toa given function f ∈ L p (Ω). The properties of these triangulations strongly depend onthe approximated function f and on the choice of the decision function d T ( e, f ), whichgoverns the creation of anisotropy. In contrast the projection operator A T plays a ratherminor role. A typical choice for the decision function e (cid:55)→ d T ( e, f ) is the local error afterbisection of the edge e , namely the algorithm selects the bisection that most reduces theerror.We present this algorithm in more detail in Chapter 7, and we establish its convergencein the sense that the (discontinuous) piecewise polynomial approximation A T N f definedby A T N f ( z ) = A T ( z ) , z ∈ T, T ∈ T N converges towards f in L p as N → + ∞ for any f ∈ L p , under some assumptions onthe decision function e (cid:55)→ d T ( e, f ). We also discuss the possibility of using the multis-cale hierachical structure to define multiresolution approximation, wavelets and CART2 Introduction (English Version) -1-0.5 0 0.5 1-1 -0.5 0 0.5 1
Figure
10 – Triangulation produced by the hierarchical refinement algorithm, and in-terpolation, for a function exhibiting a sharp transition close to a sinuosidal curve.algorithms. We illustrate the anisotropic adaptation of the algorithm on several types offunctions and images presenting sharp transitions along curved edges.In Chapter 8 we make a deeper convergence analysis of the algorithm in the case m = 2 of piecewise linear elements. Our main result states that when f is a strictlyconvex C function, then for a particular choice of the decision function based on the L local interpolation error, the sequence ( T N ) N ≥ N of triangulations generated by thisalgorithm satisfies the optimal asymptotic convergence estimatelim sup N →∞ N (cid:107) f − A T N f (cid:107) L p (Ω) ≤ C (cid:107) (cid:112) | det( d f ) |(cid:107) L τ (Ω) , where τ := 1 + p . The key observation leading to this result is that when f is a convexquadratic polynomial, the minimization of the decision function selects the longest edge in the metric associated to the homogeneous quadratic part of this polynomial. This isused to prove that the triangles generated by the algorithm tend in majority to adoptan optimal aspect ratio. This good behaviour of the algorithm is also observed on moregeneral non-convex C functions, as illustrated on Figure 10. However, proving the aboveoptimal convergence bound remains an open problem in this general setting.Finally we study in Chapter 9 some variants of the hierarchical refinement algorithmpresented above. We first consider a decision function based on the L local projectionerror, which is particularly well suited to the numerical implementation since it can beevaluated in small computing time.We then focus on the behaviour of the algorithm when applied to cartoon functions.The original algorithm does not satisfy the best possible convergence estimate for suchfunctions. We show that the optimal convergence rate may be recovered when replacingthe bisection from the mid-point of the edge towards the opposite vertex by more generalgeometric splitting procedure.Eventually we consider another variant of the algorithm, which is based on rectanglesaligned with the coordinate axes instead of triangles of arbitrary direction, in the spiritof the rectangular partitions studied in Chapter 1. This simplification leads to a result3which guarantees the best possible convergence estimate for all C functions, in the caseof piecewise constant approximations.4 Introduction (English Version) art IOptimal mesh adaptation for finiteelements of arbitrary order hapter 1Sharp asymptotics of the L p approximation error on rectangularpartitions Contents
The purpose of this chapter is to study the adaptive anisotropic approximation of afunction by interpolating splines defined over block partitions in IR d . We use the wordblock as a synonym for “ d -dimensional rectangle”. Our analysis applies to an arbitraryprojection operator in arbitrary dimension. We then apply the obtained estimates toseveral different interpolating schemes most commonly used in practice.Our approach is to introduce the “shape function” which reflects the interaction ofapproximation procedure with polynomials. Throughout the chapter we shall study the478 Chapter 1. Sharp asymptotics of the L p approximation error on rectangles asymptotic behavior of the approximation error and, whenever possible, the explicit formof the shape function which plays a major role in finding the constants in the formulaefor exact asymptotics. Let us first introduce the definitions that will be necessary to state the main problemand the results of this chapter.We consider a fixed integer d ≥ x = ( x , · · · , x d ) the elements of R d . A block R is a subset of R d of the form R = (cid:89) ≤ i ≤ d [ a i , b i ]where a i < b i , for all 1 ≤ i ≤ d . For any block R ⊂ R d , by L p ( R ), 1 ≤ p ≤ ∞ , we denotethe space of measurable functions f : R → IR for which the value (cid:107) f (cid:107) p = (cid:107) f (cid:107) L p ( R ) := (cid:90) R | f ( x ) | p dx p , if 1 ≤ p < ∞ , esssup {| f ( x ) | ; x ∈ R } , if p = ∞ . is finite. We also consider the space C ( R ) of continuous functions on R equipped withthe uniform norm (cid:107) · (cid:107) L ∞ ( R ) . We shall make a frequent use of the canonical block I d , where I is the interval I := (cid:20) − , (cid:21) . Next we define the space V := C ( I d ) and the norm (cid:107) · (cid:107) V := (cid:107) · (cid:107) L ∞ ( I d ) . Throughout thischapter we consider a linear and bounded (hence, continuous) operator I : V → V. Thisimplies that there exists a constant C I such that (cid:107) I u (cid:107) V ≤ C I (cid:107) u (cid:107) V for all u ∈ V. (1.1)We assume furthermore that I is a projector, which means that it satisfiesI ◦ I = I . (1.2)Let R be an arbitrary block. It is easy to show that there exists a unique x ∈ R d and aunique diagonal matrix D with positive diagonal coefficients such that the transformation φ ( x ) := x + Dx satisfies φ ( I d ) = R. (1.3)The volume of R , denoted by | R | , is equal to det( D ). For any function f ∈ C ( R ) we thendefine I R f := I( f ◦ φ ) ◦ φ − . (1.4)Note that (cid:107) f − I R f (cid:107) L p ( R ) = (det D ) p (cid:107) f ◦ φ − I( f ◦ φ ) (cid:107) L p ( I d ) . (1.5) .1. Introduction
49A block partition R of a block R is a finite collection of blocks such that their unioncovers R and which pairwise intersections have zero Lebesgue measure. If R is a blockpartition of a block R and if f ∈ C ( R ), by I R f ∈ L ∞ ( R ) we denote the (possiblydiscontinuous) function which coincides with I R f on the interior of each block R ∈ R . Main Question.
The purpose of this chapter is to understand the asymptotic beha-vior of the quantity (cid:107) f − I R N f (cid:107) L p ( R ) for each given function f on R from some class of smoothness, where ( R N ) N ≥ is asequence of block partitions of R that are optimally adapted to f .Note that the exact value of this error can be explicitly computed only in trivial cases.Therefore, the natural question is to study the asymptotic behavior of the shape function,i.e. the behavior of the error as the number of elements of the partition R N tends toinfinity.Most of our results hold with only assumptions (1.1) of continuity of the operator I,the projection axiom (1.2), and the definition of I R given by (1.4). Our analysis thereforeapplies to various projection operators I, such as the L -orthogonal projection on a spaceof polynomials, or spline interpolating schemes described in § The main problem formulated above is interesting for functions of arbitrary smoothnessas well as for various classes of splines (for instance, for splines of higher order, interpola-ting splines, best approximating splines, etc.). In the univariate case general questions ofthis type have been investigated by many authors. The results are more or less completeand have numerous applications (see, for example, [69]).Fewer results are known in the multivariate case. Most of them are for the case ofapproximation by splines on triangulations (for review of existing results see, for instance[4, 18, 35, 59] and Chapter 2). However, in applications where preferred directions exist,box partitions are sometimes more convenient and efficient.The first result on the error of interpolation on rectangular partitions by bivariatesplines linear in each variable (or bilinear) is due to D’Azevedo [38] who obtained local(on a single rectangle) error estimates. In [7] Babenko obtained the exact asymptoticsfor the error (in L , L , and L ∞ norms) of interpolation of C ( I d ) functions by bilinearsplines.In [8] Babenko generalized the result to interpolation and quasiinterpolation of afunction f ∈ C ( I d ) with arbitrary but fixed throughout the domain signature (numberof positive and negative second-order partial derivatives). However, the norm used tomeasure the error of approximation was uniform.In this chapter we use a different, more abstract, approach which allows us to obtain theexact asymptotics of the error in a more general framework which can be applied to manyparticular interpolation schemes by an appropriate choice of the interpolation operator.In general, the constant in the asymptotics is implicit. However, imposing additionalassumptions on the interpolation operator allows us to compute the constant explicitly.0 Chapter 1. Sharp asymptotics of the L p approximation error on rectangles The chapter is organized as follows. Section 1.1.5 contains the statements of mainapproximation results. The closer study of the shape function, as well as its explicitformulas under some restrictions, can be found in Section 1.2. The proofs of the theoremsabout asymptotic behavior of the error are contained in Section 1.3.
In order to obtain the asymptotic error estimates we need to study the interaction ofthe projection operator I with polynomials.The notation α always refers to a d -vector of non-negative integers α = ( α , · · · , α d ) ∈ ZZ d+ . For each α we define the following quantities | α | := (cid:88) ≤ i ≤ d α i , α ! := (cid:89) ≤ i ≤ d α i ! , max( α ) := max ≤ i ≤ d α i . We also define the monomial X α := (cid:89) ≤ i ≤ d X α i i , where the variable is X = ( X , ..., X d ) ∈ IR d . Finally, for each integer k ≥ k := Vect { X α ; | α | ≤ k } , IP ∗ k := Vect { X α ; max( α ) ≤ k and | α | ≤ k + 1 } , IP ∗∗ k := Vect { X α ; max( α ) ≤ k } . (1.6)Note that clearly dim(IP ∗∗ k ) = ( k + 1) d . In addition, a classical combinatorial argumentshows thatdim IP k = (cid:18) k + dd (cid:19) and dim IP ∗ k = dim IP k +1 − d = (cid:18) k + d + 1 d (cid:19) − d. By V I we denote the image of I, which is a subspace of V = C ( I d ). Since I is aprojector (1.2), we have V I = { I( f ) : f ∈ V } = { f ∈ V : f = I( f ) } . (1.7)From this point on, the integer k is fixed and defined as follows k = k (I) := max { k (cid:48) ≥ k (cid:48) ⊂ V I } (1.8)Hence, the operator I reproduces polynomials of total degree less or equal than k . (If k = ∞ then we obtain, using the density of polynomials in V and the continuity of I, thatI( f ) = f for all f ∈ V . We exclude this case from now on.)In what follows, by m we denote the integer defined by m = m (I) := k + 1 , (1.9) .1. Introduction k = k (I) is defined in (1.8). By IH m we denote the space of homogeneous polynomialsof degree m IH m := Vect { X α ; | α | = m } . We now introduce a function K I on IH m , further referred to as the “shape function”. Definition 1.1.1 (Shape Function) . For all π ∈ IH m K I ( π ) := inf | R | =1 (cid:107) π − I R π (cid:107) L p ( R ) , (1.10) where the infimum is taken over all blocks R of unit d -dimensional volume. The shape function K plays a major role in our asymptotical error estimates developedin the next subsection. Hence, we dedicate § A >
0, since π is homogeneous of degree m inf | R | = A (cid:107) π − I R π (cid:107) L p ( R ) = A md + p K I ( π ) . (1.11)The optimization (1.10) among blocks can be rephrased into an optimization amongdiagonal matrices. Indeed for any block R there exists a unique x ∈ R d and a uniquediagonal matrix with positive coefficients such that R = φ ( I d ) with φ ( x ) = x + Dx .Furthermore, the homogeneous component of degree m is the same in both π ◦ φ and π ◦ D , hence π ◦ φ − π ◦ D ∈ IP k (recall that m = k + 1) and therefore this polynomial isreproduced by the projection operator I. Using the linearity of I, we obtain π ◦ φ − I( π ◦ φ ) = π ◦ D − I( π ◦ D ) . Combining this with (1.5), and observing that det D = | R | , we obtain that K I ( π ) = inf det D =1 D ≥ (cid:107) π ◦ D − I( π ◦ D ) (cid:107) L p ( I d ) , (1.12)where the infimum is taken over the set of diagonal matrices with non-negative entriesand unit determinant. In this section we define several possible choices for the projection operator I which areconsistent with (1.8) and, in our opinion, are most useful for practical purposes. However,many other possibilities could be considered.
Definition 1.1.2 ( L ( I d ) orthogonal projection) . We may define I( f ) as the L ( I d ) or-thogonal projection of f onto one of the spaces of polynomials IP k , IP ∗ k or IP ∗∗ k defined in(1.6). If the projection operator I is chosen as in Definition 1.1.2, then a simple changeof variables shows that for any block R , the operator I R defined by (1.4) is the L ( R )orthogonal projection onto the same space of polynomials.2 Chapter 1. Sharp asymptotics of the L p approximation error on rectangles To introduce several possible interpolation schemes for which we obtain the estimatesusing our approach, we consider a set U k ⊂ I of cardinality U k ) = k + 1 (special casesare given below). For any u = ( u , · · · u d ) ∈ U dk we define an element of IP ∗∗ k as follows µ u ( X ) := (cid:89) ≤ i ≤ d (cid:89) v ∈ U k v (cid:54) = u i X i − vu i − v ∈ IP ∗∗ k . Clearly, µ u ( u ) = µ u ( u , · · · , u d ) = 1 and µ u ( v ) = µ u ( v , · · · , v d ) = 0 if v = ( v , · · · , v d ) ∈ U dk and v (cid:54) = u .It follows that the elements of B := ( µ u ) u ∈ U dk are linearly independent. Since B ) = U dk ) = ( k + 1) d = dim(IP ∗∗ k ), B is a basis of IP ∗∗ k .Therefore, any element of µ ∈ IP ∗∗ k can be written in the form µ ( X ) = (cid:88) u ∈ U dk λ u µ u ( X ) . It follows that, for any given f ∈ C ( I d ), there exists a unique element of µ ∈ IP ∗∗ k suchthat µ ( u ) = f ( u ) for all u ∈ U dk . We define I f := µ , namely(I f )( X ) := (cid:88) u ∈ U dk f ( u ) µ u ( X ) ∈ IP ∗∗ k . We may take U k to be the set of k + 1 equi-spaced points on I U k = (cid:26) −
12 + nk ; 0 ≤ n ≤ k (cid:27) . (1.13)We obtain a different, but equally relevant, operator I by choosing U k to be the set ofTchebychev points on I U k = (cid:26)
12 cos (cid:16) nπk (cid:17) ; 0 ≤ n ≤ k (cid:27) . (1.14)Different interpolation procedures can be used to construct I. Another convenient inter-polation scheme is to take I( f ) ∈ IP ∗ k and I( f ) = f on a subset of U dk . This subset contains dim IP ∗ k points, which are convenientto choose first on the boundary of I d and then (if needed) at some interior lattice points.Note that since dim IP ∗ k < U dk ) = ( k + 1) d , it is always possible to construct such anoperator.If the projection operator I is chosen as described above, then for any block R andany f ∈ C ( R ), I R ( f ) is the unique element of respective space of polynomials whichcoincides with f at the image φ ( p ) of the points p mentioned in the definition of I, by thetransformation φ described in (1.3). .1. Introduction In order to obtain the approximation results we often impose a slight technical restric-tion (which can be removed, see for instance [4]) on sequences of block partitions, whichis defined as follows.
Definition 1.1.3 (Admissibility) . We say that a sequence ( R N ) N ≥ of block partitions ofa block R is admissible if R N ) ≤ N for all N ≥ , and sup N ≥ (cid:18) N d sup R ∈R N diam( R ) (cid:19) < ∞ (1.15)We recall that the approximation error is measured in L p norm, where the exponent p is fixed and 1 ≤ p ≤ ∞ . We define τ ∈ (0 , ∞ ) by1 τ := md + 1 p . (1.16)In the following estimates we identified d m f ( x ) with an element of IH m according to d m f ( x ) m ! ∼ (cid:88) | α | = m ∂ m f ( x ) ∂x α X α α ! . (1.17)We now state the asymptotically sharp lower bound for the approximation error of afunction f on an admissible sequence of block partitions. Theorem 1.1.4.
Let R be a block and let f ∈ C m ( R ) . For any admissible sequence ofblock partitions ( R N ) N ≥ of R lim inf N →∞ N md (cid:107) f − I R N f (cid:107) L p ( R ) ≥ (cid:13)(cid:13)(cid:13)(cid:13) K I (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ ( R ) . The next theorem provides an upper bound for the projection error of a function f when an optimal sequence of block partitions is used. It confirms the sharpness of theprevious theorem. Theorem 1.1.5.
Let R be a block and let f ∈ C m ( R ) . Then there exists a (perhapsnon-admissible) sequence ( R N ) N ≥ , R N ) ≤ N , of block partitions of R satisfying lim sup N →∞ N md (cid:107) f − I R N f (cid:107) L p ( R ) ≤ (cid:13)(cid:13)(cid:13)(cid:13) K I (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ ( R ) . (1.18) Furthermore, for all ε > there exists an admissible sequence ( R εN ) N ≥ of block par-titions of R satisfying lim sup N →∞ N md (cid:107) f − I R εN f (cid:107) L p ( R ) ≤ (cid:13)(cid:13)(cid:13)(cid:13) K I (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ ( R ) + ε. (1.19)4 Chapter 1. Sharp asymptotics of the L p approximation error on rectangles An important feature of these estimates is the “lim sup”. Recall that the upper limitof a sequence ( u N ) N ≥ N is defined bylim sup N →∞ u N := lim N →∞ sup n ≥ N u n , and is in general strictly smaller than the supremum sup N ≥ N u N . It is still an openquestion to find an appropriate upper estimate of sup N ≥ N N md (cid:107) f − I R N f (cid:107) L p ( R ) whenoptimally adapted block partitions are used.In order to have more control of the quality of approximation on various parts of thedomain we introduce a positive weight function Ω ∈ C ( R ). For 1 ≤ p ≤ ∞ and for any u ∈ L p ( R ) as usual we define (cid:107) u (cid:107) L p ( R , Ω) := (cid:107) u Ω (cid:107) L p ( R ) . (1.20) Remark 1.1.6.
Theorems 1.1.4, 1.1.5 and 1.1.7 below also hold when the norm (cid:107) · (cid:107) L p ( R ) (resp (cid:107) · (cid:107) L τ ( R ) ) is replaced with the weighted norm (cid:107) · (cid:107) L p ( R , Ω) (resp (cid:107) · (cid:107) L τ ( R , Ω) ) definedin (1.20). In the following section we shall use some restrictive hypotheses on the interpolationoperator in order to obtain an explicit formula for the shape function. In particular,Propositions 1.2.7, 1.2.8, and equation (1.21) show that, under some assumptions, thereexists a constant C = C (I) > C K I (cid:18) d m fm ! (cid:19) ≤ d (cid:118)(cid:117)(cid:117)(cid:116)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:89) ≤ i ≤ d ∂ m f∂x mi (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ CK I (cid:18) d m fm ! (cid:19) . These restrictive hypotheses also allow to improve slightly the estimate (1.19) as follows.
Theorem 1.1.7.
If the hypotheses of Proposition 1.2.7 or 1.2.8 hold, and if K I (cid:0) d m fm ! (cid:1) > everywhere on R , then there exists an admissible sequence of block partitions ( R N ) N ≥ of R which satisfies the optimal estimate (1.18). The proofs of the Theorems 1.1.4, 1.1.5 and 1.1.7 are given in § In this section we perform a close study of the shape function K I , since it plays amajor role in our asymptotic error estimates. In the first subsection § K I under such general assumptions. Recall thatin § § § K I . .2. Study of the shape function The shape function K obeys the following important invariance property with respectto diagonal changes of coordinates. Proposition 1.2.1.
For all π ∈ IH m and all diagonal matrices D with non-negativecoefficients K I ( π ◦ D ) = (det D ) md K I ( π ) . Proof:
We first assume that the diagonal matrix D has positive diagonal coefficients. Let D be a diagonal matrix with positive diagonal coefficient and which satisfies det D = 1.Let also π ∈ IH m . Then, since π is homogeneous and of degree mπ ◦ ( DD ) = π ◦ ((det D ) d ˜ D ) = (det D ) md π ◦ ˜ D, where ˜ D := (det D ) − d DD satisfies det ˜ D = det D = 1 and is uniquely determined by D .According to (1.12) we therefore have K I ( π ◦ D ) = inf det D =1 D ≥ (cid:107) π ◦ ( DD ) − I( π ◦ ( DD )) (cid:107) L p ( I d ) = (det D ) md inf det ˜ D =1˜ D ≥ (cid:107) π ◦ ˜ D − I( π ◦ ˜ D ) (cid:107) L p ( I d ) = (det D ) md K I ( π ) , which concludes the proof in the case where D has positive diagonal coefficients.Let us now assume that D is a diagonal matrix with non-negative diagonal coefficients andsuch that det( D ) = 0. Clearly there exists a sequence ( D n ) n ≥ of diagonal matrices withpositive coefficients, such that det D n = 1 for all n ≥ DD n → n → ∞ .Therefore π ◦ ( DD n ) →
0, which implies that K ( π ◦ D ) = 0 and concludes the proof ofthis proposition. (cid:5) The next proposition shows that the exponent p used for measuring the approximationerror plays a rather minor role. By K p we denote the shape function associated with theexponent p . Proposition 1.2.2.
There exists a constant c > such that for all ≤ p ≤ p ≤ ∞ wehave on IH m cK ∞ ≤ K p ≤ K p ≤ K ∞ . Proof:
For any function f ∈ V = C ( I d ) and for any 1 ≤ p ≤ p ≤ ∞ by a standardconvexity argument we obtain that (cid:107) f (cid:107) L ( I d ) ≤ (cid:107) f (cid:107) L p ( I d ) ≤ (cid:107) f (cid:107) L p ( I d ) ≤ (cid:107) f (cid:107) L ∞ ( I d ) . Using (1.12), it follows that K ≤ K p ≤ K p ≤ K ∞ Chapter 1. Sharp asymptotics of the L p approximation error on rectangles on IH m . Furthermore, the following semi norms on IH m | π | := (cid:107) π − I π (cid:107) L ( I d ) and | π | ∞ := (cid:107) π − I π (cid:107) L ∞ ( I d ) vanish precisely on the same subspace of IH m , namely V I ∩ H m = { π ∈ IH m ; π = I π } .Since IH m has finite dimension, it follows that they are equivalent. Hence, there exists aconstant c > c | · | ∞ ≤ | · | on IH m . Using (1.12), it follows that cK ∞ ≤ K ,which concludes the proof. (cid:5) The examples of projection operators presented in § K I . Theseproperties are defined below and called H ± , H σ , H ∗ or H ∗∗ . They are satisfied when theoperator I is the interpolation at equispaced points (Definition 1.13), at Tchebychev points(Definition 1.14), and usually on the most interesting sets of other points. They are alsosatisfied when I is the L ( I d ) orthogonal projection onto IP ∗ k or IP ∗∗ k (Definition 1.1.2).The first property reflects the fact that a coordinate x i on I d can be changed to − x i ,independently of the projection process. Definition 1.2.3 ( H ± hypothesis) . We say that the interpolation operator I satisfies the H ± hypothesis if for any diagonal matrix D with entries in ± we have for all f ∈ V I( f ◦ D ) = I( f ) ◦ D. The next property implies that the different coordinates x , · · · , x d on I d play symme-trical roles with respect to the projection operator. Definition 1.2.4 ( H σ hypothesis) . If M σ is a permutation matrix, i.e. ( M σ ) ij := δ iσ ( j ) for some permutation σ of { , · · · , d } , then for all f ∈ V I( f ◦ M σ ) = I( f ) ◦ M σ . According to (1.8), the projection operator I reproduces the space of polynomials IP k .However, in many situations the space V I of functions reproduced by I is larger than IP k .In particular V I = IP ∗∗ k when I is the interpolation on equispaced or Tchebychev points,and V I = IP k (resp IP ∗ k , IP ∗∗ k ) when I is the L ( I d ) orthogonal projection onto IP k (resp IP ∗ k ,IP ∗∗ k ).It is particularly useful to know whether the projection operator I reproduces theelements of IP ∗ k , and we therefore give a name to this property. Note that it clearly doesnot hold for the L ( I d ) orthogonal projection onto IP k . Definition 1.2.5 ( H ∗ hypothesis) . The following inclusion holds : P ∗ k ⊂ V I . On the contrary it is useful to know that some polynomials, and in particular purepowers x mi , are not reproduced by I. .2. Study of the shape function Definition 1.2.6 ( H ∗∗ hypothesis) . If (cid:88) ≤ i ≤ d λ i x mi ∈ V I then ( λ , · · · , λ d ) = (0 , · · · , . This condition obviously holds if I( f ) ∈ IP ∗∗ k (polynomials of degree ≤ k in eachvariable) for all f . Hence, it holds for all the examples of projection operators given inthe previous subsection § In this section we provide the explicit expression of the shape function K when someof the hypotheses H ± , H σ , H ∗ or H ∗∗ hold. Let π ∈ IH m and let λ i be the coefficient of X mi in π , for all 1 ≤ i ≤ d . We define K ∗ ( π ) := d (cid:115) (cid:89) ≤ i ≤ d | λ i | and s ( π ) := { ≤ i ≤ d ; λ i > } . If d m f ( x ) m ! is identified by (1.17) to an element of IH m , then one has K ∗ (cid:18) d m f ( x ) m ! (cid:19) = 1 m ! d (cid:118)(cid:117)(cid:117)(cid:116)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:89) ≤ i ≤ d ∂ m f∂x mi ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . (1.21) Proposition 1.2.7. If m is odd and if H ± , H σ and H ∗ hold, then K p ( π ) = C ( p ) K ∗ ( π ) , where C ( p ) := (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:88) ≤ i ≤ d X mi − I (cid:32) (cid:88) ≤ i ≤ d X mi (cid:33)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p ( I d ) > . Proposition 1.2.8. If m is even and if H σ , H ∗ and H ∗∗ hold then K p ( π ) = C ( p, s ( π )) K ∗ ( π ) . Furthermore, C ( p,
0) = C ( p, d ) = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:88) ≤ i ≤ d X mi − I (cid:32) (cid:88) ≤ i ≤ d X mi (cid:33)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p ( I d ) > . (1.22) Other constants C ( p, s ) are positive and obey C ( p, s ) = C ( p, d − s ) . Next we turn to the proofs of Propositions 1.2.7 and 1.2.8.8
Chapter 1. Sharp asymptotics of the L p approximation error on rectangles Proof of Proposition 1.2.7
Let π ∈ IH m and let λ i be the coefficient of X mi in π .Denote by π ∗ := (cid:88) ≤ i ≤ d λ i X mi so that π − π ∗ ∈ IP ∗ k and, more generally, π ◦ D − π ∗ ◦ D ∈ IP ∗ k for any diagonal matrix D .The hypothesis H ∗ states that the projection operator I reproduces the elements of IP ∗ k ,and therefore π ◦ D − I( π ◦ D ) = π ∗ ◦ D − I( π ∗ ◦ D ) . Hence, K I ( π ) = K I ( π ∗ ) according to (1.12). If there exists i , 1 ≤ i ≤ d , such that λ i = 0, then we denote by D the diagonal matrix of entries D ii = 1 if i (cid:54) = i and 0 if i = i . Applying Proposition 1.2.1 we find K I ( π ) = K I ( π ∗ ) = K I ( π ∗ ◦ D ) = (det D ) md K I ( π ∗ ) = 0 . which concludes the proof. We now assume that all the coefficients λ i , 1 ≤ i ≤ d , aredifferent from 0, and we denote by ε i be the sign of λ i . Applying Proposition 1.2.1 to thediagonal matrix D of entries | λ i | m we find that K I ( π ) = K I ( π ∗ ) = (det D ) md K I ( π ∗ ◦ D − ) = K ∗ ( π ) K I (cid:32) (cid:88) ≤ i ≤ d ε i X mi (cid:33) . Using the H ± hypothesis with the diagonal matrix D of entries D ii = ε i , and recallingthat m is odd, we find that K I (cid:32) (cid:88) ≤ i ≤ d ε i X mi (cid:33) = K I (cid:32) (cid:88) ≤ i ≤ d X mi (cid:33) . We now define the functions g i := X mi − I( X mi ) for 1 ≤ i ≤ d. It follows from (1.12) that K I (cid:32) (cid:88) ≤ i ≤ d X mi (cid:33) = inf (cid:81) ≤ i ≤ d a i =1 (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:88) ≤ i ≤ d a i g i (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p ( I d ) , where the infimum is taken over all d -vectors of positive reals of product 1. Let us considersuch a d -vector ( a , · · · , a d ), and a permutation σ of the set { , · · · , d } . The H σ hypothesisimplies that the quantity (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:88) ≤ i ≤ d a σ ( i ) g i (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p ( I d ) .2. Study of the shape function σ . Hence, summing over all permutations, we obtain using the triangleinequality (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:88) ≤ i ≤ d a i g i (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p ( I d ) = 1 d ! (cid:88) σ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:88) ≤ i ≤ d a σ ( i ) g i (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p ( I d ) ≥ d ! (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:88) ≤ i ≤ d (cid:32)(cid:88) σ a σ ( i ) (cid:33) g i (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p ( I d ) = 1 d (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:88) ≤ i ≤ d g i (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p ( I d ) (cid:88) ≤ i ≤ d a i . (1.23)The right-hand side is minimal when a = · · · = a d = 1, which shows that (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:88) ≤ i ≤ d a i g i (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p ( I d ) ≥ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:88) ≤ i ≤ d g i (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p ( I d ) = C ( p )with equality when a i = 1 for all i . Note as a corollary that K I ( π ε ) = (cid:107) π ε − I( π ε ) (cid:107) L p ( I d ) = C ( p ) where π ε = (cid:88) ≤ i ≤ d ε i X mi . (1.24)It remains to prove that C ( p ) >
0. Using the hypothesis H ± , we find that for all µ i ∈ {± } we have (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:88) ≤ i ≤ d µ i g i (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p ( I d ) = C ( p ) . In particular, for any 1 ≤ i ≤ d one has2 (cid:107) g i (cid:107) L p ( I d ) ≤ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:88) ≤ i ≤ d g i (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p ( I d ) + (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) g i − (cid:88) ≤ i ≤ d g i (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p ( I d ) ≤ C ( p ) . If C ( p ) = 0, it follows that g i = 0 and therefore that X mi = I( X mi ), for any 1 ≤ i ≤ d .Using the assumption H ∗ , we find that the projection operator I reproduces all the poly-nomials of degree m = k + 1, which contradicts the definition (1.8) of the integer k . (cid:5) Proof of proposition 1.2.8
We define λ i , π ∗ and ε i ∈ {± } as before and we find,using similar reasoning, that K I ( π ) = K ∗ ( π ) K I (cid:32) (cid:88) ≤ i ≤ d ε i X mi (cid:33) . For 1 ≤ s ≤ d we define C ( p, s ) := K I (cid:32) (cid:88) ≤ i ≤ s X mi − (cid:88) s +1 ≤ i ≤ d X mi (cid:33) . Chapter 1. Sharp asymptotics of the L p approximation error on rectangles From the hypothesis H σ it follows that K I ( π ) = K ∗ ( π ) C ( p, s ( π )).Using again H σ and the fact that K I ( π ) = K I ( − π ) for all π ∈ IH m , we find that C ( p, s ) = K I (cid:32) (cid:88) ≤ i ≤ s X mi − (cid:88) s +1 ≤ i ≤ d X mi (cid:33) = K I (cid:32) − (cid:32) (cid:88) ≤ i ≤ d − s X mi − (cid:88) d − s +1 ≤ i ≤ d X mi (cid:33)(cid:33) = C ( p, d − s ) . We define g i := X mi − I( X mi ), as in the proof of Proposition 1.2.7. We obtain theexpression for C ( p,
0) by summing over all permutations as in (1.23) C ( p,
0) = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:88) ≤ i ≤ d g i (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p ( I d ) . This concludes the proof of the first part of Proposition 1.2.8. We now prove that C ( p, s ) > ≤ p ≤ ∞ and all s ∈ { , · · · , d } . To this end we define the following quantityon R d (cid:107) a (cid:107) K := (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:88) ≤ i ≤ d a i g i (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p ( I d ) . Note that (cid:107) a (cid:107) K = 0 if and only if (cid:88) ≤ i ≤ d a i X mi = (cid:88) ≤ i ≤ d a i I( X mi ) , and the hypothesis H ∗∗ precisely states that this equality occurs if and only if a i = 0, forall 1 ≤ i ≤ d . Hence, (cid:107) · (cid:107) K is a norm on R d . Furthermore, let E s := (cid:40) a ∈ R s + × R d − s − ; (cid:89) ≤ i ≤ d | a i | = 1 (cid:41) Then C ( p, s ) = inf a ∈ E s (cid:107) a (cid:107) K . Since E s is a closed subset of R d , which does not contain the origin, this infimum isattained. It follows that C ( p, s ) >
0, and that there exists a rectangle R ε of unit volumesuch that K I ( π ε ) = (cid:107) π ε − I π ε (cid:107) L p ( R ε ) = C ( p, s ( π ε )) where π ε = (cid:88) ≤ i ≤ d ε i X mi , (1.25)which concludes the proof of this proposition. (cid:5) .3. Proof of the approximation results In this section, let the block R , the integer m , the function f ∈ C m ( R ) and theexponent p be fixed. We conduct our proofs for 1 ≤ p < ∞ and provide comments on howto adjust our arguments for the case p = ∞ .For each x ∈ R by µ x ∈ IP m we denote the m -th degree Taylor polynomial of f atthe point x µ x = µ x ( X ) := (cid:88) | α |≤ m ∂ m f∂x α ( x ) ( X − x ) α α ! , (1.26)and we define π x ∈ IH m to be the homogeneous component of degree m in µ x , π x = π x ( X ) := (cid:88) | α | = m ∂ m f∂x α ( x ) X α α ! . (1.27)Since π x and µ x are polynomials of degree m , their m -th derivative is constant, and clearly d m π x = d m µ x = d m f ( x ). In particular, for any x ∈ R the polynomial µ x − π x belongs toIP k (recall that k = m −
1) and is therefore reproduced by the projection operator I. Itfollows that for any x ∈ R and any block Rπ x − I R π x = µ x − I R µ x . (1.28)In addition, we introduce a measure ρ of the degeneracy of a block Rρ ( R ) := diam( R ) d | R | . Given any function g ∈ C m ( R ) and any x ∈ R we can define, similarly to (1.27), apolynomial ˆ π x ∈ IH m associated to g at x . We then define (cid:107) d m g (cid:107) L ∞ ( R ) := sup x ∈ R (cid:32) sup | u | =1 | ˆ π x ( u ) | (cid:33) . (1.29) Proposition 1.3.1.
There exists a constant C = C ( m, d ) > such that for any block R and any function g ∈ C m ( R ) (cid:107) g − I R g (cid:107) L p ( R ) ≤ C | R | τ ρ ( R ) md (cid:107) d m g (cid:107) L ∞ ( R ) . (1.30) Proof:
Let x ∈ R and let g be the Taylor polynomial for g of degree m − x which is defined as follows g ( X ) := (cid:88) | α |≤ m − ∂ α f ( x ) ∂x α ( X − x ) α α ! . Let x ∈ R and let x ( t ) = x + t ( x − x ). We have g ( x ) = g ( x ) + (cid:90) t =0 d m g x ( t ) ( x − x ) (1 − t ) m m ! dt. Chapter 1. Sharp asymptotics of the L p approximation error on rectangles Hence, | g ( x ) − g ( x ) | ≤ (cid:90) t =0 (cid:107) d m g (cid:107) L ∞ ( R ) | x − x | m (1 − t ) m m ! dt ≤ m + 1)! (cid:107) d m g (cid:107) L ∞ ( R ) diam( R ) m . (1.31)Since g is a polynomial of degree at most m −
1, we have g = I g . Hence, (cid:107) g − I R g (cid:107) L p ( R ) ≤ | R | p (cid:107) g − I R g (cid:107) L ∞ ( R ) = | R | p (cid:107) ( g − g ) − I R ( g − g ) (cid:107) L ∞ ( R ) ≤ (1 + C I ) | R | p (cid:107) g − g (cid:107) L ∞ ( R ) , where C I is the operator norm of I : V → V . Combining this estimate with (1.31), weobtain (1.30) which concludes the proof. (cid:5) The following lemma allows us to bound the interpolation error of f on the block R from below. Lemma 1.3.2.
For any block R ⊂ R and x ∈ R we have (cid:107) f − I R f (cid:107) L p ( R ) ≥ | R | τ (cid:0) K I ( π x ) − ω (diam R ) ρ ( R ) md (cid:1) , where the function ω is positive, depends only on f and m , and satisfies ω ( δ ) → as δ → . Proof:
Let h := f − µ x , where µ x is defined in (1.26). Using (1.28) and (1.11), we obtain (cid:107) f − I R f (cid:107) L p ( R ) ≥ (cid:107) π x − I R π x (cid:107) L p ( R ) − (cid:107) h − I R h (cid:107) L p ( R ) ≥ | R | τ K I ( π x ) − (cid:107) h − I R h (cid:107) L p ( R ) , and according to Proposition 1.3.1 we have (cid:107) h − I R h (cid:107) L p ( R ) ≤ C | R | τ ρ ( R ) md (cid:107) d m h (cid:107) L ∞ ( R ) . Observe that (cid:107) d m h (cid:107) L ∞ ( R ) = (cid:107) d m f − d m π x (cid:107) L ∞ ( R ) = (cid:107) d m f − d m f ( x ) (cid:107) L ∞ ( R ) . We introduce the modulus of continuity ω ∗ of the m -th derivatives of f . ω ∗ ( r ) := sup x ,x ∈ R : | x − x |≤ r (cid:107) d m f ( x ) − d m f ( x ) (cid:107) = sup x ,x ∈ R : | x − x |≤ r (cid:32) sup | u |≤ | π x ( u ) − π x ( u ) | (cid:33) (1.32)By setting ω = C ω ∗ we conclude the proof of this lemma. (cid:5) .3. Proof of the approximation results R N ) N ≥ . For all N ≥ R ∈ R N and x ∈ int( R ), we define φ N ( x ) := | R | and ψ N ( x ) := (cid:0) K I ( π x ) − ω (diam( R )) ρ ( R ) md (cid:1) + , where λ + := max { λ, } . We now apply Holder’s inequality (cid:90) R f f ≤ (cid:107) f (cid:107) L p ( R ) (cid:107) f (cid:107) L p ( R ) with the functions f = φ mτd N ψ τN and f = φ − mτd N and the exponents p = pτ and p = dmτ . Note that p + p = τ (cid:16) p + md (cid:17) = 1. Hence, (cid:90) R ψ τN ≤ (cid:18)(cid:90) R φ mpd N ψ pN (cid:19) τp (cid:18)(cid:90) R φ − N (cid:19) mτd . (1.33)Note that (cid:82) R φ − N = R N ) ≤ N . Furthermore, if R ∈ R N and x ∈ int( R ) then accordingto Lemma 1.3.2 φ N ( x ) md ψ N ( x ) = | R | τ − p ψ N ( x ) ≤ | R | − p (cid:107) f − I R f (cid:107) L p ( R ) . Hence, (cid:20)(cid:90) R φ mpd N ψ pN (cid:21) p ≤ (cid:34) (cid:88) R ∈R N | R | (cid:90) R (cid:107) f − I R f (cid:107) pL p ( R ) (cid:35) p = (cid:107) f − I R f (cid:107) L p ( R ) . (1.34)Inequality (1.33) therefore leads to (cid:107) ψ N (cid:107) L τ ( R ) ≤ (cid:107) f − I R N f (cid:107) L p ( R ) N md . (1.35)Since the sequence ( R N ) N ≥ is admissible, there exists a constant C A > N and all R ∈ R N we have diam( R ) ≤ C A N − d . We introduce a subset of R (cid:48) N ⊂ R N which collects the most degenerate blocks R (cid:48) N = { R ∈ R N ; ρ ( R ) ≥ ω ( C A N − d ) − m } , where ω is the function defined in Lemma 1.3.2. By R (cid:48) N we denote the portion of R covered by R (cid:48) N . For all x ∈ R \ R (cid:48) N we obtain ψ N ( x ) ≥ K I ( π x ) − ω ( C A N − d ) − d . We define ε N := ω ( C A N − d ) − d and we notice that ε N → N → ∞ . Hence, (cid:107) ψ N (cid:107) τL τ ( R ) ≥ (cid:13)(cid:13) ( K I ( π x ) − ε N ) + (cid:13)(cid:13) τL τ ( R \ R (cid:48) N ) ≥ (cid:13)(cid:13) ( K I ( π x ) − ε N ) + (cid:13)(cid:13) τL τ ( R ) − C τ | R (cid:48) N | , Chapter 1. Sharp asymptotics of the L p approximation error on rectangles where C := max x ∈ R K I ( π x ). The last expression involves a slight abuse of notations, since( K I ( π x ) − ε N ) + stands for the function x ∈ R (cid:55)→ ( K I ( π x ) − ε N ) + ∈ R . Next we observethat | R (cid:48) N | → N → + ∞ : indeed for all R ∈ R (cid:48) N we have | R | = diam( R ) d ρ ( R ) − ≤ C dA N − ω ( C A N − d ) m . Since R (cid:48) N ) ≤ N , we obtain | R (cid:48) N | ≤ C dA ω ( C A N − d ) m , and the right-hand side tends to0 as N → ∞ . We thus obtainlim inf N →∞ (cid:107) ψ N (cid:107) L τ ( R ) ≥ lim N →∞ (cid:13)(cid:13) ( K I ( π x ) − ε N ) + (cid:13)(cid:13) L τ ( R ) = (cid:107) K I ( π x ) (cid:107) L τ ( R ) . Combining this result with (1.35), we conclude the proof of the announced estimate.Note that this proof also works with the exponent p = ∞ by changing (cid:18)(cid:90) R φ mpd N ψ pN (cid:19) τp into (cid:107) φ md N ψ N (cid:107) τL ∞ ( R ) in (1.33) and performing the standard modification in (1.34). Remark 1.3.3.
As announced in Remark 1.1.6, this proof can be adapted to the weightednorm (cid:107) · (cid:107) L p ( R , Ω) associated to a positive weight function Ω ∈ C ( R ) and defined in(1.20). For that purpose let r N := sup { diam( R ) ; R ∈ R N } and let Ω N ( x ) := inf x (cid:48) ∈ R | x − x (cid:48) |≤ r N Ω( x (cid:48) ) . The sequence of functions Ω N increases with N and tends uniformly to Ω as N → ∞ . If R ∈ R N and x ∈ R , then (cid:107) f − I R f (cid:107) L p ( R, Ω) ≥ Ω N ( x ) (cid:107) f − I R f (cid:107) L p ( R ) . The main change in the proof is that the function ψ N should be replaced with ψ (cid:48) N := Ω N ψ N .Other details are left to the reader. (cid:5) The proof of Theorems 1.1.5 (and 1.1.7) is based on the actual construction of anasymptotically optimal sequence of block partitions. To that end we introduce the notionof a local block specification.
Definition 1.3.4. (local block specification)
A local block specification on a block R is a (possibly discontinuous) map x (cid:55)→ R ( x ) which associates to each point x ∈ R a block R ( x ) , and such that– The volume | R ( x ) | is a positive continuous function of the variable x ∈ R .– The diameter is bounded : sup { diam( R ( x )) ; x ∈ R } < ∞ . .3. Proof of the approximation results R adapted in a certain sense to a given local block specification. Lemma 1.3.5.
Let R be a block in IR d and let x (cid:55)→ R ( x ) be a local block specificationon R . Then there exists a sequence ( P n ) n ≥ of block partitions of R , P n = P n ∪ P n , satisfying the following properties.– (The number of blocks in P n is asymptotically controlled) lim n →∞ P n ) n d = (cid:90) R | R ( x ) | − dx. (1.36) – (The elements of P n follow the block specifications) For each R ∈ P n there exists y ∈ R such that R is a translate of n − R ( y ) , and | x − y | ≤ diam( R ) n for all x ∈ R. (1.37) – (The elements of P n have a small diameter) lim n →∞ (cid:32) n sup R ∈P n diam( R ) (cid:33) = 0 . (1.38) Proof:
See Appendix. (cid:5)
We recall that the block R , the exponent p and the function f ∈ C m ( R ) are fixed,and that at each point x ∈ R the polynomial π x ∈ IH m is defined by (1.27). The sequenceof block partitions described in the previous lemma is now used to obtain an asymptoticalerror estimate. Lemma 1.3.6.
Let x (cid:55)→ R ( x ) be a local block specification such that for all x ∈ R (cid:107) π x − I R ( x ) ( π x ) (cid:107) L p ( R ( x )) ≤ . (1.39) Let ( P n ) n ≥ be a sequence of block partitions satisfying the properties of Lemma 1.3.5, andlet for all N ≥ n ( N ) := max { n ≥ P n ) ≤ N } . Then R N := P n ( N ) is an admissible sequence of block partitions and lim sup N →∞ N md (cid:107) f − I R N f (cid:107) L p ( R ) ≤ (cid:18)(cid:90) R R ( x ) − dx (cid:19) τ . (1.40) Proof:
Let n ≥ R ∈ P n . If R ∈ P n then let y ∈ R be as in (1.37). UsingProposition 1.3.1 and (1.11) we find (cid:107) f − I R f (cid:107) L p ( R ) ≤ (cid:107) π y − I R π y (cid:107) L p ( R ) + (cid:107) ( f − π y ) − I R ( f − π y ) (cid:107) L p ( R ) ≤ n − dτ (cid:107) π y − I R ( y ) π y (cid:107) L p ( R ( y )) + C | R | p diam( R ) m (cid:107) d m f − d m π y (cid:107) L ∞ ( R ) ≤ n − dτ + Cn − dτ | R ( y ) | p diam( R ( y )) m (cid:107) d m f − d m f ( y ) (cid:107) L ∞ ( R ) ≤ n − dτ (1 + C (cid:48) ω ∗ ( n − diam( R ))) , Chapter 1. Sharp asymptotics of the L p approximation error on rectangles where we defined C (cid:48) := C sup y ∈ R | R ( y ) | p diam( R ( y )) m , which is finite by Definition 1.3.4.We denoted by ω ∗ the modulus of continuity of the m -th derivatives of f which is definedat (1.32). We now define for all n ≥ δ n := n sup R ∈P n diam( R ) . According to (1.38) one has δ n → n → ∞ . If R ∈ P n , then diam( R ) ≤ n − δ n andtherefore | R | ≤ diam( R ) d ≤ n − d δ dn . Using again (1.30), and recalling that τ = md + p wefind (cid:107) f − I R f (cid:107) L p ( R ) ≤ C | R | p diam( R ) m (cid:107) d m f (cid:107) L ∞ ( R ≤ C (cid:48)(cid:48) n − dτ δ dτ n where C (cid:48)(cid:48) = C (cid:107) d m f (cid:107) L ∞ ( R . From the previous observations it follows that (cid:107) f − I P n f (cid:107) L p ( R ) ≤ P n ) p max R ∈P n (cid:107) f − I R f (cid:107) L p ( R ) ≤ P n ) p n − dτ max { C (cid:48) ω ∗ ( n − diam( R )) , C (cid:48)(cid:48) δ dτ n } . Hence, lim sup n →∞ P n ) − p n dτ (cid:107) f − I P n f (cid:107) L p ( R ) ≤ . Combining the last equation with (1.36), we obtainlim sup n →∞ P n ) md (cid:107) f − I P n f (cid:107) L p ( R ) ≤ (cid:18)(cid:90) R R ( x ) − dx (cid:19) τ . The sequence of block partitions R N := P n ( N ) clearly satisfies R N ) /N → N → ∞ and therefore leads to the announced equation (1.40). Furthermore, it follows from theboundedness of diam( R ( x )) on R and the properties of P n described in Lemma 1.3.5 thatsup n ≥ (cid:18) P n ) d sup R ∈P n diam( R ) (cid:19) < ∞ which implies that R N is an admissible sequence of partitions. (cid:5) We now choose adequate local block specifications in order to obtain the estimatesannounced in Theorems 1.1.5 and 1.1.7. For any M ≥ diam( I d ) = √ d we define themodified shape function K M ( π ) := inf | R | =1 , diam( R ) ≤ M (cid:107) π − I R π (cid:107) L p ( R ) , (1.41)where the infimum is taken on blocks of unit volume and diameter smaller that M . Itfollows from a compactness argument that this infimum is attained and that K M is acontinuous function on IH m . Furthermore, for any fixed π ∈ IH m , M (cid:55)→ K M ( π ) is adecreasing function of M which tends to K I ( π ) as M → ∞ . .3. Proof of the approximation results x ∈ R we denote by R ∗ M ( x ) a block which realises the infimum in K M ( π x ).Hence, | R ∗ M ( x ) | = 1 , diam( R ∗ M ( x )) ≤ M, and K M ( π x ) = (cid:107) π x − I R ∗ M ( x ) π x (cid:107) L p ( R ∗ M ( x )) We define a local block specification on R as follows R M ( x ) := ( K M ( π x ) + M − ) − τd R ∗ M ( x ) . (1.42)We now observe that using a change of variables and the homogeneity of π x , as in (1.11),that (cid:107) π x − I R M ( x ) π x (cid:107) L p ( R M ( x )) = K M ( π x )( K M ( π x ) + M − ) − ≤ . Hence, according to Lemma 1.3.6, there exists a sequence ( R MN ) N ≥ of block partitions of R such that lim sup N →∞ N md (cid:107) f − I R MN f (cid:107) L p ( R ) ≤ (cid:107) K M ( π x ) + M − (cid:107) L τ ( R ) . Using our previous observations on the function K M , we see thatlim M →∞ (cid:107) K M ( π x ) + M − (cid:107) L τ ( R ) = (cid:107) K I ( π x ) (cid:107) L τ ( R ) . Hence, given ε > M ( ε ) large enough in such a way that (cid:107) K M ( ε ) ( π x ) + M ( ε ) − (cid:107) L τ ( R ) ≤ (cid:107) K I ( π x ) (cid:107) L τ ( R ) + ε, which concludes the proof of the estimate (1.19) of Theorem 1.1.5.For each N let M ( N ) be such that N md (cid:107) f − I R M ( N ) N f (cid:107) L p ( R ) ≤ (cid:107) K M ( N ) ( π x ) + M ( N ) − (cid:107) L τ ( R ) + M ( N ) − and M ( N ) → ∞ as N → ∞ . Then the (perhaps non admissible) sequence of block par-titions R N := R M ( N ) N satisfies (1.18) which concludes the proof of Theorem 1.1.5. (cid:5) We now turn to the proof of Theorem 1.1.7, which follows the same scheme for themost. There exists d functions λ ( x ) , · · · , λ d ( x ) ∈ C ( R ), and a function x (cid:55)→ π ∗ ( x ) ∈ IP ∗ k such that for all x ∈ R we have π x = (cid:88) ≤ i ≤ d λ i ( x ) X mi + π ∗ ( x ) . The hypotheses of Theorem 1.1.7 state that K I (cid:16) d m f ( x ) m ! (cid:17) = K I ( π x ) (cid:54) = 0 for all x ∈ R . Itfollows from Propositions 1.2.7 and 1.2.8 that the product λ ( x ) · · · λ d ( x ) is nonzero forall x ∈ R . We denote by ε i ∈ {± } the sign of λ i , which is therefore constant over theblock R , and we define π ε := (cid:88) ≤ i ≤ d ε i X mi Chapter 1. Sharp asymptotics of the L p approximation error on rectangles The proofs of Propositions 1.2.8 and 1.2.7 show that there exists a block R ε , satisfying | R ε | = 1, and such that K I ( π ε ) = (cid:107) π − I R ε π (cid:107) L p ( R ε ) . By D ( x ) we denote the diagonalmatrix of entries | λ ( x ) | , · · · , | λ d ( x ) | , and we define φ x := (det D ( x )) md D ( x ) − m . Clearly det φ x = 1 and π x ◦ φ x = (det D ( x )) d π ε + π ∗ ( x ) ◦ φ x , and π ∗ ( x ) ◦ φ x ∈ IP ∗ k . Henceusing (1.5) we obtain (cid:107) π x − I φ x ( R ε ) π x (cid:107) L p ( φ x ( R ε )) = (cid:107) π x ◦ φ x − I R ε ( π x ◦ φ x ) (cid:107) L p ( R ε ) = (det D ( x )) d (cid:107) π ε − I R ε π ε (cid:107) L p ( R ε ) = (det D ( x )) d K I ( π ε )= K I ( π x ) . We then define the local block specification R ( x ) := K I ( π x ) − τd φ x ( R ε ) , (1.43)in such way that (cid:107) π x − I R ( x ) π x (cid:107) L p ( R ( x )) = 1 for all x ∈ R , using the homogeneity of π x and an isotropic change of variables. The admissible sequence ( R N ) N ≥ of block parti-tions constructed in Lemma 1.3.6 then satisfies the optimal upper estimate (1.18), whichconcludes the proof of Theorem 1.1.7. (cid:5) Remark 1.3.7 (Adaptation to weighted norms) . Lemma 1.3.6 also holds if (1.39) isreplaced with Ω( x ) (cid:107) π x − I R ( x ) ( π x ) (cid:107) L p ( R ( x )) ≤ and if the L p ( R ) norm is replaced with the weighted L p ( R , Ω) norm in (1.40). Replacingthe block R M ( x ) defined in (1.42) with R (cid:48) M ( x ) := Ω( x ) − τd R M ( x ) , one can easily obtain the extension of Theorem 1.1.5 to weighted norms. Similarly, re-placing R ( x ) defined in (1.43) with R (cid:48) ( x ) := Ω( x ) − τd R ( x ) , one obtains the extension ofTheorem 1.1.7 to weighted norms. By Q n we denote the standard partition of R ∈ IR d in n d identical blocks of diameter n − diam( R ) illustrated on the left in Figure 1.1. For each Q ∈ Q n by x Q we denote thebarycenter of Q and we consider the tiling T Q of R d formed with the block n − R ( x Q ) andits translates. We define P n ( Q ) and P n as follows P n ( Q ) := { R ∈ T Q ; R ⊂ Q } and P n := (cid:91) Q ∈Q n P n ( Q ) . .4. Appendix : Proof of Lemma 1.3.5 Figure Q of R . (Right) the set of blocks P n in green and the set of blocks P ∗ n in red.Comparing the areas, we obtain P n ) = (cid:88) Q ∈Q n P n ( Q ) ≤ (cid:88) Q ∈Q n | Q || n − R ( x Q ) | = n d (cid:88) Q ∈Q n | Q || R ( x Q ) | − . From this point, using the continuity of x (cid:55)→ | R ( x ) | , one easily shows that P n ) n d → (cid:82) R | R ( x ) | − dx as n → ∞ . Furthermore, the property (1.37) clearly holds. In order toconstruct P n , we first define two sets of blocks P ∗ n ( Q ) and P ∗ n as follows P ∗ n ( Q ) := { R ∩ Q ; R ∈ T Q and int( R ) ∩ ∂Q (cid:54) = ∅} and P ∗ n := (cid:91) Q ∈Q n P ∗ n ( Q ) . Comparing the surface of ∂Q with the dimensions of R ( x Q ), we find that P ∗ n ( Q )) ≤ Cn d − where C is independent of n and of Q ∈ Q n . Therefore, P ∗ n ) ≤ Cn d − . The setof blocks P n is then obtained by subdividing each block of P ∗ n into o ( n ) (for instance, (cid:98) ln( n ) (cid:99) d ) identical sub-blocks, in such a way that P n ) is o ( n d ) and that the requirement(1.38) is met.0 Chapter 1. Sharp asymptotics of the L p approximation error on rectangles hapter 2Sharp asymptotics of the L p interpolation error on optimaltriangulations Contents
In finite element approximation, a usual distinction is between uniform and adaptive methods. In the latter, the elements defining the mesh may vary strongly in size and712
Chapter 2. Sharp asymptotics of the L p interpolation error shape for a better adaptation to the local features of the approximated function f . Thisnaturally raises the objective of characterizing and constructing an optimal mesh for agiven function f .Note that depending on the context, the function f may be fully known to us : eitherthrough an explicit formula or a discrete sampling ; or observed through noisy measure-ments ; or implicitly defined as the solution of a given partial differential equation.In this chapter, we assume that f is a function defined on a polygonal bounded domainΩ ⊂ R . For a given conforming triangulation T of Ω and an arbitrary but fixed integer m >
1, we denote by I m − T the standard interpolation operator on the space of Lagrangefinite elements of degree m − T . Given a norm X of interest and a number N >
0, the objective of finding the optimal mesh for f can be formulated as solving theoptimization problem min T ) ≤ N (cid:107) f − I m − T f (cid:107) X , where the minimum is taken over all conforming triangulations of cardinality N . Wedenote by T N the minimizer of the above problem.Our first objective is to establish sharp asymptotic error estimates that precisely des-cribe the behavior of (cid:107) f − I m − T f (cid:107) X as N → + ∞ . Estimates of that type were obtainedin [4, 27] in the particular case of linear finite elements ( m − X = L p . They have the formlim sup N → + ∞ (cid:16) N min T ) ≤ N (cid:107) f − I T f (cid:107) L p (cid:17) ≤ C (cid:13)(cid:13)(cid:13)(cid:112) | det( d f ) | (cid:13)(cid:13)(cid:13) L τ , τ := 1 p + 1 , (2.1)which reveals that the convergence rate is governed by the quantity (cid:112) | det( d f ) | , whichdepends nonlinearly the on Hessian d f . This is heavily tied to the fact that we allowtriangles with possibly highly anisotropic shape. In the present work, the polynomialdegree m − m -th order derivative d m f .Our second objective is to propose simple and practical ways of designing mesheswhich behave similarly to the optimal one, in the sense that they satisfy the sharp errorestimate up to a fixed multiplicative constant. We denote by IH m the space of homogeneous polynomials of degree m :IH m := Span { x k y l ; k + l = m } . For any triangle T , we denote by I m − T the local interpolation operator acting from C ( T )onto the space IP m − of polynomials of total degree m −
1. The image of v ∈ C ( T ) bythis operator is defined by the conditionsI m − T v ( γ ) = v ( γ )for all points γ ∈ T with barycentric coordinates in the set { , m − , m − , · · · , } . Wedenote by e m,T ( v ) p := (cid:107) v − I m − T v (cid:107) L p ( T ) .1. Introduction. L p ( T ). We also denote by e m, T ( v ) p := (cid:107) v − I m − T v (cid:107) L p = (cid:32)(cid:88) T ∈T e m,T ( v ) pp (cid:33) p the global interpolation error for a given triangulation T , with the standard modificationif p = ∞ .A key ingredient in this chapter is a function defined by a shape optimization problem :for any fixed 1 ≤ p ≤ ∞ and for any π ∈ IH m , we define K m,p ( π ) := inf | T | =1 e m,T ( π ) p . (2.2)Here, the infimum is taken over all triangles of area | T | = 1. Note that from the homoge-neity of π , we find that inf | T | = A e m,T ( π ) p = K m,p ( π ) A m + p . (2.3)This optimization problem thus gives the shape of the triangles of a given area whichis best adapted to the polynomial π in the sense of minimizing the interpolation errormeasured in L p . We refer to K m,p as the shape function . We discuss in § Theorem 2.1.1.
For any bounded polygonal domain Ω ⊂ R and any function f ∈ C m (Ω) , there exists a sequence of triangulations ( T N ) N ≥ N , conforming if p < ∞ , with T N ) ≤ N , such that lim sup N →∞ N m e m, T N ( f ) p ≤ (cid:13)(cid:13)(cid:13)(cid:13) K m,p (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) , τ := m p . (2.4)An important feature of this estimate is the “lim sup” asymptotical operator. Recallthat the upper limit of a sequence ( u N ) N ≥ N is defined bylim sup N →∞ u N := lim N →∞ sup n ≥ N u n and is in general strictly smaller than the supremum sup N ≥ N u N . It is still an openquestion to find an appropriate upper estimate for sup N ≥ N N m e m, T N ( f ) p when optimallyadapted anisotropic triangulations are used.In the estimate (2.4), the m -th derivative d m f ( z ) is identified to a homogeneous poly-nomial in IH m : d m f ( z ) m ! ∼ (cid:88) k + l = m ∂ m f∂ k x∂ l y ( z ) x k k ! y l l ! . (2.5)4 Chapter 2. Sharp asymptotics of the L p interpolation error In order to illustrate the sharpness of (2.4), we introduce a slight restriction on sequencesof triangulations, following an idea in [4] : a sequence ( T N ) N ≥ N of triangulations, suchthat T N ) ≤ N , is said to be admissible ifsup N ≥ N (cid:18) N sup T ∈T N diam( T ) (cid:19) < ∞ . In other words sup T ∈T N diam( T ) ≤ C A N − / (2.6)for some C A > N . The following theorem shows that the estimate (2.4)cannot be improved when we restrict our attention to admissible sequences. It also showsthat this class is reasonably large in the sense that (2.4) is ensured to hold up to smallperturbation. Theorem 2.1.2.
Let Ω ⊂ R be a bounded polygonal domain, and f ∈ C m (Ω) . Set τ := m + p . For all admissible sequences of triangulations ( T N ) N ≥ N , conforming or not,one has lim inf N →∞ N m e m, T N ( f ) p ≥ (cid:13)(cid:13)(cid:13)(cid:13) K m,p (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) . For all ε > , there exists an admissible sequence of triangulations ( T εN ) N ≥ N , conformingif p < ∞ , such that lim sup N →∞ N m e m, T εN ( f ) p ≤ (cid:13)(cid:13)(cid:13)(cid:13) K m,p (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) + ε. Note that the sequences ( T εN ) N ≥ N satisfy the admissibility condition (2.6) with aconstant C A ( ε ) which may explode as ε →
0. The proofs of both theorems are givenin § equidistribute the local approximation error e m,T ( f ) p between each triangle, and (ii) the aspect ratio of a triangle T should be isotropic with respect to a distorted metric induced by the local value of d m f on T (and there-fore anisotropic in the sense of the Euclidean metric). Roughly speaking, the quantity (cid:107) K m,p (cid:0) d m fm ! (cid:1) (cid:107) L τ ( T ) controls the local interpolation L p -error estimate on a triangle T oncethis triangle is optimized with respect to the local properties of f . This type of estimatediffers from those obtained in [2], which hold for any T , optimized or not, and involve thepartial derivatives of f in a local coordinate system which is adapted to the shape of T .The proof of the upper estimates in Theorem 2.1.2 involves the construction of anoptimal mesh based on a patching strategy similar to [4]. However, inspection of the proofreveals that this construction becomes effective only when the number of triangles N becomes very large. Therefore it may not be useful in practical applications.A more practical approach consists in deriving the above mentioned distorted metricfrom the exact or approximate data of d m f using the following procedure. To any π ∈ IH m ,we associate a symmetric positive definite matrix h π ∈ S +2 . If z ∈ Ω, and d m f ( z ) is closeto π , then the triangle T containing z should be isotropic in the metric h π . The globalmetric is given at each point z by h ( z ) = s ( π z ) h π z , π z = d m f ( z ) , .1. Introduction. s ( π z ) is a scalar factor which depends on the desired accuracy of the finite elementapproximation. Once this metric has been properly identified, fast algorithms such asin [19, 92, 94] can be used to design a near-optimal mesh based on it. Recently in [15, 66],several algorithms have been rigorously proven to terminate and produce good qualitymeshes, see also Chapter 5. Computing the map π ∈ IH m (cid:55)→ h π ∈ S +2 (2.7)is therefore of key use in applications. This problem is well understood in the case of linearelements ( m = 2) : the matrix h π is then defined as the absolute value (in the sense ofsymmetric matrices) of the matrix associated to the quadratic form π . In contrast, theexact form of this map in the case m ≥ m = 3which corresponds to quadratic elements. These strategies have been implemented inan open-source Mathematica code [95]. In a similar manner, we address the algebraiccomputation of the shape function K m,p ( π ) from the coefficients of π ∈ IH m , when m ≥ § § m = 2) and quadratic ( m = 3)elements. In this case, it is possible to obtain explicit formulas for K m,p ( π ) from thecoefficients of π . In the case m = 2, this formula is of the form K ,p ( ax + 2 bxy + cy ) = σ (cid:112) | b − ac | , where the constant σ only depends on p and the sign of b − ac , and we therefore recover theknown estimate (2.1) from Theorem 2.1.1. The formula for m = 3 involves the discriminantof the third degree polynomial d f . Our analysis also leads to an algebraic computationof the map (2.7). We want to mention that a different strategy for the construction of thedistorted metric and the derivation of the error estimate for a finite element of arbitraryorder was proposed in [23]. In this approach, the distorted metric is obtained at a point z ∈ Ω by finding the largest ellipse contained in a level set of the polynomial associated to d m f ( z ) by (2.5). This optimization problem has connections with the one that defines theshape function in (2.2), as we shall explain in § m = 3 has the advantage of avoiding the use of numerical optimization,the metric being directly derived from the coefficients of d m f .In § m >
3. In this case, explicit formulas for K m,p ( π ) seem outof reach. However, we can introduce explicit functions K m ( π ) which are polynomials inthe coefficients of π , and are equivalent to K m,p ( π ), leading therefore to similar asymptoticerror estimates up to multiplicative constants. At the current stage, we did not obtain asimple solution to the algebraic computation of the map (2.7) in the case m >
3. Thederivation of K m is based on the theory of invariant polynomials due to Hilbert. Let usmention that this theory was also recently applied in [79] to image processing tasks suchas affine invariant edge detection and denoising.We finally discuss in § m = 2.6 Chapter 2. Sharp asymptotics of the L p interpolation error In this section, we establish several properties of the function K m,p which will be ofkey use subsequently. We assume that m ≥ p ∈ [1 , ∞ ]. We equip thefinite dimensional vector space IH m with a norm (cid:107) · (cid:107) defined as follows (cid:107) π (cid:107) := sup x + y ≤ | π ( x, y ) | . (2.8)Our first result shows that the function K m,p vanishes on a set of polynomials which hasa simple algebraic characterization. Proposition 2.2.1.
We denote by s m := (cid:98) m (cid:99) + 1 the smallest integer strictly larger than m/ . The vanishing set of K m,p is the set of polynomials which have a generalized root ofmultiplicity at least s m : K m,p ( π ) = 0 ⇔ π ( x, y ) = ( αx + βy ) s m ˜ π, for some α, β ∈ R and ˜ π ∈ IH m − s m . Proof:
We denote by T eq a fixed equilateral triangle of unit area, centered at 0.We first assume that π ( x, y ) = ( αx + βy ) s m ˜ π . Then there exists a rotation R ∈ O and ˆ π ∈ H m − s m such that π ◦ R ( x, y ) = x s m ˆ π ( x, y ) = x s m (cid:32) m − s m (cid:88) i =0 a i x i y m − s m − i (cid:33) . Therefore, denoting by φ ε the linear transform φ ε ( x, y ) = R (cid:0) εx, yε (cid:1) , we obtain π ◦ φ ε = ( εx ) s m (cid:32) m − s m (cid:88) i =0 a i ( εx ) i ( y/ε ) m − s m − i (cid:33) = ε s m − m x s m (cid:32) m − s m (cid:88) i =0 ε i a i x i y m − s m − i (cid:33) , hence π ◦ φ ε → ε →
0. Since | det φ ε | = 1, the triangles φ ε ( T eq ) have unit area.Consequently, e m,φ ε ( T eq ) ( π ) p = e m,T eq ( π ◦ φ ε ) p → ε → , and therefore K m,p ( π ) = 0.Conversely, let π ∈ IH m \ { } be such that K m,p ( π ) = 0. Then there exists a sequence( T n ) n ≥ of triangles with unit area such that e m,T n ( π ) p →
0. We remark that the interpo-lation error e T ( π ) p of π ∈ IH m is invariant by a translation τ h : z (cid:55)→ z + h of the triangle T . Indeed, π − π ◦ τ h ∈ IP m − , so that (cid:107) π − I m − T π (cid:107) L p ( τ h ( T )) = (cid:107) π ◦ τ h − I m − T ( π ◦ τ h ) (cid:107) L p ( T ) = (cid:107) π − I m − T π (cid:107) L p ( T ) . (2.9)Hence we may assume that the barycenter of T n is 0 and write T n = φ n ( T eq ) for somelinear transform φ n with det φ n = 1. Since e m,T eq ( · ) p is a norm on IH m , it follows that π ◦ φ n → φ n has a singular value decomposition φ n = U n ◦ D n ◦ V n , where U n , V n ∈ O , and D n = (cid:18) ε n
00 1 /ε n (cid:19) , < ε n ≤ . .2. The shape function (cid:107) π ◦ V (cid:107) = (cid:107) π (cid:107) for any π ∈ IH m and V ∈ O . Therefore, (cid:107) π ◦ U n ◦ D n (cid:107) = (cid:107) π ◦ U n ◦ D n ◦ V n ◦ V − n (cid:107) = (cid:107) π ◦ φ n (cid:107) → . Denoting by a i,n the coefficient of x i y m − i in π ◦ U n , we find that a i,n ε i − mn tends to 0 as n → + ∞ . In the case where i < s m , this implies that a i,n tends to 0 as n → + ∞ .By compactness of O we may assume, up to a subsequence extraction, that U n converges to some U ∈ O . Denoting by a i the coefficient of x i y m − i in π ◦ U , we thusfind that a i = 0 if i < s m . This implies that π ◦ U ( x, y ) = x s m ˆ π ( x, y ) which concludes theproof. (cid:5) Remark 2.2.2.
In the simple case m = 2 , we infer from Proposition 2.2.1 that K ,p ( π ) =0 if and only if π is of the form π ( x, y ) = x up to a rotation, and therefore a one-dimensional function. For such a function, the optimal triangle T degenerates to a segmentin the y direction, i.e., optimal triangles of a fixed area tend to be infinitely long in onedirection. This situation also holds when m > . Indeed, we see in the second part in theproof of Proposition 2.2.1 that if π is a nontrivial polynomial such that K m,p ( π ) = 0 , then ε n must tend to as n → + ∞ . This shows that T n = φ n ( T ) tends to be infinitely flat inthe direction U e y with e y = (0 , . However, K m,p ( π ) = 0 does not any longer mean that π is a polynomial of one variable. Our next result shows that the function K m,p is homogeneous and obeys an invarianceproperty with respect to linear change of variables. Proposition 2.2.3.
For all π ∈ IH m , λ ∈ R , and φ ∈ L ( R ) , K m,p ( λπ ) = | λ | K m,p ( π ) , (2.10) K m,p ( π ◦ φ ) = | det φ | m/ K m,p ( π ) . (2.11) Proof:
The homogeneity property (2.10) is a direct consequence of the definitions of K m,p .In order to prove the invariance property (2.11), we assume in a first part that det φ (cid:54) = 0,and we define ˜ T := φ ( T ) √ | det φ | and ˜ π ( z ) := π ( (cid:112) | det φ | z ) = | det φ | m/ π ( z ).We now remark that the local interpolant I m − T commutes with linear change of va-riables in the sense that, when φ is an invertible linear transform,I m − T ( v ◦ φ ) = (I m − φ ( T ) v ) ◦ φ (2.12)for all continuous functions v and triangles T . Using this commutation formula we obtain e m,T ( π ◦ φ ) p = | det φ | − /p e m,φ ( T ) ( π ) p = e m, ˜ T (˜ π ) p = | det φ | m/ e m, ˜ T ( π ) p . Since the map T (cid:55)→ ˜ T is a bijection of the set of triangles onto itself, leaving thearea invariant, we obtain the relation (2.11) when φ is invertible. When det φ = 0, the8 Chapter 2. Sharp asymptotics of the L p interpolation error polynomial π ◦ φ can be written ( αx + βy ) m so that K m,p ( π ◦ φ ) = 0 by Proposition 2.2.1. (cid:5) The functions K m,p are not necessarily continuous, but the following properties willbe sufficient for our purposes. Proposition 2.2.4.
The function K m,p is upper semi-continuous in general and conti-nuous if m = 2 or m is odd. Moreover, the following property holds :If π n → π and K m,p ( π n ) → , then K m,p ( π ) = 0 . (2.13) Proof:
The upper semi-continuity property comes from the fact that the infimum ofa family of upper semi-continuous functions is an upper semi-continuous function. Weapply this fact to the functions π (cid:55)→ e m,T ( π ) p indexed by triangles which are obviouslycontinuous.For any polynomial π ∈ IH , π = ax + 2 bxy + cy , we define det π = ac − b . It willbe shown in § K ,p ( π ) = σ p (cid:112) | det π | , where σ p only depends on the sign of det π .This clearly implies the continuity of K ,p . We next turn to the proof of the continuity of K m,p for odd m . Consider a polynomial π ∈ IH m . If K m,p ( π ) = 0, then the upper semi-continuity of K m,p , combined with its nonnegativity, implies that it is continuous at π .Otherwise, assume that K m,p ( π ) >
0. Consider a sequence π n ∈ IH m converging to π anda sequence φ n of linear transformations satisfying det φ n = 1, and such thatlim n → + ∞ e φ n ( T eq ) ( π n ) = lim inf π ∗ → π K m,p ( π ∗ ) := lim r → inf (cid:107) π ∗ − π (cid:107)≤ r K m,p ( π ∗ ) . If the sequence φ n admits a converging subsequence φ n k → φ , it follows that K m,p ( π ) ≤ e φ ( T eq ) ( π ) = lim k → + ∞ e φ nk ( T eq ) ( π n k ) = lim inf π ∗ → π K m,p ( π ∗ ) . This asserts that K m,p is lower semi-continuous at π , and therefore continuous at π sincewe already know that K m,p is upper semi-continuous.If φ n does not admit any converging subsequence, then we invoke the Singular ValueDecomposition (SVD) φ n = U n ◦ D n ◦ V n , where U n , V n ∈ O and D n = diag( ε n , ε n ), where0 < ε n ≤
1. (Here and below, we use the shorthand diag( a, b ) to denote the diagonalmatrix with entries a and b .) The compactness of O implies that U n admits a convergingsubsequence U n k → U . In particular, π n k ◦ U n k converges to π ◦ U . Therefore, denoting by a i,n the coefficient of x i y m − i in π n ◦ U n , the subsequence a i,n k converges to the coefficient a i of x i y m − i in π ◦ U . Observe also that ε n → φ n . Since e φ n ( T eq ) ( π n ) = e T eq ( π n ◦ φ n ), the sequence of polynomials π n ◦ φ n is uniformly bounded and so is the sequence π n ◦ U n ◦ D n . Therefore, the sequences( a i,n ε i − mn ) n ≥ are uniformly bounded. It follows that a i = 0 when i < m . Since m is odd,this implies that π ◦ U ( x, y ) = x s m ˜ π ( x, y ), and Proposition 2.2.1 implies that K m,p ( π ) = 0,which contradicts the hypothesis K m,p ( π ) > K m,p ( π n ) → T n = φ n ( T eq ) with det φ n = 1 such that e m,T n ( π n ) p →
0. Reasoningin a similar way as in the proof of Proposition 2.2.1, we first obtain that π n ◦ φ n →
0, andwe then invoke the SVD decomposition of φ n to build a converging sequence of orthogonal .2. The shape function U n → U and a sequence 0 < ε n ≤ a i,n is the coefficient of x i y m − i in π n ◦ U n , we have a i,n ε i − mn →
0. When i < s m , it follows that a i,n →
0, and therefore π ◦ U ( x, y ) = x s m ˆ π ( x, y ). The result follows from Proposition 2.2.1. (cid:5) We finally make a connection between the shape function and the approach developedin [23]. For all π ∈ IH m , we denote by Λ π the level set of | π | for the value 1 :Λ π = { ( x, y ) ∈ R , | π ( x, y ) | ≤ } . (2.14)We now define K E m ( π ) = (cid:18) sup E ∈E , E ⊂ Λ π | E | / π (cid:19) − m/ , (2.15)where the supremum is taken over the set E of all ellipses centered at 0. We use a boldfont to denote the numerical constant π = 3 . ... The optimization among ellipsesdefining K E m can be rephrased as an optimization on the cone S +2 of 2 × K E m ( π ) = inf { (det M ) m ; M ∈ S +2 and ∀ z ∈ R , (cid:104) M z, z (cid:105) ≥ | π ( z ) | /m } . (2.16)The minimizing ellipse E ∗ is then given by {(cid:104) M z, z (cid:105) ≤ } . The optimization problemdescribed in (2.16) is quadratic in dimension 2 and subject to (infinitely many) linearconstraints. This apparent simplicity is counterbalanced by the fact that it is noncon-vex. In particular, it does not have unique solutions and may also have no solution. Weconstruct in Chapter 6, § K ( α ) of K E which is defined by by a well posedoptimization problem, for which the optimal matrix is unique and continuously dependson the parameter π . Proposition 2.2.5.
On IH m , one has the equivalence cK E m ≤ K m,p ≤ CK E m , with constant < c ≤ C independent of p . Proof:
We consider a fixed triangle T ∗ of unit area, for instance an equilateral triangle,for each exponent 1 ≤ p ≤ ∞ on IH m we define a norm (cid:107) · (cid:107) p on IH m as follows (cid:107) π (cid:107) p := (cid:107) π − I m − T ∗ π (cid:107) L p ( T ∗ ) , in such way that K m,p ( π ) = inf | det φ | =1 (cid:107) π ◦ φ (cid:107) p , (2.17)where the infimum is taken among the collection of linear changes of variables φ satisfying | det φ | = 1. Using (2.16) and the homogeneity of π we obtain a similar expression for theshape function K E m based on ellipses K E m ( π ) = inf | det φ | =1 (cid:107) π ◦ φ (cid:107) . (2.18)0 Chapter 2. Sharp asymptotics of the L p interpolation error Indeed if φ ∈ GL d satisfies | det φ | = 1, then M := (cid:107) π ◦ φ (cid:107) m ( φ − ) T φ − satisfies (det M ) m = (cid:107) π ◦ φ (cid:107) and (cid:104) M z, z (cid:105) ≥ | π ( z ) | m . Conversely to any M ∈ S +2 we associate φ = M − (det M ) which satisfies det φ = 1.Since the vector space IH m has finite dimension there exists C ≥ c > π ∈ IH m and any exponent 1 ≤ p ≤ ∞ one has c (cid:107) π (cid:107) ≤ (cid:107) π (cid:107) ≤ (cid:107) π (cid:107) p ≤ (cid:107) π (cid:107) ∞ ≤ C (cid:107) π (cid:107) . Combining this with (2.17) and (2.18) we obtain the announced result. (cid:5)
Remark 2.2.6.
Since K m,p and K E m are equivalent, they vanish on the same set, andtherefore Proposition 2.2.1 is also valid for K E m . It also easy to see that K E m satisfies thehomogeneity and invariance properties stated for K m,p in (2.10) and (2.11), as well as thecontinuity properties stated in Proposition 2.2.4. Remark 2.2.7.
The continuity of the functions K m,p and K E m can be established when m isodd or equal to , as shown by Proposition 2.2.4, but seems to fail otherwise. In particular,direct computation shows that K E ( x y − εy ) is independent of ε > and strictly smallerthan K E ( x y ) . Therefore, K E is upper semi-continuous but discontinuous at the point x y ∈ IH . This section is devoted to the proofs of our main theorems, starting with the lowerestimate of Theorem 2.1.2, and continuing with the upper estimates involved in bothTheorem 2.1.1 and 2.1.2.Throughout this section, for the sake of notational simplicity, we fix the parameters m and p and use the shorthand K = K m,p and e T ( π ) = e m,T ( π ) p . For each point z ∈ Ω we define π z := d m f z m ! ∈ IH m , where f ∈ C m (Ω) is the function in the statement of the theorems. We denote by ω ( r ) := sup (cid:107) z − z (cid:48) (cid:107)≤ r (cid:107) π z − π z (cid:48) (cid:107) the modulus of continuity of z (cid:55)→ π z with the norm (cid:107) · (cid:107) defined by (2.8). Note that ω ( r ) → r → .3. Optimal estimates In this proof we will use an estimate from below of the local interpolation error.
Proposition 2.3.1.
Assume that ≤ p < ∞ . There exists a constant C > , dependingon f and Ω , such that for all triangles T ⊂ Ω and z ∈ T , e T ( f ) p ≥ K p ( π z ) | T | mp +1 − C (diam T ) mp | T | ω (diam T ) . (2.19) Proof:
Denoting by µ z ∈ IP m the Taylor development of f at the point z up to degree m , we obtain f ( z + u ) − µ z ( z + u ) = m (cid:90) t =0 ( π z + tu ( u ) − π z ( u ))(1 − t ) m − dt, and therefore (cid:107) f − µ z (cid:107) L ∞ ( T ) ≤ C diam( T ) m ω (diam( T )) , where C is a fixed constant. By construction π z is the homogenous part of µ z of degree m , and therefore µ z − π z ∈ IP m − . It follows that for any triangle T , we have µ z − I m − T µ z = π z − I m − T π z . (2.20)We therefore obtain | e T ( f ) − e T ( π z ) | ≤ (cid:107) ( f − I m − T f ) − ( π z − I m − T π z ) (cid:107) L p ( T ) ≤ | T | /p (cid:107) ( f − I m − T f ) − ( µ z − I m − T µ z ) (cid:107) L ∞ ( T ) = | T | /p (cid:107) ( I − I m − T )( f − µ z ) (cid:107) L ∞ ( T ) ≤ C | T | /p (cid:107) f − µ z (cid:107) L ∞ ( T ) ≤ C C | T | /p diam( T ) m ω (diam( T )) , where C is the norm of the operator I − I m − T : C ( T ) → C ( T ) in L ∞ ( T ) norm which isindependent of T .From (2.3) we know that e T ( π z ) ≥ | T | m + p K ( π z ), and therefore e T ( f ) ≥ K ( π z ) | T | m + p − C C | T | /p diam( T ) m ω (diam( T )) . We now remark that for all p ∈ [1 , ∞ ) the function r (cid:55)→ r p is convex, and therefore if a, b, c are positive numbers, and a ≥ b − c , then a p ≥ max { , b − c } p ≥ b p − pcb p − . Applyingthis to our last inequality we obtain e T ( f ) p ≥ K p ( π z ) | T | mp +1 − pC C ( K ( π z )) p − | T | ( p − m + p )+ p diam( T ) m ω (diam T ) . Since | T | ( p − m + p )+ p = | T | ( p − m | T | ≤ (diam T ) m ( p − | T | , this leads to e T ( f ) p ≥ K p ( π z ) | T | mp +1 − C (diam T ) mp | T | ω (diam T ) , where C := pC C (sup z ∈ Ω K ( π z )) p − . (cid:5) Chapter 2. Sharp asymptotics of the L p interpolation error We now turn to the proof of the lower estimate in Theorem 2.1.2 in the case where p < ∞ . Consider a sequence ( T N ) N ≥ N of triangulations which is admissible in the senseof equation (2.6). Therefore, there exists a constant C A such thatdiam T ≤ C A N − / , N ≥ N , T ∈ T N . For T ∈ T N , we combine this estimate with (2.19), which gives e T ( f ) p ≥ K p ( π z ) | T | mp +1 − ( C A N − / ) mp | T | Cω ( C A N − / ) . Averaging over T , we obtain e T ( f ) p ≥ (cid:90) T K p ( π z ) | T | mp dz − | T | N − mp C mpA Cω ( C A N − / ) . Summing on all T ∈ T N , and denoting by T Nz the triangle in T N containing the point z ∈ Ω, we obtain the estimate e T N ( f ) p ≥ (cid:90) Ω K ( π z ) | T Nz | mp dz − N − mp ε ( N ) , (2.21)where ε ( N ) := | Ω | C mpA Cω ( C A N − / ) → N → + ∞ . The function z (cid:55)→ | T Nz | is linkedwith the number of triangles in the following way : (cid:90) Ω dz | T Nz | = (cid:88) T ∈T N (cid:90) T | T | ≤ N. On the other hand, with τ = m + p , we have by H¨older’s inequality, (cid:90) Ω K τ ( π z ) dz ≤ (cid:18)(cid:90) Ω K p ( π z ) | T Nz | mp dz (cid:19) τ/p (cid:18)(cid:90) Ω | T Nz | dz (cid:19) − τ/p . (2.22)Combining the above, we obtain a lower bound for the integral term in (2.21) which isindependent of T N : (cid:90) Ω K p ( π z ) | T Nz | mp dz ≥ (cid:18)(cid:90) Ω K τ ( π z ) dz (cid:19) p/τ N − mp/ . Inserting this lower bound into (2.21) we obtain e T N ( f ) p ≥ (cid:34)(cid:18)(cid:90) Ω K τ ( π z ) dz (cid:19) p/τ − ε ( N ) (cid:35) N − mp/ . This allows us to concludelim inf N → + ∞ N m e T N ( f ) ≥ (cid:18)(cid:90) Ω K τ ( π z ) dz (cid:19) τ , (2.23) .3. Optimal estimates p = ∞ follows the same ideas. Adapting Proposition 2.3.1, one proves that e T ( f ) ≥ K ( π z ) | T | m − C (diam T ) m ω (diam T ) , and therefore e T N ( f ) ≥ (cid:13)(cid:13) K ( π z ) | T Nz | m (cid:13)(cid:13) L ∞ (Ω) − N − m ε ( N ) , (2.24)where ε ( N ) := C mA Cω ( C A N − ) → N → + ∞ . The H¨older inequality now reads : (cid:90) Ω K ( π z ) m dz ≤ (cid:13)(cid:13)(cid:13) K ( π z ) m | T Nz | (cid:13)(cid:13)(cid:13) L ∞ (Ω) (cid:13)(cid:13)(cid:13)(cid:13) | T Nz | (cid:13)(cid:13)(cid:13)(cid:13) L (Ω) , equivalently, (cid:13)(cid:13) K ( π z ) | T Nz | m (cid:13)(cid:13) L ∞ (Ω) ≥ (cid:18)(cid:90) Ω K ( π z ) m dz (cid:19) m N − m . Combining this with (2.24), we obtain to the desired estimate (2.23) with p = ∞ and τ = m . Remark 2.3.2.
This proof reveals the two principles which characterize the optimal trian-gulations. Indeed, the lower estimate (2.23) becomes an equality only when both inequalitiesin (2.19) and (2.22) are equality. The first condition - equality in (2.19) - is met when eachtriangle T has an optimal shape, in the sense that e T ( π z ) = K ( π z ) | T | m + p for some z ∈ T .The second condition - equality in (2.22) - is met when the ratio between K p ( π z ) | T Nz | mp and | T Nz | − is constant, or equivalently K ( π z ) | T | m + p is independent of the triangle T .Combined with the first condition, this means that the error e T ( f ) p is equidistributed overthe triangles, up to the perturbation by (diam T ) mp | T | ω (diam T ) which becomes negligibleas N grows. We first remark that the upper estimate in Theorem 2.1.2. implies the upper estimate inTheorem 2.1.1 by a sub-sequence extraction argument : if the upper estimate in Theorem2.1.2 holds, then for all ε > T εN ) N>N , T N ) ≤ N , such thatlim sup N → + ∞ (cid:16) N m e T εN ( f ) (cid:17) ≤ (cid:13)(cid:13)(cid:13)(cid:13) K (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ + ε, with τ = p + m . We then choose a sequence ( ε N ) N ≥ N such that N m e T εNN ( f ) ≤ (cid:13)(cid:13)(cid:13)(cid:13) K (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ + 2 ε N for all N ≥ N , and ε N → N → ∞ . Defining T N := T ε N N we thus obtainlim sup N → + ∞ (cid:16) N m e T N ( f ) (cid:17) ≤ (cid:13)(cid:13)(cid:13)(cid:13) K (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ Chapter 2. Sharp asymptotics of the L p interpolation error which concludes the proof of Theorem 2.1.1. We are thus left with proving the upperestimate in Theorem 2.1.2. We begin by fixing a (large) number M >
0. We shall takethe limit M → ∞ in the very last step of our proof. We define T M = { T triangle ; | T | = 1 , bary( T ) = 0 and diam( T ) ≤ M } , the set of triangles centered at the origin, of unit area and diameter smaller than M .This set is compact with respect to the Hausdorff distance. This allows us to define a“tempered” version of K = K m,p that we denote by K M : for all π ∈ IH m K M ( π ) := inf T ∈ T M e T ( π ) . Since T M is compact, the above infimum is attained on at least a triangle, that we denoteby T M ( π ). Note that the map π (cid:55)→ T M ( π ) need not be continuous. It is clear that K M ( π )decreases as M grows. Note also that the restriction to triangles T centered at 0 is artificial,since the error is invariant by translation as noticed in (2.9). Therefore, K M ( π ) convergesto K ( π ) as M → + ∞ . Since T M is compact, the map π (cid:55)→ max T ∈ T M e T ( π ) defines a normon IH m , and is therefore bounded by C M (cid:107) π (cid:107) for some C M >
0. One easily sees that thefunctions π (cid:55)→ e T ( π ) are uniformly C M -Lipschitz for all T ∈ T M , and so is K M .We now use this new function K M to obtain a local upper error estimate that is closelyrelated to the local lower estimate in Proposition 2.3.1. Proposition 2.3.3.
For z ∈ Ω , let T be a triangle which is obtained from T M ( π z ) byrescaling and translation ( T = tT M ( π z ) + z ). Then for any z ∈ T , e T ( f ) ≤ (cid:16) K M ( π z ) + B M ω (max {| z − z | , diam( T ) } ) (cid:17) | T | m + p , (2.25) where B M > is a constant which depends on M . Proof:
For all z , z ∈ Ω, we have e T M ( π z ) ( π z ) ≤ e T M ( π z ) ( π z ) + C M (cid:107) π z − π z (cid:107) = K M ( π z ) + C M (cid:107) π z − π z (cid:107) , ≤ K M ( π z ) + 2 C M (cid:107) π z − π z (cid:107) , ≤ K M ( π z ) + 2 C M ω ( | z − z | ) . Therefore, if T is of the form T = tT M ( π z ) + z , we obtain by a change of variable that e T ( π z ) ≤ (cid:16) K M ( π z ) + 2 C M ω ( | z − z | ) (cid:17) | T | m + p . Let µ z ∈ IP m be the Taylor polynomial of f at the point z up to degree m . Using (2.20)we obtain e T ( f ) ≤ e T ( µ z ) + e T ( f − µ z )= e T ( π z ) + e T ( f − µ z ) ≤ (cid:16) K M ( π z ) + 2 C M ω ( | z − z | ) (cid:17) | T | m + p + e T ( f − µ z ) . .3. Optimal estimates e T ( f − µ z ) ≤ C | T | p diam( T ) m ω (diam T ) , and thus e T ( f ) ≤ (cid:16) K M ( π z ) + 2 C M ω ( | z − z | ) (cid:17) | T | m + p + C | T | p diam( T ) m ω (diam T ) . Since T is the scaled version of a triangle in T M , it obeys diam( T ) ≤ M | T | . Therefore, e T ( f ) ≤ ( K M ( π z ) + (2 C M + CM m ) ω (max {| z − z | , diam( T ) } )) | T | m + p , which is the desired inequality with B M := 2 C M + CM m . (cid:5) For some r > R of Ω satisfying r ≥ sup R ∈R diam( R ) . Our strategy to build a triangulation that satisfies the optimal upper estimate is to usethe triangles R as macro-elements in the sense that each of them will be tiled by a locallyoptimal uniform triangulation. This strategy was already used in [4].For all R ∈ R we consider the triangle T R := ( K M ( π b R ) + 2 B M ω ( r )) − τ T M ( π b R ) , which is a scaled version of T M ( π b R ) where b R is the barycenter of R . We use this triangleto build a periodic tiling P R of the plane : there exists c ∈ R such that T R ∪ T (cid:48) R forms aparallelogram of side vectors a and b , with T (cid:48) R = c − T R . We then define P R := { T R + ma + nb ; m, n ∈ ZZ } ∪ { T (cid:48) R + ma + nb ; m, n ∈ ZZ } . (2.26)Observe that for all π ∈ IH m and all triangles T, T (cid:48) such that T (cid:48) = − T , one has e T ( π ) = e T (cid:48) ( π ), since π is either an even polynomial when m is an even integer, or an oddpolynomial when m is odd. Since we already know that e T ( π ) is invariant by translationof T , we find that the local error e T ( π ) is constant on all T ∈ P R .We now define as follows a family of triangulations T s of the domain Ω, for s > R ∈ R , we consider the elements T ∩ R for T ∈ s P R , where s P R denotes thetriangulation P R scaled by the factor s . Clearly, { T ∩ R, T ∈ s P R , R ∈ R and int( T ∩ R ) (cid:54) = ∅} constitute a partition of Ω, up to a set of zero Lebesgue measure. In this partition,we distinguish the interior elements T reg s := { T ∈ s P R ; T ⊂ int( R ) , R ∈ R} , which define pieces of a conforming triangulation, and the boundary elements T ∩ R for T ∈ s P R such that int( T ) ∩ ∂R (cid:54) = ∅ . These last elements might not be triangular,nor conformal with the elements on the other side. Note that for s > R ∈ R contains at least one triangle in T reg s , and therefore the boundary elements6 Chapter 2. Sharp asymptotics of the L p interpolation error Figure R separating to uniformlypaved regions ( T R is thick, P R is dashed). b. Additional edges (dashed) are added nearthe interface in order to preserve conformity. c. The sets of triangles T reg s (gray) and T bd s (white)constitute a layer around the edges of R . In order to obtain a conforming triangulation,we proceed as follows : for each boundary element T ∩ R , we consider the points on itsboundary which are either its vertices or those of a neighboring element. We then buildthe Delaunay triangulation of these points, which is a triangulation of T ∩ R since it is aconvex set. We denote by T bd s the set of all triangles obtained by this procedure, whichis illustrated in Figure 2.1.Our conforming triangulation is given by T s = T reg s ∪ T bd s . As s →
0, clearly, T bd s ) ≤ C bd s − and (cid:88) T ∈T bd s | T | ≤ C bd s for some constant C bd which depends on the macro-triangulation R . We do not need toestimate C bd since R is fixed and the contribution due to C bd in the following estimates isnegligible as s →
0. We therefore obtain that the number of triangles in T bd s is dominatedby the number of triangles in T reg s . More precisely, we have the equivalence T s ) ∼ T reg s ) ∼ (cid:88) R ∈R | R | s | T R | = s − (cid:88) R ∈R | R | ( K M ( π b R ) + 2 B M ω ( r )) τ , (2.27)in the sense that the ratio between the above quantities tends to 1 as s →
0. The right-hand side in (2.27) can be estimated through an integral : s T reg s ) ≤ (cid:88) R ∈R | R | ( K M ( π b R ) + 2 B M ω ( r )) τ = (cid:88) R ∈R (cid:90) R ( K M ( π b R ) + 2 B M ω ( r )) τ dz ≤ (cid:88) R ∈R (cid:90) R ( K M ( π z ) + C M (cid:107) π z − π b R (cid:107) + 2 B M ω ( r )) τ dz ≤ (cid:90) Ω ( K M ( π z ) + (2 B M + C M ) ω ( r )) τ dz. .3. Optimal estimates C M ≤ B M , T s ) ≤ s − (cid:18)(cid:90) Ω ( K M ( π z ) + 3 B M ω ( r )) τ dz + C bd s (cid:19) . (2.28)Observe that the construction of T s gives a bound on the diameter of its elementssup T ∈T s diam( T ) ≤ sC a , C a := max R ∈R diam( T R ) . Combining this with (2.27), we obtain thatsup T ∈T s diam( T ) ≤ C A T s ) − / for all s > , which is analogous to the admissibility condition (2.6).We now estimate the global interpolation error (cid:107) f − I m − T s f (cid:107) L p := ( (cid:80) T ∈T s e T ( f ) p ) p , assu-ming first that 1 ≤ p < ∞ . We first estimate the contribution of T bd s , which will eventuallybe negligible. Denoting by ν z ∈ IP m − the Taylor polynomial of f up to degree m − z , we remark that (cid:107) f − I m − T f (cid:107) L ∞ ( T ) = (cid:107) ( I − I m − T )( f − ν b T ) (cid:107) L ∞ ( T ) ≤ C (cid:107) f − ν b T (cid:107) L ∞ ( T ) ≤ C C diam( T ) m , where C is the norm of I − I m − T in L ∞ ( T ) which is independent of T , and C onlydepends on the L ∞ norm of d m f . Remarking that e T ( f ) = (cid:107) f − I m − T f (cid:107) L p ( T ) ≤ | T | p (cid:107) f − I m − T f (cid:107) L ∞ ( T ) , we obtain an upper bound for the contribution of T bd s to the error : (cid:88) T ∈T bd s e T ( f ) p ≤ C p C p (cid:88) T ∈T bd s | T | diam( T ) mp ≤ C p C p (cid:88) T ∈T bd s | T | sup T ∈T bd s diam( T ) mp ≤ C p C p C bd s sup T ∈T bd s diam( T ) mp ≤ C ∗ bd s mp +1 , with C ∗ bd = C p C p C mpa C bd . We next turn to the the contribution of T reg s to the error. If T ∈ T reg s , T ⊂ R ∈ R , we consider any point z = z ∈ T and define z = b R the barycenterof R . With such choices, the estimate (2.25) reads e T ( f ) ≤ ( K M ( π z ) + B M ω (max { r, C A s } )) | T | m + p . We now assume that s is chosen small enough such that C A s ≤ r . Geometrically, thiscondition ensures that the “micro-triangles” constituting T s actually have a smaller dia-meter than the “macro-triangles” constituting R . This implies e T ( f ) p ≤ ( K M ( π z ) + B M ω ( r )) p | T | mp +1 . (2.29)8 Chapter 2. Sharp asymptotics of the L p interpolation error Given a triangle T ∈ T reg s , T ⊂ R ∈ R , and a point z ∈ T , one has | T | = s ( K M ( π b R ) + 2 B M ω ( r )) − τ ≤ s ( K M ( π z ) − C M (cid:107) π z − π b R (cid:107) + 2 B M ω ( r )) − τ ≤ s ( K M ( π z ) + (2 B M − C M ) ω ( r )) − τ . Observing that B M ≥ C M , and that p − τ mp = τ , we insert the above inequality into theestimate (2.29), which yields e T ( f ) p ≤ s mp ( K M ( π z ) + B M ω ( r )) τ | T | . Averaging on z ∈ T , we obtain e T ( f ) p ≤ s mp (cid:90) T ( K M ( π z ) + B M ω ( r )) τ dz. Adding up contributions from all triangles in T s , we find e T s ( f ) p = (cid:88) T ∈T reg s e T ( f ) p + (cid:88) T ∈T bd s e T ( f ) p ≤ s mp (cid:90) Ω ( K M ( π z ) + B M ω ( r )) τ dz + C ∗ bd s mp +1 . Combining this with the estimate (2.28) we obtain e T s T s ) m ≤ (cid:18)(cid:90) Ω ( K M ( π z ) + B M ω ( r )) τ dz + C ∗ bd s (cid:19) p (cid:18)(cid:90) Ω ( K M ( π z ) + 3 B M ω ( r )) τ dz + C bd s (cid:19) m , and therefore, since τ = m + p ,lim sup s → (cid:16) T s ) m e T s (cid:17) ≤ (cid:18)(cid:90) Ω ( K M ( π z ) + 3 B M ω ( r )) τ dz (cid:19) τ . It is now time to observe that for any fixed M ,lim r → (cid:90) Ω ( K M ( π z ) + 3 B M ω ( r )) τ dz = (cid:90) Ω K τM ( π z ) dz, and that, since K M ( π ) converges decreasingly to K ( π ) := K m,p ( π ) for any π ∈ IH m lim M → + ∞ (cid:90) Ω K τM ( π z ) dz = (cid:90) Ω K τ ( π z ) dz. Therefore, for all ε >
0, we can choose M sufficiently large and r sufficiently small, suchthat lim sup s → (cid:16) T s ) m e T s (cid:17) ≤ (cid:16)(cid:90) Ω K τ ( π z ) dz (cid:17) τ + ε. This gives us the previously mentionned statement of Theorem 2.1.2, by defining s N := min { s > T s ) ≤ N } , .3. Optimal estimates T N = T s N .The adaptation of the above proof in the case p = ∞ is not straightforward due tothe fact that the contribution to the error of T bd s is no longer negligible with respect tothe contribution of T reg s . For this reason, one needs to modify the construction of T bd s .Here, we provide a simple construction but for which the resulting triangulation T s isnonconforming, as we do not know how to produce a satisfying conforming triangulation.More precisely, we define T reg s in a similar way as for p < ∞ , and add to the construc-tion of T bd s a post-processing step in which each triangle is split into 4 j similar trianglesaccording to the midpoint rule. Here we take for j the smallest integer which is largerthan − log s . With such an additional splitting, we thus havemax T ∈T bd s diam( T ) ≤ s max R ∈R diam( sT R ) = C a s . The contribution of T bd s to the L ∞ interpolation error is bounded by e T bd s ( f ) ≤ C C max T ∈T bd s diam( T ) m ≤ C ∗ bd s m , with C ∗ bd := C C C ma . We also have T bd s ) ≤ C bd s − / , which remains negligible compared to s − . We therefore obtain T s ) ≤ s − (cid:18)(cid:90) Ω ( K M ( π z ) + 3 B M ω ( r )) m dz + C bd s / (cid:19) . (2.30)Moreover, if T ∈ T reg s and T ⊂ R ∈ R , we have according to the estimate (2.25), e T ( f ) ≤ ( K M ( π b R ) + B M ω (max { r, C A s } )) | T | m . By construction | T | = s ( K M ( π b R ) + 2 B M ω ( r )) − /m . This implies e T ( f ) ≤ s m when C A s ≤ r . Therefore, e T s ( f ) = max { e T reg s , e T bd s } ≤ s m max { , C ∗ bd s m } . Combining this estimate with (2.30) yieldslim sup s → (cid:16) T s ) m e T s (cid:17) ≤ (cid:18)(cid:90) Ω ( K M ( π z ) + 3 B M ω ( r )) m dz (cid:19) m , and we conclude the proof in a similar way as for p < ∞ .0 Chapter 2. Sharp asymptotics of the L p interpolation error This section is devoted to linear ( m = 2) and quadratic ( m = 3) elements, whichare the most commonly used in practice. In these two cases, we are able to derive anexact expression for K m,p ( π ) in terms of the coefficients of π . Our analysis also gives usaccess to the distorted metric which characterizes the optimal mesh. While the resultsconcerning linear elements have strong similarities with those of [4], those concerningquadratic elements are to our knowledge the first of this kind, although [24] analyzes asimilar setting. In order to give the exact expression of K m,p , we define the determinant of a homoge-neous quadratic polynomial bydet( ax + 2 bxy + cy ) = ac − b , and the discriminant of a homogeneous cubic polynomial bydisc( ax + bx y + cxy + dy ) = b c − ac − b d + 18 abcd − a d . The functions det on IH and disc on IH are homogeneous in the sense thatdet( λπ ) = λ det π, disc( λπ ) = λ disc π. (2.31)Moreover, it is well known that they obey an invariance property with respect to linearchanges of coordinates φ :det( π ◦ φ ) = (det φ ) det π, disc( π ◦ φ ) = (det φ ) disc π. (2.32)Our main result relates K m,p to these quantities. Theorem 2.4.1.
We have for all π ∈ IH , K ,p ( π ) = σ p (det π ) (cid:112) | det π | , and for all π ∈ IH , K ,p ( π ) = σ ∗ p (disc π ) (cid:112) | disc π | , where σ p ( t ) and σ ∗ p ( t ) are constants that only depend on the sign of t . The proof of Theorem 2.4.1 relies on the possibility of mapping an arbitrary polynomial π ∈ IH such that det( π ) (cid:54) = 0 or π ∈ IH such that disc( π ) (cid:54) = 0 onto two fixed polynomials π − or π + by a linear change of variable and a sign change.In the case of IH , it is well known that we can choose π − = x − y and π + = x + y .More precisely, to all π ∈ H , we associate a symmetric matrix Q π such that π ( z ) = (cid:104) Q π z, z (cid:105) . This matrix can be diagonalized according to Q π = U T (cid:18) λ λ (cid:19) U, U ∈ O , λ , λ ∈ R . .4. The shape function and the optimal metric for linear and quadratic elements φ π := U T (cid:18) | λ | − | λ | − (cid:19) and λ π = sign( λ ) ∈ {− , } , it is readily seen that λ π π ◦ φ π = (cid:26) x + y if det π > ,x − y if det π < . In the case of IH , a similar result holds, as shown by the following lemma : Lemma 2.4.2.
Let π ∈ IH be such that disc π (cid:54) = 0 . There exists a linear transform φ π such that π ◦ φ π = (cid:26) x ( x − y ) if disc π > ,x ( x + 3 y ) if disc π < . (2.33) Proof:
Let us first assume that π is not divisible by y so that it can be factorized as π = λ ( x − r y )( x − r y )( x − r y ) , with λ ∈ R and r i ∈ C || . If disc π >
0, then the r i are real and we may assume r < r < r .Then defining φ π = λ (2 disc π ) − / (cid:18) r ( r + r ) − r r ( r − r ) r √ r − ( r + r ) ( r − r ) √ , (cid:19) . an elementary computation shows that π ◦ φ π = x ( x − y ). If disc π <
0, then wemay assume that r is real and r and r are complex conjugates with Im( r ) >
0. Thendefining φ π = λ (2 disc π ) − / (cid:18) r ( r + r ) − r r i ( r − r ) r √ r − ( r + r ) i ( r − r ) √ (cid:19) , an elementary computation shows that π ◦ φ π = x ( x + 3 y ). Moreover, it is easily checkedthat φ π has real entries and is therefore a change of variable in R .In the case where π is divisible by y , there exists a rotation U ∈ O such that ˜ π := π ◦ U is not divisible by y . By the invariance property (2.32) we know that disc π = disc ˜ π . Thus,we reach the same conclusion with the choice φ π := U ◦ φ ˜ π . (cid:5) Proof of Theorem 2.4.1
For all π ∈ IH such that det π (cid:54) = 0 and for all change ofvariable φ and λ (cid:54) = 0, we may combine the properties of the determinant in (2.31) and(2.32) with those of the shape function established in Proposition 2.2.3. This gives us K ,p ( π ) (cid:112) | det π | = K ,p ( λπ ◦ φ ) (cid:112) | det( λπ ◦ φ ) | . Applying this with φ = φ π and λ = λ π , we therefore obtain K ,p ( π ) = (cid:112) | det π | (cid:26) K ,p ( x + y ) if det π > ,K ,p ( x − y ) if det π < . Chapter 2. Sharp asymptotics of the L p interpolation error This gives the desired result with σ p ( t ) = K ,p ( x + y ) for t > σ p ( t ) = K ,p ( x − y )for t <
0. In the case where det π = 0, π is of the form π ( x, y ) = λ ( αx + βy ) , and weconclude by Proposition 2.2.1 that K ,p ( π ) = 0.For all π ∈ IH such that disc π (cid:54) = 0, a similar reasoning yields K ,p ( π ) = (cid:112) | disc π | − (cid:26) K ,p ( x ( x − y )) if disc π > ,K ,p ( x ( x + 3 y )) if disc π < , where the constant 108 comes from the fact that disc( x ( x − y )) = − disc( x ( x − y )) =108. This gives the desired result with σ ∗ p ( t ) = 108 − K ,p ( x ( x − y )) for t > σ ∗ p ( t ) = 108 − K ,p ( x ( x + 3 y )) for t <
0. In the case where disc π = 0, π is of the form π ( x, y ) = ( αx + βy ) ( γx + δy ), and we conclude by Proposition 2.2.1 that K ,p ( π ) = 0. (cid:5) Remark 2.4.3.
We do not know any simple analytical expression for the constants in-volved in σ p and σ ∗ p , but these can be found by numerical optimization. These constantsare known for some special values of p in the case m = 2 , see for example [4]. Practical mesh generation techniques such as in [15, 19, 66, 92, 94] are based on thedata of a Riemannian metric, by which we mean a field h of symmetric definite positivematrices x ∈ Ω (cid:55)→ h ( x ) ∈ S +2 . Typically, the mesh generator takes the metric h as an input and hopefully returns atriangulation T h adapted to it in the sense that all triangles are close to equilateral of unitside length with respect to this metric. Recently, it has been rigorously proved in [15, 85],see also Chapter 5, that some algorithms produce bidimensional meshes obeying theseconstraints under certain conditions. This must be contrasted with algorithms based onheuristics, such as [92] in two dimensions and [94] in three dimensions, which have beenavailable for some time and offer good performance [20] but no theoretical guaranties.See [56] for a review of these mesh generation techniques.For a given function f to be approximated, the field of metrics given as input shouldbe such that the local errors are equidistributed and the aspect ratios are optimal for thegenerated triangulation. Assuming that the error is measured in X = L p and that we areusing finite elements of degree m −
1, we can construct this metric as follows, providedthat some estimate of π z = d m f ( z ) m ! is available for all points z ∈ Ω. An ellipse E z such that | E z | is equal or close to sup E ∈E ,E ⊂ Λ πz | E | (2.34)is computed, where Λ π z is defined as in (2.14). We denote by h π z ∈ S +2 the associatedsymmetric definite positive matrix such that E z = (cid:8) ( x, y ) ; ( x, y ) T h π z ( x, y ) ≤ (cid:9) . .4. The shape function and the optimal metric for linear and quadratic elements ν > L p error on each triangle, we then define the metric by rescaling h π z accordingto h ( z ) = 1 α z h π z where α z := ν pmp +2 | E z | − mp +2 . With such a rescaling, any triangle T designed by the mesh generator should be compa-rable to the ellipse z + α z E z centered around z the barycenter of T , in the sense that z + c α z E z ⊂ T ⊂ z + c α z E z (2.35)for two fixed constants 0 < c ≤ c independent of T (recall that for any ellipse E therealways exists a triangle T such that E ⊂ T ⊂ E ).Such a triangulation heuristically fulfills the desired properties of optimal aspect ratioand error equidistribution when the level of refinement is sufficiently small. Indeed, wethen have e m,T ( f ) p ≈ e m,T ( π z ) p = (cid:107) π z − I m − T π z (cid:107) L p ( T ) , ∼ | T | p (cid:107) π z − I m − T π z (cid:107) L ∞ ( T ) , ∼ | T | p (cid:107) π z (cid:107) L ∞ ( T ) , ∼ | α z E z | p (cid:107) π z (cid:107) L ∞ ( α z E z ) , = α m + p z | E z | p (cid:107) π z (cid:107) L ∞ ( E z ) , = ν, where we have used the fact that π z ∈ IH m is homogeneous of degree m .Leaving aside these heuristics on error estimation and mesh generation, we focus onthe main computational issue in the design of the metric h ( z ), namely the solution to theproblem (2.34) : to any given π ∈ IH m , we want to associate h π ∈ S +2 such that the ellipse E π defined by h π has area equal or close to sup E ∈E ,E ⊂ Λ π | E | .When m = 2 the computation of the optimal matrix h π can be done by elementaryalgebraic means. In fact, as will be recalled below, h π is simply the absolute value (in thesense of symmetric matrices) of the symmetric matrix [ π ] associated to the quadratic form π . These facts are well known and used in mesh generation algorithms for IP elements.When m ≥ h π from π has been proposed up to now,and current approaches instead consist in numerically solving the optimization problem(2.16), see [23]. Since these computations have to be done extremely frequently in themesh adaptation process, a simpler algebraic procedure is highly valuable. In this section,we propose a simple and algebraic method in the case m = 3, corresponding to quadraticelements. For purposes of comparison, the results already known in the case m = 2 arerecalled.4 Chapter 2. Sharp asymptotics of the L p interpolation error Figure π , π = x ( x − y ) or π = x ( x + 3 y ). Proposition 2.4.4.
1. Let π ∈ IH be such that det( π ) (cid:54) = 0 , and consider its associated × matrix which can be written as [ π ] = U T (cid:18) λ λ (cid:19) U, U ∈ O . Then an ellipse of maximal volume inscribed in Λ π is defined by the matrix h π = U T (cid:18) | λ | | λ | (cid:19) U.
2. Let π ∈ IH be such that disc π > , and let φ π be a matrix satisfying (2.33). Define h π = ( φ − π ) T φ − π . (2.36) Then h π defines an ellipse of maximal volume inscribed in Λ π . Moreover, det h π = − / (disc π ) .3. Let π ∈ IH be such that disc π < , and φ π a matrix satisfying (2.33). Define h π = 2 ( φ − π ) T φ − π . Then h π defines an ellipse of maximal volume inscribed in Λ π . Moreover, det h π = | disc π | . Proof:
Clearly, if the matrix h π defines an ellipse of maximal volume in the set Λ π ,then for any linear change of coordinates φ , the matrix ( φ − ) T h π φ − defines an ellipse ofmaximal volume in the set Λ π ◦ φ . When π ∈ IH , we know that λ π π ◦ φ π = x + y whendet π >
0, and x − y when det π <
0, where | λ π | = 1. When π ∈ IH , we know fromLemma 2.4.2 that π ◦ φ π = x ( x − y ) when disc π >
0, and x ( x + 3 y ) when disc π < π ∈ { x + y , x − y , x ( x − y ) } , then h π = Id, which means that the disc of radius 1 is an ellipse of maximal volume inscribedin Λ π , while when π = x ( x + 3 y ) we have h π = 2 / Id.The case π = x + y is trivial. We next concentrate on the case π = x ( x + 3 y ), thetreatment of the two other cases being very similar. Let E be an ellipse included in Λ π , .4. The shape function and the optimal metric for linear and quadratic elements π = x ( x + 3 y ). Analyzing the variations of the function π (cos θ, sin θ ), it is not hard tosee that we can rotate E into another ellipse E (cid:48) , also satisfying the inclusion E (cid:48) ⊂ Λ π ,and whose principal axes are { x = 0 } and { y = 0 } . We therefore only need to considerellipses of the form kx + hy ≤
1. For a given value of h , we denote by k ( h ) the minimalvalue of k for which this ellipse is included in Λ π . Clearly, the boundary of the ellipse,defined by k ( h ) x + hy = 1, must be tangent to the curve defined by π ( x, y ) = 1 at somepoint ( x, y ). This translates into the following system of equations : π ( x, y ) = 1 ,hx + ky = 1 ,ky∂ x π ( x, y ) − hx∂ x π ( x, y ) = 0 . (2.37)Eliminating the variables x and y from this system, as well as negative or complex-valuedsolutions, we find that k ( h ) = h h when h ∈ (0 , k ( h ) = k (2) = 1 when h ≥
2. Theminimum of the determinant hk ( h ) = (cid:0) h + h (cid:1) is attained for h = 2 . Observing that k (2 ) = 2 , we obtain, as previously stated, h π = 2 / Id and that the ellipse of largestarea included in Λ π is the disc of equation 2 / ( x + y ) ≤
1, as illustrated on Figure 2.2.b.The same reasoning applies to the other cases. For π = x − y we obtain k ( h ) = h , h ∈ (0 , ∞ ). In this case the determinant hk ( h ) is independent of h , and we simplychoose h = 1 = k (1). For π = x ( x − y ) we obtain k ( h ) = − h h when h ∈ (0 , k ( h ) = k (1) = 1 when h >
1. The maximal volume is attained when h = 1, corres-ponding to the unit disc, as illustrated on Figure 2.2.a. (cid:5) Remark 2.4.5.
When π ∈ IH and disc π > a surprising simplification happens :the matrix (2.36) has entries which are symmetric functions of the roots r , r , r . Usingthe relation between the roots and the coefficients of a polynomial, we find the followingexpression : if π = ax + 3 bx y + 3 cxy + dy , then h π = 2 − π ) − (cid:18) b − ac ) bc − adbc − ad c − bd ) (cid:19) . This yields a direct expression of the matrix as a function of the coefficients. Unfortunately,there is no such expression when disc π < . At first sight, Proposition 2.4.4 might seem to be a complete solution to the problemof building an appropriate metric for mesh generation. However, some difficulties ariseat points z ∈ Ω where det π z = 0 or disc π z = 0. If π ∈ IH \ { } and det π = 0, thenup to a linear change of coordinates, and a change of sign, we can assume that π = x .The minimization problem clearly yields the degenerate matrix h π = diag(1 , × π ∈ IH \ { } and disc π = 0, then up to a linearchange of coordinates either π = x or π = x y . In the first case the minimization problemgives again h π = diag(1 , h π = diag( ε − , ε ) with ε →
0. The minimization process therefore gives a matrix which is not only degenerate,but also unbounded.6
Chapter 2. Sharp asymptotics of the L p interpolation error Figure π (solid) and the ellipses E π,α (dashed) for various values of α > π ∈ IH .These degenerate cases appear generically and constitute a problem for mesh genera-tion since they mean that the adapted triangles are not well defined. Current anisotropicmesh generation algorithms for linear elements often solve this problem by fixing a smallparameter δ > h π := h π + δ Id which cannotdegenerate. However, this procedure cannot be extended to quadratic elements, since h x y is both degenerate and unbounded.In the theoretical construction of an optimal mesh which was discussed in § M > K M ( π ) and of the triangle T M ( π ) ofminimal interpolation error among the triangles of diameter smaller than M . We followa similar idea here, looking for the ellipse of largest area included in Λ π with constraineddiameter. This provides matrices which are both positive definite and bounded and whichvary continuously with respect to the data π ∈ IH . A similar construction is presentedin § m and dimension d , however the expressionof the matrix in terms of the polynomial π is less explicit in that general context. Theconstrained problem, depending on α >
0, is the following :sup {| E | ; E ∈ E , E ⊂ Λ π and diam E ≤ α − / } , (2.38)or equivalently,inf { det H ; H ∈ S +2 s . t . (cid:104) Hz, z (cid:105) ≥ | π ( z ) | /m , z ∈ R , and H ≥ α Id } . (2.39)We denote by E π,α and h π,α the solutions to (2.38) and (2.39) respectively. In the remainderof this section, we show that this solution can also be computed by a simple algebraicprocedure, avoiding any kind of numerical optimization. In the case where π ∈ IH , it caneasily be checked that[ h π,α ] = U T (cid:18) max {| λ | , α }
00 max {| λ | , α } , (cid:19) U, (2.40)as illustrated in Figure 2.3.When π ∈ IH , the problem is more technical, and the matrix h π,α takes different formsdepending on the value of α and the sign of disc π . In order to describe these different .4. The shape function and the optimal metric for linear and quadratic elements ≤ β π ≤ α π ≤ µ π and a matrix U π ∈ O whichare defined as follows. We first define µ π by µ − / π := min {(cid:107) z (cid:107) ; | π ( z ) | = 1 } , the radius of the largest disc D π inscribed in Λ π . For z π such that | π ( z π ) | = 1 and (cid:107) z π (cid:107) = µ − / π , we define U π as the rotation which maps z π to the vector ( (cid:107) z π (cid:107) , α π by 2 α − / π := max { diam( E ) ; E ∈ E ; D π ⊂ E ⊂ Λ π } , the diameter of the largest ellipse inscribed in Λ π and containing the disc D π . In the casewhere π is of the form ( ax + by ) , this ellipse is infinitely long, and we set α π = 0. Wefinally define β π by 2 β − / π := diam( E π ) , where E π is the optimal ellipse described in Proposition 2.4.4. In the case where disc π = 0,the “optimal ellipse” is infinitely long, and we set β π = 0. It is readily seen that 0 ≤ β π ≤ α π ≤ µ π .All these quantities can be algebraically computed from the coefficients of π by solvingequations of degree at most 4, as well as the other quantities involved in the descriptionof the optimal h π,α and E π,α in the following result. Proposition 2.4.6.
For π ∈ IH and α > , the matrix h π,α and ellipse E π,α are describedas follows :1. If α ≥ µ π , then h π,α = α Id and E α,π is the disc of radius α − / .2. If α π ≤ α ≤ µ π , then h π,α = U T π (cid:18) µ π α (cid:19) U π , (2.41) and E α is the ellipse of diameter α − / which is inscribed in Λ π and contains D π .It is tangent to ∂ Λ π at the two points z π and − z π .3. If β π ≤ α ≤ α π then E π,α is tangent to ∂ Λ π at four points and has diameter α − / .There are at most three such ellipses, and E π,α is the one of largest area. The matrix h π,α has a form which depends on the sign of disc π .(i) If disc π < , then h π,α = ( φ − π ) T (cid:32) λ α λ α λ α (cid:33) φ − π , where φ π is the matrix defined in Proposition and λ α is determined by det( h π,α − α Id) = 0 .(ii) If disc π > , then h π,α = ( φ − π ) T V T (cid:32) λ α − λ α λ α (cid:33) V φ − π , Chapter 2. Sharp asymptotics of the L p interpolation error Figure π (solid), the disc E π,µ π = D π (solid), the ellipse E π,α π (solid),and the ellipses E π,α (dashed) for various values of α > π ∈ IH and α ∈ ( α π , ∞ ).Left : disc π <
0. Right : disc π > Figure π (solid), the ellipse E π,α π (solid), the ellipse E π,β π = E π (solid),and the ellipses E π,α (dashed) for various values of α > π ∈ IH and α ∈ ( β π , α π ).Left : disc π <
0. Right : disc π > where φ π and λ α are given as in the case disc π < and where V is chosen betweenthe three rotations by , or degrees so as to maximize | E α,π | .(iii) If disc π = 0 and α π > , then there exists a linear change of coordinates φ such that π ◦ φ = x y and we have h π,α = ( φ − ) T (cid:18) λ α λ α (cid:19) φ − , where λ α is determined by det( h π,α − α Id) = 0 .4. If α ≤ β π , then h π,α = h π , and E π,α = E π is the solution of the unconstrainedproblem. Proof:
See Appendix. (cid:5)
Figure 2.4 illustrates the ellipses E π,α , α ∈ ( α π , ∞ ) when disc π > π <
0( 2.4.b). Figure 2.5 illustrates the ellipses E π,α , α ∈ ( β π , α π ) when disc π > π < α ≥ α π , the principal axes of E π,α are independentof α , since U π is a rotation that only depends on π , while these axes generally vary when β π ≤ α ≤ α π , since the matrix φ π is not a rotation. .5. Polynomial equivalents of the shape function in higher degree Remark 2.4.7.
For interpolation by cubic or higher-degree polynomials ( m ≥ ), anadditional difficulty arises that can be summarized as follows : one should be careful notto “overfit” the polynomial π with the matrix h π . An approach based on exactly solvingthe optimization problem (2.34) might indeed lead to a metric h ( z ) with unjustified strongvariations with respect to z and/or bad conditioning, and jeopardize the mesh generationprocess. As an example, consider the one-parameter family of polynomials π t = x y + ty ∈ IH , t ∈ [ − , . It can be checked that when t > , the supremum S + = sup E ∈E ,E ⊂ Λ πt | E | is finite andindependent of t , but not attained, and that any sequence E n ⊂ Λ π t of ellipses such that lim n →∞ | E n | = S + becomes infinitely elongated in the x direction as n → ∞ . For t < ,the supremum S − = sup E ∈E ,E ⊂ Λ πt | E | is independent of t and attained for the optimalellipse of equation | t | − / √ − x + | t | / y ≤ . This ellipse becomes infinitely elongatedin the y direction as t → . This example shows the instability of the optimal matrix h π with respect to small perturbations of π . However, for all values of t ∈ [ − , , theseextremely elongated ellipses could be discarded in favor, for example, of the unit disc D = { x + y ≤ } , which obviously satisfies D ⊂ Λ π t and is a near-optimal choice in thesense that | D | = S + ≤ S − = | D | (cid:113) √ . In degrees m ≥
4, we could not find analytical expressions of K m,p or K E m and donot expect them to exist. However, equivalent quantities with analytical expressions areavailable, under the same general form as in Theorem 2.4.1 : the root of a polynomial inthe coefficients of the polynomial π ∈ IH m . This result improves on the analysis of [25],where a similar setting is studied.In the following, we say that a function R is a polynomial on IH m if there exists apolynomial P of m + 1 variables such that for all ( a , · · · , a m ) ∈ R m +1 , R (cid:32) m (cid:88) i =0 a i x i y m − i (cid:33) := P ( a , · · · , a m ) , and we define deg R := deg P .The object of this section is to prove the following theorem : Theorem 2.5.1.
For all degree m ≥ , there exists a polynomial K m on IH m , and aconstant C m > such that for all π ∈ IH m and all ≤ p ≤ ∞ , C m rm (cid:112) K m ( π ) ≤ K m,p ( π ) ≤ C m rm (cid:112) K m ( π ) , where r m = deg K m . Chapter 2. Sharp asymptotics of the L p interpolation error Since for fixed m all functions K m,p , 1 ≤ p ≤ ∞ , are equivalent on IH m , there isno need to keep track of the exponent p in this section, and we use below the notation K m = K m, ∞ . In this section, the reader should not confuse the functions K m and K m , northe polynomials Q d and Q d below, which notations are only distinguished by their case.Theorem 2.5.1 is a generalization of Theorem 2.4.1, and the polynomial K m involvedshould be seen as a generalization of the determinant on IH and of the discriminant onIH . Let us immediately stress that the polynomial K m is not unique. In particular, weshall propose two constructions that lead to different K m with different degree r m . Ourfirst construction is simple and intuitive but leads to a polynomial of degree r m that growsquickly with m . Our second construction uses the tools of invariant theory to provide apolynomial of much smaller degree, which might be more useful in practice.We first recall that there is a strong connection between the roots of a polynomial inIH or IH and its determinant or discriminant :det (cid:32) λ (cid:89) ≤ i ≤ ( x − r i y ) (cid:33) = − λ ( r − r ) , disc (cid:32) λ (cid:89) ≤ i ≤ ( x − r i y ) (cid:33) = λ ( r − r ) ( r − r ) ( r − r ) . We now fix an integer m >
3. Observing that these expressions are a “cyclic” productof the squares of differences of roots, we define S ( λ, r , · · · , r m ) := λ ( r − r ) · · · ( r m − − r m ) ( r m − r ) . Since m >
3, this quantity is not invariant anymore under reordering of the r i . For anypositive integer d , we introduce the symmetrized version of the d -powers of the cyclicproduct Q d ( λ, r , · · · , r m ) := (cid:88) σ ∈ Σ m S ( λ, r σ , · · · , r σ m ) d , where Σ m is the set of all permutations of { , · · · , m } . Proposition 2.5.2.
For all d > there exists a homogeneous polynomial Q d of degree d on IH m , with integer coefficients, and such that :If π = λ m (cid:89) i =1 ( x − r i y ) , then Q d ( π ) = Q d ( λ, r , · · · , r m ) . In addition, Q d obeys the invariance property Q d ( π ◦ φ ) = (det φ ) md Q d ( π ) . (2.42) Proof:
We denote by σ i the elementary symmetric functions in the r i , in such way that m (cid:89) i =1 ( x − r i y ) = x m − σ x m − y + σ x m − y − · · · + ( − m σ m y m . .5. Polynomial equivalents of the shape function in higher degree r i can be reformulated as a polynomial in the σ i . Hence for any d there exists a polynomial ˜ Q d such that Q d (1 , r , · · · , r m ) = ˜ Q d ( σ , · · · , σ m ) . In addition, it is known that the total degree of ˜ Q d is the partial degree of Q d in thevariable r , in our case 4 d , and that ˜ Q d has integer coefficients since Q d has.Given a polynomial π ∈ H m not divisible by y , we write it under the two equivalentforms : π = a x m + a x m − y + · · · + a m y m = λ m (cid:89) i =1 ( x − r i y ) . Clearly a = λ and σ i = ( − i a i a . It follows that Q d ( λ, r , · · · , r m ) = λ d ˜ Q d ( σ , · · · , σ m ) = a d ˜ Q d (cid:18) − a a , · · · , ( − m a m a (cid:19) . Since deg ˜ Q d = 4 d , the negative powers of a due to the denominators are cleared by thefactor a d , and the right-hand side is thus a polynomial in the coefficients a , · · · , a m thatwe denote by Q d ( π ).We now prove the invariance of Q d with respect to linear changes of coordinates ; thisproof is adapted from [61]. By continuity of Q d , it suffices to prove this invariance propertyfor pairs ( π, φ ) such that φ is an invertible linear change of coordinates and neither π or π ◦ φ − is divisible by y .Under this assumption, we observe that if π = λ (cid:81) mi =1 ( x − r i y ) and φ = (cid:18) α βγ δ (cid:19) ,then π ◦ φ − = ˜ λ (cid:81) mi =1 ( x − ˜ r i y ), where˜ λ = λ (det φ ) − m m (cid:89) i =1 ( γ + δr i ) and ˜ r i = αr i + βγr i + δ . Observing that ˜ r i − ˜ r j = det φ ( γr i + δ )( γr j + δ ) ( r i − r j ) , it follows that S (˜ λ, ˜ r , · · · , ˜ r m ) = (det φ ) − m S ( λ, r , · · · , r m ) . The invariance property (2.42) follows readily. (cid:5)
We now define r m = 2lcm { deg Q d ; 1 ≤ d ≤ m ! } , where lcm { a , · · · , a k } stands forthe lowest common multiple of { a , · · · , a k } , and we consider the following polynomial onIH m : K m := m ! (cid:88) d =1 Q rm deg Q d d . Clearly, K m has degree r m and obeys the invariance property K m ( π ◦ φ ) = (det φ ) rmm K m ( π ).02 Chapter 2. Sharp asymptotics of the L p interpolation error Lemma 2.5.3.
Let π ∈ IH m . If K m ( π ) = 0 , then K m ( π ) = 0 . Proof:
We assume that K m ( π ) (cid:54) = 0 and intend to prove that K m ( π ) (cid:54) = 0. Without lossof generality, we may assume that y does not divide π , since K m ( π ◦ U ) = K m ( π ) and K m ( π ◦ U ) = K m ( π ) for any rotation U . We thus write π = λ (cid:81) mi =1 ( x − r i y ), where r i ∈ C || .Since K m ( π ) (cid:54) = 0, we know from Proposition 2.2.1 that there is no group of s m := (cid:98) m (cid:99) + 1equal roots r i .We now define a permutation σ ∗ ∈ Σ m such that r σ ∗ ( i ) (cid:54) = r σ ∗ ( i +1) for 1 ≤ i ≤ m − r σ ∗ ( m ) (cid:54) = r σ ∗ (1) . In the case where m = 2 m (cid:48) is even and m (cid:48) of the r i are equal, anypermutation σ ∗ such that r σ ∗ (1) = r σ ∗ (3) = · · · = r σ ∗ (2 m (cid:48) − satisfies this condition. In allother cases let us assume that the r i are sorted by equality : if i < j < k and r i = r k , then r i = r j = r k . If m = 2 m (cid:48) is even, we set σ ∗ (2 i −
1) = i and σ ∗ (2 i ) = m (cid:48) + i , 1 ≤ i ≤ m (cid:48) . If m = 2 m (cid:48) +1 is odd we set σ ∗ (2 i ) = i , 1 ≤ i ≤ m (cid:48) and σ ∗ (2 i −
1) = m (cid:48) + i , 1 ≤ i ≤ m (cid:48) +1. Forexample, σ ∗ = (4 1 5 2 6 3 7) when m = 7 and σ ∗ = (1 5 2 6 3 7 4 8) when m = 8. With sucha construction, we find that | σ ∗ ( i ) − σ ∗ ( i +1) | ≥ m (cid:48) if m is odd and | σ ∗ ( i ) − σ ∗ ( i +1) | ≥ m (cid:48) − m is even, for all 1 ≤ i ≤ m , where we have set σ ∗ ( m + 1) := σ ∗ (1). Hence σ satisfiesthe required condition, and therefore S ( λ, r σ ∗ (1) , · · · , r σ ∗ ( m ) ) (cid:54) = 0.It is well known that if k complex numbers α , · · · , α k ∈ C || are such that α d + · · · + α dk =0, for all 1 ≤ d ≤ k , then α = · · · = α k = 0. Applying this property to the m ! complexnumbers S ( λ, r σ (1) , · · · , r σ ( m ) ), σ ∈ Σ m and noticing that the term corresponding to σ ∗ isnon zero, we see that there exists 1 ≤ d ≤ m ! such that Q d ( π ) = Q d ( λ, r , · · · , r m ) (cid:54) = 0.Since Q d has real coefficients, the numbers Q d ( π ) are real. Since the exponent r m / deg Q d is even, it follows that K m ( π ) >
0, which concludes the proof of this lemma. (cid:5)
The following proposition, when applied to the function K eq = rm √ K m , concludes theproof of Theorem 2.5.1 : Proposition 2.5.4.
Let m ≥ , and let K eq : IH m → R + be a continuous function obeyingthe following properties :1. Invariance property : K eq ( π ◦ φ ) = | det φ | m K eq ( π ) .2. Vanishing property : for all π ∈ IH m , if K eq ( π ) = 0 , then K m ( π ) = 0 .Then there exists a constant C > such that C K eq ≤ K m ≤ CK eq on IH m . Proof:
We first remark that K eq is homogeneous in a similar way as K m : if λ ≥
0, thenapplying the invariance property to φ = λ m Id yields K eq ( π ◦ ( λ m Id)) = K eq ( λπ ) and | det φ | m = λ . Hence K eq ( λπ ) = λK eq ( π ).Our next remark is that a converse of the vanishing property holds : if K m ( π ) = 0,then there exists a sequence φ n of linear changes of coordinates, det φ n = 1, such that π ◦ φ n → n → ∞ . Hence K eq ( π ) = K eq ( π ◦ φ n ) → K eq (0). Furthermore, K eq (0) = 0by homogeneity. Hence K eq ( π ) = 0.We define the set NF m := { π ∈ IH m ; K m ( π ) = 0 } . We also define a set A m ⊂ IH m by a property “opposite” to the property defining NF m . A polynomial π ∈ IH m belongs to A m if and only if (cid:107) π (cid:107) ≤ (cid:107) π ◦ φ (cid:107) for all φ such that det φ = 1 . .5. Polynomial equivalents of the shape function in higher degree NF m and A m are closed by construction, and clearly NF m ∩ A m = { } . We nowdenote by K m the lower semi continuous envelope of K m , which is defined by K m ( π ) = lim r → inf (cid:107) π (cid:48) − π (cid:107)≤ r K m ( π (cid:48) )the lower semi-continuous envelope of K m . If K m ( π ) = 0, then there exists a convergingsequence π n → π such that K m ( π n ) →
0. According to Proposition 2.2.4, it follows that K m ( π ) = 0 and hence π ∈ NF m . Therefore, the lower semi-continuous function K m andthe continuous function K eq are bounded below by a positive constant on the compact set { π ∈ A m , (cid:107) π (cid:107) = 1 } . Since in addition K eq is continuous and K m is upper semi-continuous,we find that the constant C = sup π ∈ A m , (cid:107) π (cid:107) =1 max (cid:26) K eq ( π ) K m ( π ) , K m ( π ) K eq ( π ) (cid:27) is finite. By homogeneity of K m and K eq , we infer that on A m ,1 C K eq ≤ K m ≤ K m ≤ CK eq . (2.43)Now, for any π ∈ IH m , we consider ˆ π of minimal norm in the closure of the set { π ◦ φ ; det φ = 1 } . By construction, we have ˆ π ∈ A m , and there exists a sequence φ n ,det φ n = 1 such that π ◦ φ n → ˆ π as n → ∞ . If ˆ π = 0, then K m ( π ) = K eq ( π ) = 0.Otherwise, we observe that K m (ˆ π ) ≤ K m ( π ) ≤ K m (ˆ π ) and K eq (ˆ π ) = K eq ( π ) , where we used the fact that K m , K m , and K eq are respectively lower semi-continuous,upper semi-continuous, and continuous on IH m . Combining this with inequality (2.43)concludes the proof. (cid:5) A natural question is to find the polynomial of smallest degree satisfying Theorem2.5.1. This leads us to the theory of invariant polynomials introduced by Hilbert [61](we also refer to [52] for a survey on this subject). A polynomial R on IH m is said to beinvariant if µ = m deg R is a positive integer and for all π ∈ IH m and any linear change ofcoordinates φ , one has R ( π ◦ φ ) = (det φ ) µ R ( π ) . (2.44)We have seen for instance that K m and Q d are “invariant polynomials” on IH m .Nearly all the literature on invariant polynomials is concerned with the case of complexcoefficients, both for the polynomials and the changes of variables. It is known in particular[52] that for all m ≥
3, there exists m − R , · · · R m − on IH m suchthat for any π (complex coefficients are allowed) and any other invariant polynomial R on IH m : If R ( π ) = · · · = R m − ( π ) = 0 , then R ( π ) = 0 . (2.45)A list of such polynomials with minimal degree is known explicitly at least when m ≤ r = 2lcm(deg R i ) and K eq := r (cid:113)(cid:80) m − i =1 R r deg Ri i , we04 Chapter 2. Sharp asymptotics of the L p interpolation error see that K eq ( π ) = 0 implies K m ( π ) = 0, and hence K m ( π ) = 0. According to proposition2.5.4, we have constructed a new, possibly simpler, equivalent of K m .For example, when m = 2 the list ( R i ) is reduced to the polynomial det, and for m = 3to the polynomial disc. For m = 4, given π = ax + 4 bx y + 6 cx y + 4 dxy + ey , the listconsists of the two polynomials I = ae − bd + 3 c , J = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) a b cb c dc d e (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ;therefore K ( π ) is equivalent to the quantity (cid:112) | I ( π ) | + J ( π ) . As m increases, thesepolynomials unfortunately become more and more complicated, and their number m − m = 5 the list consists of three polynomials ofdegrees 4 , ,
12, while for m = 6 it consists of 4 polynomials of degrees 2 , , , The function K m,p can be generalized to higher dimension d > m,d the set of homogeneous polynomials of degree m in d variables. Forall d -dimensional simplex T , we define the interpolation operator I m − T acting from C ( T )onto the space IP m − ,d of polynomials of total degree m − d variables. This operatoris defined by the conditions I m − T v ( γ ) = v ( γ ) for all points γ ∈ T with barycentric coordi-nates in the set { , m − , m − , · · · , } . Following Section § π ∈ IH m,d , K m,p,d ( π ) := inf | T | =1 (cid:107) π − I m − T π (cid:107) p , where the infimum is taken on all d -dimensional simplices T of volume 1. The variant K E m introduced in (2.15) also generalizes in higher dimension and was introduced by WeimingCao in [23]. Denoting by E d the set of d -dimensional ellipsoids, we define K E m,d ( π ) = (cid:18) sup E ∈E d ,E ⊂ Λ π | E | / | B | (cid:19) − md = inf { (det M ) m d ; M ∈ S + d and ∀ z ∈ R d , (cid:104) M z, z (cid:105) ≥ | π ( z ) | m } , where Λ π := { z ∈ R d ; | π ( z ) | ≤ } , B := { z ∈ R d ; | z | ≤ } is the unit euclidean ball,and where S + d denotes the set of symmetric positive definite d × d matrices. Similarly toProposition 2.2.5, it is not hard to show that the functions K m,p,d ( π ) and K E m,d ( π ) areequivalent : there exist constants 0 < c ≤ C depending only on m, d , such that cK E m,d ≤ K m,p,d ≤ CK E m,d . Let ( T N ) N ≥ be a sequence of simplicial meshes (triangles if d = 2, tetrahedrons if d = 3,. . . ) of a bounded d -dimensional, polygonal open set Ω. Generalizing (2.6), we say that .6. Extension to higher dimension T N ) N ≥ N is admissible if T N ) ≤ N and if there exists a constant C A satisfyingsup T ∈T N diam( T ) ≤ C A N − /d . The lower estimate in Theorem 2.1.2 can be generalized, with straightforward adaptationsin the proof. If f ∈ C m (Ω) and ( T N ) N ≥ N is an admissible sequence of simplicial meshes,then lim inf N →∞ N md e m, T N ( f ) p ≥ (cid:13)(cid:13)(cid:13)(cid:13) K m,d,p (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) , (2.46)where τ := md + p .The upper estimate in Theorem 2.1.2 however does not generalize. The reason is thatwe used in its proof a tiling of the plane consisting of translates of a single triangle andof its symmetric with respect to the origin. This construction is not possible anymore inhigher dimension, for example it is well known that one cannot tile the space R , withequilateral tetrahedra.The generalization of the second part of Theorem 2.1.2 is therefore the following.For all m and d , there exists a constant C = C ( m, d ) > ⊂ R d and f ∈ C m (Ω) the following holds : for all ε >
0, there existsan admissible sequence ( T N ) N ≥ N of simplicial meshes Ω, possibly non conforming, suchthat lim sup N →∞ N md e m, T N ( f ) p ≤ C (cid:13)(cid:13)(cid:13)(cid:13) K m,d,p (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) + ε. The “tightness” of Theorem 2.1.2 is partially lost due to the constant C . This upper boundis not new and can be found in [23]. In the proof of the bidimensional theorem, we defineby (2.26) a tiling P R of the plane made of a triangle T R and some of its translates andof their symmetric with respect to the origin. In dimension d , the tiling P R cannot beconstructed by the same procedure. The idea of the proof is to first consider a fixed tiling P of the space constituted of simplices bounded diameter, and of volume bounded belowby a positive constant, as well as a reference equilateral simplex T eq of volume 1. We thenset P R = φ ( P ), where φ is a linear change of coordinates such that T R = φ ( T eq ). Thisprocedure can be applied in any dimension and yields all subsequent estimates “up to amultiplicative constant,” which concludes the proof.Since this upper bound is not tight anymore, and since the functions K m,p,d are allequivalent to K E m,d as p varies (with equivalence constants independent of p ), there isno real need to keep track of the exponent p . We therefore denote by K m,d the function K m, ∞ ,d .For practical as well as theoretical purposes, it is desirable to have an efficient wayto compute the shape function K m,d , and an efficient algorithm to produce adapted tri-angulations. The case m = 2, which corresponds to piecewise linear elements, has beenextensively studied, see for instance [4, 27]. In that case there exists constants 0 < c < C ,depending only on d , such that for all π ∈ IH ,d , c d (cid:112) | det π | ≤ K ,d ( π ) ≤ C d (cid:112) | det π | , Chapter 2. Sharp asymptotics of the L p interpolation error where det π denotes the determinant of the symmetric matrix associated to π . Further-more, similarly to Proposition 2.4.4, the optimal metric for mesh refinement is given bythe absolute value of the matrix of second derivatives, see [4, 27], which is constructed ina similar way as in dimension d = 2 : with U and D = diag( λ , · · · , λ d ) the orthogonaland diagonal matrices such that [ π ] = U T DU and with | D | := diag( | λ | , · · · , | λ d | ), we set h π = U T | D | U . It can be shown that the matrix h π defines an ellipsoid of maximal volumeincluded in the set Λ π . The case m = 2 can therefore be regarded as solved.For values ( m, d ) both larger than 2, the question of computing the shape function aswell as the optimal metric is much more difficult, but we have partial answers, in particularfor quadratic elements in dimension 3. Following § m and the dimension d . Definition 2.6.1.
We call the pair of integers m ≥ and d ≥ “compatible” if and onlyif the following holds : For all π ∈ IH m,d such that there exists a sequence ( φ n ) n ≥ of d × d matrices with complex coefficients, satisfying det φ n = 1 and lim n →∞ π ◦ φ n = 0 , therealso exists a sequence ψ n of d × d matrices with real coefficients, satisfying det ψ n = 1 and lim n →∞ π ◦ ψ n = 0 . Following Hilbert [61], we say that a polynomial Q of degree r defined on IH m,d is invariant if µ = mrd is a positive integer and if for all π ∈ IH m,d and all linear changes ofcoordinates φ , Q ( π ◦ φ ) = (det φ ) µ Q ( π ) . (2.47)This is a generalization of (2.44). We denote by II m,d the set of invariant polynomials onIH m,d . It is easy to see that if π ∈ IH m,d is such that K m,d ( π ) = 0, then Q ( π ) = 0 for all Q ∈ II m,d . Indeed, as seen in the proof of Proposition 2.2.1, if K m,d ( π ) = 0, then thereexists a sequence φ n such that det φ n = 1 and π ◦ φ n →
0. Therefore, (2.47) implies that Q ( π ) = 0. The following lemma shows that the compatibility condition for the pair ( m, d )is equivalent to a converse of this property. Lemma 2.6.2.
The pair ( m, d ) is compatible if and only if for all π ∈ IH m,d , K m,d ( π ) = 0 if and only if Q ( π ) = 0 for all Q ∈ II m,d . Proof:
We first assume that the pair ( m, d ) is not compatible. Then there exists a po-lynomial π ∈ IH m,d such that there exists a sequence φ n , det φ n = 1 of matrices with complex coefficients such that π ◦ φ n →
0, but there exists no such sequence with real coefficients. This last property indicates that K m,d ( π ) >
0. On the contrary, let Q ∈ II m,d be an invariant polynomial, and set µ = m deg Qd . The identity Q ( π ◦ φ ) = (det φ ) µ Q ( π )is valid for all φ with real coefficients and is a polynomial identity in the coefficients of φ .Therefore, it remains valid if φ has complex coefficients. If follows that Q ( π ) = Q ( π ◦ φ n )for all n , and therefore Q ( π ) = 0, which concludes the proof in the case where the pair( m, d ) is not compatible. .6. Extension to higher dimension m, d ). Following Hilbert [61], we say that a poly-nomial π ∈ H m,d is a null form if and only if there exists a sequence of matrices φ n with complex coefficients such that det φ n = 1 and π ◦ φ n →
0. We denote by NF m,d the setof such polynomials. Since the pair ( m, d ) is compatible, note that π ∈ NF m,d if and onlyif there exists a sequence φ n of matrices with real coefficients such that det φ n = 1 and π ◦ φ n →
0. Hence, we find that NF m,d = { π ∈ IH m,d ; K m,d ( π ) = 0 } . Denoting by II C || m,d the set of invariant polynomials on IH m,d with complex coefficients, adifficult theorem of [61] states that NF m,d = { π ∈ IH m,d ; Q ( π ) = 0 for all Q ∈ II C || m,d } . It is not difficult to check that if Q = Q + iQ where Q and Q have real coefficients,then (2.47) holds for Q if and only if it holds for both Q and Q , i.e., Q and Q are alsoinvariant polynomials. Hence denoting by II m,d the set of invariant polynomials on IH m,d with real coefficients, we have obtained that NF m,d = { π ∈ IH m,d ; Q ( π ) = 0 for all Q ∈ II m,d } , which concludes the proof. (cid:5) Theorem 2.6.3.
If the pair ( m, d ) is compatible, then there exists a polynomial K onIH m,d (we set r = deg K ) and a constant C > such that for all π ∈ IH m,d , C r (cid:112) K ( π ) ≤ K m,d ( π ) ≤ C r (cid:112) K ( π ) . (2.48) Furthermore any polynomial K satisfying the above equivalence needs to be an invariantpolynomial on IH m . If the pair ( m, d ) is not compatible, then there does not exist such apolynomial K . Proof:
The proof of the invariance of any polynomial K satisfying (2.48), and of thenonexistence property when the pair ( m, d ) is not compatible, is reported in the appendix.Assume that the pair ( m, d ) is compatible. We follow a reasoning very similar to § NF m,d = { π ∈ IH m,d ; K m,d ( π ) = 0 } = { π ∈ IH m,d ; Q ( π ) = 0 , Q ∈ II m,d } . The ring of polynomials on a field is known to be Noetherian. This implies that there existsa finite family Q , · · · , Q s ∈ II m,d of invariant polynomials on IH m,d such that any invariantpolynomial is of the form (cid:80) P i Q i where P i are polynomials on IH m,d . We therefore obtain NF m,d = { π ∈ IH m,d ; Q ( π ) = · · · = Q s ( π ) = 0 } , which is a generalization of (2.45), however with no clear bound on s .08 Chapter 2. Sharp asymptotics of the L p interpolation error We now fix such a set of polynomials, set r := 2lcm ≤ i ≤ s deg Q i , and define K = s (cid:88) i =1 Q r deg Q i i and K eq := r √ K . Clearly, K is an invariant polynomial on IH m,d , and NF m,d = { π ∈ IH m,d ; K ( π ) = 0 } .Hence the function K eq is continuous on IH m,d , obeys the invariance property K eq ( π ◦ φ ) = | det φ | K eq ( π ), and for all π ∈ IH m , K eq ( π ) = 0 implies K ( π ) = 0 and therefore K m,d ( π ) =0. We recognize here the hypotheses of Proposition 2.5.4, except that the dimension d haschanged. Inspection of the proof of Proposition 2.5.4 shows that we use only once the factthat d = 2, when we refer to Proposition 2.2.4 and state that if ( π n ) ∈ IH m , π n → π and K m ( π n ) →
0, then K m ( π ) = 0. This property also applies to K m,d , when the pair ( m, d )is compatible. Assume that ( π n ) ∈ IH m,d , π n → π and that K m,d ( π n ) →
0. Then thereexists a sequence of linear changes of coordinates φ n , det φ n = 1, such that π n ◦ φ n → K ( π ) = lim n →∞ K ( π n ) = lim n →∞ K ( π n ◦ φ n ) = 0 . It follows that π ∈ NF m,d , and therefore K m,d ( π ) = 0. Since the rest of the proof of Pro-position 2.5.4 never uses that d = 2, this concludes the proof of equivalence (2.48). (cid:5) Hence there exists a “simple” equivalent of K m,d for all compatible pairs ( m, d ), whileequivalents of K m,d for incompatible pairs need to be more sophisticated, or at leastdifferent from the root of a polynomial. This theorem leaves open several questions. Thefirst one is to identify the list of compatible pairs ( m, d ). It is easily shown that thepairs ( m, m ≥
2, and (2 , d ), d ≥ ,
3) is compatible, whichcorresponds to approximation by quadratic elements in dimension 3. There exist twogenerators S and T of II , , whose expressions are given in [83] and which have respectivelydegree 4 and 6. Corollary 2.6.4. (cid:112) | S | + T is equivalent to K , on IH , . Proof:
The invariants S and T obey the invariance properties S ( π ◦ φ ) = (det φ ) S ( π )and T ( π ◦ φ ) = (det φ ) T ( π ). We intend to show that if π ∈ IH , and S ( π ) = T ( π ) = 0,then K , ( π ) = 0. Let us first admit this property and see how to conclude the proofof this corollary. According to Lemma 2.6.2 the pair (3 ,
3) is compatible. The function K eq := (cid:112) | S | + T is continuous on IH , , obeys the invariance property K eq ( π ◦ φ ) = | det φ | K eq ( π ), and is such that K eq ( π ) = 0 implies K , ( π ) = 0. We have seen in the proofof Theorem 2.6.3 that these properties imply the desired equivalence of K eq and K , .We now show that S ( π ) = T ( π ) = 0 implies K , ( π ) = 0. A polynomial π ∈ IH , canbe of two types. Either it is reducible , meaning that there exist π ∈ IH , (linear) and π ∈ IH , (quadratic) such that π = π π , or it is irreducible . In the latter case, accordingto [60], there exists a linear change of coordinates φ and two reals a, b such that π ◦ φ = y z − ( x + 3 axz + bz ) . .7. Conclusion and Perspectives S ( π ◦ φ ) = a and T ( π ◦ φ ) = − b . If S ( π ) = T ( π ) = 0, then S ( π ◦ φ ) = T ( π ◦ φ ) = 0 and π ◦ φ = y z − x .Therefore for all λ (cid:54) = 0, π ◦ φ ( λx, λ y, λ − z ) = λy z − λ x , which tends to 0 as λ → φ n , det φ n = 1, such that π ◦ φ n → K , ( π ) = 0.If π is reducible, then π = π π where π is linear and π is quadratic. Choosing alinear change of coordinates φ such that π ◦ φ = z , we obtain π ◦ φ = 3 z ( ax + 2 bxy + cy ) + z ( ux + vy + wz )for some constants a, b, c, u, v, w . Again, a direct computation from the expressions givenin [83] shows that S ( π ◦ φ ) = − ( ac − b ) (and T ( π ◦ φ ) = 8( ac − b ) ). Therefore, if S ( π ) = T ( π ) = 0 then the quadratic function ax + 2 bxy + cy of the pair of variables( x, y ) is degenerate. Hence there exists a linear change of coordinates ψ , altering only thevariables x, y , and reals µ, u (cid:48) , v (cid:48) such that π ◦ φ ◦ ψ = µzx + z ( u (cid:48) x + v (cid:48) y + wz ) . It follows that π ◦ φ ◦ ψ ( x, λ − y, λz ) tends to 0 as λ →
0. Again, this implies that K , ( π ) = 0, and concludes the proof of this proposition. (cid:5) We could not find any example of an incompatible pair ( m, d ), which leads us toformulate the conjecture that all pairs ( m, d ) are compatible (hence providing “simple”equivalents of K m,d in full generality). Another even more difficult problem is to derive apolynomial K of minimal degree for all pairs ( m, d ) which are compatible and of interest.Last but not least, efficient algorithms are needed to compute metrics, from whicheffective triangulations are built that yield the optimal estimates. A possibility is to followthe approach proposed in [23], i.e., solve numerically the optimization probleminf { det M ; M ∈ S + d and ∀ z ∈ R d , (cid:104) M z, z (cid:105) ≥ | π ( z ) | /m } , which amounts to minimizing a degree d polynomial under an infinite set of linear constraints.When d >
2, this minimization problem is not quadratic, which makes it rather delicate.Furthermore, numerical instabilities similar to those described in Remark 2.4.7 can beexpected to appear.
In this chapter, we have introduced asymptotic estimates for the finite element in-terpolation error measured in the L p norm when the mesh is optimally adapted to theinterpolated function. These estimates are asymptotically sharp for functions of two va-riables, see Theorem 2.1.2, and precise up to a fixed multiplicative constant in higherdimension, as described in § K m,p (or K m,d,p if d > Chapter 2. Sharp asymptotics of the L p interpolation error shows, and has equivalents of a simple form in a number of other cases, see Theorems2.5.1 and 2.6.3.All our results are stated and proved for sufficiently smooth functions. One of ourfuture objectives is to extend these results to larger classes of functions, and in particularto functions exhibiting discontinuities along curves. This means that we need to givea proper meaning to the nonlinear quantity K m,p (cid:0) d m fm ! (cid:1) for nonsmooth functions. Thisquestion is addressed in Chapter 4.This chapter also features a constructive algorithm (similar to [4]), that produces trian-gulations obeying our sharp estimates, and is described in § § m and dimension d , although, as we pointed out in Proposition 2.4.7,this might be a rather delicate matter.We finally remark that in many applications, one seeks for error estimates in the Sobo-lev norms W ,p (or W m,p ) rather than in the L p norms. Finding the optimal triangulationfor such norms requires a new error analysis, see Chapter 3. For instance, in the survey [85]on piecewise linear approximation, it is observed that the metric h π = | d f | (evoked inEquation (2.40)) should be replaced with h π = ( d f ) for best adaptation in H norm. Inother words, the principal axes of the positive definite matrix h π remain the same, but itsconditioning is squared. We consider a fixed polynomial π ∈ IH , set a parameter α >
0, and look for an ellipse E π,α of maximal volume included in the set α − / D ∩ Λ π . Since this set is compact, astandard argument shows that there exists at least one such ellipse.If α ≥ µ π , then α − / D ⊂ Λ π , and therefore α − / D ∩ Λ π = α − / D . It follows that E π,α = α − / D , which proves part 1.In the following, we denote by E (cid:48) α the ellipse defined by the matrix (2.41). Note thatany ellipse containing D π and included in Λ π must be tangent to ∂ Λ π at the point z π ,and hence of the form E (cid:48) δ for some δ >
0. Clearly, E (cid:48) δ ⊂ E (cid:48) µ if and only if δ ≥ µ .Therefore, E (cid:48) α ⊂ Λ π if and only if α ≥ α π . Let E be an arbitrary ellipse, D the largestdisc contained in E , and D the smallest disc containing E . Then it is not hard to checkthat | E | = (cid:112) | D || D | . For any α satisfying α π ≤ α ≤ µ π , the ellipse E (cid:48) α is such that D = D π , which is the largest centered disc contained in Λ π , and D = α − / D , which .8. Appendix α − / on the diameter of E π,α . It follows that E (cid:48) α is an ellipseof maximal volume included in α − / D ∩ Λ π , and this concludes the proof of part 2.Part 4 is trivial ; hence we concentrate on part 3 and assume that β π ≤ α ≤ α π .An elementary observation is that E π,α must be “blocked with respect to rotations.”Indeed, assume for contradiction that R θ ( E π,α ) ⊂ Λ π for θ ∈ [0 , ε ] or [ − ε, R θ the rotation of angle θ . Observing that the set ∪ θ ∈ [0 ,ε ] R θ ( E π,α ) contains anellipse of larger area than E π,α and of the same diameter, we obtain a contradiction.In the following, we say that an ellipse E is quadri-tangent to Λ π when there are atleast four points of tangency between ∂E and ∂ Λ π (a tangency point being counted twiceif the radii of curvature of ∂E and ∂ Λ π coincide at this point).The fact that E π,α is “blocked with respect to rotations” implies that it is either quadri-tangent to Λ π or tangent to ∂ Λ π at the extremities of its small axis. In the latter case,the extremities of the small axis must clearly be the points z π and − z π , the closest pointsof ∂ Λ π to the origin. It follows that E π,α belongs to the family E (cid:48) δ , δ ≥ α π describedabove, and therefore is equal to E (cid:48) α π since α ≤ α π . But E (cid:48) α π is quadri-tangent to Λ π , sinceotherwise we would have E (cid:48) α π − ε ⊂ Λ π for some ε > E π,α is quadri-tangent to Λ π when β π ≤ α ≤ α π . Thisproperty is invariant by under linear change of coordinates : if an ellipse E is quadri-tangent to Λ π ◦ φ , then φ ( E ) is quadri-tangent to Λ π . Furthermore, if E is defined by asymmetric positive definite matrix H , then φ ( E ) is defined by ( φ − ) T Hφ − . This remarkleads us to the problem of identifying the family of ellipses quadri-tangent to ∂ Λ π when π is among the four reference polynomials x ( x − y ) , x ( x + 3 y ) , x y , and x . In thecase of x there is no quadri-tangent ellipse and we have α π = 0 ; therefore part 3 of thetheorem is irrelevant. In the three other cases, which respectively correspond to part 3(i), (ii), and (iii), the quadri-tangent ellipses are easily identified using the symmetries ofthese polynomials and the system of equations (2.37).The ellipses quadri-tangent to x ( x + 3 y ) are defined by matrices of the form H λ =diag( λ, λ λ ), where 0 < λ ≤
2. Note that det H λ is decreasing on (0 , ] and increasingon [2 , π with disc π <
0, the optimization problem (2.39) therefore becomesmin λ { det H λ ; ( φ − π ) T H λ φ − π ≥ α Id } . If the constraint is met for λ = 2 / , we obtain E π,α = E π and therefore α ≤ β π . Otherwise,using the monotonicity of λ (cid:55)→ det H λ on each side of its minimum 2 we see that thematrix H λ − αφ T π φ π must be singular. Taking the determinant, we obtain an equation ofdegree 4 from which λ can be computed, and this concludes the proof of part 3 (i).The ellipses quadri-tangent to x ( x − y ) are defined by H λ,V = V T diag( λ, − λ λ ) V ,where 0 < λ ≤ V is a rotation by 0, 60, or 120 degrees. Since det H λ,V is a decreasingfunction of λ on (0 , π suchthat disc π >
0. This concludes the proof of part 3 (ii).Finally, the ellipses quadri-tangent to xy are defined by H λ = diag( λ, λ ), λ > λ (cid:55)→ det H λ is a decreasing function of λ with lower bound 0 as λ → ∞ ,and the same reasoning applies again, hence concluding the proof of part 3 (iii).12 Chapter 2. Sharp asymptotics of the L p interpolation error Let K be a polynomial on IH m,d which satisfies the inequalities (2.48). We observe that K takes nonnegative values on IH m,d , and we assume in a first time that µ := mrd is aninteger. We derive from (2.48) and from the invariance of the shape function K m,d withrespect to changes of variables the inequalities C − r (det φ ) µ K ( π ) ≤ K ( π ◦ φ ) ≤ C r (det φ ) µ K ( π ) , (2.49)for all π ∈ IH m,d and all φ ∈ M d (the space of d × d real matrices), where C is the constantappearing in inequalities (2.48). We regard the function Q ( π, φ ) = K ( π ◦ φ ) as a polynomialon the vector space V = IH m,d × M d and we observe that it vanishes on the hypersurface V det := { ( π, φ ) ∈ V ; det φ = 0 } . Since φ (cid:55)→ det( φ ) is an irreducible polynomial, as shownin [14], it follows that Q ( π, φ ) = (det φ ) Q ( π, φ ) for some polynomial Q on V . Injectingthis expression in (2.49) we obtain that Q ( π, φ ) also vanishes on the hypersurface V det if µ >
1, and the argument can be repeated. By induction we eventually obtain a polynomial (cid:98) K on V such that K ( π ◦ φ ) = (det φ ) µ (cid:98) K ( π, φ ). It follows from inequality (2.49) that forall ( π, φ ) ∈ V , C − r K ( π ) ≤ (cid:98) K ( π, φ ) ≤ C r K ( π ) . This implies that (cid:98) K ( π, φ ) does not depend on φ . Otherwise, since it is a polynomial, wecould find π ∈ H m,d and a sequence φ n ∈ M d such that | (cid:98) K ( π , φ n ) | → ∞ . Therefore, K ( π ◦ φ ) = (det φ ) µ (cid:98) K ( π, φ ) = (det φ ) µ (cid:98) K ( π, Id) = (det φ ) µ K ( π ) . This establishes the invariance property of K under the hypothesis that µ is an integer.Let us assume for contradiction that µ is not an integer. Then applying the previousreasoning to K d we obtain that K ( π ◦ φ ) / K ( π ) = α ( π, φ ) | det φ | µ for all π ∈ IH m,d and φ ∈ M d , where α : IH m,d × M d → {− , } . Since φ (cid:55)→ det φ is an irreducible polynomial weobtain that µ is an integer and as before that K ( π ◦ φ ) = (det φ ) µ K ( π ), which concludesthe proof of the invariance property.Let ( m, d ) be an incompatible pair. We know from Lemma 2.6.2 that there exists π ∈ IH m,d such that K m,d ( π ) > Q ( π ) = 0 for any invariant polynomial Q ∈ II m,d .Therefore there exists no polynomial K satisfying (2.48). hapter 3Sharp asymptotics of the W ,p interpolation error on optimaltriangulations Contents L m,p and local error estimates . . . . 1183.3 Proof of Theorems 3.1.1 and 3.1.2 . . . . . . . . . . . . . . 128 Chapter 3. Sharp asymptotics of the W ,p interpolation error We consider in this section the problem of finding the optimal mesh for the interpola-tion of a function by finite elements of a given degree, when the error is measured in theSobolev W ,p norm. Our purpose is to establish sharp asymptotic estimates, of the sametype as the following in the case of the L p norm,lim sup N → + ∞ (cid:16) N min T ) ≤ N (cid:107) f − I T f (cid:107) L p (cid:17) ≤ C (cid:107) (cid:112) | det( d f ) |(cid:107) L τ , τ = 1 p + 1 , (3.1)which holds for a C smooth function f defined on a bounded polygonal domain Ω,see [4, 27] and Chapter 2. We refer to the introduction of this thesis and to Chapter 2for a more detailed overview of existing results and of our motivations. The convergenceestimate (3.1) is extended to arbitrary approximation order in Chapter 2, where thequantity governing the convergence rate for finite elements of arbitrary degree m − m -th order derivative d m f .The purpose of the present chapter is to investigate this problem when the L p -normis replaced by the W ,p semi-norm which plays a critical role in PDE analysis, and whichis defined as follows | f | W ,p (Ω) := (cid:107)∇ f (cid:107) L p (Ω) = (cid:18)(cid:90) Ω |∇ f | p (cid:19) /p . Our second objective is to propose simple and practical ways of designing meshes whichbehave similar to the optimal one, in the sense that they satisfy the sharp error estimateup to a fixed multiplicative constant.
We denote by IP m the space of polynomials of total degree less or equal to m and byIH m the space of homogeneous polynomials of total degree m ,IP m := Span { x k y l ; k + l ≤ m } and IH m := Span { x k y l ; k + l = m } . For any triangle T , we denote by I mT the local interpolation operator acting from C ( T )onto IP m . For any continuous fonction ν ∈ C ( T ), the interpolating polynomial I mT ν ∈ IP m is defined by the conditions I mT ν ( γ ) = ν ( γ ) , for all points γ ∈ T with barycentric coordinates in the set { , m , m , · · · , } . This inter-polation operator is invariant by translation, hence for any polynomial π ∈ IH m , triangle T and offset z we have | π − I m − T π | W ,p ( T ) = | π − I m − T π | W ,p ( z + T ) . (3.2)If T is a triangulation of a domain Ω, then I m T refers to the interpolation operator whichcoincides with I mT on each triangle T ∈ T . .1. Introduction shape function L m,p , which is defined by a shapeoptimization problem : for any fixed 1 ≤ p ≤ ∞ and for any π ∈ IH m , we define L m,p ( π ) := inf | T | =1 | π − I m − T π | W ,p ( T ) . (3.3)Here, the infimum is taken over all triangles of area | T | = 1. From the homogeneity of π ,it is easily checked that inf | T | = A | π − I m − T π | W ,p ( T ) = L m,p ( π ) A m − + p . The solution to this optimization problem thus describes the shape of the triangles of agiven area which are best adapted to the polynomial π in the sense of minimizing theinterpolation error measured in W ,p .The function L m,p is the natural generalisation of the function K m,p introduced inChapter 2 for the study of optimal anisotropic triangulations in the sense of the L p inter-polation error K m,p ( π ) := inf | T | =1 (cid:107) π − I m − T π (cid:107) L p (Ω) . Our asymptotic error estimate for the optimal triangulation is given by the followingtheorem.
Theorem 3.1.1.
For any bounded polygonal domain Ω ⊂ R , any function f ∈ C m (Ω) and any ≤ p < ∞ , there exists a sequence ( T N ) N ≥ N , T N ) ≤ N , of triangulations of Ω such that lim sup N →∞ N m − | f − I m − T N f | W ,p (Ω) ≤ (cid:13)(cid:13)(cid:13)(cid:13) L m,p (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) , where τ := m −
12 + 1 p . (3.4)In the above convergence estimate, the number N is independent of f and refers tothe minimal cardinality of a conforming triangulation of Ω. The m -th derivative d m f ( z )at each point z is identified to a homogeneous polynomial π z ∈ IH m : d m f ( z ) m ! ∼ π z = (cid:88) k + l = m ∂ m f∂x k ∂y l ( z ) x k k ! y l l ! . (3.5)An important feature of this estimate is the “lim sup”. Recall that the upper limit of asequence ( u N ) N ≥ N is defined bylim sup N →∞ u N := lim N →∞ sup n ≥ N u n , and is in general stricly smaller than the supremum sup N ≥ N u N . It is still an open questionto find an appropriate upper estimate of sup N ≥ N N m − | f − I m − T N f | W ,p (Ω) when optimallyadapted anisotropic triangulations are used.In order to illustrate the sharpness of (3.4), we introduce a slight restriction on se-quences of triangulations, following an idea in [4] : a sequence ( T N ) N ≥ N of triangulations16 Chapter 3. Sharp asymptotics of the W ,p interpolation error is said to be admissible if T N ) ≤ N and sup N ≥ N ( N sup T ∈T N diam( T )) < ∞ . In otherwords if sup T ∈T N diam( T ) ≤ C A N − (3.6)for some constant C A > N . The following theorem shows that theestimate (3.4) cannot be improved when we restrict our attention to admissible sequences.It also shows that this class is reasonably large in the sense that (3.4) is ensured to holdup to small perturbation. Theorem 3.1.2.
Let Ω ⊂ R be a bounded polygonal domain, let f ∈ C m (Ω) and ≤ p < ∞ . We define τ := m − + p . For any admissible sequence ( T N ) N ≥ N of triangulationsof Ω , one has lim inf N →∞ N m − | f − I m − T N f | W ,p (Ω) ≥ (cid:13)(cid:13)(cid:13)(cid:13) L m,p (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) . (3.7) Furthermore, for all ε > there exists an admissible sequence ( T εN ) N ≥ N of triangulationsof Ω such that lim sup N →∞ N m − | f − I m − T εN f | W ,p (Ω) ≤ (cid:13)(cid:13)(cid:13)(cid:13) L m,p (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) + ε. (3.8)Note that the sequences ( T εN ) N ≥ N satisfy the admissibility condition (3.6) with aconstant C A ( ε ) which may grow to + ∞ as ε →
0. The proofs of these two theorems aregiven in § T ε N N with ε N → N → + ∞ . The proof of theupper estimate in Theorem 3.1.2 involves the construction of an optimal mesh based ona patching strategy adapted from the one encountered in [4]. However, inspection of theproof reveals that this construction only becomes effective as the number of triangles N becomes very large. Therefore it may not be useful in practical applications. Remark 3.1.3.
It can easily be shown that if ( T N ) N ≥ N is an admissible sequence oftriangulations and f ∈ C m (Ω) , then (cid:107) f − I m T N f (cid:107) L p (Ω) decays with the rate N − m/ which isfaster than the decay rate obtained for the W ,p error. Therefore, our convergence estimatesare also valid in the W ,p norm (cid:107) f (cid:107) W ,p (Ω) := (cid:16) (cid:107) f (cid:107) pL p (Ω) + | f | pW ,p (Ω) (cid:17) /p . We show in § d m f ,(iii) the largest angle of the triangles should be bounded away from π = 3 . . . . (iv) the triangulation T should be sufficiently refined in order to adapt to the localfeatures of f . .1. Introduction W ,p error estimates (but not for L p error estimates).Roughly speaking, two triangles having the same optimized aspect ratio imposed by (ii)may greatly differ in term of their largest angle, and the most acute triangle should bepreferred when the interpolation error is measured in W ,p rather than L p . The influenceof large angles in mesh adaptation has already been studied in [10,62,85], see also Chapter6. The heuristic guideline is that large angles should be avoided in general, since they leadto oscillations of the gradient of the interpolant. On the contrary, extremely thin trianglesand very small angles can be necessary for optimal mesh adaptation.A practical approach for mesh generation is discussed in § d m f at each point x ∈ Ω. Werestrict in this section to the case of linear and quadratic finite elements, and we providesimple mesh generation procedures and numerical results. To any π ∈ IH m , m ∈ { , } , weassociate a symmetric positive definite matrix M m ( π ) ∈ S +2 (strictly speaking, this matrixis degenerated if π is univariate, a detail that needs to be taken care of in the numericalimplementation). If z ∈ Ω and d m f ( z ) is close to π , then the triangle T containing z should be isotropic in the metric M m ( π ). The requirements (i) and (ii) above, which arerespectively linked to the size and shape of the triangles, can then be summarized througha global metric on Ω given by h ( z ) = s ( π z ) M m ( π z ) , π z = d m f ( z ) m ! , (3.9)where s ( π z ) is a scalar factor which depends on the desired accuracy of the finite elementapproximation. Once this metric has been properly identified, fast algorithms such asin [19, 93, 94] can be used in order to design a near-optimal mesh based on it. Recentlyit has been rigorously proved in [15, 66], that several algorithms terminate and producegood quality meshes, under certain conditions. Although we are not aware that the angleconstraint (iii) is guaranteed in such algorithms, it seems to hold in practice. Computingthe map π ∈ IH m (cid:55)→ M m ( π ) ∈ S +2 , is therefore of key use in applications ( S +2 refers to the set of 2 × m = 2) :the matrix M m ( π ) is then defined as the square of the matrix associated to the quadraticform π . We give a simple expression of M m ( π ) for piecewise quadratic finite elements ( m =3). The optimality of this construction is proved theoretically, and numerical experimentsconfirm its adequacy. An open source implementation for the mesh generator FreeFem++[93] is provided at [95].The shape function L m,p does not always have a simple analytic expression from thecoefficients of π . For this reason we introduce in § π (cid:55)→ L m ( π ) whichare defined as the root of a polynomial in the coefficients of π , and are equivalent to L m,p ,leading therefore to similar asymptotic error estimates up to multiplicative constants.We finally discuss in § d > Chapter 3. Sharp asymptotics of the W ,p interpolation error Notations
Throughout this chapter, we define L m := L m, ∞ , where L m,p is defined at Equation(3.3). We prove further in Lemma 3.2.7 that for all 1 ≤ p ≤ ∞ cL m ≤ L m,p ≤ L m on IH m , (3.10)where the constant c > m . For any compact set E ⊂ R d , of non-zeroLebesgue measure, we denote by bary( E ) its barycenter. For any pair of vectors u, v ∈ R d ,we denote by (cid:104) u, v (cid:105) their inner product, and by | u | := (cid:112) (cid:104) u, u (cid:105) , the euclidean norm of u . When g ∈ L p ( E, R d ) is a vector valued function, we denote by (cid:107) g (cid:107) L p ( E ) the L p norm of x (cid:55)→ | g ( x ) | on E .We denote by M d ( R ) the set of all d × d real matrices, equiped with the norm (cid:107) A (cid:107) := max | u |≤ | Au | . We denote by GL d ⊂ M d ( R ) the linear group of invertible matrices and by SL d ⊂ GL d the special linear group of matrices of determinant 1.GL d := { A ∈ M d ( R ) ; det A (cid:54) = 0 } and SL d := { A ∈ M d ( R ) ; det A = 1 } . For A ∈ GL d , we denote by κ ( A ) := (cid:107) A (cid:107) (cid:107) A − (cid:107) , (3.11)its condition number. We denote by S d ⊂ M d ( R ) the subset of symmetric matrices, by S ⊕ d ⊂ S d the subset of non-negative symmetric matrices and by S + d the subset of positivedefinite symmetric matrices. For any two symmetric matrices S, S (cid:48) ∈ S d , we write S ≤ S (cid:48) if and only if S (cid:48) − S ∈ S ⊕ d .We equip the spaces IP m and IH m with the norm (cid:107) π (cid:107) := max | u |≤ | π ( u ) | . (3.12)Note that the greek letter π always refers to an homogeneous polynomial π ∈ IH m , whilethe large and bold notation π refers to the mathematical constant π = 3 . . . . .Recall that if g is a C m function, we identify d m g ( x ) to a polynomial in IH m , by (3.5).We then denote (cid:107) d m g (cid:107) L ∞ ( E ) := max z ∈ E (cid:107) d m g ( z ) (cid:107) (3.13)with (cid:107) · (cid:107) the previously defined norm on IH m . L m,p and local error estimates In this section, we study the function L m,p and obtain local W ,p error estimatesfor functions of two variables. These estimates naturally give rise to a heuristic method .2. The shape function L m,p and local error estimates f , in other wordstriangulations satisfying the estimate (3.4) up to a fixed multiplicative constant, and itis put into practice in § § S ( T ) of a triangle T . Given two triangles T, T (cid:48) , there are precisely 6 affine transformations Ψ such that Ψ( T ) = T (cid:48) . Each of theseaffine transformations Ψ defines a linear transformation ψ and we set d ( T, T (cid:48) ) := ln (inf { κ ( ψ ) ; Ψ( T ) = T (cid:48) } ) , (3.14)where κ ( ψ ) is the condition number defined in (3.11). Clearly d ( T, T (cid:48) ) ≥ d ( T, T (cid:48) ) = d ( T (cid:48) , T ) and d ( T, T (cid:48)(cid:48) ) ≤ d ( T, T (cid:48) ) + d ( T (cid:48) , T (cid:48)(cid:48) ). Furthermore d ( T, T (cid:48) ) = 0 if and only if T canbe transformed into T (cid:48) through a translation, a rotation and a dilatation. Therefore d ( · , · )defines a distance between shapes of triangles. The heuristic guideline of the papers [10,62]and Chapter 6 is that obtuse shapes should be avoided when possible in the design ofFinite Element meshes for the approximation of a function in the W ,p norm. We thereforeintroduce the set A of all acute triangles and we define the measure of sliverness S ( T ) ofa triangle T as follows S ( T ) := exp d ( T, A ) = inf { κ ( ψ ) ; Ψ( T ) ∈ A } . (3.15)This quantity reflects the distance from T to the set of acute triangles : in particular S ( T ) = 1 if and only if T ∈ A , and S ( T ) > Proposition 3.2.1.
For any triangle T with largest angle θ , one has S ( T ) = max { , tan θ } . Proof:
The result of this proposition is trivial if the triangle T is acute, and we thereforeassume that T is obtuse. We can assume without loss of generality that the vertices of T are 0, αu and βv , where α, β > u, v ∈ R , | u | = | v | = 1 and (cid:104) u, v (cid:105) = cos θ . Notethat | u − v | = 2 sin( θ/
2) and | u + v | = 2 cos( θ/ T ) ∈ A , and let ψ be the associated linear transform. Since Ψ( T ) is acute we have (cid:104) ψ ( u ) , ψ ( v ) (cid:105) ≥ | ψ ( u ) − ψ ( v ) | ≤ | ψ ( u ) + ψ ( v ) | . It follows that κ ( ψ ) = (cid:107) ψ (cid:107) (cid:107) ψ − (cid:107) ≥ | u − v || u + v | × | ψ ( u ) + ψ ( v ) || ψ ( u ) − ψ ( v ) | ≥ θ/ θ/
2) = tan θ . Therefore S ( T ) ≥ tan θ . Furthermore, let ψ be defined by ψ ( u ) = (0 ,
1) and ψ ( v ) = (1 , ψ ( T ) has one of its angles equal to π / κ ( ψ ) = tan( θ/
2) and therefore S ( T ) ≤ tan θ . This concludesthe proof of this proposition. (cid:5) The previous proposition implies in particular that S ( T ) is equivalent to the quantity θ , where θ denotes the largest angle of T , which is used in [10,62]. The following lemmashows that the interpolation process is stable with respect to the L ∞ norm of the gradientif the measure of sliverness S ( T ) is controlled. Let us mention that a slightly differentformulation of this result was already proved in [62], yet not exactly adapted to ourpurposes.20 Chapter 3. Sharp asymptotics of the W ,p interpolation error Figure T are aligned vertically. Lemma 3.2.2.
There exists a constant C = C ( m ) such that for any triangle T and any f ∈ W , ∞ ( T ) one has (cid:107)∇ I m − T f (cid:107) L ∞ ( T ) ≤ CS ( T ) (cid:107)∇ f (cid:107) L ∞ ( T ) , (3.16) Proof:
Let T be the triangle of vertices (0 , ,
0) and (0 , g ∈ W , ∞ ( T ).We define ˜ g ( x, y ) := g ( x,
0) and h ( x, y ) := g ( x, y ) − g ( x, g does not depend on y and since the Lagrange interpolation points on T are aligned vertically, as illustrated onFigure 3.1, the Lagrange interpolant I m − T ˜ g does not depend on y either. Futhermore, forall ( x, y ) ∈ T , we have | h ( x, y ) | = | (cid:82) ys =0 ∂g∂y ( x, s ) ds | ≤ (cid:107) ∂g∂y (cid:107) L ∞ ( T ) . Hence (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∂ I m − T g∂y (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L ∞ ( T ) = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∂ I m − T h∂y (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L ∞ ( T ) ≤ C (cid:107) I m − T h (cid:107) L ∞ ( T ) ≤ C C (cid:107) h (cid:107) L ∞ ( T ) ≤ C C (cid:13)(cid:13)(cid:13)(cid:13) ∂g∂y (cid:13)(cid:13)(cid:13)(cid:13) L ∞ ( T ) , where the constants C and C are the L ∞ ( T ) norms of the operators g (cid:55)→ ∂g∂y restrictedto IP m − and g (cid:55)→ I m − T g respectively.Let e be an edge vector of T . There exists an affine change of coordinates Ψ e , withlinear part ψ e , such that T = Ψ e ( T ) and ψ e ( e ) = e where e = (0 ,
1) is the vertical edgevector of T . Noticing that (cid:104) e, ∇ I m − T ( g ◦ Ψ e ) (cid:105) = (cid:104) e, ∇ ((I m − T g ) ◦ Ψ e ) (cid:105) = (cid:104) e , ( ∇ I m − T g ) ◦ Ψ e (cid:105) = ∂ I m − T g∂y ◦ Ψ e , we obtain (cid:107)(cid:104) e, ∇ I m − T ( g ◦ Ψ e ) (cid:105)(cid:107) L ∞ ( T ) = (cid:13)(cid:13)(cid:13)(cid:13) ∂ I m − T g∂y (cid:13)(cid:13)(cid:13)(cid:13) L ∞ ( T ) ≤ C C (cid:13)(cid:13)(cid:13) ∂g∂y (cid:13)(cid:13)(cid:13) L ∞ ( T ) = C C (cid:107)(cid:104) e, ∇ ( g ◦ Ψ e ) (cid:105)(cid:107) L ∞ ( T ) . (3.17) .2. The shape function L m,p and local error estimates g = f ◦ Ψ − e we obtain that (cid:107)(cid:104) e, ∇ I m − T f (cid:105)(cid:107) L ∞ ( T ) ≤ C C (cid:107)(cid:104) e, ∇ f (cid:105)(cid:107) L ∞ ( T ) , (3.18)for all edge vectors e ∈ { a, b, c } of T . We next define a norm on R as follows | v | T := | a | − |(cid:104) a, v (cid:105)| + | b | − |(cid:104) b, v (cid:105)| + | c | − |(cid:104) c, v (cid:105)| . It follows from inequality (3.18), that (cid:107) |∇ I m − T f | T (cid:107) L ∞ ( T ) ≤ C C (cid:107)|∇ f | T (cid:107) L ∞ ( T ) . (3.19)We next observe that if θ denotes the maximal angle of T ,cos( θ/ | v | ≤ | v | T ≤ | v | , where | · | is the euclidean norm : the upper inequality is trivial and the lower one isimplied by the fact that at least one of the edge vectors makes an angle less than θ/ v . Combining this with (3.19), we obtain (cid:107)∇ I m − T f (cid:107) L ∞ ( T ) ≤ C C cos( θ/ (cid:107)∇ f (cid:107) L ∞ ( T ) . Since θ > π/ θ/ ≤ θ/ ≤ S ( T ) according to Proposition 3.2.1, whichconcludes the proof with C = 18 C C . (cid:5) Remark 3.2.3.
The following example in the simple case of piecewise linear approxima-tion illustrates the sharpness of inequality (3.16). Let T be a triangle having an obtuseangle θ at a vertex v , and edges neighbouring v of length l and l (cid:48) . Let f ( z ) := | z − v | . Asimple computation shows that (cid:107)∇ I T f (cid:107) L ∞ ( T ) = diam T sin θ and (cid:107)∇ f (cid:107) L ∞ ( T ) = 2 max( l, l (cid:48) ) . It follows that (cid:107)∇ I T f (cid:107) L ∞ ( T ) = λ ( T ) S ( T ) (cid:107)∇ f (cid:107) L ∞ ( T ) , with λ ( T ) := diam( T )2 sin( θ ) S ( T ) max { l, l (cid:48) } = diam( T )4 sin( θ/ max { l, l (cid:48) } ∈ [1 / , , which shows the sharpness of Lemma 3.2.2 in this context. We now introduce for each polynomial π ∈ IH m , a special set A π ⊂ M ( R ) of linearmaps. A π := { A ∈ M ( R ) ; |∇ π ( z ) | ≤ | Az | m − for all z ∈ R } . (3.20)This set has a geometrical interpretation : since ∇ π is homogeneous of degree m −
1, wefind that A ∈ A π if and only if the ellipse { z ∈ R ; | Az | ≤ } is included in the set22 Chapter 3. Sharp asymptotics of the W ,p interpolation error { z ∈ R ; |∇ π ( z ) | ≤ } which is limited by the algebraic curve {|∇ π ( z ) | = 1 } . If T is atriangle that contains the origin and if A ∈ A π , we observe that (cid:107)∇ π (cid:107) L ∞ ( T ) ≤ diam( A ( T )) m − . (3.21)We define γ m ( π ) := inf {| det A | ; A ∈ A π } , so that π γ m ( π ) is the maximal area of an ellipse contained in { z ∈ R ; |∇ π ( z ) | ≤ } . Remark 3.2.4.
Similar concepts have been introduced in [23] for the purpose of studyingthe L p interpolation error of anisotropic finite elements, therefore with | π ( z ) | in place of |∇ π ( z ) | . The following result shows that a certain power of γ m is equivalent to the shape function L m . For any domain Ω ⊂ R d and any f, g ∈ L p (Ω) we use the shorthand (cid:107) ( f, g ) (cid:107) L p (Ω) := (cid:18)(cid:90) Ω | ( f ( z ) , g ( z )) | p dz (cid:19) p = (cid:18)(cid:90) Ω ( f ( z ) + g ( z ) ) p/ dz (cid:19) p . (3.22) Lemma 3.2.5.
There exists a constant C = C ( m ) such that for all π ∈ IH m C − L m ( π ) ≤ γ m ( π ) m − ≤ CL m ( π ) . (3.23) Proof:
We first prove the left part of (3.23). Let π ∈ IH m and A ∈ A π such that A isinvertible. The matrix A admits a singular value decomposition A = U DV, where
U, V are unitary and D = diag( λ , λ ) with λ i > λ i are the eigenvaluesof A T A . Let T be the triangle of vertices (0 , , √
2) and ( √ , T := (cid:112) | det A | V T D − ( T ) , which satisfies | T | = | T | = 1 and has an angle of π / S ( T ) = 1.Denoting by C the constant in Lemma 3.2.2 and using (3.21), we obtain(1+ C ) − (cid:107)∇ π −∇ I m − T π (cid:107) L ∞ ( T ) ≤ (cid:107)∇ π (cid:107) L ∞ ( T ) ≤ diam( A ( T )) m − = | det A | m − diam( T ) m − . Taking the infimum over all invertible A ∈ A π and remarking that this set is dense in A π , we conclude the proof of the left part of (3.23). For the right part, we define for all q , q ∈ IH m − , and any triangle T , (cid:107) ( q , q ) (cid:107) T := inf r ,r ∈ IP m − (cid:107) ( q , q ) − ( r , r ) (cid:107) L ∞ ( T ) . (3.24)We denote by T eq an equilateral triangle centered at the origin and of area 1. Since thefunctions (cid:107) · (cid:107) T eq and (cid:107) · (cid:107) L ∞ ( T eq ) are norms on IH m − × IH m − there exists a constant C ∗ such that (cid:107) · (cid:107) L ∞ ( T eq ) ≤ C ∗ (cid:107) · (cid:107) T eq . Let T be a triangle satisfying | T | = 1 and bary( T ) = 0. .2. The shape function L m,p and local error estimates φ ∈ SL such that T = φ ( T eq ). We thenobtain (cid:107) ( q , q ) (cid:107) L ∞ ( T ) = (cid:107) ( q ◦ φ, q ◦ φ ) (cid:107) L ∞ ( T eq ) ≤ C ∗ (cid:107) ( q ◦ φ, q ◦ φ ) (cid:107) T eq = C ∗ (cid:107) ( q , q ) (cid:107) T We now choose a polynomial π ∈ IH m and we set ( q , q ) := ∇ π . It follows from theprevious equation and (3.24) that (cid:107)∇ π (cid:107) L ∞ ( T ) ≤ C ∗ (cid:107)∇ π (cid:107) T ≤ C ∗ (cid:107)∇ π − ∇ I m − T π (cid:107) L ∞ ( T ) We define a linear map A ∈ GL associated to π and T = φ ( T eq ) as follows A := (cid:107)∇ π (cid:107) m − L ∞ ( T ) λ − φ − , where λ = 3 − / is the minimal distance from 0 to ∂T eq . Then for all z ∈ ∂T we have φ − ( z ) ∈ ∂T eq and hence | φ − ( z ) | ≥ λ . Therefore, | A ( z ) | m − = (cid:107)∇ π (cid:107) L ∞ ( T ) ( λ − | φ − ( z ) | ) m − ≥ (cid:107)∇ π (cid:107) L ∞ ( T ) ≥ |∇ π ( z ) | . By homogeneity, we thus find that for all z ∈ R |∇ π ( z ) | ≤ | A ( z ) | m − , which means that A ∈ A π . Furthermore, since det φ = 1, we have | det A | m − = λ − ( m − (cid:107)∇ π (cid:107) L ∞ ( T ) ≤ C ∗ λ − ( m − (cid:107)∇ π − ∇ I m − T π (cid:107) L ∞ ( T ) . Hence taking the infimum over all triangles T satisfying | T | = 1 and bary( T ) = 0 weobtain γ m ( π ) m − ≤ C inf | T | =1bary( T )=0 (cid:107)∇ π − ∇ I m − T π (cid:107) L ∞ ( T ) (3.25)with C = C ∗ λ − ( m − . Using the invariance of the interpolation error under translation, asexpressed by (3.2), we find that the right hand side of (3.25) is CL m ( π ), which concludesthe proof. (cid:5) We next introduce a measure of the isotropy of a triangle T with respect to theeuclidean metric : ρ ( T ) := diam( T ) | T | . (3.26)If T is an obtuse triangle, an elementary computation shows that 4 S ( T ) ≤ ρ ( T ). Indeed,if the largest angle of T is θ ≥ π /
2, and if the edges neighbouring the angle θ have length l , l , we obtain using l + l ≥ l l that ρ ( T ) = l + l − l l cos θ l l sin θ ≥ − cos θ sin θ = 4 tan θ S ( T )24 Chapter 3. Sharp asymptotics of the W ,p interpolation error Since the minimal value of ρ is 4 / √ S ( T ) = 1 foracute triangles, we obtain that for any triangle Tρ ( T ) ≥ √ S ( T ) . (3.27)The functions S and ρ have a different behavior : ρ ( T ) increases as T becomes thinner,while S ( T ) increases only if an angle of T approaches π .In the follow up of this chapter, we frequently distort the measure of isotropy ρ bya linear transform. If A ∈ GL , then ρ ( A ( T )) reflects the isotropy of T measured in themetric A T A . In particular ρ ( A ( T )) is minimal, i.e. equal to 4 / √
3, if and only if the ellipse E containing T and of minimal area is of the form E = { z ∈ R ; | A ( z − bary( T )) | ≤ r } for some r > Theorem 3.2.6.
There exists a constant C = C ( m ) such that for all π ∈ IH m , all A ∈ A π and any triangle T , we have | π − I m − T π | W ,p ( T ) ≤ C | T | τ S ( T ) ρ ( A ( T )) m − | det A | m − (3.28) where τ := m − + p . Furthermore for any triangle T and any g ∈ C m ( T ) , we have | g − I m − T g | W ,p ( T ) ≤ C | T | τ S ( T ) ρ ( T ) m − (cid:107) d m g (cid:107) L ∞ ( T ) , (3.29) where (cid:107) d m g (cid:107) L ∞ ( T ) is defined by (3.13). Before proving this result, we make some observations on its consequences. Combiningthe two estimates contained in this theorem, we obtain a mixed anisotropic-isotropicestimate, that can be used as a guideline for producing triangulations adapted to a function f ∈ C m (Ω). Let T be a triangle, let f ∈ C m ( T ), π ∈ IH m and A ∈ A π . Then | f − I m − T f | W ,p ( T ) ≤ | π − I m − T π | W ,p ( T ) + | ( f − π ) − I m − T ( f − π ) | W ,p ( T ) , ≤ C | T | τ S ( T ) (cid:16) ρ ( A ( T )) m − | det A | m − + ρ ( T ) m − (cid:107) d m f − d m π (cid:107) L ∞ ( T ) (cid:17) , (3.30)where C = C ( m ). Note that the left term in the parenthesis is an “anisotropic” contributionto the error, while the right term is an “isotropic” contribution.Let ε > ≤ p < ∞ . We now explain how the requirements (i), (ii), (iii) and (iv)heuristically exposed in the introduction can be mathematically stated, and show thatthe estimate T ) m − | f − I m − T f | W ,p (Ω) ≤ C (cid:107) L m ( π z ) + ε (cid:107) L τ (Ω) , is met when the triangulation satisfies these requirements. Consider a polygonal andbounded domain Ω, a function f ∈ C m (Ω) and a triangulation T . For each z ∈ Ω, wedenote by T z ∈ T the triangle containing z and define π z = d m f ( z ) m ! ∈ IH m . The adaptationof T with respect to f for the W ,p semi-norm, can be measured by the smallest constant C T ≥ .2. The shape function L m,p and local error estimates δ > z ∈ Ω, C − T δ ≤ | T z | τ ( L m ( π z ) + ε ) ≤ C T δ. (3.31)(ii) (Optimized shapes) For all z ∈ Ω, there exists A z ∈ A π z , such that ρ ( A z ( T z )) ≤ C T and | det A z | m − ≤ C L ( L m ( π z ) + ε ) , (3.32)where C L is the constant that appears in Lemma (3.2.5). According to this lemma,such an A z always exists for any ε > l p ( T ) norm of S is bounded as follows (cid:32) T ) (cid:88) T ∈T S ( T ) p (cid:33) p ≤ C T . (3.33)This condition is less stringent than asking that S ( T ) ≤ C T for all T ∈ T , and turnsout to be sufficient for proving the optimal error estimate.(iv) (Sufficient refinement) The mesh T is sufficiently fine in such way that the localinterpolation error estimate(3.30) is controlled by the “anisotropic” component. Moreprecisely, for all z ∈ Ω, ρ ( T z ) m − (cid:107) d m f − d m f ( z ) (cid:107) L ∞ ( T z ) ≤ C T ( L m ( π z ) + ε ) . (3.34)This condition is ensured by sufficient refinement of the triangulation due to thefollowing observation : If T (cid:48) z is the image of T z by a homothetic size reduction around z , then ρ ( T (cid:48) z ) = ρ ( T z ) while (cid:107) d m f − d m f ( z ) (cid:107) L ∞ ( T (cid:48) z ) tends to zero due to the continuityof d m f .We now produce a global error estimate from these four assumptions. For a given z ∈ Ω we inject successively π = π z , (3.32), (3.34) and (3.31) into the estimate (3.30) andobtain | f − I m − T z f | W ,p ( T z ) ≤ C | T z | τ S ( T z ) (cid:16) ρ ( A z ( T z )) m − | det A z | m − + ρ ( T z ) m − (cid:107) d m f − d m π z (cid:107) L ∞ ( T ) (cid:17) ≤ C | T z | τ S ( T z )( C m − T C L ( L m ( π z ) + ε ) + ρ ( T z ) m − (cid:107) d m f − d m f ( z ) (cid:107) L ∞ ( T ) ) ≤ C | T z | τ S ( T z )( C m − T C L + C T )( L m ( π z ) + ε ) ≤ C δS ( T z )where C = C ( m, C T , C L ). Using (3.33) we obtain | f − I m − T f | pW ,p (Ω) = (cid:88) T ∈T | f − I m − T f | pW ,p ( T ) ≤ C p δ p (cid:88) T ∈T S ( T ) p ≤ ( C C T ) p δ p T ) . (3.35)On the other hand, the left side of inequality (3.31) provides an upper estimate of δ asfollows. C − τ T δ τ T ) = C − τ T δ τ (cid:90) Ω dz | T z | ≤ (cid:90) Ω ( L m ( π z ) + ε ) τ dz = (cid:107) L m ( π z ) + ε (cid:107) τL τ (Ω) . (3.36)26 Chapter 3. Sharp asymptotics of the W ,p interpolation error Combining (3.35) with (3.36) we eliminate the variable δ and obtain T ) m − | f − I m − T f | W ,p (Ω) ≤ C (cid:107) L m ( π z ) + ε (cid:107) L τ (Ω) , (3.37)where C = C ( m, C T ). Hence the optimal asymptotic estimate (3.4) is satisfied up toa multiplicative constant depending only on the degree m − C T . Note however that the properties (3.31), (3.32), (3.33)and (3.34) required for C T may lead to a very pessimistic constant C = C ( m, C T , C L ) ininequality (3.37). Finer estimates and weaker conditions on the mesh T can be obtainedfrom (3.30).In the context of the H = W , semi-norm and of piecewise linear and quadraticelements we present numerical results in § T using three quantities σ ( T ), ρ ( T ) and S ( T ) that are related to the conditions(i), (ii) and (iii) respectively. We also discuss in § T of the mesh T in terms of Riemannianmetrics, a more convenient form for mesh generation.The construction of a mesh which satisfies both the requirements (3.32) of optimizedshapes and (3.33) of bounded measure of sliverness is a difficult problem. The constructionpresented in this chapter, for the proof of Theorems 3.1.1 and 3.1.2, is based on a localpatching strategy. A small portion of the triangulations, which can be neglected as thecardinality tends to infinity, does not satisfy conditions. Another approach to this meshgeneration problem is presented in Chapter 5, where the requirements (3.33) is replacedas follows : the measure of sliverness needs to be uniformly bounded on a refinement of T .The role of the measure of sliverness in the W ,p approximation error is again discussedin Chapter 6. Proof of Theorem 3.2.6 :
Let T be a triangle and let h ∈ C ( T ). Using lemma 3.2.2,we obtain | h − I m − T h | W ,p ( T ) = (cid:107)∇ h − ∇ I m − T h (cid:107) L p ( T ) ≤ | T | p (cid:107)∇ h − ∇ I m − T h (cid:107) L ∞ ( T ) ≤ | T | p (1 + CS ( T )) (cid:107)∇ h (cid:107) L ∞ ( T ) . (3.38)Replacing h with π in inequality (3.38) and combining it with (3.21), we obtain that if T contains the origin, then for all A ∈ A π | π − I m − T π | W ,p ( T ) ≤ | T | /p (1 + CS ( T )) diam( A ( T )) m − . The left and right quantities in the above inequality are invariant by translation of T andtherefore this inequality remains valid for any T . Combining it with the identitydiam( A ( T )) = | T | | det A | ρ ( A ( T )) , this leads to the first inequality (3.28) of Theorem 3.2.6. For the second inequality, wetake g ∈ C m ( T ) and z = ( x , y ) ∈ T . We now take for h the remainder of the Taylordevelopment of g at z , h ( x, y ) := g ( x, y ) − (cid:88) k + l ≤ m − ∂ k + l g∂x k ∂y l ( z ) ( x − x ) k k ! ( y − y ) l l ! . .2. The shape function L m,p and local error estimates h ∈ C m ( T ) and h ( z ) = dh ( z ) = · · · = d m − h ( z ) = 0 . (3.39)It follows that (cid:107)∇ h (cid:107) L ∞ ( T ) ≤ C (diam T ) m − (cid:107) d m h (cid:107) L ∞ ( T ) = C | T | m − ρ ( T ) m − (cid:107) d m h (cid:107) L ∞ ( T ) , (3.40)where C = C ( m ). Combining (3.38) and (3.40), we obtain | h − I m − T h | W ,p ( T ) ≤ C | T | τ S ( T ) ρ ( T ) m − (cid:107) d m h (cid:107) L ∞ ( T ) , (3.41)where C = C ( m ). We now observe that g − h ∈ IP m − , hence d m g = d m h and h − I m − T h = g − I m − T g . Injecting this into the last equation we conclude the proof of inequality (3.29)and of Theorem 3.2.6 (cid:5) As a conclusion to this section we prove inequality (3.10), which links the functions L m,p and L m = L m, ∞ . Lemma 3.2.7.
There exists a constant c = c ( m ) > such that for all ≤ p ≤ p ≤ ∞ , cL m ≤ L m,p ≤ L m,p ≤ L m on IH m , (3.42) Proof:
Let T eq be an equilateral triangle of area 1. Since all norms are equivalent on thefinite dimensional space IP m − , there exists a constant c = c ( m ) > q , q ) ∈ IP m − × IP m − , c (cid:107) ( q , q ) (cid:107) L ∞ ( T eq ) ≤ (cid:107) ( q , q ) (cid:107) L ( T eq ) , (3.43)Furthermore, since T eq has area 1, we have (cid:107) ( q , q ) (cid:107) L p ( T eq ) ≤ (cid:107) ( q , q ) (cid:107) L p ( T eq ) , (3.44)for all 1 ≤ p ≤ p ≤ ∞ . If T is a triangle satisfying | T | = 1, there exists an affine changeof coordinates Ψ such that T = Ψ( T eq ) and we have (cid:107) ( q , q ) (cid:107) L p ( T ) = (cid:107) ( q ◦ Ψ , q ◦ Ψ) (cid:107) L p ( T eq ) for all ( q , q ) ∈ IP m − × IP m − and 1 ≤ p ≤ ∞ . Combining this invariance property withinequalities (3.43) and (3.44) we obtain c (cid:107) ( q , q ) (cid:107) L ∞ ( T ) ≤ (cid:107) ( q , q ) (cid:107) L p ( T ) ≤ (cid:107) ( q , q ) (cid:107) L p ( T ) ≤ (cid:107) ( q , q ) (cid:107) L ∞ ( T ) . (3.45)We now choose a polynomial π ∈ IH m , we set ( q , q ) := ∇ π − ∇ I m − T π , and we take theinfimum of (3.45) among all triangles T of area 1. This leads to the announced inequality(3.42) which concludes the proof. (cid:5) Chapter 3. Sharp asymptotics of the W ,p interpolation error The polygonal domain Ω, the integer m , the function f ∈ C m (Ω) and the exponent1 ≤ p < ∞ are fixed in this section which is devoted to the proof of the lower estimate(3.7) and the upper estimates (3.4) and (3.8) which are stated in Theorems 3.1.1 and3.1.2.We denote by µ z the Taylor polynomial of f and of degree m at the point z =( x , y ) ∈ Ω µ z ( x, y ) := (cid:88) k + l ≤ m ∂ k + l f ( z ) ∂x k ∂y l ( x − x ) k k ! ( y − y ) l l ! . Note that π z is the homogeneous component of degree m in µ z . Therefore d m π z = d m µ z = d m f ( z ) for any z ∈ Ω, and for any triangle Tπ z − I m − T π z = µ z − I m − T µ z . (3.46) The following lemma allows to bound by below the interpolation error of f on a triangle T . Lemma 3.3.1.
Let τ := m − + p . For any triangle T ⊂ Ω and z ∈ T we have | f − I m − T f | W ,p ( T ) ≥ | T | τ (cid:16) L m,p ( π z ) − ω (diam T ) ρ ( T ) m − S ( T ) (cid:17) , where the function ω is positive, depends only on f and m , and satisfies ω ( δ ) → as δ → . Proof:
Let h := f − µ z . Using Equation (3.46) we obtain | f − I m − T f | W ,p ( T ) ≥ | π z − I m − T π z | W ,p ( T ) − | h − I m − T h | W ,p ( T ) ≥ | T | τ L m,p ( π z ) − | h − I m − T h | W ,p ( T ) . and we have seen in Theorem 3.2.6 that | h − I m − T h | W ,p ( T ) ≤ C | T | τ S ( T ) ρ ( T ) m − (cid:107) d m h (cid:107) L ∞ ( T ) . for some constant C > m . We then remark that (cid:107) d m h (cid:107) L ∞ ( T ) = (cid:107) d m f − d m π z (cid:107) L ∞ ( T ) = (cid:107) d m f − d m f ( z ) (cid:107) L ∞ ( T ) . Therefore, defining ω ( δ ) := C sup z,z (cid:48) ∈ Ω ; | z − z (cid:48) |≤ δ (cid:107) d m f ( z ) − d m f ( z (cid:48) ) (cid:107) , we conclude the proof of this lemma. (cid:5) .3. Proof of Theorems 3.1.1 and 3.1.2 T N ) N ≥ N . For all N ≥ N , T ∈ T N and z ∈ T , we define φ N ( z ) := | T | and ψ N ( z ) := (cid:16) L m,p ( π z ) − ω (diam( T )) ρ ( T ) m − S ( T ) (cid:17) + , where λ + := max { λ, } . Holder’s inequality gives, with τ := m − + p , (cid:90) Ω ψ τN ≤ (cid:18)(cid:90) Ω φ ( m − p N ψ pN (cid:19) τp (cid:18)(cid:90) Ω φ − N (cid:19) ( m − τ . (3.47)Note that (cid:82) Ω φ − N = T N ≤ N . Furthermore if T ∈ T N and z ∈ T then according toLemma 3.3.1 φ N ( z ) ( m − p ψ N ( z ) p = | T | pτ − ψ N ( z ) p ≤ | T | | f − I m − T f | pW ,p ( T ) , hence (cid:90) Ω φ ( m − p N ψ pN ≤ (cid:88) T ∈T N | T | (cid:90) T | f − I m − T f | pW ,p ( T ) = | f − I m − T f | pW ,p (Ω) Inequality (3.47) therefore leads to (cid:107) ψ N (cid:107) L τ (Ω) ≤ | f − I m − T N f | W ,p (Ω) N m − . (3.48)Since the sequence ( T N ) N ≥ N is admissible, there exists a constant C A > N and all T ∈ T N we have diam( T ) ≤ C A N − . We introduce a subset of T (cid:48) N ⊂ T N which gathers the most degenerate triangles T (cid:48) N = { T ∈ T N ; ρ ( T ) ≥ ω ( C A N − ) − m +1 } , where ω is the function from Lemma 3.3.1. We denote by Ω (cid:48) N the portion of Ω covered by T (cid:48) N . For all z ∈ Ω \ Ω (cid:48) N , recalling from (3.27) that ρ ≥ S , we obtain ψ N ( z ) ≥ L m,p ( π z ) − (cid:113) ω ( C A N − ) . Hence (cid:107) ψ N (cid:107) τL τ (Ω) ≥ (cid:13)(cid:13)(cid:13)(cid:13)(cid:18) L m,p ( π z ) − (cid:113) ω ( C A N − ) (cid:19) + (cid:13)(cid:13)(cid:13)(cid:13) τL τ (Ω \ Ω (cid:48) N ) ≥ (cid:13)(cid:13)(cid:13)(cid:13)(cid:18) L m,p ( π z ) − (cid:113) ω ( C A N − ) (cid:19) + (cid:13)(cid:13)(cid:13)(cid:13) τL τ (Ω) − C τ | Ω (cid:48) N | , where C := max z ∈ Ω L m,p ( π z ). We next observe that | Ω (cid:48) N | → N → + ∞ : indeed forall T ∈ T (cid:48) N we have | T | = diam( T ) ρ ( T ) − ≤ C A N − ω ( C A N − ) m +1 . Since T (cid:48) N ≤ N , we obtain | Ω (cid:48) N | ≤ C A ω ( C A N − ) m +1 , which tends to 0 as N → ∞ . Wethus obtainlim inf N →∞ (cid:107) ψ N (cid:107) L τ (Ω) ≥ lim N →∞ (cid:13)(cid:13)(cid:13)(cid:13)(cid:18) L m,p ( π z ) − (cid:113) ω ( C A N − ) (cid:19) + (cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) = (cid:107) L m,p ( π z ) (cid:107) L τ (Ω) . Combining this result with (3.48) we conclude the proof of the announced estimate (3.7).30
Chapter 3. Sharp asymptotics of the W ,p interpolation error The proof of these upper estimates is based on an explicit construction of the trian-gulations T N , which is adapted from the construction in [4]. Roughly speaking, the ideaof this construction is to produce a first mesh R of the domain Ω, composed of elementssufficiently small so that f can be regarded as a polynomial π R on each triangle R ∈ R .Each element R ∈ R is then tiled with small triangles optimally adapted to π R , and sometechnical manipulations are done in order to preserve the conformity at the interfaces ofthe elements of R . The main difference with the construction first proposed in [4], andused later in Chapter 2, is that the measure of sliverness S of the generated trianglesshould be kept under control.Let T be a triangle with vertices ( z , z , z ). We define the symmetrized triangle ˜ T ofvertices ( z , z , z + z − z ) so that T ∪ ˜ T is a parallelogram. We define a tiling P T of theplane R as follows P T := { α ( z − z ) + β ( z − z ) + T (cid:48) ; α, β ∈ ZZ , T (cid:48) ∈ { T , ˜T }} . (3.49)A homogeneous polynomial π ∈ IH m is either even or odd (depending on the parityof m ). Combining this observation with the translation invariance (3.2) we obtain that | π − I m − T (cid:48) π | W ,p ( T (cid:48) ) is constant among all triangles T (cid:48) ∈ P T . We also define P T,n := 1 n P T (3.50)the tiling obtained by rescaling P T by a factor n . We use this rescaled tiling in orderto subdivide an arbitrary triangle R , up to a few additional triangles located near theboundary of R , as expressed by the following lemma. Lemma 3.3.2.
Let R and T be two triangles. There exists a family ( P T,n ( R )) n ≥ , ofconforming triangulations of R such that the following holds1. Nearly all the elements of P T,n ( R ) belong to P T,n , which is defined by (3.50), in thesense that lim n →∞ P T,n ( R )) n = | R || T | and lim n →∞ P T,n ( R )) n = 0 . (3.51) where P T,n ( R ) := P T,n ( R ) ∩ P T,n and P T,n ( R ) := P T,n ( R ) \ P T,n (3.52)
2. The vertices of P T,n ( R ) on the boundary of R are exactly those of the form kn a +(1 − kn ) b , where ≤ k ≤ n and a, b are vertices of R .3. There exists constants C = C ( R, T ) and C = C ( R, T ) such that sup n ≥ (cid:16) n max T (cid:48) ∈P T,n ( R ) diam( T (cid:48) ) (cid:17) ≤ C and sup n ≥ max T (cid:48) ∈P T,n ( R ) S ( T (cid:48) ) ≤ C . (3.53) .3. Proof of Theorems 3.1.1 and 3.1.2 Proof:
See appendix. (cid:5)
For any
M >
0, we define the compact set of triangles T M := { T ; | T | = 1 , diam( T ) ≤ M and bary( T ) = 0 } . Note that for all T ∈ T M , ρ ( T ) ≤ M . We also define the function L M ( π ) := min T ∈ T M | π − I m − T π | W ,p ( T ) . (3.54)Since T M is compact for the Hausdorff distance between sets and since T (cid:55)→ | π − I m − T π | W ,p ( T ) is continuous with respect to this distance on the set of all triangles, wefind that this minimum is indeed attained and that L M is continuous. We also observethat M (cid:55)→ L M ( π ) is a decreasing function of M and thatlim M →∞ L M ( π ) = inf | T | =1 , bary( T )=0 | π − I m − T π | W ,p ( T ) = inf | T | =1 | π − I m − T π | W ,p ( T ) = L m,p ( π ) , where we have used the invariance under translation of the interpolation error (3.2) forthe second equality.The constant M > π ∈ IH m andlet T ⊂ Ω be homothetic to a triangle achieving the minimum in the definition of L M ( π ).Then, | f − I m − T f | W ,p ( T ) ≤ | π − I m − T π | W ,p ( T ) + | ( f − π ) − I m − T ( f − π ) | W ,p ( T ) ≤ | T | τ L M ( π ) + C | T | τ ρ ( T ) m − S ( T ) (cid:107) d m f − d m π (cid:107) L ∞ ( T ) ≤ | T | τ ( L M ( π ) + CM m +1 (cid:107) d m f − d m π (cid:107) L ∞ ( T ) ) , (3.55)where we have used inequality (3.29) in the second line, and in third line the fact that ρ ( T ) m − S ( T ) ≤ ρ ( T ) m +12 ≤ M m +1 , since S ≤ ρ and T is homothetic to an element of T M .Let δ > d m f is continuous, we can choosea sufficiently fine mesh R = R ( M, δ ) of Ω in such way that, CM m +1 (cid:107) d m f ( x ) − d m f ( y ) (cid:107) L ∞ ( T ) ≤ δ, for all R ∈ R and x, y ∈ R. (3.56)For any triangle R ∈ R we define z R := argmin z ∈ R L M ( π z ) and π R := π z R . (3.57)We also define T R := ( L M ( π R ) + δ ) − τ T ∗ , (3.58)32 Chapter 3. Sharp asymptotics of the W ,p interpolation error where T ∗ ∈ T M achieves the minimum in the definition of L M ( π R ). We denote by P n ( R ) = P T R ,n ( R ) the triangulation of Lemma 3.3.2 built from the two triangles R and T R , andsimilarly P n ( R ) = P T R ,n ( R ) and P n ( R ) = P T R ,n ( R ). We define for all n the global meshof Ω T M,δn = (cid:91) R ∈R P n ( R ) , which coincides with P n ( R ) on each R ∈ R . Since all the meshes P n ( R ) are conforming,and since P n ( R ) has by construction n + 1 equispaced vertices on each edge of R , themesh T M,δn is also conforming. According to Equations (3.51) and (3.57), we havelim n →∞ (cid:0) T M,δn (cid:1) n = (cid:88) R ∈R (cid:18) lim n →∞ P n ( R )) n (cid:19) = (cid:88) R ∈R | R | ( L M ( π R ) + δ ) τ ≤ (cid:90) Ω ( L M ( π z ) + δ ) τ dz. (3.59)For T ∈ P n ( R ), we combine (3.55), (3.56) and (3.58) to obtain | f − I m − T f | W ,p ( T ) ≤ n − τ for all T ∈ P n ( R ) . (3.60)For T ∈ P n ( R ), we invoke the isotropic estimate (3.29) to obtain | f − I m − T f | W ,p ( T ) ≤ C | T | p S ( T ) diam( T ) m − (cid:107) d m f (cid:107) L ∞ (Ω) ≤ CS ( T ) diam( T ) τ (cid:107) d m f (cid:107) L ∞ (Ω) (3.61)where C is the constant from (3.29). Using the third item in Lemma 3.3.2, we find thatthat there exists constants C = C ( M, δ ) and C = C ( M, δ ) such thatsup n ≥ (cid:18) n max T ∈T M,δn diam( T ) (cid:19) ≤ C and sup n ≥ (cid:18) max T ∈T M,δn S ( T ) (cid:19) ≤ C , (3.62)so that, combining with (3.61), we have for all T ∈ P n ( R ) | f − I m − T f | pW ,p ( T ) ≤ C n − τ , (3.63)with C = C ( M, δ ). Combining (3.60) and (3.63), and using the first item in Lemma3.3.2, we obtain | f − I m − T M,δn f | pW ,p (Ω) = (cid:88) T ∈T M,δn | f − I m − T f | pW ,p ( T ) ≤ (cid:88) R ∈R (cid:16) P n ( R )) n − pτ + P n ( R )) C p n − pτ (cid:17) .4. Optimal metrics for linear and quadratic elements n →∞ n pτ − | f − I m − T M,δn f | pW ,p (Ω) ≤ (cid:88) R ∈R lim n →∞ P n ( R )) + P n ( R )) C p n = (cid:88) R ∈R | R | ( L M ( π R ) + δ ) τ ≤ (cid:90) Ω ( L M ( π z ) + δ ) τ dz. Combining this with (3.59) we obtainlim sup n →∞ T M,δn ) m − (cid:12)(cid:12)(cid:12) f − I m − T M,δn f (cid:12)(cid:12)(cid:12) W ,p (Ω) ≤ (cid:107) L M ( π z ) + δ (cid:107) L τ (Ω) . (3.64)Let ε >
0. Sincelim M →∞ lim δ → (cid:107) L M ( π z ) + δ (cid:107) L τ (Ω) = lim M →∞ (cid:107) L M ( π z ) (cid:107) L τ (Ω) = (cid:107) L m,p ( π z ) (cid:107) L τ (Ω) we can choose adequately M and δ in such way that (cid:107) L M ( π z )+ δ (cid:107) L τ (Ω) ≤ (cid:107) L m,p ( π z ) (cid:107) L τ (Ω) + ε . Let n = n ( N, M, δ ) be the largest integer such that T M,δn ) ≤ N , we define T εN := T M,δn . so that N − T εN ) → N → ∞ . If follows from (3.59) and (3.62) that the sequence oftriangulations ( T εN ) is admissible, and inequality (3.64) giveslim sup N →∞ N m − | f − I m − T εN f | W ,p (Ω) ≤ (cid:107) L m,p ( π z ) (cid:107) L τ (Ω) + ε. which is the upper estimate (3.8) announced. Last we choose for all N large enough ε ( N ) > N m − | f − I m − T ε ( N ) N f | W ,p (Ω) ≤ (cid:107) L m,p ( π z ) (cid:107) L τ (Ω) + 2 ε ( N ) . and such that ε ( N ) → N → ∞ . The sequence of triangulations T N := T ε ( N ) N fulfillsthe estimate (3.4) which concludes the proof. The proof of the upper estimate (3.8) exposed in the previous section involves theconstruction of meshes T εN by tiling each element R of the “coarse” triangulation R usingthe finer mesh P n ( R ). In practice, such a construction may require a very large number oftriangles in order to match the optimal error estimate. More commonly used strategies formesh generation are based on the prescription of a non-euclidean metric depending on f for which each triangle should be isotropic. In this section, we explain how to design suchmetric in order to derive near-optimal error estimates and we give analytic expressions inthe particular case of IP and IP finite elements.34 Chapter 3. Sharp asymptotics of the W ,p interpolation error As a first step, we express the requirements (i), (ii) and (iv) of mesh adaptation in termsof metrics. We therefore use the following notations : we consider a polygonal domain Ω,an integer m ≥
2, an exponent 1 ≤ p < ∞ , and a function f ∈ C m (Ω) to be approximatedin the W ,p semi-norm by IP m − finite element interpolation on a triangulation of Ω. Wealso consider two real numbers ε > δ > π ∈ IH m A (cid:48) π := { M ∈ S +2 ; |∇ π ( z ) | ≤ ( z T M z ) m − for all z ∈ R } = { A T A ; A ∈ A π } , (3.65)and we consider a continuous field M : Ω → S +2 of symmetric positive definite matricessatisfying M ( z ) ∈ A (cid:48) π z for all z ∈ Ω and C − ( L m ( π z ) + ε ) ≤ (det M ( z )) m − ≤ C ( L m ( π z ) + ε ) , (3.66)where C is an absolute constant. The existence of such a field is established in fullgenerality in Chapter 6. We explain in the sequel of this section a practical constructionin the case of piecewise linear and bilinear elements. We then define a field of symmetricpositive definite matrices h on Ω by h ( z ) := δ − τ (det M ( z )) − τ p M ( z ) (3.67)where τ := m − + p . Such a field h is called a Riemannian metric. Under some assumptionson the metric h and on the domain Ω, which are discussed in [15, 66] and Chapter 5 forthe infinite domain R d , it is possible to produce a triangulation T of Ω satisfying for all T ∈ T and z ∈ T C − ≤ | T | (cid:112) det h ( z ) ≤ C and ρ (cid:16)(cid:112) h ( z )( T ) (cid:17) ≤ C (3.68)where the constant C ≥ h . (In the second inequality the square root is meant in the sense of symmetric positivematrices). Examples of such mesh generators are [19, 93, 94]. We shall not discuss in thischapter the conditions under which such a mesh can be generated. Let us only mentionthat, if one ignores a few outliers at the corners of Ω, these conditions hold if δ is smallenough.Note that (det h ( z )) τ = δ − (det M ( z )) m − , therefore if (3.68) holds we find that forall T ∈ T and z ∈ T , ( C C ) − δ ≤ | T | τ ( L m ( π z ) + ε ) ≤ C C δ, hence condition (i) of equilibrated errors, as stated in (3.31), holds provided C T ≥ C C .Furthermore for all z ∈ Ω let us define A z := (cid:112) M ( z ) and note that A z ∈ A π z anddet A z = (cid:112) det M ( z ). Using (3.66) and (3.68) we find that condition (ii) of optimalshapes, as stated in (3.32), holds provided C T ≥ C . Condition (iv) holds when the mesh T is sufficiently refined, which is the case if δ is small enough. .4. Optimal metrics for linear and quadratic elements M : Ω → S +2 satisfying M ( z ) ∈ A (cid:48) π z , (3.66) and such that M ( z ) is positive definite, state of the art mesh generators allow us to build triangulations T that match the conditions (i), (ii) and (iv). In order to prove the near-optimal estimate T ) m − | f − I m − T | W ,p (Ω) ≤ C (cid:107) L m ( π z ) + ε (cid:107) L τ (Ω) , it is also necessary that the generated meshes satisfy condition (iii) of bounded measureof sliverness, as stated in (3.33). Unfortunately, the author has not heard of theoreticalresults that would guarantee this condition when a mesh is built by such algorithms, apartfrom Theorems 5.1.14 and 6.1.2 which only apply to the infinite domain R or the periodicdomain ( R / ZZ) respectively. We discuss in § S ( T ) whenusing the mesh generation software [93].For m ∈ { , } , which correspond to IP and IP elements, we give in the sequel asimple expression of a continuous map M m : IH m → S ⊕ satisfying M m ( π ) ∈ A (cid:48) π for all π ∈ IH m and K − L m ( π ) ≤ (det M m ( π )) m − ≤ KL m ( π ) (3.69)for some absolute constant K ≥
1. It is not hard to build from M m ( π z ) a matrix M ( z )satisfying (3.66). For practical uses one usually takes M ( z ) := M m ( π z ) + µ Id . where the constant µ ≥ L p norm. This approach does not seem toadapt well to the W ,p norm : the main problem arises again from condition (iii) of boundedmeasure of sliverness, and from the lack of conformity of the triangulations generated bythis procedure. We now give analytic expression of matrix fields M and M satisfying (3.69), whichcorrespond to linear and quadratic elements. In the simplest and already well establishedcase of IP elements, a more detailed analysis can be found in [85]. In contrast, the resultsfor quadratic elements are new.For any homogeneous quadratic polynomial π ∈ IH , π = ax + 2 bxy + cy , we definethe symmetric matrix [ π ] = (cid:18) a bb c (cid:19) . We define M ( π ) := 4[ π ] = 4 (cid:18) a bb c (cid:19) = 4 (cid:18) a + b ab + bcab + bc b + c (cid:19) (3.70)36 Chapter 3. Sharp asymptotics of the W ,p interpolation error For all z ∈ R one has ∇ π ( z ) = 2[ π ] z , and therefore |∇ π ( z ) | = z T M ( π ) z . It followsthat M ( π ) ∈ A (cid:48) π and det M ( π ) = inf { det M ; M ∈ A (cid:48) π } which implies (3.69) according to Lemma 3.2.5.For any π ∈ IH we define M ( π ) := (cid:113) [ ∂ x π ] + [ ∂ y π ] (3.71)If π = ax + 3 bx y + 3 cxy + dy then in terms of the coefficients of π M ( π ) = 3 (cid:115)(cid:18) a bb c (cid:19) + (cid:18) b cc d (cid:19) = 3 (cid:115)(cid:18) a + 2 b + c ab + 2 bc + cdab + 2 bc + cd b + 2 c + d (cid:19) In the sense of symmetric matrices, we have M ( π ) = (cid:113) [ ∂ x π ] + [ ∂ y π ] ≥ (cid:112) [ ∂ x π ] = | [ ∂ x π ] | , where we used the fact that the square root √ : S ⊕ → S ⊕ is increasing. It follows that |∇ π ( z ) | = | ∂ x π ( z ) | + | ∂ y π ( z ) | ≤ z T M ( π ) z ) , hence √ M ( π ) ∈ A (cid:48) ( π ). Note thatdet M ( π ) = 9 (cid:112) ( a + 2 b + c )( b + 2 c + d ) − ( ab + 2 bc + cd ) . (3.72)It remains to establish (3.69). This point is postponed to § L m . Let usfinally mention the work [65] in which approximate solutions to the optimization probleminf { det M ; M ∈ A (cid:48) π } are obtained through numerical optimization. This approach worksfor general m but is harder to use than the algebraic expressions of M ( π ) and M ( π )given here. The measure of non degeneracy of a triangle and of its image by a linear transformcan be linked by the following result.
Proposition 3.4.1.
There exists an absolute constant c > such that for any triangle T and any A ∈ GL , cρ ( A ( T )) ≤ ρ ( T ) (cid:107) A (cid:107)(cid:107) A − (cid:107) ≤ ρ ( A ( T )) . (3.73) .4. Optimal metrics for linear and quadratic elements Proof:
We use in this proof the identity | det B | = (cid:107) B (cid:107)(cid:107) B − (cid:107) − which holds for all B ∈ GL . Let T (cid:48) be a triangle and let A (cid:48) ∈ GL , then ρ ( A (cid:48) ( T (cid:48) )) = diam( A (cid:48) ( T (cid:48) )) | A (cid:48) ( T (cid:48) ) | ≤ (cid:107) A (cid:48) (cid:107) | det A (cid:48) | diam( T (cid:48) ) | T (cid:48) | = (cid:107) A (cid:48) (cid:107)(cid:107) A (cid:48)− (cid:107) ρ ( T (cid:48) ) . with the particular choice A (cid:48) = A − and T (cid:48) = A ( T ) we obtain the right side of (3.73).Let T eq be an equilateral triangle of area 1, and let µ be the diameter of the largest ballincluded in T eq . Up to a translation on T eq we can assume that there exists B ∈ GL suchthat T = B ( T eq ). We then havediam( T ) diam( A ( T )) = diam( B ( T eq )) diam( AB ( T eq )) ≥ µ (cid:107) B (cid:107)(cid:107) AB (cid:107)≥ µ (cid:107) B (cid:107)(cid:107) A (cid:107)(cid:107) B − (cid:107) − = µ (cid:107) A (cid:107)| det B | . Hence, since | T | = | B ( T eq ) | = | det B | , ρ ( T ) ρ ( A ( T )) = diam( T ) diam( A ( T )) | T || A ( T ) | ≥ ( µ (cid:107) A (cid:107)| det B | ) | det B | | det A | = µ (cid:107) A (cid:107)(cid:107) A − (cid:107) which establishes the left part of (3.73) with c = µ = . (cid:5) A consequence of the above lemma is that if T is a mesh adapted to a metric h in thesense of (3.68), then for all T ∈ T and z ∈ T we have cC − (cid:112) (cid:107) h ( z ) (cid:107)(cid:107) h ( z ) − (cid:107) ≤ ρ ( T ) ≤ C (cid:112) (cid:107) h ( z ) (cid:107)(cid:107) h − ( z ) (cid:107) . The measure of non-degeneracy ρ ( T ) is thus large when h ( z ) is ill conditioned. Althoughthis property is desirable in order to adapt to highly anisotropic features of the function f to be approximated, excessive degeneracy can cause mesh generation problems, which arediscussed in § M and M in order to control the value of ρ ( T ).According to (3.67) we have (cid:107) h ( z ) (cid:107)(cid:107) h − ( z ) (cid:107) = (cid:107)M ( z ) (cid:107)(cid:107)M ( z ) − (cid:107) , and thus ρ ( T ) ≤ C (cid:112) (cid:107)M ( z ) (cid:107)(cid:107)M ( z ) − (cid:107) . This leads us to define for all α ≥ A (cid:48) π,α := { M ∈ A (cid:48) π ; (cid:107) M (cid:107)(cid:107) M − (cid:107) ≤ α } . Let M ∈ S +2 , let R be a rotation and let λ ≥ µ ≥ M in suchway that M = R T diag( λ, µ ) R. We define for any α ≥ M ( α ) := R T (cid:18) λ
00 max( λα − , µ ) (cid:19) R. (3.74)38 Chapter 3. Sharp asymptotics of the W ,p interpolation error Clearly M ( α ) ≥ M , and if M (cid:54) = 0 then (cid:107) M ( α ) (cid:107)(cid:107) (cid:0) M ( α ) (cid:1) − (cid:107) ≤ α . Hence for all M ∈ A (cid:48) π we have M ( α ) ∈ A (cid:48) π,α .In the case of piecewise linear elements we therefore have M ( α )2 ( π ) ∈ A (cid:48) π,α for all π ∈ IH , and one easily shows that det M ( α )2 ( π ) = inf { det M ; M ∈ A (cid:48) π,α } . This suggeststhat constructing M ( z ) from M ( α )2 ( π z ) instead of M ( π z ) leads to a near-optimal meshadaptation to the function f , under the constraint ρ ( T ) ≤ C α for all triangles T inthe triangulation. The following proposition implies the same in the case of piecewisequadratic finite elements. Proposition 3.4.2.
Let π ∈ IH and α ≥ . Then √ M ( α )3 ( π ) ∈ A (cid:48) π,α and det M ( α )3 ( π ) ≤ K inf { det M ; M ∈ A (cid:48) π,α } (3.75) where the constant K is independent of π and α . Proof:
We already know that √ M ( α )3 ( π ) ∈ A (cid:48) π,α . If M ( α )3 ( π ) = M ( π ) then (3.75) holdsas a consequence of (3.69) and Lemma 3.2.5. We therefore assume in the following that M ( α )3 ( π ) (cid:54) = M ( π ). Let λ ∗ ( π ) := (cid:107)∇ π (cid:107) L ∞ ( D ) = sup | z |≤ |∇ π ( z ) | , where D = { z ∈ R ; | z | ≤ } is the unit disc of R . The largest ball inscribed in { z ∈ R ; |∇ π ( z ) | ≤ } is λ ∗ ( π ) − D . Let M ∈ A (cid:48) π,α and let λ ≥ λ > { z ∈ R ; z T M z ≤ } contains the ball λ − D , hence λ ≥ λ ∗ ( π ).Furthermore λ ≥ α − λ , hencedet M = λ λ ≥ α − λ ≥ α − λ ∗ ( π ) . (3.76)Let λ ( π ) be the largest eigenvalue of M ( π ), and assume that π = ax +3 bx y +3 cxy + dy .We obtain from (3.71) that λ ( π ) ≤ (cid:112) Tr M ( π ) = √ a + 3 b + 3 c + d . Since the norms (cid:107)∇ π (cid:107) L ∞ ( D ) and √ a + 3 b + 3 c + d are equivalent on the vector spaceIH , there exists a constant C > π ∈ IH such that λ ( π ) ≤ C λ ∗ ( π ).Since M ( α )3 ( π ) (cid:54) = M ( π ), the eigeinvalues of M ( α )3 ( π ) are λ ( π ) and α − λ ( π ). Hencedet M ( α )3 ( π ) = α − λ ( π ) ≤ C α − λ ∗ ( π ) . Combining this with (3.76) we conclude the proof, with K = C . (cid:5) Let us finally mention that, although they are derived from the coefficients of π , themaps π (cid:55)→ M m ( π ) and π (cid:55)→ M ( α ) m ( π ) for m ∈ { , } are invariant under rotation, andtherefore not tied to the chosen system of coordinate ( x, y ), as expressed by the followingresult. .4. Optimal metrics for linear and quadratic elements Proposition 3.4.3.
For any m ∈ { , } , any π ∈ IH m and any unitary matrix U ∈ O ,one has M m ( π ◦ U ) = U T M m ( π ) U. Furthermore, for any α ≥ one has M ( α ) m ( π ◦ U ) = U T M ( α ) m ( π ) U . Proof:
We only prove the invariance under unitary transformation of M , since theproof for M is elementary, as well at the result for M ( α ) m . Let π ∈ IH , let D x = [ ∂ x π ]and D y = [ ∂ y π ]. Let U = (cid:18) u u u u (cid:19) be unitary, then[ ∂ x ( π ◦ U )] = u U T D x U + u U T D y U and [ ∂ y ( π ◦ U )] = u U T D x U + u U T D y U Hence [ ∂ x ( π ◦ U )] + [ ∂ y ( π ◦ U )] = ( u + u ) U T D x U +( u u + u u ) U T ( D x D y + D y D x ) U +( u + u ) U T D y U which equals U T D x U + U T D y U since U is unitary. Eventually M ( π ◦ U ) = (cid:113) U T D x U + U T D y U = U T (cid:113) D x + D y U = U T M ( π ) U which concludes the proof. (cid:5) The envisionned applications for the theory developped in this chapter are mainly inthe field of partial differential equations that exhibit “shocks”, and strongly anisotropicfeatures, in particular conservation laws and fluid dynamics. We therefore test the qua-lity of our meshes on a synthetic function that mimics the typical behavior of functionsencountered in these contexts. For all δ >
0, our test function f δ : [ − , → R is definedas follows f δ ( x, y ) := tanh (cid:18) x − sin(5 y ) δ (cid:19) + x + xy . In all numerical results, we choose δ := 0 .
1. This function f δ , although smooth, exhibitsa “smoothed jump” of height 2 along to the curve defined by the equation 2 x = sin(5 y ),on a layer of width δ . On the rest of the domain, f δ is dominated by the polynomial part x + xy . The level lines and a 3D plot of f δ are presented on the two rightmost picturesof Figure 3.2.Our purpose is to produce four triangulations T H , IP , T H , IP , T L , IP and T L , IP contai-ning 2000 triangles each and which, for this cardinality, produce respectilvely the smallestpossible interpolation errors (cid:107)∇ f − ∇ I T f (cid:107) , (cid:107)∇ f − ∇ I T f (cid:107) , (cid:107) f − I T f (cid:107) and (cid:107) f − I T f (cid:107) .It is clearly out of reach to find the triangulations leading exactly to the smallest error.40 Chapter 3. Sharp asymptotics of the W ,p interpolation error ! ! ! ! sin ! y " ! x !" Figure f δ , δ = 0 . T H , IP and T H , IP based on the metrics h H , IP ( z ) = λ (det M (100)2 ( π z )) − M (100)2 ( π z ) where π z := d f δ ( z )2 ,h H , IP ( z ) = λ (det M ( π z )) − M ( π z ) where π z := d f δ ( z )6 , (3.77)where the positive constants λ , λ are adjusted in such way that the meshes generatedhave 2000 elements. Mesh generation was performed by the open source program Free-FEM++ [93] and results are illustrated on Figure 3.3. Note that we have used M (100)2 (defined as in (3.74)) instead of M which would lead to a different triangulation T ∗ H , IP ,also displayed on Figure 3.3, and associated to the metric h ∗ H , IP ( z ) := λ ∗ (det M ( π z )) − M ( π z ) where π z := d f δ ( z )2 , with again λ ∗ adjusted to obtain 2000 elements. The use of M (100)2 in place of M isjustified by mesh generation issues which are discussed in the next subsection.Similarly, and following the study developed in Chapter 2, we have generated T L , IP and T L , IP from the metrics h L , IP ( z ) = µ (det N ( π z )) − N ( π z ) where π z := d f δ ( z )2 ,h L , IP ( z ) = µ (det N ( π z )) − N ( π z ) where π z := d f δ ( z )6 , where again µ , µ are positive constants adjusted in order to generate a mesh with 2000elements. Here N ( π ) := (cid:112) M ( π ) and N ( π ) := argmin { det M ; M ∈ S +2 and | π ( z ) | ≤ ( z T M z ) for all z ∈ R } . We have obtained the following results, which confirm that the use of the metric adapted toa given norm and interpolation degree produces the triangulation that yields the smallestinterpolation error in this case (at least among these four triangulations). T = 2000 T H , IP T H , IP T L , IP T L , IP | f δ − I T f δ | H .
35 1 .
47 1 .
43 1 . | f δ − I T f δ | H .
66 1 .
17 1 .
89 1 . (cid:107) f δ − I T f δ (cid:107) L .
54 2 .
73 0 .
759 1 . (cid:107) f δ − I T f δ (cid:107) L .
64 6 .
61 4 .
73 3 .
17 (3.78) .4. Optimal metrics for linear and quadratic elements
Figure T ∗ H , IP , T H , IP and T H , IP adapted to f δ (500 triangles only). Given a metric h : Ω → S +2 , there does not always exists a triangulation T adaptedto h , i.e. satisfying (3.68) for some constant C ≥ h satisfies some constraints which are analyzed in [66], see also Chapter 5.Instead of analysing the metric h prior to the process of mesh generation, we choose herethe simpler option of evaluating a posteriori the quality of a triangulation T .Since we are interested in the H = W , semi norm we define following (3.33) S ( T ) := (cid:32) T (cid:88) T ∈T S ( T ) (cid:33) . For all T ∈ T we define h T := h (bary( T )) ∈ S +2 . We also define the sets E := (cid:110) ln (cid:16) | T | (cid:112) det h T (cid:17) ; T ∈ T (cid:111) and F := (cid:110) ρ (cid:16)(cid:112) h T ( T ) (cid:17) ; T ∈ T (cid:111) . According to (3.68), the quality of T is reflected by the quantitiesexp(max E − min E ) and max F. However these quantities give a rather pessimistic account of the adaptation of T to h ,and heuristically we find it more fruitful to consider averages. We therefore define ρ ( T , h ) := 1 T ) (cid:88) T ∈T ρ (cid:16)(cid:112) h T ( T ) (cid:17) . and σ ( T , h ) := exp (cid:32) E ) (cid:88) e ∈ E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e − E ) (cid:88) e ∈ E e (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:33) . The following table shows that the quantities S ( T ), ρ ( T , h ) and σ ( T , h ) are abnormallylarge for the triangulation T ∗ H , IP generated from the metric h ∗ H , IP but reasonable for thetriangulations T H , IP and T H , IP generated from the metrics (3.77). T = 2000 T ∗ H , IP T H , IP T H , IP S ( T ) 14 . .
14 4 . ρ ( T , h ) 10 . .
02 4 . σ ( T , h ) 2 .
39 2 .
25 1 . Chapter 3. Sharp asymptotics of the W ,p interpolation error In practice T ∗ H , IP led to a poor interpolation error, contrary to T H , IP . We believethat the poor quality of T ∗ H , IP is due to the excessively wild behavior of the metric h ∗ H , IP and not to a deficiency of the excellent mesh generator BAMG [93]. The optimal error estimates established in Theorem 3.1.1 involve the quantity L m,p ( d m fm ! ).The function π (cid:55)→ L m,p ( π ) is obtained by solving an optimization problem, and it doesnot have an explicit analytic expression in terms of the coefficients of π ∈ IH m . In thissection, we introduce quantities which are equivalent to L m ( π ), and therefore to L m,p ( π )for all 1 ≤ p ≤ ∞ , and which can be written in analytic form in terms of the coefficientsof π ∈ IH m .Given a pair of non negative functions Q and R on IH m we write Q ∼ R if and onlyif there exists a constant C > C − Q ≤ R ≤ CQ uniformly on IH m . Wesometimes slightly abuse notations and write Q ( π ) ∼ R ( π ). We say that a function Q isa polynomial on IH m if there exists a polynomial P of m + 1 real variables such that forall a , · · · , a m ∈ R , Q (cid:32) m (cid:88) i =0 a i x i y m − i (cid:33) = P ( a , · · · , a m ) . We define deg Q := deg P , and we say that Q is homogeneous if P is homogeneous. Forall m ≥
2, we shall build an homogeneous polynomial Q on IH m such that L m ∼ r (cid:112) | Q | with r := deg Q, (3.79)where the constants in the equivalence only depend on m .We first introduce for all π ∈ IH m the set B π := { B ∈ M ( R ) ; | π ( z ) | ≤ | Bz | m for all z ∈ R } , and the function K E m ( π ) := inf {| det B | m ; B ∈ B π } . According to Lemma 3.2.5 we have for any m ≥ L m ( π ) ∼ (cid:113) K E m − ( |∇ π | ) (3.80)where |∇ π | = ( ∂ x π ) + ( ∂ y π ) ∈ IH m − . The function K E m is extensively studied inChapter 2. In particular we know that K E ( π ) ∼ (cid:112) | det[ π ] | , (3.81)and K E ( π ) ∼ (cid:112) | disc( π ) | (3.82)where disc( π ) denotes the discriminant of a polynomial π ∈ IH , namelydisc( ax + bx y + cxy + dy ) = b c − ac − b d + 18 abcd − a d . .5. Polynomial equivalents of the shape function m ≥
2, the function K E m has anequivalent of the form r (cid:112) | Q | , where Q is an homogeneous polynomial of degree r on IH m .Combining this result with (3.80) we obtain the main result of this section. Proposition 3.5.1.
Let m ≥ and let Q be an homogeneous polynomial on IH m − suchthat K E m − ∼ r (cid:112) | Q | , where r = deg Q . Let Q ∗ be the polynomial on IH m defined by Q ∗ ( π ) := Q ( |∇ π | ) . Then L m ∼ r √ Q ∗ on IH m . Let π ∈ IH and let us observe that |∇ π ( z ) | = | π ] z | = 4 z T [ π ] z . Using (3.81) wetherefore obtain L ( π ) ∼ (cid:113) K E ( |∇ π | ) ∼ (cid:113)(cid:112) det(4[ π ] ) = 2 (cid:112) | det[ π ] | . (3.83)The construction suggested by Theorem 3.5.1 uses an equivalent of K E m − to producean equivalent to L m . Unfortunately, as m increases, the practical construction of Q suchthat r (cid:112) | Q | is equivalent to K E m becomes more involved and the degree r quickly raises. Inthe following theorem, we build an equivalent to L m from an equivalent of K E m − insteadof K E m − , which is therefore simpler. Theorem 3.5.2.
Let m ≥ and let Q be an homogeneous polynomial on IH m − such that K E m − ∼ r (cid:112) | Q | , where r = deg Q . Let ( Q k ) ≤ k ≤ r be the homogeneous polynomials of degree r on IH m − × IH m − such that for all u, v ∈ R and all π , π ∈ IH m we have Q ( uπ + vπ ) = (cid:88) ≤ k ≤ r (cid:18) rk (cid:19) u k v r − k Q k ( π , π ) , (3.84) where (cid:0) rk (cid:1) := r ! k !( r − k )! . Let Q ∗ be the polynomial defined for all π ∈ IH m by Q ∗ ( π ) := (cid:88) ≤ k ≤ r (cid:18) rk (cid:19) Q k ( ∂ x π, ∂ y π ) then L m ∼ r √ Q ∗ on IH m . Proof:
See Appendix. (cid:5)
Using this construction and (3.81) we obtain an equivalent of L as follows. Let π = ax + 2 bxy + cy and π = a (cid:48) x + 2 b (cid:48) xy + c (cid:48) y be two elements of IH . We obtaindet([ uπ + vπ ]) = ( ua + va (cid:48) )( uc + vc (cid:48) ) − ( ub + vb (cid:48) ) = u ( ac − b ) + uv ( ac (cid:48) + a (cid:48) c − bb (cid:48) ) + v ( a (cid:48) c (cid:48) − b (cid:48) ) . Applying the construction of Theorem 3.5.2 to π = ax + 3 bx y + 3 cxy + dy ∈ IH weobtain L ( π ) ∼ (cid:112) ( ac − b ) + ( ad − bc ) / bd − c ) . (3.85)44 Chapter 3. Sharp asymptotics of the W ,p interpolation error Remarking that2[( ac − b ) + ( ad − bc ) / bd − c ) ] = ( a + 2 b + c )( b + 2 c + d ) − ( ab + 2 bc + cd ) , and using equation (3.72) we obtain that L ( π ) ∼ (cid:112) det M ( π ). This point was announ-ced in § M defined in (3.71) can be used for optimalmesh adaptation for quadratic finite elements.Using (3.82) and the construction of Theorem 3.5.2, we also obtain an equivalent of L ( π ) L ( π ) ∼ (3 b c − ac − b d + 6 abcd − a d ) + (2 bc − ac d + 4 abd − b e + 6 abce − a de ) /
4+ (3 c − bc d + 8 b d − acd − b ce + 6 ac e + 2 abde − a e ) /
6+ (2 c d − ad − bc e + 4 b de + 6 acde − abe ) /
4+ (3 c d − bd − c e + 6 bcde − b e ) . The following proposition identifies the polynomials π ∈ IH m for which L m ( π ) = 0,and therefore the values of d m f for which anisotropic mesh adaptation may lead to super-convergence . Proposition 3.5.3.
Let m ≥ and let t m := (cid:4) m +32 (cid:5) . Then for all π ∈ IH m , L m ( π ) = 0 if and only if π = ( αx + βy ) t m ˜ π for some α, β ∈ R and ˜ π ∈ IH m − t m . (3.86) Proof:
According to (3.80), L m ( π ) = 0 if and only if K E m − ( |∇ π | ) = 0. On the otherhand, it is proved in Chapter 2 that for any π ∗ ∈ IH m − one has K E m − ( π ∗ ) = 0 if andonly if π ∗ has a linear factor of multiplicity m . Therefore L m ( π ) = 0 if and only |∇ π | isa multiple of l m , where l is of the form l = αx + βy .Let us first assume that |∇ π | = ( ∂ x π ) + ( ∂ y π ) has such a form. Since they are non-negative, ( ∂ x π ) and ( ∂ y π ) are both multiples of l m . Therefore ∂ x π and ∂ y π are multiplesof l s where s is an integer such that 2 s ≥ m , hence s ≥ t m −
1. We therefore have ∂ x π = l s π and ∂ y π = l s π where π , π ∈ IH m − − s Recalling that l = αx + βy we obtain0 = ∂ yx π − ∂ xy π = l s ( ∂ y π − ∂ x π ) + sl s − ( βπ − απ ) , hence βπ − απ is a multiple of l . Since π is homogenous of degree m it obeys the Euleridentity mπ ( z ) = (cid:104) z, ∇ π ( z ) (cid:105) for all z ∈ R . Assuming without loss of generality that α (cid:54) = 0, we therefore obtain mπ ( x, y ) = l s ( xπ + yπ ) = l s (cid:16) ( αx + βy ) π α + yα ( απ − βπ ) (cid:17) which shows that π is a multiple of l s +1 , hence of l t m .Conversely if π is a multiple of l t m then ∂ x π and ∂ y π are both multiples of l t m − . Since2( t m − ≥ m the polynomial |∇ π | is a multiple of l m which concludes the proof. (cid:5) .6. Extension to higher dimension This section partially extends the results exposed in the previous sections to functionsof d variables. We give in § L m,p and of themeasure of sliverness S .Subsection § d -dimensional error estimate in Theorem 3.6.6 which generalises Theorem 3.2.6. We thenestablish an asymptotic lower error estimate in Theorem 3.6.7 which generalises Theorem3.1.2. We give sufficient conditions under which the interpolation on a d -dimensionalmesh T achieves this optimal lower bound up to a multiplicative constant. However dueto technical issues linked to the measure of sliverness S we were not able to constructsuch meshes, and we therefore state the upper bound as a conjecture.We discuss in subsection § § We extend in this section the tools used in our analysis of optimally adapted triangu-lations to arbitrary dimension d . We begin with the spaces of polynomials. LetIH m,d := Span { x α · · · x α d d ; | α | = m } and IP m,d := Span { x α · · · x α d d ; | α | ≤ m } , where α = ( α , · · · , α d ) denotes a d-plet of non-negative integers, | α | := α + · · · + α d . Forany simplex T the Lagrange interpolation operator I mT : C ( T ) → IP m,d is defined by impo-sing f ( γ ) = I mT f ( γ ) for all points γ with barycentric coordinates in the set { , m , m , · · · , } with respect to the vertices of T . For all π ∈ IH m,d we define L m,d,p ( π ) := inf | T | =1 (cid:107)∇ π − ∇ I m − T π (cid:107) L p ( T ) , where the infimum is taken on the set of d -dimensional simplices of unit volume. Similarlyto (3.10) the functions L m,d,p , 1 ≤ p ≤ ∞ , are uniformly equivalent on IH m,d . We define L m,d := L m,d, ∞ .The distance defined at (3.14) between triangles extends easily to simplices. Given two d -dimensional simplices T , T (cid:48) there are precisely ( d + 1)! affine transformations Ψ suchthat Ψ( T ) = T (cid:48) . For each such Ψ, we denote by ψ its linear part and we define d ( T, T (cid:48) ) := ln (cid:16) inf { κ ( ψ ) ; Ψ( T ) = T (cid:48) } (cid:17) . We say that a d -dimensional simplex T is acute if the exterior normals n, n (cid:48) to any twodistinct faces F, F (cid:48) of T have a negative scalar product (cid:104) n, n (cid:48) (cid:105) . In other words if all pairsof faces of T form acute dihedral angles. We denote the set of acute simplices by A andwe generalise the measure of sliverness to arbitrary dimension d as follows S ( T ) := exp d ( T, A ) = inf { κ ( ψ ) ; Ψ( T ) ∈ A } . (3.87)46 Chapter 3. Sharp asymptotics of the W ,p interpolation error Similarly to (3.15), the quantity S ( T ) reflects the distance from a simplex T to the setof acute simplexes A . The definition (3.87) of S ( T ) raises a legitimate question : howto produce an affine transformation Ψ such that Ψ( T ) has acute angles, and κ ( ψ ) iscomparable to S ( T ) ? This question is answered by the following proposition.For any d -dimensional simplex T with vertices ( v i ) ≤ i ≤ d , we define the symmetricmatrix M T := (cid:88) ≤ i For any simplex T , the simplex M − T ( T ) is acute and S ( T ) ≤ κ ( (cid:112) M T ) ≤ α d S ( T ) . (3.90) Proof: See appendix. (cid:5) Remark 3.6.2. In the paper [62] an alternative measure of sliverness S (cid:48) ( T ) of a simplex T is introduced, and defined as S (cid:48) ( T ) := (cid:18) inf | u | =1 max i For all m ≥ and all d ≥ there exists a constant C = C ( m, d ) suchthat for any d -dimensional simplex T and any f ∈ W , ∞ ( T ) , one has (cid:107)∇ I mT f (cid:107) L ∞ ( T ) ≤ CS ( T ) (cid:107)∇ f (cid:107) L ∞ ( T ) . (3.91) .6. Extension to higher dimension Proof: The proof this lemma is extremely similar to the proof of Lemma (3.2.2). Let T be the simplex which vertices are the origin and the canonical basis of R d . For the samereason as in Lemma 3.2.2, if a function ˜ g ( x , x , · · · , x d ) ∈ C ( T ) does not depend onthe coordinate x d , then I m − T ˜ g does not depend on x d either. Using the same reasonningas in Lemma 3.2.2 we obtain that there exists a constant C = C ( m, d ) such that for all g ∈ W , ∞ ( T ) (cid:13)(cid:13)(cid:13)(cid:13) ∂ I mT g∂x d (cid:13)(cid:13)(cid:13)(cid:13) L ∞ ( T ) ≤ C (cid:13)(cid:13)(cid:13)(cid:13) ∂g∂x d (cid:13)(cid:13)(cid:13)(cid:13) L ∞ ( T ) . Again similarly to the proof of Lemma 3.2.2 we obtain using a change of variables thatfor any simplex T , any f ∈ W , ∞ ( T ) and any edge vector u of T (cid:107)(cid:104) u, ∇ I mT f (cid:105)(cid:107) L ∞ ( T ) ≤ C (cid:107)(cid:104) u, ∇ f (cid:105)(cid:107) L ∞ ( T ) . We use the notations of Proposition 3.6.1 and we define a norm | v | T on R d by | v | T := v T M T v = (cid:88) ≤ i 1) .Note that ˆ S ( T ) has an analytic expression in terms of the coordinates of T : the squareroot of the ratio of two polynomials in the positions of the vertices of T . Remark 3.6.4. We illustrate the sharpness of inequality (3.91) in a simple example. Let x, y, z be the coordinates on R and let π := x ∈ IH , . Let T λ be the tetrahedron ofvertices ( − λ, , , ( λ, , , ( λ, , and (0 , , . Simple computations show that (cid:107)∇ I T λ π (cid:107) L ∞ ( T λ ) = λ , (cid:107)∇ π (cid:107) L ∞ ( T λ ) = 2 λ and lim λ →∞ ˆ S ( T λ ) λ = (cid:114) . Let T (cid:48) λ be defined by replacing the vertex ( − λ, , of T λ with (0 , , . Then (cid:107)∇ I T λ π (cid:107) L ∞ ( T (cid:48) λ ) = λ, (cid:107)∇ π (cid:107) L ∞ ( T (cid:48) λ ) = 2 λ and lim λ →∞ ˆ S ( T (cid:48) λ ) = 32 . Hence the simplices T λ and T (cid:48) λ have very different interpolation properties for large λ ,although they have a similar aspect ratio. They are representatives of “bad” and “good”anisotropy respectively. The tetrahedrons T and T (cid:48) are illustrated on the left of Figure4, bottom and top respectively. For any d -dimensional simplex T , we define its measure of non degeneracy by ρ ( T ) := diam( T ) d | T | . Let T ∗ be a fixed d -dimensional acute simplex, for instance the reference equilateral sim-plex. For any d -dimensional simplex T let ψ ∈ GL d and z ∈ R d be such that T = z + ψ ( T ∗ ).Since T ∗ is acute, we obtain a generalization of (3.27) S ( T ) ≤ κ ( ψ ) ≤ (cid:107) ψ (cid:107) d | det ψ | − ≤ diam( T ) d µ ( T ∗ ) d | T ∗ || T | = C ( d ) ρ ( T ) . where µ ( T ∗ ) is the diameter of the largest ball incribed in T ∗ , and where we have used theinequality | det( ψ − ) | ≥ (cid:107) ψ − (cid:107)(cid:107) ψ (cid:107) − ( d − . This last inequality can be derived by using thesingular value decomposition ψ = U diag( λ , · · · , λ d ) V with 0 < λ ≤ · · · ≤ λ d and notingthat (cid:107) ψ (cid:107) = λ d and (cid:107) ψ − (cid:107) = λ − . .6. Extension to higher dimension π ∈ IH m,d , A π := { A ∈ M d ( R ) ; |∇ π ( z ) | ≤ | Az | m − for all z ∈ R d } . Geometrically, one has A ∈ A π if and only if the ellipsoid { z ∈ R d ; | Az | ≤ } is includedin the algebraic set { z ∈ R d ; |∇ π ( z ) | ≤ } . This leads us to the generalisation of Lemma3.2.5. Lemma 3.6.5. For all m ≥ and all d ≥ there exists a constant C = C ( m, d ) suchthat for all π ∈ IH m,d , we have C − L m,d ( π ) ≤ inf {| det A | m − d ; A ∈ A π } ≤ CL m,d ( π ) . Proof: The proof of this lemma is completely similar to the proof of its bidimensionalversion Lemma 3.2.5. The only point that needs to be properly generalized is the following :given a matrix A ∈ GL d , how to construct an acute simplex T = T ( A ) such that ρ ( A ( T ))is bounded independently of A ?The following construction is not the simplest but will be useful in our subsequentanalysis. Let A = U DV , be the singular value decomposition of A , where U, V are ortho-gonal matrices and D is a diagonal matrix with positive diagonal entries ( λ i ) ≤ i ≤ d . Wedefine the Kuhn simplex T T := { x ∈ [0 , d ; x ≥ x ≥ · · · ≥ x d } , and T := V T D − T . Then ρ ( A ( T )) = ρ ( U ( T )) = ρ ( T ) = d ! d d/ which is independent of A . We now show that T is an acute simplex.Let ( e , · · · , e d ) be the canonical basis of R d , and let by convention e = e d +1 = 0. For0 ≤ i ≤ d , an easy computation shows that the the exterior normal to the face F i of T ,opposite to the vertex v i = (cid:80) ≤ k ≤ i e k , is n i = e i − e i +1 (cid:107) e i − e i +1 (cid:107) . It follows that the exterior normal n (cid:48) i to the face D − ( F i ) of the simplex D − ( T ) is n (cid:48) i = D ( n i ) | D ( n i ) | = λ i e i − λ i +1 e i +1 | λ i e i − λ i +1 e i +1 | . Hence (cid:104) n (cid:48) i , n (cid:48) j (cid:105) = 0 if | i − j | > 1, and (cid:104) n (cid:48) i , n (cid:48) i +1 (cid:105) < ≤ i ≤ d − 1. It follows thatthe simplex D − ( T ) is acute, and therefore T = V T D − ( T ) is also acute since V is arotation. (cid:5) We present in this section the generalisation to higher dimension of our anisotropicerror estimates. We prove a local error estimate in theorem 3.6.6 and an asymptotic lowerestimate in 3.6.7. We also point out in conjecture 3.6.8 a technical point which, if proved,would lead to the optimal asymptotic upper estimates (3.98) and (3.99).50 Chapter 3. Sharp asymptotics of the W ,p interpolation error Theorem 3.6.6. For all m ≥ and d ≥ there exists a constant C = C ( m, d ) such thatfor all π ∈ IH m,d , all A ∈ A π and any simplex T we have | π − I m − T π | W ,p ( T ) ≤ C | T | τ S ( T ) ρ ( A ( T )) m − d | det A | m − d , (3.94) where τ := m − d + p . Furthermore for any g ∈ C m ( T ) we have | g − I m − T g | W ,p ( T ) ≤ C | T | τ S ( T ) ρ ( T ) m − d (cid:107) d m g (cid:107) L ∞ ( T ) . Proof: It is a straightforward generalization of the proof of Theorem 3.2.6. (cid:5) Combining these two estimates, we can obtain a mixed estimate similar to (3.30), withthe new value of τ and the generalised S and ρ . For all m ≥ d ≥ C = C ( m, d ) such that for any simplex T , any f ∈ C m ( T ), any π ∈ IH m and any A ∈ A π | f − I m − T f | W ,p ( T ) ≤ C | T | τ S ( T ) (cid:16) ρ ( A ( T )) m − d | det A | m − d + ρ ( T ) m − d (cid:107) d m f − d m π (cid:107) L ∞ ( T ) (cid:17) . (3.95)This leads us to a straightforward generalisation of the points (i) to (iv) exposed in (3.31).Similarly to the bidimensional case (3.37) if a triangulation T meets these requirements,then it satisfies the error estimate T ) m − d | f − I m − T f | W ,p (Ω) ≤ C (cid:107) L m ( π z ) + ε (cid:107) L τ (Ω) . (3.96)Generalizing (3.6), we say that a sequence ( T N ) N ≥ N of simplicial meshes of a d -dimensional polygonal domain is admissible if T N ) ≤ N and if there exists a constant C A > T ∈T N diam( T ) ≤ C A N − d . Similarly to (3.7), it can be shown that (3.96) cannot be improved for an admissiblesequence of triangulations, in the following asymptotical sense. Theorem 3.6.7. Let ( T N ) N ≥ N be an admissible sequence of triangulations of a domain Ω , let f ∈ C m (Ω) and ≤ p < ∞ . Then lim inf N →∞ N m − d | f − I m − T N f | W ,p (Ω) ≥ (cid:13)(cid:13)(cid:13)(cid:13) L m,d,p (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) (3.97) where τ := m − d + p . Proof: It is identical to the proof of the bidimensional estimate (3.7), which is exposedin § (cid:5) In contrast, the upper estimates (3.4) and (3.8) do not generalize easily to higherdimension. A first problem is that the bidimensional mesh P T defined in Equation (3.49)has no equivalent in higher dimension, in the sense that we cannot exactly tile the space by .6. Extension to higher dimension σ ∈ Σ d of { , · · · , d } we define T σ := { x ∈ [0 , d ; x σ (1) ≥ · · · ≥ x σ ( d ) } . Let A ∈ GL d ( R ), and let A = U DV be the singular value decomposition of A , where U and V are unitary and D is diagonal. We define P A := { V T D − ( T σ + z ) ; σ ∈ Σ d , z ∈ ZZ d } , which is a tiling of R d built of acute simplices T satisfying ρ ( A ( T )) = d ! d d/ (theseproperties are established in the proof of Lemma 3.6.3). Using such a tiling, we wouldlike to build partitions P A,n ( R ) of any d -dimensional simplex R , with properties similarto those expressed in Lemma 3.3.2 for the triangulations P T,n ( R ). At the present stage wedo not know how to properly adapt the construction of P A,n ( R ) near the boundary of R in order to respect the condition on the measure of sliverness. The following conjecture,if established, would serve as a generalisation of Lemma 3.3.2. Conjecture 3.6.8. Let R be a d-dimensional simplex, and let A ∈ GL d ( R ) . There existsa sequence ( P A,n ( R )) N ≥ , of conformal triangulations of R such that– Nearly all the elements of R N belong to P A,n := n P A , in the sense that lim n →∞ P A,n ( R )) n d = d ! | R || det A | and lim n →∞ P A,n ( R )) n d = 0 . where P A,n ( R ) := P A,n ( R ) ∩ P A,n and P A,n ( R ) := P A,n ( R ) \ P A,n – The restriction of P A,n ( R ) to a face F of R is its standard periodic tiling with n d − elements.– The sequence ( P A,n ( R )) n ≥ satisfies sup n ≥ (cid:18) n max T ∈P A,n ( R ) diam( T ) (cid:19) < ∞ and sup n ≥ max T ∈P A,n ( R ) S ( T ) < ∞ . The validity of this conjecture would imply the following result using the same proofas for the estimates (3.4) and (3.8) established in § Conjecture 3.6.9. For all m ≥ there exists a constant C = C ( m, d ) such that thefollowing holds. Let Ω ⊂ R d be polygonal domain, let f ∈ C m (Ω) and ≤ p < ∞ . Thenthere exists a sequence ( T N ) N ≥ N of simplicial meshes of Ω such that T N ) ≤ N and lim sup N →∞ N m − d | f − I m − T N f | W ,p (Ω) ≤ C (cid:13)(cid:13)(cid:13)(cid:13) L m,d (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) (3.98) where τ := m − d + p . Furthermore, for all ε > , there exists an admissible sequence ofsimplicial meshes ( T εN ) N ≥ N of Ω such that T εN ) ≤ N and lim sup N →∞ N m − d | f − I m − T εN f | W ,p (Ω) ≤ C (cid:13)(cid:13)(cid:13)(cid:13) L m,d (cid:18) d m fm ! (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) + ε. (3.99)52 Chapter 3. Sharp asymptotics of the W ,p interpolation error The theory of anisotropic mesh generation in dimension three or higher is only atits infancy. However efficient software already exists such as [94] for tetrahedral meshgeneration in domains of R . A description of such an algorithm can be found in [16] aswell as some applications to computational mechanics. These software take as input a field h ( z ) of symmetric positive definite matrices and attempt to create a mesh satisfying (3.68).Defining the set of symmetric matrices A (cid:48) π in a similar way as in the two dimensional case(3.65), let us consider a continuous function M ( ε ) m,d : IH m,d → S + d such that M ( ε ) m,d ( π ) ∈ A (cid:48) π and det M ( ε ) m,d ( π ) ≤ K (inf { det M ; M ∈ A (cid:48) π } + ε ) , (3.100)where ε > K is an absolute constantindependent of ε . The existence of such a function is established in Chapter 6. Let M ( z ) := M ( ε ) m,d ( d m f ( z )), let δ > h ( z ) := δ − τ (det M ( z )) − τdp M ( z ) . (3.101)A heuristic analysis similar to the one developed in § T of Ω optimally adapted for approximating f with IP m − elements in the W ,p semi norm. This justifies the search for functions M m,d satisfying (3.100).The form of M ,d , which corresponds to piecewise linear finite elements, is alreadyestablished, see for instance [85], but we recall it for completeness. The same analysis asin § M ,d ( π ) := 4[ π ] satisfies M ,d ( π ) ∈ A (cid:48) π and det M ,d ( π ) = inf { det M ; M ∈ A (cid:48) π } . As a byproduct weobtain from Lemma 3.6.5 that there exists a constant C = C ( d ) such that for all π ∈ IH ,d C − d (cid:112) | det[ π ] | ≤ L ,d ( π ) ≤ C d (cid:112) | det[ π ] | . For piecewise quadratic elements, we generalise (3.71) and define M ∗ ,d ( π ) := (cid:113) [ ∂ x π ] + · · · + [ ∂ x d π ] . Then √ d M ∗ ,d ( π ) ∈ A (cid:48) π , but we have found that for each K > π ∈ IH ,d such that det M ∗ ,d ( π ) > K inf { det M ; M ∈ A (cid:48) π } . The map M ∗ ,d may still be used for mesh adaptation through the formula (3.101) butthis metric may not be optimal in the area where π z = d f ( z )6 is such that det M ∗ ,d ( π z ) isnot well controlled by inf { det M ; M ∈ A (cid:48) π z } . .7. Final remarks and conclusion In this chapter, we have introduced asymptotic estimates for the finite element inter-polation error measured in the W ,p semi-norm, when the mesh is optimally adapted to afunction of two variables and the degree of interpolation m − L p interpolation error,and leads to asymptotically sharp error estimates, exposed in Theorems 3.1.1 and 3.1.2.These estimates involve a shape function L m,p which generalises the determinant whichappears in estimates for piecewise linear interpolation. The shape function has equivalentsof polynomial form for all values of m , as established in theorems 3.5.1 and 3.5.2. Up to afixed multiplicative constant, our estimates can therefore be written under analytic formin terms of the derivatives of the function to be approximated.In the case of piecewise linear and piecewise quadratic finite elements, we have pre-sented in § § S ( T ) of a simplex, defined in (3.87), which hasa geometrical interpretation as the distance from T to the set of acute simplices. Thismeasure accurately distinguishes between good anisotropy, that leads to optimal errorestimates, and bad anisotropy that leads to oscillation of the gradient of the interpolatedfunction. Equivalent quantities can be found in [10, 62], but had not been used in thecontext of optimal mesh adaptation. Let R n be the homothetic contraction of R by the factor 1 − n − and with the samebarycenter. We define a partition P (cid:48) n of R n into convex polygons as follows P (cid:48) n := { R n ∩ T (cid:48) ; T (cid:48) ∈ P T,n } . The triangles R , R n , and the partition P (cid:48) n are illustrated on Figure 3.5.Note that the normals to the faces of the polygons building the partition P (cid:48) n belong to afamily of only 6 elements ( n i ) ≤ i ≤ : the normals to the faces of of R , and the normalsto the faces of T . Hence only 6 × P (cid:48) n , and we denote thelargest of these by α < π .We now partition into triangles each convex polygon C ∈ P (cid:48) n using the Delaunaytriangulation of its vertices. Note that the angles of the triangles partitionning a convex54 Chapter 3. Sharp asymptotics of the W ,p interpolation error RR n Figure R and R n . Right : The partition P (cid:48) n of R n .polygon C are smaller than the maximal angle of C , hence than α . We denote by P (cid:48)(cid:48) n theresulting triangulation of R n , as illustrated on the left of Figure 3.6.We denote by E n the collection of n equidistributed points on each edge of R , describedin item 2 of Lemma 3.3.2. We denote by E (cid:48) n the set of vertices of the triangles in P (cid:48)(cid:48) n thatfall on ∂R (cid:48) n . For each point p ∈ E n , we draw an edge between p and the point of E (cid:48) n which is the closest to p . This produces a partition of R \ R n into triangles and convexquadrilaterals. Eventually we partition each of these polygons C into triangles using theDelaunay triangulation of the point set C ∩ ( E n ∪ E (cid:48) n ), which produces a triangulation (cid:101) P n of R \ R n illustrated on the right of Figure 3.6. The triangles T (cid:48) ∈ (cid:101) P n obeydiam( T (cid:48) ) ≤ (2 diam R + diam T ) n − = Cn − . Furthermore let L be the length of the edge of T (cid:48) included in ∂R ∪ ∂R n , and let H be theheight of the triangle T (cid:48) such that LH = 2 | T (cid:48) | . Then H ≥ min {| z − z (cid:48) | ; z ∈ ∂R, z (cid:48) ∈ ∂R n } = cn − . where c > n . Let L (cid:48) be another edge of T (cid:48) , and let θ be the angle of T (cid:48) between the edges L and L (cid:48) . Then2 | T (cid:48) | = LL (cid:48) sin θ = LH. hence sin θ ≥ H diam( T (cid:48) ) ≥ cC , and therefore arcsin( cC ) ≤ θ ≤ π − arcsin( cC ). It follows that allthe angles of T (cid:48) are smaller than π − arcsin( cC ). We eventually define P T,n ( R ) := P (cid:48)(cid:48) n ∪ (cid:101) P n Figure P (cid:48)(cid:48) n of R n . Right : the partition P n = P T,n ( R ) = P (cid:48)(cid:48) n ∪ (cid:101) P n of R . .8. Appendix P T,n ( R ) is bounded by the constant β ( R, T ) = max { α, π − arcsin( cC ) } < π which is independent of n . Hencesup n ≥ sup T ∈P T,n ( R ) S ( T ) ≤ tan (cid:18) β ( R, T )2 (cid:19) < ∞ The other properties of P T,n ( R ) mentionned in 3.3.2 are easily checked. Let m ≥ s m := (cid:98) m (cid:99) + 1. We have proved in chapter § π ∈ IH m the three following properties are equivalent K E m ( π ) = 0 , There exists α, β ∈ R and ˜ π ∈ IH m − s m such that π = ( αx + βy ) s m ˜ π, There exists a sequence ( φ n ) n ≥ , φ n ∈ SL such that π ◦ φ n → . (3.102)We also proved in chapter § 2, Theorem 2.6.3, the following invariance property : Let Q be a polynomial on IH m such that K E m ∼ r (cid:112) | Q | where r = deg Q . Then Q ( π ◦ φ ) = (det φ ) rm Q ( π ) for all π ∈ IH m and φ ∈ M ( R ) . (3.103)It immediately follows that the polynomials ( Q k ) ≤ k ≤ r , defined in (3.84), satisfy for all π , π ∈ IH m and φ ∈ M ( R ) Q k ( π ◦ φ, π ◦ φ ) = (det φ ) rm Q k ( π , π ) for all π , π ∈ IH m and all φ ∈ M ( R ) . (3.104)We define two functions on IH m × IH m K ∗ ( π , π ) := r (cid:115) (cid:88) ≤ k ≤ r Q k ( π , π ) and K ( π , π ) := r (cid:113) ˜ Q ( π + π ) , where ˜ Q is such that K E m ∼ ˜ r (cid:113) ˜ Q , ˜ r := deg ˜ Q . We show below that K ∼ K ∗ on IH m × IH m .This result combined with (3.80) concludes the proof of Theorem 3.5.2. (Note that m isreplaced with m + 1 in the statement of this theorem.) Using (3.104) and remarking theinvariance property ˜ Q ( π ◦ φ ) = (det φ ) ˜ rm Q ( π ), for the same reasons as (3.103), we obtainfor all π , π ∈ IH m and all φ ∈ M ( R ) , (cid:26) K ( π ◦ φ, π ◦ φ ) = | det φ | m K ( π , π ) ,K ∗ ( π ◦ φ, π ◦ φ ) = | det φ | m K ∗ ( π , π ) . (3.105)If K ( π , π ) = 0, then π + π ∈ IH m has a linear factor of multiplicity s m = m + 1according to (3.102), and therefore π and π have a common linear factor of multiplicity s m .If K ∗ ( π , π ) = 0, then for all k , 0 ≤ k ≤ r , we have Q k ( π , π ) = 0. Using (3.84) weobtain that for all u, v ∈ R we have K m ( uπ + vπ ) = 0. It follows from (3.102) that forall u, v ∈ R the polynomial uπ + vπ ∈ IH m has a linear factor of multiplicity s m , hencethat π and π have a common linear factor of multiplicity s m .56 Chapter 3. Sharp asymptotics of the W ,p interpolation error Hence the following properties are equivalent K ( π , π ) = 0 ,K ∗ ( π , π ) = 0 , There exists α, β ∈ R and ˜ π , ˜ π ∈ IH m − s m such that π = ( αx + βy ) s m ˜ π , and π = ( αx + βy ) s m ˜ π . (3.106)Using (3.102) we find that these properties are also equivalent to K E m ( π + π ) = 0 , There exists a sequence ( φ n ) n ≥ , φ n ∈ SL , such that ( π ◦ φ n ) + ( π ◦ φ n ) → , There exists a sequence ( φ n ) n ≥ , φ n ∈ SL , such that π ◦ φ n → π ◦ φ n → . (3.107)We now define the norm (cid:107) ( π , π ) (cid:107) := sup | u |≤ | ( π ( u ) , π ( u )) | on IH m × IH m and F := { ( π , π ) ∈ IH m × IH m ; (cid:107) ( π , π ) (cid:107) = 1 and (cid:107) ( π ◦ φ, π ◦ φ ) (cid:107) ≥ φ ∈ SL } . F is compact subset of IH m × IH m and K as well as K ∗ do not vanish on F according to(3.106) and (3.107). Since these functions are continuous, there exists a constant C > C − K ≤ K ∗ ≤ C K on F . (3.108)Let π , π ∈ IH m . If there exists a sequence ( φ n ) n ≥ , φ n ∈ SL , such that π ◦ φ n → π ◦ φ n → 0, then K ( π , π ) = K ∗ ( π , π ) = 0. Otherwise, consider a sequence ( φ n ) n ≥ , φ n ∈ SL such that lim n →∞ (cid:107) ( π ◦ φ n , π ◦ φ n ) (cid:107) = inf φ ∈ SL (cid:107) ( π ◦ φ, π ◦ φ ) (cid:107) . By compactness there exists a pair (˜ π , ˜ π ) ∈ IH m × IH m and a subsequence ( φ n k ) k ≥ suchthat ( π ◦ φ n k , π ◦ φ n k ) → (˜ π , ˜ π ) . One easily checks that (˜ π , ˜ π ) (cid:107) (˜ π , ˜ π ) (cid:107) ∈ F . Using (3.105) we obtain K ( π , π ) K ∗ ( π , π ) = lim n →∞ K ( π ◦ φ n , π ◦ φ n ) K ∗ ( π ◦ φ n , π ◦ φ n ) = K (˜ π , ˜ π ) K ∗ (˜ π , ˜ π )Using (3.108) and the homogeneity of K and K ∗ , we obtain that C − K ≤ K ∗ ≤ C K onIH m × IH m which concludes the proof. We denote by T eq a d -dimensional equilateral simplex such that bary( T eq ) = 0, wherebary denotes the barycenter, and such that its vertices q i , 0 ≤ i ≤ d , belong to the unitsphere, i.e. | q i | = 1. Since the vertices of T eq play symmetrical roles there exists a constant ξ ∈ R such thatFor all 0 ≤ i ≤ d, ≤ j ≤ d, ≤ k ≤ d, one has (cid:104) q i − q j , q k (cid:105) = ξ ( δ ik − δ jk ) , (3.109) .8. Appendix δ is the Kronecker symbol : δ ij = 1 if i = j , and 0 otherwise. Using the relation q + · · · + q d = 0 we obtain ξd = (cid:80) dj =0 (cid:104) q − q j , q (cid:105) = d + 1 hence ξ = 1 + d . Note also thatthe unit exterior normal to the face of T eq opposite to the vertex q i is − q i .We recall the following property : if A ∈ GL d and if n is the exterior normal to a face F of a simplex T , then the exterior normal to the face A ( F ) of A ( T ) is n (cid:48) = ( A − ) T n | ( A − ) T n | . (3.110)We first establish that for any simplex T , the simplex M − T ( T ) is acute. Without lossof generality we can assume that bary( T ) = 0, hence there exists A ∈ GL d such that T = A ( T eq ). Since the vertices of T are v i = Aq i for 0 ≤ i ≤ d , we obtain from definition(3.88) that M T = A (cid:32) (cid:88) ≤ i 0. For all 0 ≤ a < b ≤ d , we therefore obtain using (3.109) ν a ν b (cid:104) n a , n b (cid:105) = (cid:68) M T ( A − ) T q a , M T ( A − ) T q b (cid:69) = q T a A − M T ( A − ) T q b = q T a (cid:32) (cid:88) ≤ i For any acute simplex T ac one has M T ac ≥ Id . Proof: Without loss of generality we can assume that bary( T ac ) = 0, hence there exists A ∈ GL d such that T ac = A ( T eq ). The vertices of T ac are c i = Aq i for 0 ≤ i ≤ d , and theexterior normal to the face of T ac opposite c i is m i = − µ i ( A − ) T q i . Chapter 3. Sharp asymptotics of the W ,p interpolation error where µ i > 0. We define for all 0 ≤ i < j ≤ dλ ij := −| c i − c j | (cid:104) m i , m j (cid:105) ξ µ i µ j . Since T ac is acute we have (cid:104) m i , m j (cid:105) ≤ λ ij ≥ 0. We now introduce thesymmetric matrix M := (cid:88) ≤ i There exists various ways of measuring the smoothness of functions on a domainΩ ⊂ IR d , generally through the definition of an appropriate smoothness space . Classicalinstances are Sobolev, H¨older and Besov spaces. Such spaces are of common use whendescribing the regularity of solutions to partial differential equations. From a numericalperspective, they are also useful in order to sharply characterize at which rate a function f may be approximated by simpler functions such as Fourier series, finite elements, splinesor wavelets (see [30, 42, 45] for surveys on such results).Functions arising in concrete applications may have inhomogeneous smoothness pro-perties, in the sense that they exhibit area of smoothness separated by localized discon-tinuities. Two typical instances are (i) edge in functions representing real images and (ii)shock profiles in solutions to non-linear hyperbolic PDE’s. The smoothness space that is16364 Chapter 4. From finite element approximation to image models best taylored to take such features into account is the space BV (Ω) of bounded variationfunctions. This space consists of those f in L (Ω) such that ∇ f is a bounded measure,i.e. such that their total variationTV( f ) = | f | BV := max (cid:26)(cid:90) Ω f div( ϕ ) ; ϕ ∈ D (Ω) d , (cid:107) ϕ (cid:107) L ∞ ≤ (cid:27) is finite. Functions of bounded variation are allowed to have jump discontinuities alonghypersurfaces of finite measure. In particular, the characteristic function of a smoothsubdomain D ⊂ Ω has finite total variation equal to the d − | χ D | BV = H d − ( ∂D ) . (4.1)It is well known that BV is a regularity space for certain hyperbolic conservation laws[57, 68], in the sense that the total variation of their solutions remains finite for all time t > 0. This space also plays an important role in image processing since the seminalpaper [54]. Here, a small total variation is used as a prior to describe the mathematicalproperties of “plausible images”, when trying to restore an unknown image f from anobservation h = T f + e where T is a known operator and e a measurement noise of norm (cid:107) e (cid:107) L ≤ ε . The restored image is then defined as the solution to the minimization problemmin g ∈ BV {| g | BV ; (cid:107) T g − h (cid:107) L ≤ ε } . (4.2)From the point of view of approximation theory, it was shown in [32, 33] that the space BV is almost characterized by expansions in wavelet bases. For example, in dimension d = 2, if f = (cid:80) d λ ψ λ is an expansion in a tensor-product L -orthonormal wavelet basis,one has ( d λ ) ∈ (cid:96) ⇒ f ∈ BV ⇒ ( d λ ) ∈ w(cid:96) , where w(cid:96) is the space of weakly summable sequences. The fact that the wavelet coeffi-cients of a BV function are weakly summable implies the convergence estimate (cid:107) f − f N (cid:107) L ≤ CN − / | f | BV , (4.3)where f N is the nonlinear approximation of f obtained by retaining the N largest coef-ficients in its wavelet expansion. Such approximation results have been further used inorder to justify the performance of compression or denoising algorithms based on waveletthresholding [31, 46, 47].In recent years, it has been observed that the space BV (and more generally classicalsmoothness spaces) do not provide a fully satisfactory description of piecewise smoothfunctions arising in the above mentioned applications. Indeed, formula (4.1) reveals thatthe total variation only takes into account the size of the sets of discontinuities and nottheir geometric smoothness . In image processing, this means that the set of boundedvariation images does not make the distinction between smooth and non-smooth edges aslong as they have finite length.The fact that edges have some geometric smoothness can be exploited in order to studyapproximation procedures which outperform wavelet thresholding in terms of convergence .1. Introduction f = χ D where D is a bidimensional domainwith smooth boundary, one can find a sequence of triangulations T N with N trianglessuch that the convergence estimate (cid:107) f − I T N f (cid:107) L ≤ CN − , (4.4)holds, where I T denotes the piecewise linear interpolation operator on a triangulation T .Other methods are based on thresholding a decomposition of the function in bases orframes which differ from classical wavelets, see e.g. [3, 21, 80]. These methods also yieldimprovements over (4.3) similar to (4.4). The common feature in all these approaches isthat they achieve anisotropic refinement near the edges. For example, in order to obtainthe estimate (4.4), the triangulation T N should include a thin layer of triangles whichapproximates the boundary ∂D . These triangles typically have size N − in the normaldirection to ∂D and N − in the tangential direction, and are therefore highly anisotropic.Intuitively, these methods are well adapted to functions which have anisotropic smooth-ness properties in the sense that their local variation is significantly stronger in one direc-tion. Such properties are not well described by classical smoothness spaces such as BV ,and a natural question to ask is therefore : What type of smoothness properties govern the convergence rate of anisotropic refine-ment methods and how can one quantify these properties ? The goal of this chapter is to answer this question, by proposing and studying mea-sures of smoothness which are suggested by recent results on anisotropic finite elementapproximation [4, 27] and Chapter 2. Before going further, let us mention several exis-ting approaches which have been developed for describing and quantifying anisotropicsmoothness, and explain their limitations.1. The so-called mixed smoothness classes have been introduced and studied in orderto describe functions which have a different order of smoothness in each coordinate,see e.g. [76, 88]. These spaces are therefore not adapted to our present goal sincethe anisotropic smoothness that we want to describe may have preferred directionsthat are not aligned with the coordinate axes and that may vary from one point toanother (for example an image with a curved edge).2. Anisotropic smoothness spaces with more general and locally varying directions havebeen investigated in [64]. Yet, in such spaces the amount of smoothness in differentdirections at each point is still fixed in advance and therefore again not adapted toour goal, since this amount may differ from one function to another (for exampletwo images with edges located at different positions).3. A class of functions which is often used to study the convergence properties ofanisotropic approximation methods is the family of C m − C n cartoon images , i.e.functions which are C m smooth on a finite number of subdomains (Ω i ) i =1 , ··· ,k sepa-rated by a union of discontinuity curves (Γ j ) j =1 , ··· ,l that are C n smooth. The defectsof this class are revealed when searching for simple expression that quantifies theamount of smoothness in this sense. A natural choice is to take the supremum ofall C m (Ω i ) norms of f and C n norms of the normal parametrization of Γ j . We then66 Chapter 4. From finite element approximation to image models observe that this quantity is unstable in the sense that it becomes extremely largefor blurry images obtained by convolving a cartoon image by a mollifier ϕ δ = δ ϕ ( · h )as δ → 0. In addition, this quantity does not control the number of subdomains inthe partition.4. A recent approach proposed in [43] defines anisotropic smoothness through the geo-metric smoothness properties of the level sets of the function f . In this approachthe measure of smoothness is not simple to compute directly from f since it involveseach of its level sets and a smoothness measure of their local parametrization.The results of [4, 27] and Chapter 2 describe the L p -error of piecewise linear inter-polation by an optimally adapted triangulation of at most N elements, when f is a C function of two variables. This error is defined as σ N ( f ) p := inf T ) ≤ N (cid:107) f − I T f (cid:107) L p . It is shown in [4] for p = ∞ and in Chapter 2 for all 1 ≤ p ≤ ∞ that thatlim sup N → + ∞ N σ N ( f ) p ≤ CA p ( f ) , (4.5)where C is an absolute constant and A p ( f ) := (cid:107) (cid:112) | det( d f ) |(cid:107) L τ , τ := 1 + 1 p . (4.6)Moreover, this estimate is known to be optimal in the sense that lim inf N → + ∞ N σ N ( f ) p ≥ cA p ( f ) also holds, under some mild restriction on the class of triangulations in which oneselects the optimal one. These results are extended in Chapter 2 to the case of higher orderfinite elements and space dimension d > 2, for which one can identify similar measures f (cid:55)→ A ( f ) governing the convergence estimate. Such quantities thus constitute naturalcandidates to measure anisotropic smoothness properties. Note that A p ( f ) is not a semi-norm due to the presence of the determinant in (4.6), and in particular the quasi-triangleinequality A p ( f + g ) ≤ C ( A p ( f ) + A p ( g )) does not hold even with C > § A p ( f ) defined by (4.6).Since A p ( f ) is not a norm, we cannot associate a linear smoothness space to it by astandard completion process. We are thus facing a difficulty in extending the definition of A p ( f ) to functions which are not C -smooth and in particular to cartoon images such asin item 3 above. Since we know from (4.4) that for such cartoon images the L error ofadaptive piecewise linear interpolation decays like N − , we would expect that the quantity A ( f ) = (cid:107) (cid:112) | det( d f ) |(cid:107) L / , corresponding to the case p = 2 can be properly defined for piecewise smooth functions.We address this difficulty in § f is a cartoon image weintroduce its regularized version f δ := f ∗ ϕ δ , (4.7) .2. Anisotropic finite element approximation ϕ δ = δ ϕ ( · h ) is a standard mollifier. Our main result is the following : for anycartoon image f of C − C type, the quantity A ( f δ ) remains uniformly bounded as δ → δ → A ( f δ ) / = k (cid:88) i =1 (cid:90) Ω i (cid:12)(cid:12)(cid:12)(cid:112) | det( d f ) | (cid:12)(cid:12)(cid:12) / + C ( ϕ ) l (cid:88) j =1 (cid:90) Γ j | [ f ]( s ) | / | κ ( s ) | / ds, (4.8)where [ f ]( s ) and κ ( s ) respectively denote the jump of f and the curvature of Γ j at thepoint s , and where C ( ϕ ) is a constant that only depends on the choice of the mollifier.This constant can be shown to be uniformly bounded by below for the class of radiallydecreasing mollifiers. This result reveals that A ( f ) is stable under regularization of car-toon images (in contrast to the measure of smoothness described in item 3 above). Wealso discuss the behaviour of A p when p (cid:54) = 2.These results lead us in § A ( f ) and thetotal variation TV( f ). We also make some remarks on the existing links between the limitexpression in (4.8) and classical results on adaptive approximation of curves, as well aswith operators of affine-invariant image processing which also involve the power 1 / § A ( f ) with the total variationTV( f ) as a model for plausible images.One could be tempted to use the quantity A ( f ) in place of the total variation TV( f )as a prior in image processing, and in particular to replace | g | BV by A ( g ) in a restorationprocedure such as (4.2). In § A ( f ) in the framework of a bayesian least-square estimator, as proposed in [70] in thecase of the total variation. At the present stage, the algorithm implementing this approachdid not give satisfactory results due to its very slow convergence. For this reason, weonly present some numerical illustration of this approach in a simplified one-dimensionalsetting.Eventually we describe in § § A standard estimate in finite element approximation states that if f ∈ W ,p (Ω) then (cid:107) f − I T h f (cid:107) L p ≤ Ch (cid:107) d f (cid:107) L p , where T h is a triangulation of mesh size h := max T ∈T h diam( T ). If we restrict our attentionto a family quasi-uniform triangulations, h is linked with the complexity N := T h )according to C h − ≤ N ≤ C h − Chapter 4. From finite element approximation to image models Therefore, denoting by σ unif N ( f ) L p the L p approximation by quasi-uniform triangulationsof cardinality N , we can re-express the above estimate as σ unif N ( f ) L p ≤ CN − (cid:107) d f (cid:107) L p . (4.9)In order to explain how this estimate can be improved when using adaptive partitions,we first give some heuristic arguments which are based on the assumption that on eachtriangle T the relative variation of d f is small so that it can be considered as constantover T , which means that f coincides with a quadratic function q T on each T . Denotingby I T the local interpolation operator on a triangle T and by e T ( f ) p := (cid:107) f − I T f (cid:107) L p ( T ) the local L p error, we thus have according to this heuristics (cid:107) f − I T f (cid:107) L p = (cid:16)(cid:88) T ∈T e T ( f ) pp (cid:17) p = (cid:16)(cid:88) T ∈T e T ( q T ) pp (cid:17) p We are thus led to study the local interpolation error e T ( q ) p when q ∈ IP is a a quadraticpolynomial. Denoting by q the homogeneous part of q , we remark that e T ( q ) p = e T ( q ) p . We optimize the shape of T with respect to the quadratic form q by introducing a function K p defined on the space of quadratic forms by K p ( q ) := inf | T | =1 e T ( q ) p , where the infimum is taken among all triangles of area 1. It is easily seen that e T ( q ) p isinvariant by translation of T and so is therefore the minimizing triangle if it exists. Byhomogeneity, it is also easily seen thatinf | T | = a e T ( q ) p = a τ K p ( q ) , τ = 1 p + 1 , and that the minimizing triangle of area a is obtained by rescaling the minimizing triangleof area 1 if it exists. Finally, it is easily seen that if ϕ is an invertible linear transform K p ( q ◦ ϕ ) = | det( ϕ ) | K p ( q ) , and that the minimizing triangle of area | det( ϕ ) | − for q ◦ ϕ is obtained by application of ϕ − to the minimizing triangle of area 1 for q if it exists. If det( q ) (cid:54) = 0, there exists a ϕ suchthat q ◦ ϕ is either x + y or x − y up to a sign change, and we have | det( q ) | = | det( ϕ ) | − .It follows that K p ( q ) has the simple form K p ( q ) = σ | det( q ) | / , (4.10)where σ is a constant equal to K p ( x + y ) if det( q ) > K p ( x − y ) if det( q ) < q ) = 0 in which case K p ( q ) = 0. .3. Piecewise smooth functions and images T is such that all its triangles T have optimized shapein the above sense with respect to the quadratic form q T associated with q T , we thus havefor any triangle T ∈ T e T ( f ) p = e T ( q T ) p = | T | τ K p ( q T ) = (cid:13)(cid:13)(cid:13)(cid:13) K p (cid:16) d f (cid:17)(cid:13)(cid:13)(cid:13)(cid:13) L τ ( T ) . since we have assumed d f = q T on T . In order to optimize the trade-off between the globalerror and the complexity N = T ), we apply the principle of error equidistribution :the triangles T have area such that all errors e T ( q T ) p are equal i.e. e T ( q T ) p = η for some η > T . It follows that N η τ ≤ (cid:13)(cid:13)(cid:13)(cid:13) K p (cid:16) d f (cid:17)(cid:13)(cid:13)(cid:13)(cid:13) τL τ (Ω) , and therefore σ N ( f ) p ≤ (cid:107) f − I T f (cid:107) L p ≤ N /p η ≤ (cid:13)(cid:13)(cid:13)(cid:13) K p (cid:16) d f (cid:17)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) N − , which according to (4.10) implies σ N ( f ) p ≤ CN − A p ( f ) , (4.11)with A p defined as in (4.6).The estimate (4.11) is too optimistic to be correct : if f is a univariate function then A p ( f ) = 0 while σ N ( f ) p may not vanish. In a rigorous derivation such as in [4] and Chapter2, one observes that if f ∈ C , the replacement of d f by a constant over T induces anerror which becomes negligible only when the triangles are sufficiently small, and thereforea correct statement is that for any ε > N = N ( f, ε ) such that σ N ( f ) p ≤ N − (cid:32)(cid:13)(cid:13)(cid:13)(cid:13) K p (cid:16) d f (cid:17)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) + ε (cid:33) , (4.12)for all N ≥ N , i.e. lim sup N → + ∞ N σ N ( f ) p ≤ (cid:13)(cid:13)(cid:13)(cid:13) K p (cid:16) d f (cid:17)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) (4.13)which according to (4.10) implies (4.5). As already observed, the quantities A p ( f ) are well defined for functions f ∈ C , butwe expect that they should in some sense also be well defined for functions representing C − C “cartoon images” when p ≤ 2. We first give a precise definition of such functions.70 Chapter 4. From finite element approximation to image models Definition 4.3.1. A cartoon function on an open set Ω is a function almost everywhereof the form f = (cid:88) ≤ i ≤ k f i χ Ω i , where the Ω i are disjoint open sets with piecewise C boundary, no cusps (i.e. satisfying aninterior and exterior cone condition), and such that Ω = ∪ ki =1 Ω i , and where the function f i is C on Ω i for each ≤ i ≤ k . Let us consider a fixed cartoon function f on an open polygonal domain Ω (i.e. Ω issuch that Ω is a closed polygon) associated with a decomposition (Ω i ) ≤ i ≤ k . We defineΓ := (cid:91) ≤ i ≤ k ∂ Ω i , (4.14)the union of the boundaries of the Ω i . The above definition implies that Γ is the disjointunion of a finite set of points P and a finite number of open curves (Γ i ) ≤ i ≤ l .Γ = (cid:16) (cid:91) ≤ i ≤ l Γ i (cid:17) ∪ P . Furthermore for all 1 ≤ i < j ≤ l , we may impose that Γ i ∩ Γ j ⊂ P (this may be ensuredby a splitting of some of the Γ i if necessary).We now consider the piecewise linear interpolation I T N f of f on a triangulation T N ofcardinality N . We distinguish two types of elements of T N . A triangle T ∈ T N is called“regular” if T ∩ Γ = ∅ , and we denote the set of such triangles by T rN . Other triangles arecalled “edgy” and their set is denoted by T eN . We can thus split Ω according toΩ := (cid:0) ∪ T ∈T rN T (cid:1) ∪ (cid:0) ∪ T ∈T eN T (cid:1) = Ω rN ∪ Ω eN . We split accordingly the L p interpolation error into (cid:107) f − I T N f (cid:107) pL p (Ω) = (cid:90) Ω rN | f − I T N f | p + (cid:90) Ω eN | f − I T N f | p . We may use O ( N ) triangles in T eN and T rN (for example N/ f hasdiscontinuities along Γ, the L ∞ interpolation error on Ω eN does not tend to zero and T eN should be chosen so that Ω eN has the aspect of a thin layer around Γ. Since Γ is a finiteunion of C curves, we can build this layer of width O ( N − ) and therefore of global area | Ω eN | ≤ CN − , by choosing long and thin triangles in T eN . On the other hand, since f isuniformly C on Ω rN , we may choose all triangles in T rN of regular shape and diameter h T ≤ CN − / . Hence we obtain the following heuristic error estimate, for a well designedanisotropic triangulation : (cid:107) f − I T N f (cid:107) L p (Ω) = (cid:16) (cid:107) f − I T N f (cid:107) pL p (Ω rN ) + (cid:107) f − I T N f (cid:107) pL p (Ω eN ) (cid:17) /p ≤ (cid:16) (cid:107) f − I T N f (cid:107) pL ∞ (Ω rN ) | Ω rN | + (cid:107) f − I T N f (cid:107) pL ∞ (Ω eN ) | Ω eN | (cid:17) /p ≤ C ( N − p + N − ) /p , .3. Piecewise smooth functions and images (cid:107) f − I T N f (cid:107) L p (Ω) ≤ CN − min { , /p } , (4.15)where the constant C depends on (cid:107) d f (cid:107) L ∞ (Ω \ Γ) , (cid:107) f (cid:107) L ∞ (Ω) and on the number, length andmaximal curvature of the C curves which constitute Γ.Observe in particular that the error is dominated by the edge term (cid:107) f − I T N f (cid:107) L p (Ω eN ) when p > (cid:107) f − I T N f (cid:107) L p (Ω rN ) when p < 2. For the critical value p = 2 the two terms have the same order.For p ≤ 2, we obtain the approximation rate N − which suggests that approxima-tion results such as (4.5) should also apply to cartoon functions and that the quantity A p ( f ) should be finite. We would therefore like to bridge the gap between anisotropicapproximation of cartoon functions and smooth functions. For this purpose, we first needto give a proper meaning to A p ( f ) when f is a cartoon function. This is not straightfor-ward, due to the fact that the product of two distributions has no meaning in general.Therefore, we cannot define det( d f ) in a distributional sense, when the coefficients of d f are distributions without sufficient smoothness. Our approach will rather be based onregularisation. This is additionally justified by the fact that sharp curves of discontinuityare a mathematical idealisation. In real world applications, such as photography, severalphysical limitations (depth of field, optical blurring) impose a certain level of blur on theedges.In the following, we consider a fixed radial nonnegative function ϕ of unit integral andsupported in the unit ball, and we define for all δ > f defined on Ω, ϕ δ ( z ) := 1 δ ϕ (cid:16) zδ (cid:17) and f δ = f ∗ ϕ δ . (4.16)Our main result gives a meaning to A p ( f ) based on this regularization. If f is a cartoonfunction on a set Ω, and if x ∈ Γ \ P , we denote by [ f ]( x ) the jump of f at this point. Wealso denote t ( x ) and n ( x ) the unit tangent and normal vectors to Γ at x oriented in suchway that det( t , n ) = +1, and by κ ( x ) the curvature at x which is defined by the relation ∂ t ( x ) t ( x ) = κ ( x ) n ( x ) . For p ∈ [1 , ∞ ] and τ defined by τ := 1 + p , we introduce the two quantities S p ( f ) := (cid:107) (cid:112) | det( d f ) |(cid:107) L τ (Ω \ Γ) = A p ( f | Ω \ Γ ) ,E p ( f ) := (cid:107) (cid:112) | κ | [ f ] (cid:107) L τ (Γ) , which respectively measure the “smooth part” and the “edge part” of f . We also introducethe constant C p,ϕ := (cid:107) (cid:112) | ΦΦ (cid:48) |(cid:107) L τ ( R ) , where Φ( x ) := (cid:90) y ∈ R ϕ ( x, y ) dy. (4.17)Note that f δ is only properly defined on the setΩ δ := { z ∈ Ω ; B ( z, δ ) ⊂ Ω } , and therefore, we define A p ( f δ ) as the L τ norm of (cid:112) | det( d f δ ) | on this set.72 Chapter 4. From finite element approximation to image models Theorem 4.3.2. For all cartoon functions f , the quantity A p ( f δ ) behaves as follows :– If p < , then lim δ → A p ( f δ ) = S p ( f ) . – If p = 2 , then τ = and lim δ → A ( f δ ) = (cid:16) S ( f ) τ + E ( f ) τ C τ ,ϕ (cid:17) τ . (4.18) – If p > , then A p ( f δ ) → ∞ according to lim δ → δ − p A p ( f δ ) = E p ( f ) C p,ϕ . (4.19) Remark 4.3.3. This theorem reveals that as δ → , the contribution of the neighbourhoodof Γ to A p ( f δ ) is negligible when p < and dominant when p > , which was alreadyremarked in the heuristic computation leading to (4.15) . Remark 4.3.4. It seems to be possible to eliminate the “no cusps” condition in the defi-nition of cartoon functions, while still retaining the validity of this theorem. It also seemspossible to take the more natural choice ϕ ( z ) = π e −(cid:107) z (cid:107) , which is not compactly supported.However, both require higher technicality in the proof which we avoid here. Before attacking the proof of Theorem 4.3.2, we show below that the constant C p,ϕ involved in the result for p ≥ Proposition 4.3.5. Let ϕ be a radial and positive function supported on the unit ballsuch that (cid:82) ϕ = 1 and that ϕ ( x ) decreases as | x | increases. For any p ≥ we have C p,ϕ ≥ π (cid:18) τ + 2 (cid:19) τ . and this lower bound is optimal. There is no such bound if p < , but note that Theorem4.3.2 does not involve C p,ϕ for p < . Proof: Let D be the unit disc of R . We define a non smooth mollifier ψ and a functionΨ as follows ψ := χ D π and Ψ( x ) := (cid:90) R ψ ( x, y ) dy. One easily obtains that Ψ( x ) = π √ − x χ [ − , ( x ) and Ψ (cid:48) ( x ) = − xπ √ − x χ [ − , ( x ) , henceΨ( x )Ψ (cid:48) ( x ) = − xπ χ [ − , ( x ) . For all δ > ψ δ := δ − ψ ( δ − · ), and Ψ δ ( x ) := (cid:82) R ψ δ ( x, y ) dy . Similarly we obtainΨ δ ( x )Ψ (cid:48) δ ( x ) = − xπ χ [ − δ,δ ] δ − . .3. Piecewise smooth functions and images C p,ψ δ = (cid:13)(cid:13)(cid:13)(cid:13)(cid:113) Ψ δ ( x ) | Ψ (cid:48) δ ( x ) | (cid:13)(cid:13)(cid:13)(cid:13) L τ ( R ) = 2 πδ (cid:18)(cid:90) δ − δ | x | τ dx (cid:19) τ = 2 πδ (cid:18) δ τ +1 τ + 1 (cid:19) τ = 2 π (cid:18) τ + 2 (cid:19) τ δ p − . Note that If p ≥ δ ∈ (0 , 1] then C p,ψ δ ≥ C p,ψ = 2 π (cid:18) τ + 2 (cid:19) τ . (4.20)The mollifier ϕ of interest is radially decreasing, has unit integral and is supported on theunit ball. It follows that there exists a Lebesgue measure µ on (0 , ϕ = (cid:90) ψ δ dµ ( δ ) . Hence Φ( x ) := (cid:82) R ϕ ( x, y ) dy = (cid:82) Ψ δ ( x ) dµ ( δ ), for any x ∈ R . Since s (cid:55)→ s τ is concave on R + when 0 < τ ≤ 1, we obtainΦ( x ) τ = (cid:18)(cid:90) Ψ δ ( x ) dµ ( δ ) (cid:19) τ ≥ (cid:90) Ψ δ ( x ) τ dµ ( δ )Similarly, since the sign of Ψ (cid:48) δ ( x ) is independent of δ , | Φ (cid:48) ( x ) | τ = (cid:16)(cid:82) | Ψ (cid:48) δ ( x ) | dµ ( δ ) (cid:17) τ ≥ (cid:82) | Ψ (cid:48) δ ( x ) | τ dµ ( δ ) . Applying the Cauchy-Schwartz inequality we obtain (cid:112) Φ( x ) | Φ (cid:48) ( x ) | τ ≥ (cid:115)(cid:18)(cid:90) Ψ δ ( x ) τ dµ ( δ ) (cid:19) (cid:18)(cid:90) | Ψ (cid:48) δ ( x ) | τ dµ ( δ ) (cid:19) ≥ (cid:90) (cid:113) Ψ δ ( x ) | Ψ (cid:48) δ ( x ) | τ dµ ( δ )Eventually we obtain using the previous equation and (4.20) that C τp,ϕ = (cid:90) R (cid:112) Φ | Φ (cid:48) | τ ≥ (cid:90) (cid:18)(cid:90) R (cid:113) Ψ δ | Ψ (cid:48) δ | τ (cid:19) dµ ( δ ) = (cid:90) C τp,ψ δ dµ ( δ ) ≥ C τp,ψ , which concludes the proof of this lemma. (cid:5) The rest of this section is devoted to the proof of Theorem 4.3.2. Since it is ratherinvolved, we split its presentation into several main steps.74 Chapter 4. From finite element approximation to image models Step 1 : decomposition of A p ( f δ ) . Using the notation K ( M ) := (cid:112) | det M | , we canwrite A p ( f δ ) τ = (cid:90) Ω δ K ( d f δ ) τ . (4.21)We decompose this quantity based on a partition of Ω δ into three subsetsΩ δ = Ω δ ∪ Γ δ ∪ P δ . The first set Ω δ corresponds to the smooth part :Ω δ := (cid:91) ≤ i ≤ k Ω i, δ , where Ω i, δ := { z ∈ Ω i ; d ( z, Ω \ Ω i ) > δ } . Note that Ω δ is strictly contained in Ω δ . The second set corresponds to the edge part : wefirst define Γ δ := (cid:91) ≤ j ≤ l Γ j, δ , where Γ j, δ := { z ∈ Γ j ; d ( z, Γ \ Γ j ) > δ } , and then setΓ δ := (cid:91) ≤ j ≤ l Γ j, δ where Γ j, δ := { z ∈ Ω ; d ( z, Γ) < δ and π Γ ( z ) ∈ Γ j, δ } where π Γ ( z ) denotes the point of Γ which is the closest to z . The third set correspondsthe corner part : P δ := Ω δ \ (Ω δ ∪ Γ δ ) . The measures of the sets Γ δ and P δ tends to 0 as δ → 0, while | Ω δ | tends to | Ω | . Moreprecisely, we have | Γ δ | ≤ Cδ and |P δ | ≤ Cδ where the last estimate exploits the “no cusps” property of the cartoon function. We ana-lyze separately the contributions of these three sets to (4.21). Step 2 : Contribution of the smooth part Ω δ . The contribution of Ω δ to the in-tegral (4.21) is easily measured. Indeed, let us define Q δ ( z ) := (cid:26) K ( d f δ ( z )) τ if z ∈ Ω δ , Q δ ( z ) → K ( d f ( z )) τ on Ω \ Γ. Since the δ -neighbourhoodof Ω δ is included in Ω \ Γ, we have (cid:107) d ( f ∗ ϕ δ ) (cid:107) L ∞ (Ω δ ) = (cid:107) ( d f ) ∗ ϕ δ (cid:107) L ∞ (Ω δ ) ≤ (cid:107) d f (cid:107) L ∞ (Ω \ Γ) (cid:107) ϕ δ (cid:107) L = (cid:107) d f (cid:107) L ∞ (Ω \ Γ) Since K ( M ) = (cid:112) | det M | ≤ (cid:107) M (cid:107) .3. Piecewise smooth functions and images K ( d f δ ) ≤ (cid:107) d f (cid:107) L ∞ (Ω \ Γ) on Ω δ , and we conclude by dominated convergence thatlim δ → (cid:90) Ω δ K ( d f δ ) τ = lim δ → (cid:90) Ω \ Γ Q δ = (cid:90) Ω \ Γ K ( d f ) τ . Step 3 : Contribution of the corner part P δ . We only need a rough upper estimateof the contribution of P δ to the integral (4.21). We observe that (cid:107) d ( f ∗ ϕ δ ) (cid:107) L ∞ (Ω) = (cid:107) f ∗ ( d ϕ δ ) (cid:107) L ∞ (Ω) ≤ (cid:107) f (cid:107) L ∞ (Ω) (cid:107) d ϕ δ (cid:107) L ( R ) = Mδ , where M := (cid:107) f (cid:107) L ∞ (Ω) (cid:107) d ϕ (cid:107) L ( R ) . It follows that (cid:90) P δ K ( d f δ ) τ ≤ |P δ | (cid:18) Mδ (cid:19) τ ≤ Cδ − τ . If τ < 1, this quantity tends to 0 and is therefore negligible compared to the contributionof the smooth part. If τ = 1, which corresponds to p = ∞ , our further analysis showsthat the contribution of the edge part tends to + ∞ , and therefore the contribution of thecorner part is always negligible. Step 4 : Contribution of the edge part Γ δ . This step is the main difficulty ofthe proof. We make a key use of an asymptotic analysis of f δ on Γ δ , which relates itssecond derivatives to the jump [ f ] and the curvature κ as δ → 0. We first define for all δ > U δ : Γ \ P × [ − , → Ω( x, u ) (cid:55)→ x + δu n ( x ) . We notice that according to our definitions, for δ small enough, the map U δ induces a dif-feomorphism between Γ δ × [ − , 1] and Γ δ , such that π Γ ( U δ ( x, u )) = x and d ( U δ ( x, u ) , Γ) = | U δ ( x, u ) − x | = δ | u | . We establish asymptotic estimates on the second derivatives of f δ which have the following form : (cid:12)(cid:12)(cid:12)(cid:12) ∂ n , n f δ ( z ) − δ [ f ]( x )Φ (cid:48) ( u ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ Cδ (4.22) | ∂ n , t f δ ( z ) | ≤ Cδ (4.23) (cid:12)(cid:12)(cid:12)(cid:12) ∂ t , t f δ ( z ) + 1 δ [ f ]( x ) κ ( x )Φ( u ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ ω ( δ ) δ (4.24)where lim δ → ω ( δ ) = 0 and with the notation z = U δ ( x, u ). The constant C and thefunction ω depend only on f . The proof of these estimates is given in the appendix. Asan immediate consequence, we obtain an asymptotic estimate of K ( d f δ ) = (cid:112) | det( d f δ ) | of the form (cid:12)(cid:12)(cid:12) δ K ( d f δ ( z )) − (cid:112) | κ ( x ) | | [ f ]( x ) | (cid:112) | Φ( u )Φ (cid:48) ( u ) | (cid:12)(cid:12)(cid:12) ≤ ˜ ω ( δ ) , (4.25)where lim δ → ˜ ω ( δ ) = 0, and the function ˜ ω depends only on f . Using the notations g δ ( z ) := δ K ( d f δ ( z )) , λ ( x ) := (cid:112) | κ ( x ) | | [ f ]( x ) | , µ ( u ) := (cid:112) | Φ( u )Φ (cid:48) ( u ) | , Chapter 4. From finite element approximation to image models we thus have | g δ ( z ) − λ ( x ) µ ( u ) | ≤ ˜ ω ( δ ) , (4.26)for all x ∈ Γ δ , u ∈ [ − , 1] and δ > z = U δ ( x, u ). We claim thatfor any continuous functions ( g δ , λ, µ ) satisfying (4.26), we have for any τ > δ → δ − (cid:90) Γ δ g τδ = (cid:90) Γ λ ( x ) τ dx (cid:90) − µ ( u ) τ du, (4.27)which is in our case equivalent to the estimatelim δ → δ τ − (cid:90) Γ δ K ( d f δ ) τ = (cid:90) Γ | [ f ] | τ | κ | τ/ (cid:90) R | ΦΦ (cid:48) | τ/ . (4.28)In order to prove (4.27), we may assume without loss of generality that τ = 1 up toreplacing ( g δ , λ, µ ) by ( g τδ , λ τ , µ τ ). We first express the jacobian matrix of U δ using thebases B = (( t ( x ) , , (0 , B = ( t ( x ) , n ( x )) for the tangent spaces of Γ × [ − , dU δ ( x, u )] B ,B = (cid:18) − δuκ ( x ) 00 δ (cid:19) and therefore | det([ dU δ ( x, u )] B ,B ) | = δ − δ uκ ( x ). Since B and B are orthonormalbases, this quantity is the jabobian of U δ at ( x, u ), and therefore (cid:90) Γ δ g δ = δ (cid:90) Γ δ × [ − , g δ ( x + δu n ( x ))(1 − δuκ ( x )) dx du Combining with (4.26), and using dominated convergence we obtain (4.27). Step 5 : summation of the different contributions. Summing up the contributionsof Ω δ , P δ and Γ δ , we reach the estimate (cid:90) Ω K ( d f δ ) τ = (cid:90) Ω δ K ( d f δ ) τ + (cid:90) P δ K ( d f δ ) τ + (cid:90) Γ δ K ( d f δ ) τ = (cid:18)(cid:90) Ω \ Γ K ( d f ) τ + ε ( δ ) (cid:19) + B ( δ ) δ − τ + δ − τ (cid:18)(cid:90) Γ | [ f ] | τ | κ | τ/ (cid:90) R | ΦΦ (cid:48) | τ/ + ε ( δ ) (cid:19) , = ( S p ( f ) τ + ε ( δ )) + B ( δ ) δ − τ + δ − τ ( E p ( f ) τ C τp,φ + ε ( δ )) , where lim δ → ε ( δ ) = lim δ → ε ( δ ) = 0 and B ( δ ) is uniformly bounded. This concludes theproof of Theorem 4.3.2. .4. Relation with other works Theorem 4.3.2 allows us to extend the definition of A ( f ) when f is a cartoon function,according to A ( f ) := (cid:16) S ( f ) / + E ( f ) / C / ,ϕ (cid:17) / . (4.29)We first compare this additive form with the total variation TV( f ). If f is a cartoonfunction, its total variation has the additive formTV( f ) := (cid:107)∇ f (cid:107) L (Ω \ Γ) + (cid:107) [ f ] (cid:107) L (Γ) , (4.30)Both (4.29) and (4.30) include a “smooth term” and an “edge term”. It is interesting tocompare the edge term of A ( f ), which is given by E ( f ) = (cid:107) (cid:112) | κ | [ f ] (cid:107) L / (Γ) , up to the multiplicative constant C ,ϕ , with the one of TV( f ) which is simply the integralof the jump J ( f ) := (cid:107) [ f ] (cid:107) L (Γ) , Both terms are 1-homogeneous with the value of the jump of the function f . In particular,if the value of this jump is 1 (for example when f is the characteristic function of a set ofboundary Γ), we have E ( f ) = (cid:16)(cid:90) Γ | κ | / (cid:17) / , (4.31)while J ( f ) coincides with the length of Γ. In summary, A ( f ) takes into account the smoothness of edges, through their curvature κ , while TV( f ) only takes into account their length .Let us now investigate more closely the measure of smoothness of edges which isincorporated in A ( f ). According to (4.31), this smoothness is meant in the sense that thearc length parametrizations of the curves that constitute Γ admit second order derivativesin L . In the following, we show that this particular measure of smoothness is naturallyrelated to some known results in two different areas : adaptive approximation of curvesand affine-invariant image processing.We first revisit the derivation of the heuristic estimate (4.15) for the error betweena cartoon function and its linear interpolation on an optimally adapted triangulation. Inthis computation, the contribution of the “edgy triangles” was estimated by the area ofthe layer Ω eN according to (cid:107) f − I T N f (cid:107) L p (Ω eN ) ≤ (cid:107) f (cid:107) L ∞ | Ω eN | /p . Then we invoke the fact that Γ is a finite union of C curves Γ j in order to build a layerof global area | Ω eN | ≤ CN − , which results in the case p = 2 into a contribution to the L p error of the order O ( N − ). The area of the layer Ω eN is indeed of the same order as thearea between the edge Γ and its approximation by a polygonal line with O ( N ) segments.Each of the curves Γ j can be identified to the graph of a C function in a suitableorthogonal coordinate system. If γ is one of these functions, the area between Γ and its78 Chapter 4. From finite element approximation to image models polygonal approximation can thus be locally measured by the L error between the one-dimensional function γ and a piecewise linear approximation of this function. Since γ is C ,it is obvious that it can be approximated by a piecewise linear function on O ( N ) intervalswith accuracy O ( N − ) in the L ∞ norm and therefore in the L norm. However, we mayask whether such a rate could be achieved under weaker conditions on the smoothnessof γ . The answer to this question is a chapter of nonlinear approximation theory whichidentifies the exact conditions for a function γ to be approximated at a certain rate bypiecewise polynomial functions on adaptive one-dimensional partitions. We refer to [42]for a detailed treatment and only state the result which is of interest to us. We say that afunction γ defined on a bounded interval I belongs to the approximation space A s ( L p ) ifand only if there exists a sequence ( p N ) N> of functions where each p N is piecewise affineon a partition of I by N intervals such that (cid:107) γ − p N (cid:107) L p ≤ CN − s . For 0 < s ≤ 2, it is known that γ ∈ A s ( L p ) provided that γ ∈ B sτ,τ ( I ) with τ := p + s ,where B sτ,τ ( I ) is the standard Besov space that roughly describes those functions having s derivatives in L τ . In the case s = 2 and p = 1 which is of interest to us, we find τ = and therefore γ should belong to the Besov space B , ( I ). Note that in our definitionof cartoon functions, we assume much more than B , smoothness on γ , and it is notclear to us if Theorem (4.3.2) can be derived under this minimal smoothness assumption.However it is striking to see that the quantity E ( f ) that is revealed by Theorem (4.3.2)precisely measures the second derivative of the arc-length parametrization of Γ in the L norm, up to the multiplicative weight | [ f ] | / . Let us also mention that Besov spaces havebeen used in [43] in order to describe the smoothness of functions through the regularityof their level sets. Note that edges and level sets are two distinct concepts, which coincidein the case of piecewise constant cartoon functions.The quantity | κ | / is also encountered in mathematical image processing, for thedesign of simple smoothing semi-groups that respect affine invariance with respect to theimage. Since these semi-groups should also have the property of contrast invariance , theycan be defined through curve evolution operators acting on the level sets of the image. Thesimplest curve evolution operator that respects affine invariance is given by the equation d Γ dt = −| κ | / n , where n is the outer normal, see e.g. [22]. Here the value 1 / E ( f ) suggests that some affine invariance propertyalso holds for this quantity as well as for A ( f ). We first notice that if f is a compactlysupported C function of two variables and T is a bijective affine transformation, thenwith ˜ f such that f = ˜ f ◦ T, we have the property d f ( z ) = L T d ˜ f ( T z ) L, .4. Relation with other works L is the linear part of T and L T its transpose, so that (cid:112) | det( d f ( z )) | = | det L | (cid:113) | det( d ˜ f ( T z )) | . By change of variable, we thus find that A p ( ˜ f ) = | det L | /τ − A p ( f ) = | det L | /p A p ( f ) . (4.32)A similar invariance property can be derived on the interpolation error σ N ( f ) p = (cid:107) f − I T N f (cid:107) L p where T N is a triangulation which is optimally adapted to f in the sense of mini-mizing the linear interpolation error in the L p norm among all triangulations of cardinality N . We indeed remark that an optimal triangulation for ˜ f is then given by applying T to all elements of T N . For such a triangulation ˜ T N := T ( T N ), one has the commutationformula I T N f = (I ˜ T N ˜ f ) ◦ T, and therefore we obtain by a change of variable that σ N ( ˜ f ) p = (cid:107) ˜ f − I ˜ T N ˜ f (cid:107) L p = | det L | /p (cid:107) f − I T N f (cid:107) L p = | det L | /p σ N ( f ) p . (4.33)Let us finally show that if f is a cartoon function, then E ( f ) satisfies a similar invarianceproperty corresponding to p = 2, namely E ( ˜ f ) = | det L | / E ( f ) . (4.34)Note that this cannot be derived by arguing that A ( f ) satisfies this invariance propertywhen f and ˜ f are smooth, since we lose the affine invariance property as we introduce theconvolution by ϕ δ : we do not have( f ◦ T ) ∗ ϕ δ = ( f ∗ ϕ δ ) ◦ T. unless T is a rotation or a translation.Let Γ j be one of the C pieces of Γ and γ j : [0 , B i ] → Ω a regular parametrisation ofΓ j . The curvature of Γ on Γ j at the point γ j ( t ) is therefore given by κ ( γ j ( t )) = det( γ (cid:48) j ( t ) , γ (cid:48)(cid:48) j ( t )) (cid:107) γ (cid:48) j ( t ) (cid:107) (4.35)Since f = ˜ f ◦ T , the discontinuity curves of ˜ f are the images of those of f by T :˜Γ j = T (Γ j ) . The curvature of ˜Γ j at the point T ( γ j ( t )) is therefore given by˜ κ ( T ( γ j ( t ))) = det( Lγ (cid:48) j ( t ) , Lγ (cid:48)(cid:48) j ( t )) (cid:107) Lγ (cid:48) j ( t ) (cid:107) = det( L ) det( γ (cid:48) j ( t ) , γ (cid:48)(cid:48) j ( t )) (cid:107) Lγ (cid:48) j ( t ) (cid:107) . This leads us to the relation : | det( L ) | / | κ ( γ j ( t )) | / (cid:107) γ (cid:48) j ( t ) (cid:107) = | ˜ κ ( T ( γ j ( t ))) | / (cid:107) Lγ (cid:48) j ( t ) (cid:107) , (4.36)80 Chapter 4. From finite element approximation to image models and therefore (cid:90) ˜Γ j | [ ˜ f ] | / | ˜ κ | / = (cid:90) B j | [ ˜ f ]( T ( γ j ( t ))) | / | ˜ κ ( T ( γ j ( t ))) | / (cid:107) Lγ (cid:48) j ( t ) (cid:107) dt, = | det L | / (cid:90) B j | [ f ]( γ j ( t )) | / | κ ( γ j ( t )) | / (cid:107) γ (cid:48) j ( t ) (cid:107) dt, = | det L | / (cid:90) Γ i | [ f ] | / | κ | / . Summing over all j = 1 , · · · , l and elevating to the 3 / We first validate our previous results by numerical tests applied to a simple cartoonimage : the Logan-Shepp phantom. We use a 256 × 256 pixel version of this image, with aslight modification which is motivated further. This image is iteratively smoothed by thenumerical scheme u n +1 i,j = u ni,j u ni +1 ,j + u ni − ,j + u ni,j +1 + u ni,j − . (4.37)This scheme is an explicit discretization of the heat equation. Formally, as n grows, u n isa discretization of u ∗ ϕ λ √ n with ϕ δ ( z ) := 1 πδ e − (cid:107) z (cid:107) δ , (4.38)where u stands for the continuous image. The determinant of the hessian is discretised bythe following 9-points formula d ni,j := ( u ni,j +1 − u ni,j + u ni,j − )( u ni +1 ,j − u ni,j + u ni − ,j ) − ( u ni +1 ,j +1 + u ni − ,j − − u ni +1 ,j − − u ni − ,j +1 ) 16 (4.39)For each value of n , we then compute the (cid:96) τ norm of the array (cid:16)(cid:113) | d ni,j | (cid:17) for τ ∈ [ , p ∈ [1 , ∞ ] with τ := 1 + p . This norm is thus a discretization of thequantity (cid:13)(cid:13)(cid:13)(cid:113) | det( d ( u ∗ ϕ λ √ n )) | (cid:13)(cid:13)(cid:13) L τ . For each value of n we obtain a function τ ∈ [ , → D n ( τ ) ∈ R + .As n grows, three consecutive but potentially overlapping phases appear in the beha-viour of the functions D n , which are illustrated on Figure 4.2.1. For small n , the 9-points discretisation is not a good approximation of the determi-nant of the hessian due to the fact that the pixel discretization is too coarse comparedto the smoothing width. During this phase, the functions D n decay rapidly for allvalues of τ . .5. Numerical tests Figure n ! ! n ! 50 n ! ! ! n ! ! Figure D n ( τ ) for n ≤ 50 (left), 50 ≤ n ≤ 100 (center) and 100 ≤ n ≤ 200 (right).2. For some range of n , the edges have been smoothed by the action of (4.37), but theparameter λ √ n in (4.38) remains rather small. Our previous analysis applies and weobserve that D n (2 / 3) is (approximately) constant while D n ( τ ) increases for τ < / τ > / n , the details of the picture fade and begin to disappear. The picture beginsto resemble a constant picture. Therefore the functions D n decay for all values of τ ,and eventually tend to 0.In our numerical experiments, we used the well known Shepp-Logan Phantom witha slight modification as shown on Figure 4.1 : we have removed the thin layer aroundthe head, which represents the skull, because it disappears too quickly by the smoothingprocedure and causes phases 1 and 3 to overlap, masking phase 2. For more complica-ted functions f , such as most photographic pictures, phases 1 and 3 also tend to overlapfor similar reasons. Indeed, these pictures often have details at the pixel scale, includingelectronic noise due to the captor. Since these details disappear early phase 3 begins im-mediately, therefore phase 2 cannot be observed.82 Chapter 4. From finite element approximation to image models Our next discussion aims at comparing the quantities A ( f ) with the total variationTV( f ) on numerical images. In particular, we want to compare how these quantitiesmeasure the geometric complexity of images. Here, we consider non-discretized images z = ( x, y ) ∈ [0 , (cid:55)→ f ( z ) , and we compare the numerical behaviour of A ( f ) and TV( f ) for some relevant cases.Let us recall that for any g ∈ C ( R + ), if f is the radial function f ( z ) = g ( | z | ) , one has det (cid:0) d f ( z ) (cid:1) = 1 | z | g (cid:48) ( | z | ) g (cid:48)(cid:48) ( | z | ) . As a result we find that the two functionals TV and A / behave similar on the oscillatoryimages f ω ( z ) := cos( ω | z | ) illustrated on Figure 4.3 (left), as ω → + ∞ :TV( f ω ) (cid:39) ω and A ( f ω ) (cid:39) ω / . (4.40)We next consider the cartoon images defined by g ω ( z ) = (cid:98) ω | z |(cid:99) ω , which have the form of acircular staircase with approximately ω steps of height 1 /ω , as illustrated on Figure 4.3(center). In this case, we may give a meaning to A ( g ω ) based on (4.29), which since g ω is piecewise constant gives A ( g ω ) / = C (cid:90) Γ | [ g ω ] | | κ | , where Γ are the circular curves of discontinuities. From this we find that as ω → + ∞ ,TV( g ω ) (cid:39) A ( g ω ) (cid:39) √ ω. (4.41)This shows that in contrast to TV, the functional A penalizes images which have theappearance of a staircase.Finally we consider the cartoons functions h ω ( z ) = χ S ω where the set S ω is defined by S ω := (cid:26) z = ( r cos θ, r sin θ ) ; 0 ≤ θ ≤ π , and 0 ≤ r ≤ ω cos( ωθ ) (cid:27) , see Figure 4.3 (right). These functions have therefore only one step but its geometrybecomes more and more oscillatory as ω grows. More precisely its length remains finite,but its curvature behaves like ω . In turn, we obtain that as ω → + ∞ ,TV( h ω ) (cid:39) A ( h ω ) (cid:39) √ ω. (4.42)This shows that in contrast to TV, the functional A penalizes the fact that the curve ofdiscontinuity ∂S ω strongly oscillates.These examples suggest that the functional A gives a more relevant account of thecomplexity of cartoon images than the total variation TV. In particular staircasing effectsand oscillatory curves of discontinuity are penalized. .6. Applications to image restoration (cid:39) A / (cid:39) ω TV (cid:39) A (cid:39) √ ω TV (cid:39) A (cid:39) √ ω Figure f ω , g ω and h ω of images, and behavior ofthe functionals TV and A . The previous discussion suggests that we could use A as an alternative to TV as aprior in image restoration. In particular, we may try to replace | g | BV by A ( g ) in (4.2),and therefore consider the minimization problemmin g { A ( g ) ; (cid:107) T g − h (cid:107) L ≤ ε } , or its formulation using a Lagrange multipliermin g (cid:107) T g − h (cid:107) L + tA ( g ) / . (4.43)Our first observation is that, in contrast to (4.2), these problems are non-convex.Moreover, it is easily seen that even in the very simple case of image denoising corres-ponding to T = Id , the above problems are ill-posed, in the following sense. Proposition 4.6.1. For any f ∈ L ([0 , ) , there exists a sequence ( f n ) n ≥ of C ∞ func-tions such that (cid:107) f − f n (cid:107) L → A ( f n ) → . Consequently the infimums of A ( g ) over those g ∈ C ∞ such that (cid:107) g − h (cid:107) L ≤ ε and of (cid:107) g − h (cid:107) L + tA ( g ) / over all g ∈ C ∞ are both equal to , and are not attained in general. Proof: If T is a triangulation of the domain [0 , , and if g is piecewise affine on T ,we may consider its regularized version g δ := g ∗ ϕ δ . From Theorem 4.3.2, we find that A ( g δ ) → δ → 0. The proof follows by observing that piecewise affine functions ontriangulations are dense in L . (cid:5) Note that the above result is formulated in the setting of non-discretized images, anddoes not exactly hold for numerical images if we define A using the 9-points formula(4.39) for the discretization of the determinant of the hessian. However, we expect thatthe ill-posedness in the continuous setting is reflected by a bad behaviour of the solutionof the discrete optimization problem.84 Chapter 4. From finite element approximation to image models In summary, a straightforward generalization of the minimization approach (4.2) with A in place of TV is doomed to fail. In order to circumvent this difficulty, we proposea different strategy based on a restoration algorithm introduced in [70] and [71]. Thisalgorithm is based on a bayesian framework that we briefely recall. For simplicity, we focus from this point on the discrete setting, in which images arediscretized on a N × N grid where N ≥ 1. We denote by F := R N × N . the collection of all discrete images. Bayesian restauration algorithms are based on a priorprobability distribution on F describing how certain images are more “plausible” thanothers. One way to build such priors is through a functional J : F → R + which reflects the plausibility of an image : typically the value of J is large for compleximages and zero only for few very simple images. The functional J is typically obtainedas the discretization of a measure of smoothness, such as the total variation TV or thequantity A / (discretized with the nine points formula (4.39)) which is shown in § Z J := (cid:90) F exp( − J ( F )) dF is finite, our bayesian prior for plausible images is the probability distribution e − J ( F ) Z J dF (4.44)where dF stands for the Lebesgue measure on F .We use the simplistic, although popular, model of additive gaussian noise in whichrandom drafts from F with respect to the probability e − α (cid:107) Ω (cid:107) Z α d Ω , (4.45)where α > Z α is a normalizing factor, represent typical noise .We regard a corrupted image G as a random variable defined as the sum G = F + Ωof a random variable F of distribution (4.44), the original image , and a random variableΩ of distribution (4.45), the corruption by additive gaussian noise. .6. Applications to image restoration p ( G | F ) of the corrupted image G with respect to the original image F is p ( G | F ) = e − α (cid:107) Ω (cid:107) Z α = e − α (cid:107) G − F (cid:107) Z α . The conditional probability density p ( F | G ) of the original image F with respect to thecorrupted one G is obtained by the bayesian rule of conditional probabilities p ( F | G ) p ( G ) = p ( G | F ) p ( F ) . Replacing p ( G | F ) and p ( F ) with their explicit expressions we obtain p ( F | G ) = 1 p ( G ) p ( G | F ) p ( F )= 1 p ( G ) (cid:18) Z α e − α (cid:107) G − F (cid:107) (cid:19) (cid:18) Z J e − J ( F ) (cid:19) = 1 p ( G ) Z α Z J e − α (cid:107) G − F (cid:107) − J ( F ) . We define σ ( F, G ) = exp( − α (cid:107) G − F (cid:107) − J ( F )) . For any fixed G , the explicit function F ∈ F (cid:55)→ σ ( F, G ) ∈ R ∗ + is therefore proportionalto the conditional probability density F (cid:55)→ p ( F | G ).In order to recover the original image F from the corrupted one G a first approach, cal-led the maximum a posteriori (MAP), consists in maximizing the conditional probabilitydensity F ∗ J := argmax F ∈F p ( F | G )= argmax F ∈F σ ( F, G )= argmin F ∈F α (cid:107) F − G (cid:107) + J ( F )We thus recover the optimization procedure (4.43) in the case T = Id of image denoising,and of the functionnal J = A / . As observed in Proposition 4.6.1 this approach is doomedto fail for the functional A / . For the total variation functional J = TV, the MAPapproach gives good results, but is also known to produce visual artifacts : the restoredimage is exactly constant on large regions, delimited by sharp discontinuities which donot correspond to a feature of the original image. The heuristical reason of this problemis that the maximum of a probability density is generally not a good representative of arandom draft for this probability, as discussed in [70].A second approach, called the minimum mean square error (MMSE), consists in finding F which minimizes the empirical quadratic risk (cid:107) F − F (cid:48) (cid:107) with respect to a random image86 Chapter 4. From finite element approximation to image models F (cid:48) distributed according to the conditional probability density p ( F (cid:48) | G ) : F J := argmin F ∈F (cid:90) F (cid:107) F − F (cid:48) (cid:107) p ( F (cid:48) | G ) dF (cid:48) , = argmin F ∈F (cid:90) F (cid:107) F − F (cid:48) (cid:107) σ ( F (cid:48) , G ) dF (cid:48) , = 1 Z (cid:90) F F σ ( F, G ) dF. where Z = (cid:82) F σ ( F, G ) dF is a normalizing factor. This method is studied in depth in [70]in the case J = TV where it is shown that it has the advantage of suppressing the visualartifacts observed in the MAP approach, and we describe in the next section its adaptationto the case J = A / . The computation of the estimator F J is a numerical challenge, since it involves anintegration on the space F = R N × N which has dimension at least 512 × (cid:39) F k ) k ≥ of images in F which is recurrentwith respect to the probability measure σ ( F, G ) dF/Z , where G is the denoised image and Z is a normalizing factor. Then, almost surely,lim K →∞ K (cid:88) ≤ k ≤ K − F k = 1 Z (cid:90) F F σ ( F, G ) dF. (4.46)Two successive images F k and F k +1 generated by the algorithm proposed in [70] only differby the value of a single pixel. The cost of a step of the algorithm, generating a new image F k , comes mainly from the computation of the ratio σ ( U, G ) σ ( V, G ) = exp (cid:16) (cid:107) V − G (cid:107) − (cid:107) U − G (cid:107) + J ( V ) − J ( U ) (cid:17) , (4.47)where U and V are two elements of F which differ at a single position ( i, j ).This algorithm applies to any continuous functional J ∈ C ( F , R + ), hence in particularto non-convex functionals such as J = A / . For numerical applications, in order to havea good speed of convergence, the ratio (4.47) needs to be extremely cheap to compute interms of computer time. The discretisation of the total variation TV or of the functional A / , are given for an image F by respectively1 N (cid:88) i,j | F i,j − F i +1 ,j | + | F i,j − F i,j +1 | and N − / (cid:88) i,j | d i,j | where d i,j is given by the formula (4.39). The ratio (4.47) is cheap to compute numerically,as required by the algorithm, since it has an explicit algebraic expression which only .6. Applications to image restoration U, V ∈ F at the positions ( i + k, j + l ) wheremax {| k | , | l |} ≤ 2, where ( i, j ) denotes the single position at which these images differ.From a theoretical point of view, the averaging procedure (4.46) converges at thespeed O ( K − ). Unfortunately the constant in front of the convergence rate plays a keyrole in practice, and we did not manage to “reach the convergence” for realistic 512 × J = A / . This issue might be solved by a parallelization ofthe algorithm, or a modification of the algorithm in which a group of neighboring pixelsis modified at each step instead of a single pixel. For the time being, we thus only presentone-dimensional results. We present numerical results in one dimension only, using some counterparts of thetotal variation TV and of the functional A / in that context. We define for each function f ∈ C ([0 , D ( f ) := (cid:90) [0 , (cid:112) | f (cid:48)(cid:48) | . (4.48)The next proposition shows that D ( f ) can be given a meaning when f has localizeddiscontinuites, similarly to the functional A / for cartoon images. Proposition 4.6.2. Let f : [0 , → R be piecewise C , with a finite set E of discontinuitypoints in ]0 , . Let ϕ be a C mollifier supported in [ − , , satisfying (cid:82) R ϕ = 1 and ϕ ( x ) = ϕ ( − x ) for all x ∈ R . For all δ > let ϕ δ := δ ϕ ( · δ ) and let f δ := f ∗ ϕ δ . Then f δ ∈ C ([ δ, − δ ]) and satisfies lim δ → (cid:90) [ δ, − δ ] (cid:113) | f (cid:48)(cid:48) δ | = (cid:90) ]0 , \ E (cid:112) | f (cid:48)(cid:48) | + C ( ϕ ) (cid:88) e ∈ E (cid:112) | [ f ]( e ) | (4.49) where C ( ϕ ) := (cid:82) R (cid:112) | ϕ (cid:48) | . Proof: We denote by 0 < e < e < · · · < e n − < E and we define e = 0and e n = 1. The function f can written as the sum f = J + A + S, where J (the Jump part) is piecewise constant, and A is continuous and piecewise Affine,with respect to the partition ( ] e i , e i +1 [ ) ≤ i ≤ n − of the interval [0 , S is C on ]0 , S (cid:48)(cid:48) is uniformly bounded. Then, on the interval [ δ, − δ ], f δ = f ∗ ϕ δ = ( J + A + S ) ∗ ϕ δ = J δ + A δ + S δ . We now assume that the parameter δ satisfies0 < δ < min ≤ i ≤ n − e i +1 − e i . For any 0 ≤ i ≤ n − e i + δ, e i +1 − δ ] J δ = J, A δ = A and S δ = f δ − A − J. Chapter 4. From finite element approximation to image models ! ! TV " D TV ! D Figure D on different types of functions.Hence J (cid:48)(cid:48) δ = A (cid:48)(cid:48) δ = 0, and S (cid:48)(cid:48) δ = f (cid:48)(cid:48) δ converges uniformly to f (cid:48)(cid:48) as δ → 0, on this interval.Therefore lim δ → (cid:88) ≤ i ≤ n − (cid:90) e i +1 − δe i + δ (cid:113) | f (cid:48)(cid:48) δ | = (cid:90) ]0 , \ E (cid:112) | f (cid:48)(cid:48) | . (4.50)Furthermore for any 1 ≤ i ≤ n − | x | ≤ J (cid:48)(cid:48) δ ( e i + δx ) = 1 δ [ f ]( e i ) ϕ (cid:48) ( x ) and A (cid:48)(cid:48) δ ( e i + δx ) = 1 δ [ f (cid:48) ]( e i ) ϕ ( x )where[ f ]( e i ) := lim ε → + f ( e i + ε ) − f ( e i − ε ) and [ f (cid:48) ]( e i ) := lim ε → + f (cid:48) ( e i + ε ) − f (cid:48) ( e i − ε ) . On the other hand S (cid:48)(cid:48) δ ( e i + δx ) is uniformly bounded independently of x and δ . Hence thecontribution of J (cid:48)(cid:48) δ is dominant on the intervals [ e i − δ, e i + δ ], and we obtainlim δ → (cid:88) ≤ i ≤ n − (cid:90) e i + δe i − δ (cid:113) | f (cid:48)(cid:48) δ | = (cid:88) ≤ i ≤ n − | [ f ]( e i ) | (cid:90) − (cid:112) | ϕ (cid:48) | . (4.51)Combining (4.50) and (4.51) we conclude the proof of this proposition. (cid:5) For the simple oscillating function f ω ( x ) := cos( ωx ), illustrated on Figure 4.4 (left),the two functionals TV and D behave similarly as ω → ∞ :TV( f ω ) (cid:39) ω and D ( f ω ) (cid:39) ω. (4.52)On the contrary for the function g ω ( x ) := (cid:98) ωx (cid:99) ω , illustrated on Figure 4.4 (right) we obtainas ω → ∞ TV( g ω ) (cid:39) D ( g ω ) (cid:39) √ ω, (4.53)where D ( g ω ) is understood in the sense of (4.49). Hence the functionals TV and D treatdiscontinuities very differently, and the latter penalizes functions which have the appea-rance of a staircase.We denote by F := R N . .6. Applications to image restoration F ∈ F , we denote byTV( F ) and D ( F ) the discretization of the functionals TV and D using finite differences,as follows TV( F ) := (cid:88) ≤ i ≤ N − | F i − F i +1 | . (4.54)and D ( F ) := (cid:88) ≤ i ≤ N − (cid:112) | F i +1 − F i + F i − | , (4.55)We now turn to numerical results. Given a one dimensional discretized function F ∈F := R N we compute using the algorithm described in § F . F ∗ TV := argmin F (cid:48) ∈ R N α (cid:107) F − F (cid:48) (cid:107) + TV( F (cid:48) ) F TV := 1 Z (cid:90) R N F (cid:48) exp( − α (cid:107) F − F (cid:48) (cid:107) − β TV( F (cid:48) )) dF (cid:48) F D := 1 Z (cid:90) R N F (cid:48) exp( − α (cid:107) F − F (cid:48) (cid:107) − β D ( F (cid:48) )) dF (cid:48) , where Z and Z are normalizing coefficients. The parameters α , α , α and β , β can befreely chosen.We have not attempted to denoise some data and to compare the PSNR of the res-toration by the different methods. Indeed such a comparison should, in order to be fair,involve an in depth analysis of the role of the parameters α , α , α and β , β , that wedid not have the time to do. Note also that F ∗ TV involves a single parameter, while twoparameters are needed for F TV and F D . We refer to the thesis [70] for the comparison of F ∗ TV and F TV from the point of view of the PSNR.Instead we regard the three procedures as regularization methods, that we apply todifferent test functions. We focus on the qualitative differences of the regularizations F ∗ TV , F TV and F D of different functions F , and we discuss on how these differences traduce thedifferent bayesian priors implicit in the three methods.Our first experiment, see Figure 4.5, is the regularization of a random walk. As remar-ked in [70], F ∗ TV is constant over large intervals and also has several sharp discontinuities.These two types of features were not present in the original function F and are undesi-rable. They are avoided in the two other regularizations F TV and F D .Our second experiment, see Figure 4.6 is the regularization of a Heaviside function.Up to numerical artifacts, the functions F and F ∗ TV are identical, which is precisely whatis wanted in this situation. The discontinuity disappears in F TV and F D , and is replacedwith a smooth but sharp and well localized transition. Numerical experiments on real(bidimensional) images in [70] show that generally, and with appropriate parameters, theregularization F TV of an image F does not cause a perceptible blurring of the edges whichappear in the original image.90 Chapter 4. From finite element approximation to image models 50 100 150 200 ! 50 100 150 2000.20.40.60.8 50 100 150 2000.20.40.60.8 50 100 150 200 ! Figure F , (Top, Right) F ∗ TV ,(Bottom,Left) F TV , (Bottom, Right) F D . 10 20 30 400.20.40.60.81.0 10 20 30 400.20.40.60.81.0 10 20 30 400.20.40.60.81.0 10 20 30 400.20.40.60.81.0 Figure F , (Top, Right) F ∗ TV , (Bottom,Left) F TV , (Bottom, Right) F D . .6. Applications to image restoration 50 100 150 200 250 300 ! ! 50 100 150 200 250 300 ! 50 100 150 200 250 300 ! 50 100 150 200 250 300 ! ! Figure x ) where x ∈ [0 , F , (Top, Right) F ∗ TV ,(Bottom,Left) F TV , (Bottom, Right) F D .This example puts in light a strong flaw of the regularization method F D : it does notobey the maximum principle. Indeed, it is clear on Figure 4.6 that F ∗ TV and F TV take theirvalues in the interval [min F, max F ], but F D does not.Our third and last experiment, see Figure 4.7 is the regularization of a “Churp” : anoscillating function of increasing frequency, here sin( x ) where x ∈ [0 , F ∗ TV again introduces un-desirable visual artefacts : F ∗ TV is exactly constant on some regions, and ∇ F ∗ TV has sharpdiscontinuities which do not correspond to a feature of the original function. In contrastthe MMSE regularizations F TV and F D do not exhibit these artefacts. As anticipated, theslow oscillations of the original function F are preserved, while the fast oscillations areattenuated. It seems qualitatively that fast oscillations are more strongly attenuated in F D than in F TV . The author could not prove this last property, but a heuristical analysissuggests that sinuosidal oscillations of large frequency ω (cid:29) ω − in F TV and ω − in F D .92 Chapter 4. From finite element approximation to image models The results on approximation by anisotropic bidimensional piecewise linear finite ele-ments that we have exposed in § m − ⊂ IR d by simplices. Here, thelocal error is defined as e m,T ( f ) p := (cid:107) f − I m − T f (cid:107) L p ( T ) , where I m − T denotes the local interpolation operator on IP m − for a d -dimensional simplex T . This operator is defined by the conditionI m − T v ( γ ) = v ( γ ) , for all points γ ∈ T with barycentric coordinates in the set { , m − , m − , · · · , } . Wedenote by IH m ⊂ IP m the collection of homogeneous polynomials of degree m , and wedefine for any q ∈ IH m the quantity K m,p ( q ) := inf | T | =1 e m,T ( q ) p . We refer to K m,p as the shape function . For piecewise linear elements in dimension two,i.e. m = d = 2, we have observed that K p = K ,p has the special form given by (4.10)which justifies the introduction of the quantity A p ( f ). In a similar way, it can easily beproved that for piecewise linear elements in higher dimension, i.e. m = 2 and d > 2, onehas c | det( q ) | /d ≤ K ,p ( q ) ≤ c | det( q ) | /d . For piecewise quadratic elements in dimension two, i.e. m = 3 and d = 2, it is proved inChapter 2 that c | disc( q ) | / ≤ K ,p ( q ) ≤ c | disc( q ) | / . for any homogeneous polynomial q ∈ IH , wheredisc( ax + bx y + cxy + dy ) := b c − ac − b d + 18 abcd − a d . For other values of m and d , equivalent expressions of K m,p ( q ) in terms of polynomials inthe coefficients of q are available but of less simple form, see Chapter 2.Defining the finite element interpolation error by an optimally adapted partition σ N ( f ) p := inf T ) ≤ N (cid:107) f − I m − T f (cid:107) L p , where I m − T is the global interpolation operator for the simplicial partition (possibly nonconforming) T , the following generalization of (4.13) is proved in Chapter 2 :lim sup N → + ∞ N md σ N ( f ) p ≤ C d (cid:13)(cid:13)(cid:13)(cid:13) K m,p (cid:16) d m fm ! (cid:17)(cid:13)(cid:13)(cid:13)(cid:13) L τ (Ω) , τ = 1 p + md . (4.56) .7. Extension to higher dimensions and higher order elements C d is equal to 1 when d = 2 but larger than 1 when d > f is a C m function of d variables, it is therefore natural to consider the quantity A m,p ( f ) := (cid:107) K m,p ( d m f ) (cid:107) L τ (Ω) , τ = 1 p + md , (4.57)as a possible way to measuring anisotropic smoothness. For d = 2 and piecewise linearelements, we have seen in § A ,p ( f ) is equivalent to the quantity A p ( f ).Similarly to A p we are interested in the possible extension of A m,p to cartoon functions.We first introduce a generalisation of the notion of cartoon functions to higher piecewisesmoothness m and dimension d . Definition 4.7.1. Let m ≥ and d ≥ be two integers. Let Ω ⊂ R d be an open set. Wesay that a function f defined on Ω is a C m cartoon function if it is almost everywhere ofthe form f = (cid:88) ≤ i ≤ k f i χ Ω i , where the Ω i are disjoint open sets with piecewise C boundary, no cusps (i.e. satisfyingan interior and exterior cone condition), and such that Ω = ∪ ki =1 Ω i . Additionally, for each ≤ i ≤ k , the function f i is assumed to be C m on Ω i . Let us consider a fixed cartoon function f on a polyhedral domain Ω ⊂ R d (i.e. Ω issuch that Ω is a closed polyhedron), and a decomposition (Ω i ) ≤ i ≤ k of Ω as in definition4.7.1. As before we define Γ := (cid:83) ≤ i ≤ k ∂ Ω i , the union of the boundaries of the Ω i . Ourassumptions on the sets (Ω i ) ≤ i ≤ k imply that Γ is the union of a finite number of openhypersurfaces (Γ j ) ≤ j ≤ l , and of a set P of dimension d − § f N of piecewise linear approximations of f onsimplicial partitions T N of cardinality N . We distinguish two types of elements of T N . Asimplex T ∈ T N is called “regular” if T ∩ Γ = ∅ , and we denote the set of these simplicesby T rN . Other simplices are called “edgy” and their set is denoted by T eN . We can againsplit Ω according to Ω := ( ∪ T ∈T rN T ) ∪ ( ∪ T ∈T eN T ) = Ω rN ∪ Ω eN . Heuristically, if the partitions T N are built with approximation error minimisation in mind,the number of elements should be balanced between T rN and T eN . The partition T rN tends tocover most of the surface of Ω, with simplices of diameter ≤ CN − d , and L ∞ approximationerror | f − f N | ≤ CN − md (since we use IP m − elements). On the other hand, since f hasdiscontinuities along Γ, the L ∞ approximation error on T eN does not tend to zero, and T eN should thus be chosen so as to produce a thin layer around Γ. Let h be the typicaldiameter of an element of T eN . Since the Γ j has bounded curvature, this layer can be madeof width O ( h ) and therefore the layer around Γ has volume bounded by h H d − (Γ) up toa fixed multiplicative constant, where H d − (Γ) is the d − h needed tocover Γ is bounded by h − d H d − (Γ) up to a fixed multiplicative constant. Eventually, wefind that the layer around Γ has volume bounded by CN − d − .94 Chapter 4. From finite element approximation to image models Hence we have the following heuristic error estimate, for a well designed anisotropicpartition : (cid:107) f − f N (cid:107) L p (Ω) ≤ (cid:107) f − f N (cid:107) L p (Ω rN ) + (cid:107) f − f N (cid:107) L p (Ω eN ) ≤ (cid:107) f − f N (cid:107) L ∞ (Ω rN ) | Ω rN | p + (cid:107) f − f N (cid:107) L ∞ (Ω eN ) | Ω eN | p ≤ C ( N − md + N − p ( d − )This leads us to define a critical exponent p c = p c ( m, d ) := 2 dm ( d − . If one measures the error in L p norm with p > p c ( m, d ), then the contribution of theedge neighbourhood Ω eN dominates, while if p < p c ( m, d ) it is negligible compared tothe contribution of the smooth region Ω rN . For the critical exponent p = p c ( m, d ) the twoterms have the same order, which makes the situation more interesting. Note in particularthat p c (2 , 2) = 2, which is consistent with our previous analysis.For p ≤ p c ( m, d ), we obtain the approximation rate N − m/d which suggests that ap-proximation results such as (4.56) should also apply to cartoon functions and that thequantity A m,p ( f ) should be finite for such functions. We again need to use a regularizationapproach, for the same reasons as in § d , we consider a radialnonnegative function ϕ of unit integral and supported in the unit ball of R d , and we definefor δ > ϕ δ ( z ) := 1 δ d ϕ (cid:16) zδ (cid:17) and f δ = f ∗ ϕ δ . (4.58)In order to define the quantities of involved in our conjecture, we need to introducethe second fundamental form of an hypersurface. At any point x ∈ Γ \ P we denote by n ( x ) the unit normal to Γ. Note that since Γ is piecewise C , the map x (cid:55)→ n ( x ) is C on Γ \ P . We define T x Γ := n ( x ) ⊥ , the tangent space to Γ at x . In a neighbourhood of x ∈ Γ \ P , the hypersurface Γ admits a parametrization of the form u ∈ T x Γ (cid:55)→ x + u + λ ( u ) n ( x ) ∈ Γ \ P , where λ is a scalar valued C function. By definition, the second fundamental form of Γ atthe point x is the quadratic form II x associated to d λ (0) which is defined on T x Γ × T x Γ.Alternatively, for all u, v ∈ T x Γ we have II x ( u, v ) := −(cid:104) ∂ u n , v (cid:105) . The Gauss curvature κ ( x )is the determinant of II x , in any orthonormal basis of T x Γ, κ ( x ) := det II x . For example, in two space dimensions the tangent space T x Γ is one dimensional, and wesimply have II x ( u, v ) = κ ( x ) (cid:104) u, v (cid:105) . We also denote by σ ( x ) ∈ { , · · · , d − } the signatureof the quadratic form II x , which is defined as the number of its positive eigenvalues.With τ such that τ := md + p , we define S p ( f ) := (cid:107) K ( d m f ) (cid:107) L τ (Ω \ Γ) = A p ( f | Ω \ Γ ) . We conjecture the following generalization to Theorem 4.3.2. .7. Extension to higher dimensions and higher order elements Conjecture 4.7.2. There exists d positive constants C ( k ) , k ∈ { , · · · , d − } , that dependon ϕ, p, m, d , such that, with E p ( f ) := (cid:107) C ( σ ) | κ | m d [ f ] (cid:107) L τ (Γ) = (cid:16)(cid:90) Γ (cid:12)(cid:12) C ( σ ( x )) | κ ( x ) | m d [ f ( x )] (cid:12)(cid:12) τ dx (cid:17) τ , we have– If p < p c then lim δ → A m,p ( f δ ) = S p ( f ) . – If p = p c then lim δ → ( A m,p ( f δ )) = ( S p ( f ) τ + E p ( f ) τ ) /τ . – If p > p c then lim δ → δ pc − p A m,p ( f δ ) = E p ( f ) . In the remainder of this section, we give some arguments that justify this conjecture.Given a cartoon function f , we define the sets Ω δ , Γ δ , Γ δ and P δ similarly to § (cid:90) Ω K ( d m f δ ) τ = (cid:90) Ω δ K ( d m f δ ) τ + (cid:90) P δ K ( d m f δ ) τ + (cid:90) Γ δ K ( d m f δ ) τ . (4.59)As in the proof of Theorem 4.3.2 the contribution of P δ can be proved to be negligiblecompared with those of Ω δ and Γ δ as δ → 0. The contribution of Ω δ satisfieslim δ → (cid:90) Ω δ K ( d m f δ ) τ = (cid:90) Ω \ Γ K ( d m f ) τ . The main difficulty lies again in the contribution of Γ δ . Let us define τ c by1 τ c := md + 1 p c . The contribution of Γ δ can be computed if one can establish an estimate generalizing(4.25) according to (cid:12)(cid:12)(cid:12) δ τc K m,p ( d m f δ ( z )) − | [ f ]( x ) || κ ( x ) | m d Φ m,d,σ ( x ) ( u ) (cid:12)(cid:12)(cid:12) ≤ ω ( δ ) (4.60)where ω ( δ ) → δ → x ∈ Γ δ , u ∈ [ − , z = x + δu n ( x ), and where the functionΦ m,d,k : [ − , → R only depends on m, d, k and ϕ . If (4.60) holds, we then easily derivethat lim δ → δ ττc − (cid:90) Γ δ K ( d m f δ ) τ = (cid:90) Γ C ( σ ) | κ | τm d | [ f ] | τ , with C ( k ) := (cid:82) − | Φ m,d,k ( u ) | τ du , which leads to the proof of the conjecture.We do not have a general proof of (4.60) for any m , p and d . In the following, we justifyits validity in two particular cases for which the explicit expression of K m,p is known tous : piecewise quadratic in two space dimensions ( d = 2 and m = 3) and piecewise linearin any dimension ( m = 2).96 Chapter 4. From finite element approximation to image models Piecewise quadratic elements in two dimensions. For all δ > x ∈ Γ δ and u ∈ [ − , π x,δ,u ∈ IH be the homogeneous cubic polynomial on R corresponding to d f δ ( x + δu n ( x )). Let also π x,u ∈ IH be the homogeneous cubic polynomial on R definedby π x,u ( λ n ( x ) + µ t ( x )) = − λ (Φ (cid:48)(cid:48) ( u ) λ − (cid:48) ( u ) κ ( x ) µ ) (4.61)for all ( λ, µ ) ∈ R , where Φ is defined by (4.17). For all x ∈ Γ, we denote by M x,δ the(symmetric) linear map defined by M x,δ n ( x ) = δ n ( x ) and M x,δ t ( x ) = √ δ t ( x ) . Then, using a reasoning similar to the one used in the appendix of this chapter, it can beproved that (cid:107) π x,δ,u ◦ M x,δ − [ f ]( x ) π x,u (cid:107) ≤ ω ( δ ) . (4.62)where lim δ → ω ( δ ) = 0 and the function ω depends only on f . Furthermore, it is provedin Chapter 2 that for all q ∈ IH K ,p ( q ) = C (cid:112) | disc q | where the positive constant C depends on p and the sign of disc q . Combining this ex-pression with (4.62) proves (4.60) and thus the conjecture in the case m = 3 and d = 2. Piecewise linear elements in any dimension. We use the second fundamental formof the discontinuity set Γ in order to evaluate d m f δ on Γ δ . Characteristic functions are oneof the simplest types of cartoon functions. In that case, it is possible to establish a simplerelation between the second fundamental form of the edge set and the second derivativesof f in a distributional sense : if Ω ⊂ R d is a bounded domain with smooth boundary Γand inward normal n , we then have for all C test function ψ and u, v ∈ R d − (cid:90) Ω ∂ u,v ψ = (cid:90) Γ (cid:104) u, n (cid:105)(cid:104) v, n (cid:105) ( ∂ n ψ − Tr(II) ψ ) + II (cid:48) ( u, v ) ψ, (4.63)where II (cid:48) x ( u, v ) is the second fundamental form II x applied to the orthogonal projectionof u and v on T x Γ. The proof of this formula (that generalizes the simpler bidimensionalcase (4.70) which is proved in the appendix) is given further below. For all x ∈ Γ, wedenote by M x,δ the (symmetric) linear map defined by M x,δ n ( x ) = δ n ( x ) and M x,δ t = √ δ t for all t ∈ T x Γ. For all δ > x ∈ Γ δ and u ∈ [ − , π x,δ,u ∈ IH be the homogeneousquadratic polynomial on R d corresponding to d f δ ( x + δu n ( x )). Let also π x,u ∈ IH be thehomogeneous quadratic polynomial on R d defined by π x,u ( λ n ( x ) + t ) = Φ (cid:48) ( u ) λ − Φ( u )II x ( t , t )for all λ ∈ R and t ∈ T x Γ, where Φ( x ) := (cid:82) R d − ϕ ( x, y ) dy . Then, using (4.63) and areasoning analogous to the one presented in the appendix, it can be proved that (cid:107) π x,δ,u ◦ M x,δ − [ f ]( x ) π x,u (cid:107) ≤ ω ( δ ) . (4.64) .7. Extension to higher dimensions and higher order elements δ → ω ( δ ) = 0 and the function ω depends only on f . Furthermore, it is provedin Chapter 2 that K ,p ( q ) = C d (cid:112) | det q | where the positive constant C depends on d, p and the signature of q ∈ IH . Combiningthis expression with (4.64) proves the estimate (4.60) and thus the conjecture in the case m = 2 in any dimension d > Proof of (4.63) : Let proj Γ be the orthogonal projection onto Γ, and for all x ∈ Γlet proj x be the orthogonal projection onto T x Γ. We consider a vector u ∈ R d and wedefine U : Γ → Γ by U( x ) := proj Γ ( x + u ). If (cid:107) u (cid:107)(cid:107) II (cid:107) L ∞ (Γ) < 1, then U is smooth and itsdifferential d x U : T x Γ → T x (cid:48) Γ, where x (cid:48) = U( x ) , is given by the following formula d x U = (Id −(cid:104) u, n ( x (cid:48) ) (cid:105) II x (cid:48) ) − proj x (cid:48) The determinant of d x U (more precisely the determinant of the matrix of d x U in directorthogonal bases of T x Γ and T x (cid:48) Γ) isdet( d x U) = det(Id −(cid:104) u, n ( x (cid:48) ) (cid:105) II x (cid:48) ) − (cid:104) n ( x ) , n ( x (cid:48) ) (cid:105) = 1 + (cid:104) u, n ( x (cid:48) ) (cid:105) Tr(II x (cid:48) ) + (cid:107) u (cid:107) ω ( u, x ) . where ω ( u, x ) tends uniformly to 0 as u → 0. Furthermore, it is easy to show that | ψ ( x + u ) − ψ ( x (cid:48) ) − (cid:104) u, n ( x ) (cid:105) ∂ n ( x (cid:48) ) ψ ( x (cid:48) ) | ≤ C (cid:107) u (cid:107) , and (cid:107) n ( x (cid:48) ) − n ( x ) − II x (cid:48) (proj x (cid:48) ( u )) (cid:107) ≤ (cid:107) u (cid:107) ω ( u ), where C and ω are independent of x ∈ Γand ω ( u ) → u → x ( r, s ) := −(cid:104) ∂ r n ( x ) , s (cid:105) for all r, s ∈ T x Γ isidentified here to the differential of n ). Combining these results, we obtain (cid:90) Γ ψ ( x + u ) (cid:104) n ( x ) , v (cid:105) dx = (cid:90) Γ ψ ( x + u ) (cid:104) n ( x ) , v (cid:105) det( d x (cid:48) U u ) − dx (cid:48) = (cid:90) Γ (cid:104) n ( x (cid:48) ) , v (cid:105) ψ ( x (cid:48) ) dx (cid:48) + (cid:90) Γ (cid:104) n ( x (cid:48) ) , v (cid:105)(cid:104) n ( x (cid:48) ) , u (cid:105) ( ∂ n ( x (cid:48) ) ψ − Tr(II (cid:48) x ) ψ ( x (cid:48) )) + (cid:104) v, II (cid:48) x (proj x (cid:48) ( u )) (cid:105) ψ ( x (cid:48) ) dx (cid:48) + (cid:107) u (cid:107) ω ( u ) . where ω ( u ) → u → 0. We conclude the proof of (4.63) using the formula − (cid:90) Ω ∂ u,v ψ = lim h → h − (cid:90) Γ ( ψ ( x + hu ) − ψ ( x )) (cid:104) n ( x ) , v (cid:105) dx. (cid:5) Chapter 4. From finite element approximation to image models Remark 4.7.3. Similarly to the results presented in § κ : if T is an affine transformation of R d with linear part L , and if f = ˜ f ◦ T , ˜Γ = T (Γ) and ˜ κ is the Gauss curvature of ˜Γ , then one has for any s ≥ , (det L ) d − d +1 (cid:90) Γ | C ( σ )[ f ] | s | κ | d +1 = (cid:90) ˜Γ | C (˜ σ )[ ˜ f ] | s | ˜ κ | d +1 . It follows from this observation that when p = p c , the contribution of the edges is affineinvariant in the sense that E p c ( ˜ f ) = (det L ) d − d +1 E p c ( f ) . Since one also has A m,p ( ˜ f ) = (det L ) d − d +1 A m,p ( f ) this comforts the conjecture. Let us men-tion that the quantity | κ | d +1 has been used in [78] in order to define surface smoothingoperators that are invariant under affine change of coordinates. In this chapter we have investigated the quantity A p ( f ) which governs the rate of ap-proximation by anisotropic IP finite elements as a way to describe anisotropic smoothnessof functions. This quantity is not a semi-norm due to the presence of the non-linear quan-tity det( d f ) and cannot be defined in a straightforward manner for general distributions.We nevertheless have shown that this quantity can be defined for cartoon images withgeometrically smooth edges when p ≤ 2. A theoretical issue remains to give a satisfactorymeaning to the full class of functions for which this quantity is finite.From a more applied perspective, it could be interesting to investigate the role of A p ( f )in problems where anisotropic features naturally arise :1. Approximation of PDE’s : in the case of one dimensional hyperbolic conservationlaws, it was proved in [44] that despite the appearance of discontinuities the solu-tion has high order smoothness in Besov spaces that govern the rate of adaptiveapproximation by piecewise polynomials. A natural question is to ask wether simi-lar results hold in higher dimension, which corresponds to understanding if A p ( f )remains bounded despite the appearance of shocks.2. Image processing : as illustrated in § A p ( f ) can easily be discretizedand defined for pixelized images. It is therefore tempting to use A ( f ) in a similarway as the total variation in (4.2), by solving a problem of the formmin g ∈ BV { A ( g ) ; (cid:107) T g − h (cid:107) L ≤ ε } , (4.65)with the objective of promoting images with piecewise smooth edges. The maindifficulty is that A is not a convex functional. One way to solve this difficulty couldbe to reformulate (4.65) in a Bayesian framework as the search of a maximum of ana-posteriori probability distribution (MAP) as an estimator of f . In this framework,we may instead search for a minimal mean-square error estimator (MMSE), andthis search can be implemented by stochastic algorithms which do not require theconvexity of A , see [70] and § .9. Appendix : proof of the estimates (4.22)-(4.23)-(4.24). It is known since the work of Whitney on extension theorems (see in particular [91])that for any open set U ⊂ R d , and any g ∈ C ( U ) there exists ˜ g ∈ C ( R d ) such that˜ g | U = g . It follows that for each 1 ≤ i ≤ k , there exists ˜ f i ∈ C ( R ), compactly supported,and such that ˜ f i | Ω i = f i .Let Γ j be one of the pieces of Γ, between the domains Ω k and Ω l , and let s = ˜ f k and t = ˜ f l − ˜ f k . Although the domains Ω k and Ω l are only piecewise smooth, there exists anopen set Ω (cid:48) with C boundary such that for δ > f = s χ Ω (cid:48) + t on (cid:91) <δ ≤ δ (Γ j,δ + B δ ) , where B δ is the ball of radius δ centered at 0. Note that Γ j ⊂ Γ (cid:48) := ∂ Ω (cid:48) and that s = [ f ]on Γ j . In the following, the variables x, z are always subject to the restriction x ∈ Γ j,δ and z = U δ ( x, u ) = x + δu n ( x ) where 0 < δ ≤ δ and | u | ≤ , (4.66)note that z ∈ Γ j,δ and (cid:107) x − z (cid:107) ≤ δ . We therefore have f δ ( z ) = (cid:90) Ω (cid:48) s (˜ x ) ϕ δ ( z − ˜ x ) d ˜ x + t δ ( z ) , where t δ := t ∗ ϕ δ . The second derivatives of t δ are uniformly bounded, and are thereforenegligible in regard of all three estimates (4.22), (4.23) and (4.24), indeed (cid:107) d t δ (cid:107) L ∞ = (cid:107) ( d t ) ∗ ϕ δ (cid:107) L ∞ ≤ (cid:107) d t (cid:107) L ∞ (cid:107) ϕ δ (cid:107) L = (cid:107) d t (cid:107) L ∞ (cid:107) ϕ (cid:107) L < ∞ . We now define the 2 × I ( z, x ) := (cid:90) Ω (cid:48) ( s (˜ x ) − s ( x )) d ϕ δ ( z − ˜ x ) d ˜ x and J ( z ) := (cid:90) Ω (cid:48) d ϕ δ ( z − ˜ x ) d ˜ x so that d f δ ( z ) = d t δ + I ( z, x ) + [ f ]( x ) J ( z ) . (4.67)We already know that the contribution of d t δ is negligible. We now prove that the sameholds for the contribution of I ( z, x ). Since ϕ δ ( z − ˜ x ) is non-zero only if (cid:107) ˜ x − z (cid:107) ≤ δ andtherefore (cid:107) ˜ x − x (cid:107) ≤ δ , we can bound the norm of the matrix I ( z, x ) by (cid:107) I ( z, x ) (cid:107) ≤ δ (cid:107) ds (cid:107) L ∞ (cid:107) d ϕ δ (cid:107) L ≤ δ (cid:107) ds (cid:107) L ∞ (cid:107) d ϕ (cid:107) L δ − = Cδ − . (4.68)This proves that the contribution of I ( z, x ) is negligible for the two estimates (4.22) and(4.23). In order to prove that it is also negligible in the estimate (4.24), we need a fineranalysis of t ( x ) T I ( z, x ) t ( x ). For this purpose we fix a unit vector u and the pair ( x, z ).We introduce Λ(˜ x ) := ( s (˜ x ) − s ( x )) ∂ u ϕ δ ( z − ˜ x ) + ∂ u s (˜ x ) ϕ δ ( z − ˜ x ) , Chapter 4. From finite element approximation to image models so that by Leibniz rule( s (˜ x ) − s ( x )) ∂ u,u ϕ δ ( z − ˜ x ) = ∂ u,u s (˜ x ) ϕ δ ( z − ˜ x ) − ∂ u Λ(˜ x ) . Therefore u T I ( z, x ) u = (cid:90) Ω (cid:48) (cid:0) ∂ u,u s (˜ x ) ϕ δ ( z − ˜ x ) − ∂ u Λ(˜ x ) (cid:1) d ˜ x, = (cid:90) Ω (cid:48) ∂ u,u s (˜ x ) ϕ δ ( z − ˜ x ) d ˜ x − (cid:90) Γ (cid:48) Λ(˜ x ) (cid:104) n (˜ x ) , u (cid:105) d ˜ x. The first integral clearly satisfies (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) Ω (cid:48) ∂ u,u s (˜ x ) ϕ δ ( z − ˜ x ) d ˜ x (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:107) d s (cid:107) L ∞ (cid:107) ϕ δ (cid:107) L , and is therefore bounded independently of δ . We estimate the second integral for thespecial case u = t ( x ), remarking that |(cid:104) n (˜ x ) , t ( x ) (cid:105)| ≤ C δ on the domain of integration.Therefore (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) Γ (cid:48) Λ(˜ x ) (cid:104) n (˜ x ) , t ( x ) (cid:105) d ˜ x (cid:12)(cid:12)(cid:12)(cid:12) ≤ C δ | Γ (cid:48) ∩ B ( z, δ ) |(cid:107) Λ (cid:107) L ∞ , where, slightly abusing notations, we denote by | Γ (cid:48) ∩ B ( z, δ ) | the length (1-dimensionalHausdorff measure) of the curve Γ (cid:48) ∩ B ( z, δ ). Clearly Λ(˜ x ) = 0 if (cid:107) z − ˜ x (cid:107) ≥ δ . If (cid:107) z − ˜ x (cid:107) ≤ δ we have | Λ(˜ x ) | ≤ ( (cid:107) x − z (cid:107) + (cid:107) z − ˜ x (cid:107) ) (cid:107) ds (cid:107) L ∞ (cid:107) dϕ (cid:107) L ∞ δ − + (cid:107) ds (cid:107) L ∞ (cid:107) ϕ (cid:107) L ∞ δ − ≤ C δ − . (4.69)Since in addition | Γ (cid:48) ∩ B ( z, δ ) | ≤ C δ , we finally find that (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) Γ (cid:48) Λ(˜ x ) (cid:104) n (˜ x ) , t ( x ) (cid:105) d ˜ x (cid:12)(cid:12)(cid:12)(cid:12) ≤ C C C . We have therefore proved that | t ( x ) T I ( z, x ) t ( x ) | ≤ C, where the constant C is independent of δ , which shows that the contribution of I ( z, x ) isnegligible in (4.24).We now analyze the contribution the quantity [ f ]( x ) J ( z ) in (4.67). For this purpose, weuse an expression of the second derivative of the characteristic function χ Ω (cid:48) of a smooth setΩ (cid:48) in the distribution sense. We assume without loss of generality that Γ (cid:48) is parametrizedin the trigonometric sense, and therefore that n is the inward normal to Ω. For all testfunction ψ , we have − (cid:90) Ω (cid:48) ∂ u,v ψ = (cid:90) Γ (cid:48) ∂ u ψ (cid:104) v, n (cid:105) = (cid:90) Γ (cid:48) ( ∂ n ψ (cid:104) u, n (cid:105) + ∂ t ψ (cid:104) u, t (cid:105) ) (cid:104) v, n (cid:105) and, by integration by parts, (cid:90) Γ (cid:48) ∂ t ψ (cid:104) u, t (cid:105)(cid:104) v, n (cid:105) = − (cid:90) Γ (cid:48) ψ ( (cid:104) u, κ n (cid:105)(cid:104) v, n (cid:105) − (cid:104) u, t (cid:105)(cid:104) v, κ t (cid:105) ) . .9. Appendix : proof of the estimates (4.22)-(4.23)-(4.24). − (cid:90) Ω (cid:48) ∂ u,v ψ = (cid:90) Γ (cid:48) (cid:104) u, n (cid:105)(cid:104) v, n (cid:105) ( ∂ n ψ − κψ ) + κ (cid:104) u, t (cid:105)(cid:104) v, t (cid:105) ψ (4.70)Applying this formula to ψ (˜ x ) := ϕ δ ( z − ˜ x ) we obtain − u T J ( z ) v = (cid:90) Γ (cid:48) (cid:104) u, n (˜ x ) (cid:105)(cid:104) v, n (˜ x ) (cid:105) ( ∂ n ϕ δ ( z − ˜ x ) − κ (˜ x ) ϕ δ ( z − ˜ x )) d ˜ x + (cid:90) Γ (cid:48) κ (˜ x ) (cid:104) u, t (˜ x ) (cid:105)(cid:104) v, t (˜ x ) (cid:105) ϕ δ ( z − ˜ x ) d ˜ x. (4.71)Since Γ j is C , there exists a constant C such that for all x , x ∈ Γ j , we have |(cid:104) t ( x ) , n ( x ) (cid:105)| ≤ C (cid:107) x − x (cid:107) , and | − (cid:104) n ( x ) , n ( x ) (cid:105)| = | − (cid:104) t ( x ) , t ( x ) (cid:105)| ≤ C (cid:107) x − x (cid:107) . We finally remark that | Γ (cid:48) ∩ B ( z, δ ) | ≤ C δ , and that (cid:107) ϕ δ (cid:107) L ∞ ≤ (cid:107) ϕ (cid:107) L ∞ δ − and (cid:107) ∂ n ϕ δ (cid:107) L ∞ ≤(cid:107) dϕ (cid:107) L ∞ δ − .Taking the vectors t ( x ) or n ( x ) as possible values of u and v in (4.71) and using theabove remarks, we obtain the estimates (cid:12)(cid:12)(cid:12)(cid:12) n ( x ) T J ( z ) n ( x ) + (cid:90) Γ (cid:48) ∂ n ϕ δ ( z − ˜ x ) d ˜ x (cid:12)(cid:12)(cid:12)(cid:12) ≤ Cδ − , (4.72) | t ( x ) T J ( z ) n ( x ) | ≤ Cδ − , (4.73) (cid:12)(cid:12)(cid:12)(cid:12) t ( x ) T J ( z ) t ( x ) + (cid:90) Γ (cid:48) κ (˜ x ) ϕ δ ( z − ˜ x ) d ˜ x (cid:12)(cid:12)(cid:12)(cid:12) ≤ C, (4.74)where the constant C depends only on f . In view of (4.67) we can immediately deriveestimate (4.23) from (4.73).In order to derive the estimate (4.24) from (4.74), we first introduce the modulus ofcontinuity ω of κ on Γ j , ω ( δ ) := sup x ,x ∈ Γ j ; (cid:107) x − x (cid:107)≤ δ | κ ( x ) − κ ( x ) | . Therefore (cid:12)(cid:12)(cid:12)(cid:12) t ( x ) T J ( z ) t ( x ) + (cid:90) Γ (cid:48) κ (˜ x ) ϕ δ ( z − ˜ x ) d ˜ x (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12) t ( x ) T J ( z ) t ( x ) + κ ( x ) (cid:90) Γ (cid:48) ϕ δ ( z − ˜ x ) d ˜ x (cid:12)(cid:12)(cid:12)(cid:12) + Cω ( δ ) δ − . We now claim that (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) Γ ϕ δ ( z − ˜ x ) d ˜ x − δ − Φ( u ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ C, (4.75)holds with C independent of δ which implies the validity of (4.24). In order to prove (4.75),we use a local parametrization of Γ (cid:48) : let λ : R → R be such that for h small enough we02 Chapter 4. From finite element approximation to image models have, x + h t ( x ) + λ ( h ) n ( x ) ∈ Γ. Note that we have | λ ( h ) | ≤ C h and | λ (cid:48) ( h ) | ≤ C h for h small enough. Then for δ small enough, (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) Γ (cid:48) ϕ δ ( z − ˜ x ) d ˜ x − δ − Φ( u ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) R ϕ δ ( h t ( x ) + ( δu − λ ( h )) n ( x )) (cid:112) λ (cid:48) ( h ) dh − (cid:90) R ϕ δ ( h t ( x ) + δu n ( x )) dh (cid:12)(cid:12)(cid:12)(cid:12) ≤ Cδ ( (cid:107) ϕ δ (cid:107) L ∞ ( (cid:112) C δ ) − 1) + (cid:107) dϕ δ (cid:107) L ∞ C δ ) ≤ C Finally, we can derive the estimate (4.22) from (4.72) using the inequality (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) Γ ∂ n ϕ δ ( z − ˜ x ) d ˜ x + δ − Φ (cid:48) ( u ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ Cδ − (4.76)which proof is very similar to the one of (4.75). art IIIMesh adaptation and riemannianmetrics hapter 5Are riemannian metrics equivalentto simplicial meshes ? Contents R d , d H ) . . . . . . . . . 223 d − at eachpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2265.4 From mesh to metric . . . . . . . . . . . . . . . . . . . . . . 2355.5 From metric to mesh . . . . . . . . . . . . . . . . . . . . . . 246 Triangulations and meshes are finite objects of combinatorial nature : they can bedescribed by a collection of vertices and of connections between these. This descriptionwell adapted to the demonstration of algebraic results, such as the Euler formula, or forcomputer processing. In contrast, many approaches towards anisotropic mesh generationare based on a continuous object, namely a riemannian metric z (cid:55)→ H ( z ), in other wordsa continuous function H from the domain Ω ⊂ R d to the set S + d of symmetric positive20506 Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? Figure T and their associated ellipse E T .definite matrices. Once this metric has been properly designed, it is the task of a mesh ge-neration algorithm such as [15, 66, 93, 94] to generate a triangulation that agrees with thismetric, in a sense specified below (5.2). The purpose of this chapter is to formulate pre-cise equivalence results between some classes of triangulations and of riemannian metrics.This equivalence translates some geometrical constraints satisfied by the triangulationsinto the form of regularity properties of the equivalent riemannian metrics.Our results are so far limited to meshes and metrics defined on the entire infinitedomain R d , and the dimension d ≥ R d is guided by simplicity, since the curvature and the singularities ofthe boundary of a bounded domain induce additional difficulties from the point of viewof computational mesh generation. Bounded domains will be the object of future work.We denote by T the collection of conforming simplicial meshes of R d , and we introducein § T of particular interest T i,C ⊂ T a,C ⊂ T g,C , where C ≥ T i,C collects all isotro-pic meshes T which are heuristically defined as follows : the simplices T ∈ T may havestrongly varying volumes, but their aspect ratio is uniformly bounded. The largest set T g,C collects graded meshes, which only satisfy a condition of local consistency : all thegeometrical features of the simplices T ∈ T may vary strongly, volume, aspect ratio andorientation, but two neighboring simplices should not be excessively different. The in-termediate set T a,C of quasi-acute meshes, is defined by a condition which involves themeasure of sliverness S ( T ) of a simplex T introduced in Chapter 3 and also discussed inChapter 6. The measure of sliverness plays an important role in the finite element ap-proximation of a function on a mesh T when the error is measured in the Sobolev W ,p norm.We associate to each simplex T a symmetric positive definite matrix H T such that theellipsoid E T := { z ∈ R d ; ( z − z T ) H T ( z − z T ) T ≤ } , (5.1)which is centered at the barycenter z T of T , is the ellipsoid of minimal volume containing T . Two triangles T and their associated ellipses E T are illustrated on Figure 5.1. We givein § H T in terms of the coordinates of verticesof the simplex T , and we discuss its main properties. The matrix H T ∈ S + d encodes thevolume, the aspect ratio and the orientation (but not the angles) of the simplex T .A riemannian metric on R d is a continuous map H which associates to any z ∈ R d asymmetric positive definite matrix H ( z ) ∈ S + d . We denote by H = C ( R d , S + d ) the col- .1. Introduction Definition 5.1.1. We say that a mesh T ∈ T is C -equivalent to a given metric H ∈ H ,where C ≥ is a fixed constant, if for all T ∈ T and all z ∈ T one has C − H ( z ) ≤ H T ≤ C H ( z ) (5.2) We say that a collection of simplicial meshes T ∗ ⊂ T is equivalent to a collection H ∗ ofmetrics if there exists a uniform constant C ≥ such that the following holds :– For any mesh T ∈ T ∗ there exists a metric H ∈ H ∗ such that T and H are C -equivalent.– For any metric H ∈ H ∗ there exists a mesh T ∈ T ∗ such that T and H are C -equivalent. Our main result (Theorem 5.1.14) states that when the constant C is sufficiently large,and if dimension is d = 2, the three classes of meshes T i,C ⊂ T a,C ⊂ T g,C are respectivelyequivalent to three classes of metrics H i ⊂ H a ⊂ H g . that are defined by precise smoothness conditions on the function z (cid:55)→ H ( z ) in § H T does not encodethe angles of a simplex T , the fact that a mesh T ∈ T a,C is quasi-acute can be translatedin a precise regularity condition on the metric H equivalent to T .In order to state our results more precisely we need to introduce some notations.We first recall some notations of linear algebra. We denote by M d the collection of d × d matrices with real entries. We denote by GL d the standard linear group, by SL d the speciallinear group and by O d the orthogonal groupGL d := { A ∈ M d ; det A (cid:54) = 0 } SL d := { A ∈ M d ; det A = 1 }O d := { U ∈ M d ; U T U = Id } . We denote by S d the collection of symmetric matrices, by S ⊕ d the subset of non-negativesymmetric matrices, and by S + d the collection of symmetric positive definite matrices. S d := { M ∈ M d ; M = M T } S ⊕ d := { M ∈ S d ; z T M z ≥ z ∈ R d } S + d := S ⊕ d ∩ GL d . For any M, M (cid:48) ∈ S d , we write M ≤ M (cid:48) if M (cid:48) − M ∈ S ⊕ d , and M < M (cid:48) if M (cid:48) − M ∈ S + d .We recall that for any M, M (cid:48) ∈ S d such that M ≤ M (cid:48) and any φ ∈ M d we have φ T M φ ≤ φ T M (cid:48) φ . Furthermore if M, M (cid:48) ∈ S + d and M ≤ M (cid:48) then M (cid:48)− ≤ M − . For each symmetricpositive definite matrix M ∈ S + d we define a norm (cid:107) · (cid:107) M on R d as follows : for all u ∈ R d (cid:107) u (cid:107) M := u T M u. Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? We briefly recall the definition of a d -dimensional simplex, and of the collection of itsfaces. Definition 5.1.2. A d -dimensional simplex T is the convex envelope of a set V ⊂ R d of d + 1 vertices, not contained in a d − dimensional affine subspace of R d : T = Cvx( V ) . We denote by F ( T ) the collection of faces of T of any dimension, F ( T ) := { Cvx( V (cid:48) ) ; V (cid:48) ⊂ V } . (5.3)Observe that ∅ and T are among the faces of a simplex T , since they respectivelycorrespond to V (cid:48) = ∅ and V (cid:48) = V . We denote by z T ∈ R d the barycenter of a simplex T of vertices V z T := 1 d + 1 (cid:88) v ∈ V v, and we define a symmetric positive definite matrix H T ∈ S + d as follows H − T = dd + 1 (cid:88) v ∈ V ( v − z T )( v − z T ) T . (5.4)The next proposition shows that this definition is consistent with the characterization of H T given in (5.1), and gives some of the key properties of the matrix H T . Throughoutthis chapter, we denote by T eq a d -dimensional equilateral simplex, centered at the origini.e. such that z T eq = 0, and having its vertices on the unit sphere. One easily checks that H T eq = Id . Proposition 5.1.3. The following holds for any d -dimensional simplex T .– For any an affine change of coordinates Φ , Φ( z ) := φz + z where φ ∈ GL d and z in R d , we have H Φ − ( T ) = φ T H T φ. (5.5) – There exists a rotation U ∈ O d , depending on T , such that H T ( T − z T ) = U ( T eq ) , (5.6) hence | T | (cid:112) det H T = | T eq | . (5.7) – We have the inclusions { z ∈ R d ; (cid:107) z − z T (cid:107) H T ≤ /d } ⊂ T ⊂ E T := { z ∈ R d ; (cid:107) z − z T (cid:107) H T ≤ } , (5.8) and the these two ellipsoids are respectively the one of largest volume included in T ,and of smallest volume containing T . .1. Introduction Proof: For any vertex Φ − ( v ) of the simplex Φ − ( T ), we haveΦ − ( v ) − z Φ − ( T ) = Φ − ( v ) − Φ − ( z T ) = φ − ( v − z T ) . Injecting this relation in (5.4) we obtain (5.5). We now turn to the proof of (5.6) and forthat purpose we remark that the simplices T eq and T (cid:48) := H T ( T − z T ) both have theirbarycenter at zero, hence there exists a linear map φ ∈ GL d such that T (cid:48) = φ − ( T eq ). Wethus obtain, using the formula (5.5) for the transformation of H T under affine change ofcoordinates H T (cid:48) = H − T H T H − T = Id H φ − ( T eq ) = φ T H T eq = φ T φ. Hence φ T φ = Id which establishes that φ ∈ O d and concludes the proof of (5.6). It iswell known that the smallest ellipsoid containing T eq is the unit ball, and that the largestellipsoid included in T eq is the ball of radius 1 /d . Combining this fact with the change ofcoordinates (5.6) we obtain (5.8) which concludes the proof of this proposition. (cid:5) Note that for any simplex T and any z, z (cid:48) ∈ T we obtain using (5.8) (cid:107) z − z (cid:48) (cid:107) H T ≤ (cid:107) z − z T (cid:107) H T + (cid:107) z (cid:48) − z T (cid:107) H T ≤ , (5.9)and 2 d (cid:107)H − T (cid:107) ≤ diam( T ) ≤ (cid:107)H − T (cid:107) (5.10)We introduce a measure of degeneracy ρ ( T ) ∈ [1 , ∞ ) of a d -dimensional simplex Tρ ( T ) := (cid:113) (cid:107)H T (cid:107)(cid:107)H − T (cid:107) . (5.11)Note that ρ ( T ) = 1 if and only if H T is proportional to the identity, which implies in viewof (5.6) that T = z T + rU for some r > U ∈ O d .The measure of degeneracy ρ is slightly different from the measure of degeneracy diam( T ) d | T | used in Chapters 2 and 3 but both have the same role : they are minimal forequilateral simplices, and increase as the simplex becomes thinner.The measure of sliverness S ( T ) of a simplex T is a quantity which plays an importantrole in the context of optimal mesh adaptation for the finite element approximation ofa function in the Sobolev W ,p norm, see [10, 62] and Chapters 3 and 6. The measureof sliverness is defined in Chapter 3 and illustrated by several examples and equivalentquantities. The measure of sliverness is defined by S ( T ) := inf {(cid:107) φ (cid:107)(cid:107) φ − (cid:107) ; φ ∈ GL d and φ ( T ) is acute } . (5.12)We recall that a simplex is acute if and only if the exterior normals n , n (cid:48) to any twodistinct d − (cid:104) n , n (cid:48) (cid:105) ≤ 0. As observedin Chapter 3, the measure of sliverness S ( T ) can be interpreted as the distance from T to the collection of acute simplices.10 Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? The exterior normals n , n (cid:48) to any two distinct d − T eq satisfy (cid:104) n , n (cid:48) (cid:105) = − /d , hence this simplex is acute. Recalling (5.6) we thus obtain for any simplex T S ( T ) ≤ ρ ( T ) . Apart from this upper bound, the matrix H T does not contain any direct informationon the measure of sliverness S ( T ). For any bidimensional triangle T of largest angle θ ,Proposition 3.2.1 of Chapter 3 states that : S ( T ) = max (cid:26) , tan θ (cid:27) . As a starter, we recall the definition of a conforming simplicial mesh of R d . Definition 5.1.4. A conforming simplicial mesh of R d is a collection T of simplices whichsatisfy the conformity axiom : for all T, T (cid:48) ∈ T T ∩ T (cid:48) ∈ F ( T ) ∩ F ( T (cid:48) ) , (5.13) as well as the following properties– (Covering) The simplices T ∈ T cover the whole infinite domain R d : (cid:91) T ∈T T = R d . – (Partition) The interiors of the simplices T ∈ T are pairwise disjoint : for any T, T (cid:48) ∈ T , int( T ) ∩ int( T (cid:48) ) (cid:54) = ∅ implies T = T (cid:48) . – (Local finiteness) For any compact set K ⊂ R d , only a finite collection of simplices T ∈ T intersect K : { T ∈ T ; K ∩ T (cid:54) = ∅} is finite. The conformity axiom (5.13) states that the intersection of any two simplices in T needs to be a full common face. We denote by T the collection of conforming simplicialmeshes of R d .For an optimal efficiency, numerous numerical simulation software use anisotropicmeshes, in which the simplices may have an arbitrary shape. Anisotropic meshes areused for instance, to create a thin layer of simplices close to a geometric feature which isrelevant in a numerical simulation. In order to avoid excessively wild meshes, one oftenrequires some consistency between the shapes of neighboring simplices. We therefore in-troduce for any constant C ≥ T g,C ⊂ T of graded meshes, which is defined asfollows. Definition 5.1.5. A mesh T belongs to T g,C if for any two simplices T, T (cid:48) ∈ T one has T ∩ T (cid:48) (cid:54) = ∅ ⇒ C − H T ≤ H T (cid:48) ≤ C H T . (5.14) .1. Introduction Figure C ≥ T i,C of isotropic meshes, by requiring inaddition to (5.14) that the measure of degeneracy ρ is uniformly bounded by C , whichforbids any anisotropy. Definition 5.1.6. A mesh T belongs to T i,C if and only if T ∈ T g,C and for all T ∈ T ρ ( T ) ≤ C. In order to introduce the intermediate collection T a,C of C -quasi-acute meshes, wefirst define the notion of refinement of a simplicial mesh. Definition 5.1.7. Consider two meshes T , T (cid:48) ∈ T and a constant C ≥ . We say that T (cid:48) is a C -refinement of T if it satisfies the following properties.– (Inclusion) Any simplex T (cid:48) ∈ T (cid:48) is contained in a simplex T ∈ T .– (Bounded refinement) Any simplex T ∈ T , contains at most C simplices T (cid:48) ∈ T . Definition 5.1.8. A graded mesh T ∈ T g,C belongs to the collection T a,C of quasi-acutemeshes if and only there exists a C -refinement T (cid:48) of T which satisfies for all T (cid:48) ∈ T (cid:48) S ( T (cid:48) ) ≤ C. In particular if T ∈ T i,C , then choosing T (cid:48) = T , which is a C -refinement since C ≥ S ( T ) ≤ ρ ( T ) for any simplex T , we obtain that T ∈ T a,C . Thus T i,C ⊂ T a,C ⊂ T g,C as announced.The constraints defining graded, quasi-acute and isotropic meshes are illustrated onFigure 8 in the main introduction of the thesis. Figure 5.2 gives an example of a quasi-acute triangulation which by the above described refinement process leads to a slightlyfiner triangulation on which the measure of sliverness S ( T ) is uniformly bounded. For any metric H ∈ H := C ( R d , S + d ) and any path γ ∈ C ([ a, b ] , R d ), we define theriemmannian length l H ( γ ) by the integral l H ( γ ) := (cid:90) ba (cid:107) γ (cid:48) ( t ) (cid:107) H ( γ ( t )) dt. Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? The riemmannian distance d H between two points z, z (cid:48) ∈ R d , is the infimum of the lengthof all paths joining z and z (cid:48) , d H ( z, z (cid:48) ) := inf (cid:8) l H ( γ ) ; γ ∈ C (cid:0) [0 , , R d (cid:1) , γ (0) = z and γ (1) = z (cid:48) (cid:9) . (5.15)If H, H (cid:48) ∈ H are such that H ( z ) ≤ H (cid:48) ( z ) for all z ∈ R d , then d H ( z, z (cid:48) ) ≤ d H (cid:48) ( z, z (cid:48) ) for all z, z (cid:48) ∈ R d . By construction, the riemannian distance d H associated to a metric H ∈ H islocally equivalent close to a point z to the distance defined by the norm (cid:107) · (cid:107) H ( z ) : for all z ∈ R d lim ε → sup p,q ∈ B ( z,ε ) | ln d H ( p, q ) − ln (cid:107) p − q (cid:107) H ( z ) | = 0 , (5.16)where B ( z, ε ) denotes the euclidean ball of radius ε around z . Remark 5.1.9 (Riemannian geodesics) . When a metric H is sufficiently smooth, a specialfamily of curves named Riemannian geodesics can be defined and studied. A riemanniangeodesic is a curve γ ∈ C ( R , R d ) which satisfies a second order differential equation calledthe equation of geodesics. This equation implies that for any sufficiently close t, t (cid:48) ∈ R the curve γ | [ t,t (cid:48) ] is the path of smallest possible length joining the points γ ( t ) and γ ( t (cid:48) ) : d H ( γ ( t ) , γ ( t (cid:48) )) = l H ( γ | [ t,t (cid:48) ] ) .Geodesics are not defined for general continuous or Lipschitz riemannian metrics, suchas those constituting the sets H and H i ⊂ H a ⊂ H g , since the coefficients of the equationof geodesics involve the second derivatives of the metric. Notions related to geodesics, suchas the injectivity radius of a metric, therefore have no meaning in our context. This is notan issue for our purposes since, following the point of view of [58], we study Riemanniandistances and not Riemannian geodesics. Definition 5.1.10. We denote by H i the collection of metrics H ∈ H which are propor-tionnal to the identity : for all z ∈ R d H ( z ) = Id s ( z ) , and such that the proportionality factor s satisfies one of the two following conditionsconditions, which are surprisingly equivalent as shown in Proposition 5.2.7 :– (“Additive” Lipschitz condition) | s ( z ) − s ( z (cid:48) ) | ≤ | z − z (cid:48) | for all z, z (cid:48) ∈ R d . (5.17) – (“Multiplicative” Lipschitz condition) | ln s ( z ) − ln s ( z (cid:48) ) | ≤ d H ( z, z (cid:48) ) for all z, z (cid:48) ∈ R d . (5.18)The two Lipschitz properties (5.17) and (5.18) have a natural generalisation to generalanisotropic metrics but are not equivalent in that context. In order to introduce theregularity conditions defining the classes H a and H g of metrics, we need to introduce .1. Introduction d × and d + on the set S + d of symmetric positive definite matrices. For any M, M (cid:48) ∈ S + d we define d × ( M, M (cid:48) ) := min { δ ≥ e − δ M ≤ M (cid:48) ≤ e δ M } . (5.19)For instance, a mesh T is C -equivalent to a metric H , as defined in (5.2), if and only iffor all T ∈ T and all z ∈ T one has d × ( H ( z ) , H T ) ≤ ln C. There exists various expressions of the distance d × : for all M, M (cid:48) ∈ S + d one easily checksthat d × ( M, M (cid:48) ) = sup u ∈ IR d \{ } | ln (cid:107) u (cid:107) M − ln (cid:107) u (cid:107) M (cid:48) | = sup | u | =1 | ln (cid:107) u (cid:107) M − ln (cid:107) u (cid:107) M (cid:48) | , (5.20)and d × ( M, M (cid:48) ) = ln (cid:16) max {(cid:107) M − M (cid:48) (cid:107) , (cid:107) M (cid:48)− M (cid:107)} (cid:17) . (5.21)This last expression implies that d × ( M, M (cid:48) ) ≥ 12 max {| ln (cid:107) M (cid:107) − ln (cid:107) M (cid:48) (cid:107)| , | ln (cid:107) M − (cid:107) − ln (cid:107) M (cid:48)− (cid:107)|} (5.22)The exponential of the distance d × was used in the earlier paper [66] under the name“relative deformation”. We define a second distance d + on S + d as follows d + ( M, M (cid:48) ) := (cid:107) M − − M (cid:48)− (cid:107) . (5.23)The expressions (5.21) and (5.23) of the distances d × and d + show that they are res-pectively tied to multiplicative or additive properties of matrices, which justifies theirnotations. Note also that for any s, s (cid:48) > d × ( s − Id , s (cid:48)− Id) = | ln s − ln s (cid:48) | and d + ( s − Id , s (cid:48)− Id) = | s − s (cid:48) | . (5.24) Definition 5.1.11. We denote by H g the collection of metrics H ∈ H which satisfy thenatural extension of (5.18) to general anisotropic metrics : d × ( H ( z ) , H ( z (cid:48) )) ≤ d H ( z, z (cid:48) ) for all z, z (cid:48) ∈ R d . (5.25)This condition can be heuristically described as follows : a metric H ∈ H definesclose to each point z ∈ R d a norm (cid:107) · (cid:107) H ( z ) which encodes an anisotropic notion of scale.The equation (5.25) states that the metric H itself is consistent a the scale that it encodes. Definition 5.1.12. We denote by H a the collection of metrics H ∈ H which satisfy both(5.25) and the natural extension of (5.17) to general anisotropic metrics : d + ( H ( z ) , H ( z (cid:48) )) ≤ | z − z (cid:48) | for all z, z (cid:48) ∈ R d . (5.26)14 Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? In addition to the results of mesh generation presented in this chapter, we show inthe next chapter, Lemma 6.4.1, that this condition is critical in order to define a localaveraging operator compatible with Sobolev norms. Remark 5.1.13. Let H ∈ H , and let λ > . Using the homogeneity properties of thedistances d × , d + on S + d and d H on R d with respect to H , we obtain that λ H ∈ H g if andonly if d × ( H ( z ) , H ( z (cid:48) )) ≤ λd H ( z, z (cid:48) ) for all z, z (cid:48) ∈ R d , (5.27) and λ H ∈ H a if and only if we have in addition to the previous property d + ( H ( z ) , H ( z (cid:48) )) ≤ λ | z − z (cid:48) | for all z, z (cid:48) ∈ R d . (5.28) In particular the collections H i ⊂ H a ⊂ H g of isotropic, quasi-acute and graded metricsare stable under multiplication by a constant larger than one : for any (cid:63) ∈ { a, b, c } , any H ∈ H (cid:63) and any λ ≥ we have λ H ∈ H (cid:63) . The main result of this chapter is the equivalence of the classes of meshes and metricsdefined above, which is announced in the first part of the introduction and proved insections § § d = 2. Theorem 5.1.14. There exists C = C ( d ) such that for all C ≥ C the following holds.i) The collections T i,C of triangulations and H i of metrics are equivalent.ii) If d = 2 , then the collections T a,C of triangulations and H a of metrics are equivalent.iii) If d = 2 , then the collections T g,C of triangulations and H g of metrics are equivalent. Let us comment on this result. The theory of isotropic meshes is already well developed,and our result i) in this direction should be regarded as a reformulation of previous work,intended to put into perspective the results on anisotropic meshes. The main ingredientof the proof of i) is the mesh refinement procedure exposed in [77].In the case iii) of graded metrics the key ingredients used for the construction ofa bidimensional triangulation T ∈ T g,C equivalent to a metric H ∈ H g come from thepaper of computational geometry [66]. In contrast the arguments used to produce a metric H ∈ H g from a mesh T ∈ H g,C hold in any dimension.Last the author has not heard of any anterior results on the collection T a,C of quasi-acute meshes, and most of the techniques used in that context are new. Conjecture 5.1.15. The author conjectures that the points ii) and iii) of Theorem 5.1.14hold without restriction on the dimension d ≥ . This chapter is organised as follows. We establish in section § H a or H g . We first prove in § H g of graded metrics with respect to affine changes of coordinates. Theregularity assumptions (5.25) and (5.26) defining graded and quasi-acute metrics simply .2. General properties of graded and quasi-acute metrics H : R d → S + d is Lipschitz with respect to some distances on R d and S + d , and we therefore study metrics from this point of view in § § R d equipped with the distance d H and themeasure (cid:112) det H ( z ) dz associated to a graded metric H ∈ H g . In particular we comparethe d H with the euclidean distance and we estimate the volume of balls.Section § H ∈ H such that the symmetric matrix H ( z ) has an eigenspace of dimension at least d − d = 2 but is also relevant in some applications to higher dimensionsuch as the creation of a thin layer of simplices close to a d − § H a and H g of metrics in terms of thetwo eigenvalues of H and of the direction of the eigenspaces of such metrics.The last two sections § § § § H ∈ H i , H a or H g . We study in this section the general properties of the distinguished classes of metrics H i ⊂ H a ⊂ H g . We first establish some invariance properties of these collections ofmetrics with respect to affine changes of coordinates. We then regard metrics as Lipschitzfunctions H : R d → S + d . The last subsection is devoted to the comparison of the distance d H and the measure (cid:112) det H ( z ) dz associated to a graded metric H ∈ H g , with respect tothe standard euclidean distance and Lebesgue measure. We focus in this subsection on the properties of invariance of the collections of meshes T i,C ⊂ T a,C ⊂ T g,C and of metrics H i ⊂ H a ⊂ H g introduced in this chapter.Let Φ : R d → R d be an affine change of coordinates, Φ( z ) := φz + z where φ ∈ GL d and z ∈ R d . We have for any simplex T , as previously observed in (5.5), H Φ − ( T ) = φ T H T φ. (5.29)Mimicking this property we define for any H ∈ H a transported version H Φ ∈ H asfollows : for all z ∈ R d H Φ ( z ) = φ T H (Φ( z )) φ. (5.30)In the language of differential geometry, the metric H Φ is generally referred to as the “pullback” of H by Φ. If a mesh T ∈ T is C -equivalent to a metric H ∈ H , then (5.29) and(5.30) clearly imply that the mesh Φ − ( T ) is C -equivalent to the metric H Φ .The next proposition studies the invariance of the collections T (cid:63),C of meshes and T (cid:63) of metrics, (cid:63) ∈ { i, a, g } , under the transformations T (cid:55)→ Φ − ( T ) or H (cid:55)→ H Φ induced byan affine change of coordinates Φ.16 Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? Proposition 5.2.1. Let Φ : R d → R d be an affine change of coordinates, Φ( z ) := φz + z where φ ∈ GL d and z ∈ R d . The following holds for any constant C ≥ .i) If T ∈ T g,C then Φ − ( T ) ∈ T g,C . If H ∈ H g then H Φ ∈ H g .ii) Assume that Φ is an isometry, in other words φ ∈ O d .If T ∈ T a,C then Φ − ( T ) ∈ T a,C , and similarly if T ∈ T i,C then Φ − ( T ) ∈ T i,C .If H ∈ H a then H Φ ∈ H a , and similarly if H ∈ H i then H Φ ∈ H i . Proof: We first establish the properties announced for meshes, and we then focus onmetrics. Let M, M (cid:48) ∈ S + d , let C ≥ φ ∈ GL d . Clearlyif C − M ≤ M (cid:48) ≤ C M then C − φ T M φ ≤ φ T M (cid:48) φ ≤ C φ T M φ. In view of the definition (5.14) of T g,C we obtain that Φ − ( T ) ∈ T g,C for any mesh T ∈ T g,C as announced in i).If Φ is an isometry then for any simplex T we obtain, using the definitions (5.11) and(5.12) of the measure of degeneracy ρ and the measure of sliverness S , that ρ (Φ − ( T )) = ρ ( T ) and S (Φ − ( T )) = S ( T ) . This implies the properties of T a,C and T i,C announced in b), which concludes the proofof the properties announced for meshes.We therefore turn our attention to metrics. The fact that H Φ ∈ H g for any H ∈ H g and any affine change of coordinates Φ is a special case of Proposition 5.2.2 below. Wethus admit this property and we focus on isotropic and quasi-acute metrics and isometric changes of coordinates.If H ∈ H i , then H = Id /s ( z ) where s : R d → R ∗ + is a Lipschitz function. Since φ ∈ O d we have φ T φ = Id and therefore H Φ ( z ) = φ T H (Φ( z )) φ = Id /s (Φ( z )) . The map s ◦ Φ : R d → R ∗ + is Lipschitz since it is the composition of a Lipschitz functionwith an isometry, which establishes that H Φ ∈ H i as announced.Since φ ∈ O d we have for any symmetric matrix M ∈ S + d and any exponent α ∈ R ( φ T M φ ) α = φM α φ. Consider H ∈ H a , then for any z, z (cid:48) ∈ R d (cid:107) H Φ ( z ) − − H Φ ( z (cid:48) ) − (cid:107) = (cid:107) φ T (cid:16) H (Φ( z )) − − H (Φ( z (cid:48) )) − (cid:17) φ (cid:107) = (cid:107) H (Φ( z )) − − H (Φ( z (cid:48) )) − (cid:107)≤ (cid:107) Φ( z ) − Φ( z (cid:48) ) (cid:107) = (cid:107) φ ( z − z (cid:48) ) (cid:107) = | z − z (cid:48) | , which establishes that H Φ satisfies the Lipschitz regularity condition (5.26). We alreadyknow from the case of graded metrics that the other regularity condition (5.25) is satisfied .2. General properties of graded and quasi-acute metrics H Φ , which establishes that H Φ ∈ H a and concludes the proof of this proposition. (cid:5) In practical applications, one often needs to mesh simultaneously a domain Ω and asurface Γ embedded in Ω. In terms of metric this raises the following question : given ametric H on Ω which satisfies certain properties of regularity what is the regularity of therestriction of this metric to Γ ?The next proposition answers this question is the simplified setting where Ω is theinfinite domain R d , Γ is an affine subspace of R d , and properties of regularity are thosedefining the collection H g of graded metrics. For any 1 ≤ k ≤ d we denote by H g ( R k )the collection of graded metrics on R k (with the convention that H g ( R ) is the collectionLip( R , R ∗ + ) of Lipschitz functions from R to R ∗ + ). Proposition 5.2.2. Let ≤ k ≤ d , let Φ : R k → R d be an affine injective map, Φ( z ) = φz + z where φ ∈ M d,k has rank k and z ∈ R d . Let H ∈ H and let for all z ∈ R k H Φ ( z ) := φ T H (Φ( z )) φ. If H ∈ H g then H Φ ∈ H g ( R k ) . Proof: For any u ∈ R d , one has (cid:107) u (cid:107) H Φ ( z ) = (cid:107) φ ( u ) (cid:107) H (Φ( z )) . (5.31)Hence for any z, z (cid:48) ∈ R k we obtain using (5.20), d × ( H Φ ( z ) , H Φ ( z (cid:48) )) = sup u ∈ IR k \{ } (cid:12)(cid:12)(cid:12) ln (cid:107) φ ( u ) (cid:107) H (Φ( z )) − ln (cid:107) φ ( u ) (cid:107) H (Φ( z (cid:48) )) (cid:12)(cid:12)(cid:12) ≤ d × ( H (Φ( z )) , H (Φ( z (cid:48) ))) (5.32)For any path γ ∈ C ([0 , , R k ), it follows from (5.31) that l H (Φ ◦ γ ) = (cid:90) (cid:107) φ ( γ (cid:48) ( t )) (cid:107) H (Φ( γ ( t ))) dt = l H Φ ( γ ) . Taking the infimum among all paths γ ∈ C ([0 , , R k ) joining two points z, z (cid:48) ∈ R k , weobtain since Φ ◦ γ ∈ C ([0 , , R d ) is a path joining Φ( z ) and Φ( z (cid:48) ) d H (Φ( z ) , Φ( z (cid:48) )) ≤ d H Φ ( z, z (cid:48) ) . (5.33)Combining (5.32) and (5.33), we conclude the proof of this proposition. (cid:5) Definition 5.2.3. Let ( X, d X ) and ( Y, d Y ) be metric spaces and let f ∈ C ( X, Y ) be acontinuous function. We define the dilatation dil( f ) ∈ [0 , ∞ ] of f as follows dil( f ) := sup x,x (cid:48) ∈ Xx (cid:54) = x (cid:48) d Y ( f ( x ) , f ( x (cid:48) )) d X ( x, x (cid:48) )18 Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? We say that f is Lipschitz if and only if dil( f ) ≤ 1, and more generally that f is λ -Lipschitz if and only if dil( f ) ≤ λ .We define the local dilatation dil x ( f ) of a function f ∈ C ( X, Y ) at a point x ∈ X asfollows dil x ( f ) := lim ε → dil (cid:0) f | B ( x,ε ) (cid:1) = lim ε → (cid:32) sup p,q ∈ B ( x,ε ) d Y ( f ( q ) , f ( q )) d X ( p, q ) (cid:33) . (5.34)We specify the distance function on X or Y if it is not the expected “canonical one”,for instance if X or Y is a subset of R k and if the associated distance is not the standardeuclidean distance but the riemannian distance d H associated to a metric, or if X or Y isthe collection S + d of symmetric positive definite matrices since none of the two distances d + or d × is canonical. The local dilatation is in that case denoted as followsdil x ( f ; d X , d Y ) , dil x ( f ; d X ) or dil x ( f ; d Y ) . The local dilatation dil x ( f ) only depends on the local properties of the metric spacesclose to x and f ( x ). Consider the metric space ( R d , d H ), where H ∈ H is a fixed riemannianmetric, and an arbitrary metric space ( Y, d Y ). For any f ∈ C ( R d , Y ) and any z ∈ R d weobtain using (5.16) (cid:107) H ( z ) − (cid:107) − dil z ( f ) ≤ dil z ( f ; d H ) = dil z ( f ; (cid:107) · (cid:107) H ( z ) ) ≤ (cid:112) (cid:107) H ( z ) (cid:107) dil z ( f ) (5.35)We also introduce the lower local dilatation dil ∗ z ( f ) ≤ dil z ( f ) which is defined as followsdil ∗ z ( f ) := lim ε → (cid:18) inf p,q ∈ B ( x,ε ) d Y ( f ( q ) , f ( q )) d X ( p, q ) (cid:19) . The next proposition establishes that local dilatations are sub-multiplicative. Lemma 5.2.4. Let ( X, d X ) , ( Y, d Y ) and ( Z, d Z ) be three metric spaces, and let f ∈ C ( X, Y ) and g ∈ C ( Y, Z ) . For any x ∈ X one has dil x ( g ◦ f ) ≤ dil f ( x ) ( g ) dil x ( f ) , (5.36) provided no indeterminate product ×∞ or ∞× appears in the right hand side. Similarlywe have dil x ( g ◦ f ) ≥ dil ∗ f ( x ) ( g ) dil x ( f ) , (5.37) provided no indeterminate product × ∞ or ∞ × appears in the right hand side. Proof: We first establish (5.36). If dil x ( f ) = ∞ or dil f ( x ) ( g ) = ∞ , then there is nothingto prove. We may therefore assume that dil x ( f ) < ∞ and dil f ( x ) ( g ) < ∞ . For any ε > F ( ε ) = sup p,q ∈ B ( x,ε ) d Y ( f ( p ) , f ( q )) d X ( p, q ) and G ( ε ) := sup p,q ∈ B ( f ( x ) ,ε ) d Z ( g ( p ) , g ( q )) d Y ( p, q ) , we thus have sup p,q ∈ B ( x,ε ) d Z ( g ( f ( p )) , g ( f ( q ))) d X ( p, q ) ≤ G ( F ( ε ) ε ) F ( ε ) . .2. General properties of graded and quasi-acute metrics ε → p n ) n ≥ , ( q n ) n ≥ which both tend to x and such that d Y ( f ( p n ) , f ( q n )) /d X ( p n , q n ) → dil x ( f ) as n → ∞ . We thus have d Z ( g ( f ( p n )) , g ( f ( q n ))) d X ( p n , q n ) = d Z ( g ( f ( p n )) , g ( f ( q n ))) d Y ( f ( p n ) , f ( q n )) × d Y ( f ( p n ) , g ( f ( q n ))) d X ( p n , q n ) , and taking the limit as n → ∞ we obtain (5.37) which concludes the proof of this lemma. (cid:5) Let f : Ω → V be a C function, where Ω ⊂ R d is an open set and V is a Banachspace. Then for any z ∈ Ω the local dilatation dil z ( f ), and the lower dilatation dil ∗ z ( f )have an explicit expression in terms of the differential d z f : R d → V of f at z dil z ( f ) = sup | u | =1 (cid:107) d z f ( u ) (cid:107) and dil ∗ z ( f ) = inf | u | =1 (cid:107) d z f ( u ) (cid:107) (5.38)Applying Lemma 5.2.4 and (5.38) to the function g = ln we obtain for any metric space( X, d X ), any f ∈ C ( X, R ∗ + ) and any x ∈ X dil x (ln f ) = dil x ( f ) f ( x ) . (5.39)The following proposition shows that, under certain circumstances, the Lipschitz pro-perty is a local property. Proposition 5.2.5. Consider the metric space ( R d , d H ) , where H ∈ H is a fixed rieman-nian metric, and an arbitrary metric space ( Y, d Y ) . For any f ∈ C ( R d , Y ) and any path γ ∈ C ([0 , , R d ) , γ (0) = z , γ (1) = z (cid:48) , one has d Y ( f ( z ) , f ( z (cid:48) )) ≤ l H ( γ ) sup ≤ t ≤ dil γ ( t ) ( f ; d H ) . (5.40) It follows that dil( f ; d H ) = sup z ∈ IR d dil z ( f ; d H ) . (5.41) Proof: We equip the segment [0 , 1] with the distance d γ defined for 0 ≤ t ≤ t (cid:48) ≤ d γ ( t, t (cid:48) ) = d γ ( t (cid:48) , t ) = l H ( γ | [ t,t (cid:48) ] ) . (5.42)Note that for any t ≤ t ∗ ≤ t (cid:48) one has d γ ( t, t (cid:48) ) = d γ ( t, t ∗ ) + d γ ( t ∗ , t (cid:48) ) (5.43)The map γ : ([0 , , d γ ) → ( R d , d H ) is clearly Lipschitz. It thus follows from Lemma 5.2.4that F := f ◦ γ : [0 , → Y satisfies for all t ∈ [0 , t ( F ) ≤ dil γ ( t ) ( f ) ≤ λ := sup ≤ t (cid:48) ≤ dil γ ( t (cid:48) ) ( f ) . Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? We assume that λ < ∞ , otherwise there is nothing to prove, and we consider a fixed δ > t ∈ [0 , 1] there exists therefore an interval V t ⊂ [0 , 1] containing t , relatively openin [0 , F | V t is ( λ + δ )-Lipschitz. Since the segment [0 , 1] is compact, thereexists a finite set I ⊂ [0 , 1] such that[0 , 1] = (cid:91) t ∈ I V t . (5.44)Let V, V (cid:48) ⊂ [0 , 1] be two intersecting intervals, and assume that F is λ + δ Lipschitz on V and on V (cid:48) . For any t ∈ V and t (cid:48) ∈ V (cid:48) , there exists t ∗ ∈ V ∩ V (cid:48) which satisfies t ≤ t ∗ ≤ t (cid:48) or t ≥ t ∗ ≥ t (cid:48) . Recalling (5.43) we thus obtain d Y ( F ( t ) , F ( t (cid:48) )) ≤ d Y ( F ( t ) , F ( t ∗ )) + d Y ( F ( t ∗ ) , F ( t (cid:48) )) ≤ ( λ + δ ) d γ ( t, t ∗ ) + ( λ + δ ) d γ ( t ∗ , t (cid:48) )= ( λ + δ ) d γ ( t, t (cid:48) ) . Thus F is λ + δ Lipschitz on the interval V ∪ V (cid:48) . Since [0 , 1] is covered by the finitecollection of open intervals (5.44) on which F is λ + δ Lipschitz, we obtain proceeding byinduction that F is ( λ + δ )-Lipshitz on [0 , 1] and therefore d Y ( f ( z ) , f ( z (cid:48) )) = d Y ( F (0) , F (1)) ≤ ( λ + δ ) d γ (0 , 1) = ( λ + δ ) l H ( γ ) . Letting δ → γ ∈ C ([0 , , R d ) joining the points z and z (cid:48) we obtain d Y ( f ( z ) , f ( z (cid:48) )) ≤ λd H ( z, z (cid:48) ). Thereforedil( f ; d H ) ≤ λ . Conversely we have dil z ( f ; d H ) ≤ dil( f ; d H ) for any z ∈ R d , which esta-blishes (5.41) and concludes the proof of this proposition. (cid:5) The next corollary shows that the global dilatation of a function is controlled by thelocal dilatation on R d \ Γ, if Γ is a sufficiently thin set. For instance the skeleton ∪ T ∈T ∂T of a mesh T ∈ T . Corollary 5.2.6. Let Γ ⊂ R d be a set which has the following property : for any path γ ∈ C ([0 , , R d ) there exists a sequence of paths ( γ n ) n ≥ which converges to γ in C normand such that { t ∈ [0 , 1] ; γ n ( t ) ∈ Γ } is finite for each n .Let H ∈ H , let ( Y, d Y ) be an arbitrary metric space, and met let f ∈ C ( R d , Y ) . Then dil( f ; d H ) = sup z ∈ R d \ Γ dil z ( f ; d H ) . Proof: We denote λ := sup z ∈ R d \ Γ dil z ( f ). We consider two points z, z (cid:48) ∈ R d and a path γ ∈ C ([0 , , R d ) such that γ (0) = z and γ (1) = z (cid:48) .We first assume that the set { t ∈ [0 , 1] ; γ ( t ) ∈ Γ } is finite, and we denote its elementsby t < · · · < t k . We set by convention t = 0 and t k +1 = 1. For each 0 ≤ i ≤ k andeach t, t (cid:48) such that t i < t < t (cid:48) < t i +1 we have d Y ( f ( γ ( t )) , f ( γ ( t (cid:48) ))) ≤ λl H ( γ | [ t,t (cid:48) ] ) accordingto Proposition 5.2.5. Letting t → t i and t (cid:48) → t i +1 we obtain d Y ( f ( γ ( t i )) , f ( γ ( t i +1 ))) ≤ λl H ( γ | [ t i ,t i +1 ] ). Adding up we obtain d Y ( f ( z ) , f ( z (cid:48) )) ≤ (cid:88) ≤ i ≤ k d Y ( f ( γ ( t i )) , f ( γ ( t i +1 ))) ≤ λ (cid:88) ≤ i ≤ k l H ( γ [ t i ,t i +1 ] ) = λl H ( γ ) . .2. General properties of graded and quasi-acute metrics γ meets the set Γ an infinite number of times, andwe choose a sequence ( γ n ) n ≥ as described is the statement of the corollary. We obtain d Y ( f ( γ n (0)) , f ( γ n (1))) ≤ λl H ( γ n ) for each n ≥ 0. Taking the limit as n → ∞ we obtain d Y ( f ( z ) , f ( z (cid:48) )) ≤ λl H ( γ ) since γ n converges to γ is C norm as n → ∞ .Taking the infimum among all paths γ ∈ C ([0 , , R d ) joining the points z and z (cid:48) , weobtain d Y ( f ( z ) , f ( z (cid:48) )) ≤ λd H ( z, z (cid:48) ) which concludes the proof of this proposition. (cid:5) The next proposition establishes that the regularity conditions (5.25) and (5.26) defi-ning the collections of quasi-acute or graded metrics are equivalent in the case of a metricproportionnal to the identity. Proposition 5.2.7. Let s ∈ C ( R d , R ∗ + ) and let H := Id /s . We have for all z ∈ R d dil z (ln s ; d H ) = dil z ( s ) . (5.45) Furthermore the following properties are equivalent.1. The map s : R d → R ∗ + is Lipschitz. (i.e. H ∈ H i )2. (“Additive” Lipschitz property) For all z, z (cid:48) ∈ R d one has d + ( H ( z ) , H ( z (cid:48) )) ≤ | z − z (cid:48) | .3. (“Multiplicative” Lipschitz property) For all z, z (cid:48) ∈ R d one has d × ( H ( z ) , H ( z (cid:48) )) ≤ d H ( z, z (cid:48) ) . Proof: Combining (5.35) and (5.39) we obtain for any z ∈ R d dil z (ln s ; d H ) = s ( z ) dil z (ln s ) = dil z ( s ) , which establishes (5.45). According to (5.24) we have for any z, z (cid:48) ∈ R d d + ( H ( z ) , H ( z (cid:48) )) = | s ( z ) − s ( z (cid:48) ) | and d × ( H ( z ) , H ( z (cid:48) )) = | ln s ( z ) − ln s ( z (cid:48) ) | . The properties 1. and 2. are thus equivalent to dil( s ) ≤ 1, and therefore also to :dil z ( s ) ≤ z ∈ R d , (5.46)according Proposition 5.2.5 applied to the constant metric H = Id and to the function f = s . On the other hand property 3. is equivalent to dil(ln s ; d H ) ≤ 1, and thus to :dil z (ln s ; d H ) ≤ z ∈ R d , according to Proposition 5.2.5 applied to the metric H and to the function f = ln s .This property equivalent to (5.46) according to (5.45), which concludes the proof of thisProposition. (cid:5) The next corollary uses Proposition 5.2.7 to analyse the regularity of the first and lasteigenvalue of a metric. Corollary 5.2.8. 1. For any H ∈ H g , the map z (cid:55)→ (cid:107) H ( z ) (cid:107) − is Lipschitz.2. For any H ∈ H a , the map z (cid:55)→ (cid:107) H ( z ) − (cid:107) is Lipschitz. Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? Proof: We first establish Point 1. We consider H ∈ H g , and we define s ( z ) := (cid:107) H ( z ) (cid:107) − and H (cid:48) ( z ) := Id s ( z ) = (cid:107) H ( z ) (cid:107) Id , for each z ∈ R d . Since H ( z ) ≤ H (cid:48) ( z ) for all z ∈ R d , we have d H ( z, z (cid:48) ) ≤ d H (cid:48) ( z, z (cid:48) ) for all z, z (cid:48) ∈ R d . It follows that, using (5.22), | ln s ( z ) − ln s ( z (cid:48) ) | = 12 | ln (cid:107) H ( z ) (cid:107) − ln (cid:107) H ( z (cid:48) ) (cid:107)|≤ d × ( H ( z ) , H ( z (cid:48) )) ≤ d H ( z, z (cid:48) ) ≤ d H (cid:48) ( z, z (cid:48) ) . Applying Proposition 5.2.7 to the metric H (cid:48) we obtain that H (cid:48) ∈ H i , and therefore that s is Lipschitz as announced.The proof of Point 2. is more straightforward, since for any metric H ∈ H a we have (cid:12)(cid:12)(cid:12) (cid:107) H ( z ) − (cid:107) − (cid:107) H ( z (cid:48) ) − (cid:107) (cid:12)(cid:12)(cid:12) ≤ (cid:107) H ( z ) − − H ( z (cid:48) ) − (cid:107) ≤ | z − z (cid:48) | for all z, z (cid:48) ∈ R d , which concludes the proof of this corollary. (cid:5) Our last proposition studies the local dilatation of the maximum and the minimum oftwo functions. Proposition 5.2.9. Let ( X, d X ) be a metric space and let λ, µ ∈ C ( X, R ) . Let γ :=min { λ, µ } and Γ := max { λ, µ } . Then for all x ∈ X max { dil x ( γ ) , dil x (Γ) } ≤ max { dil x ( λ ) , dil x ( µ ) } . (5.47) If ( X, d X ) = ( R d , d H ) , where H ∈ H is a riemannian metric, then the above inequality isan equality max { dil x ( γ ; d H ) , dil x (Γ; d H ) } = max { dil x ( λ ; d H ) , dil x ( µ ; d H ) } . (5.48) Proof: We have for any p, q ∈ X , | γ ( p ) − γ ( q ) | = | min { λ ( p ) , µ ( p ) } − min { λ ( q ) , µ ( q ) }|≤ max {| λ ( p ) − λ ( q ) | , | µ ( p ) − µ ( q ) |} . For any x ∈ X and any ε > γ | B ( x,ε ) ) = sup p,q ∈ B ( x,ε ) | γ ( p ) − γ ( q ) | d X ( p, q ) ≤ sup p,q ∈ B ( x,ε ) max {| λ ( p ) − λ ( q ) | , | µ ( p ) − µ ( q ) |} d X ( p, q )= max (cid:40) sup p,q ∈ B ( x,ε ) | λ ( p ) − λ ( q ) | d X ( p, q ) , sup p,q ∈ B ( x,ε ) | µ ( p ) − µ ( q ) | d X ( p, q ) (cid:41) = max (cid:8) dil( λ | B ( x,ε ) ) , dil( µ | B ( x,ε ) ) (cid:9) . .2. General properties of graded and quasi-acute metrics ε → x ( γ ) ≤ max { dil x ( λ ) , dil x ( µ ) } . Proceeding likewise for Γ weconclude the proof of (5.47).We now turn to the proof of (5.48). For that purpose we consider a fixed ε > K := max { dil( γ | B ( x,ε ) ; (cid:107) · (cid:107) H ( x ) ) , dil(Γ | B ( x,ε ) ; (cid:107) · (cid:107) H ( x ) ) } , where B ( x, ε ) stands for the euclidean unit ball of radius ε centered at x . Consider twopoints p, q ∈ B ( x, ε ). If λ ( p ) ≥ µ ( p ) and λ ( q ) ≥ µ ( q ), or if λ ( p ) ≤ µ ( p ) and λ ( q ) ≤ µ ( q ),then | λ ( p ) − λ ( q ) | = max {| γ ( p ) − γ ( q ) | , | Γ( p ) − Γ( q ) |}≤ K (cid:107) p − q (cid:107) H ( x ) Otherwise if λ ( p ) ≥ µ ( p ) and λ ( q ) ≤ µ ( q ), or λ ( p ) ≤ µ ( p ) and λ ( q ) ≥ µ ( q ), then thereexists a point r on the segment [ p, q ] such that λ ( r ) = µ ( r ). We thus have | λ ( p ) − λ ( q ) | ≤ max {| γ ( p ) − γ ( r ) | , | Γ( p ) − Γ( r ) |} + max {| γ ( r ) − γ ( q ) | , | Γ( r ) − Γ( q ) |}≤ K (cid:107) p − r (cid:107) H ( x ) + K (cid:107) r − q (cid:107) H ( x ) = K (cid:107) p − q (cid:107) H ( x ) If follows that dil( λ | B ( x,ε ) ; (cid:107) · (cid:107) H ( x ) ) ≤ K , and letting ε → x ( λ ; (cid:107) · (cid:107) H ( x ) ) ≤ max { dil x ( γ ; (cid:107) · (cid:107) H ( x ) ) , dil x (Γ; (cid:107) · (cid:107) H ( x ) ) } . Recalling that dil x ( f ; d H ) = dil x ( f ; (cid:107) · (cid:107) H ( x ) ) for any f ∈ C ( R d , R ) and any x ∈ R d , see(5.35), we obtain dil x ( λ ; d H ) ≤ max { dil x ( γ ; d H ) , dil x (Γ; d H ) } . Proceeding likewise for µ we obtain (5.48) which concludes the proof of this proposition. (cid:5) ( R d , d H ) We focus in this subsection on the “geometrical properties” of the space R d equippedwith the distance d H and the measure (cid:112) det H ( z ) dz associated to a graded metric H ∈ H g .The next proposition compares the riemannian distance d H between two points z and z + u with the norm (cid:107) u (cid:107) H ( z ) of their difference. Proposition 5.2.10. Let H ∈ H g and let z ∈ R d . For all u ∈ R d one has ln(1 + (cid:107) u (cid:107) H ( z ) ) ≤ d H ( z, z + u ) ≤ − ln(1 − (cid:107) u (cid:107) H ( z ) ) , (5.49) where the right hand side equals ∞ by convention when (cid:107) u (cid:107) H ( z ) ≥ . Proof: We consider a path γ ∈ C ([0 , , R d ), and we define for each t ∈ [0 , l ( t ) := (cid:90) t (cid:107) γ (cid:48) ( t ) (cid:107) H ( γ ( t )) dt. (5.50)24 Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? Since H ∈ H g we have for any t ∈ [0 , − d H ( z, γ ( t ))) H ( z ) ≤ H ( γ ( t )) ≤ exp(2 d H ( z, γ ( t ))) H ( z ) . Therefore, since d H ( z, γ ( t )) ≤ l ( t ),exp( − l ( t )) H ( z ) ≤ H ( γ ( t )) ≤ exp(2 l ( t )) H ( z ) . It follows that (cid:107) γ (cid:48) ( t ) (cid:107) H ( z ) exp( − l ( t )) ≤ l (cid:48) ( t ) = (cid:107) γ (cid:48) ( t ) (cid:107) H ( γ ( t )) ≤ (cid:107) γ (cid:48) ( t ) (cid:107) H ( z ) exp( l ( t )) , hence − ddt exp( − l ( t )) ≤ (cid:107) γ (cid:48) ( t ) (cid:107) H ( z ) ≤ ddt exp( l ( t )) . (5.51)We have (cid:107) u (cid:107) H ( z ) = (cid:107) γ (1) − γ (0) (cid:107) H ( z ) ≤ (cid:90) (cid:107) γ (cid:48) ( t ) (cid:107) H ( z ) dt (5.52)with equality if γ ( t ) = z + tu for all t ∈ [0 , γ is a straight line. For that specificpath γ we obtain (cid:107) u (cid:107) H ( z ) = (cid:90) (cid:107) γ (cid:48) ( t ) (cid:107) H ( z ) dt ≥ − (cid:90) (cid:18) ddt exp( − l ( t )) (cid:19) dt = 1 − exp( − l H ( γ )) . Rearranging the terms we obtain the right part of (5.49) : d H ( z, z + u ) ≤ l H ( γ ) ≤ − ln(1 − (cid:107) u (cid:107) H ( z ) ) . We now consider an arbitrary path γ joining z and z + u and we obtain integrating theright part of (5.51) and recalling (5.52) (cid:107) u (cid:107) H ( z ) ≤ (cid:90) (cid:107) γ (cid:48) ( t ) (cid:107) H ( z ) dt ≤ (cid:90) (cid:18) ddt exp( l ( t )) (cid:19) dt = exp( l H ( γ )) − , which is equivalent to l H ( γ ) ≥ ln(1 + (cid:107) u (cid:107) H ( z ) ) . Taking the infimum among all paths γ ∈ C ([0 , , R d ) satisfying γ (0) = z and γ (1) = z + u , we obtain the left part of (5.49),which concludes the proof of this proposition. (cid:5) We discuss in the rest of this section the consequences of the comparison (5.49) of theriemannian distance d H with the distance associated to the norm (cid:107) · (cid:107) H ( z ) when H ∈ H g .Combining the right part of (5.49) with the expression (5.20) of the distance d × we obtainan estimate of the local variations of the norm (cid:107) · (cid:107) H ( z ) associated to the metric H at apoint z : let H ∈ H g , and let z, u, v ∈ R d be such that (cid:107) u (cid:107) H ( z ) < 1, then(1 − (cid:107) u (cid:107) H ( z ) ) (cid:107) v (cid:107) H ( z ) ≤ (cid:107) v (cid:107) H ( z + u ) ≤ (1 − (cid:107) u (cid:107) H ( z ) ) − (cid:107) v (cid:107) H ( z ) . (5.53)Note also that(1 − (cid:107) u (cid:107) H ( z ) ) d (cid:112) det H ( z ) ≤ (cid:112) det H ( z + u ) ≤ (1 − (cid:107) u (cid:107) H ( z ) ) − d (cid:112) det H ( z ) . (5.54) .2. General properties of graded and quasi-acute metrics z ∈ R d and any r ≥ { z (cid:48) ∈ R d ; d H ( z, z (cid:48) ) ≤ r } for the distance d H , is a closed and bounded subset R d ,hence is compact.The following example establishes that (5.49) is a sharp inequality. The metric H defined by H ( z ) := Id(1 + | z | ) belongs to H i since z (cid:55)→ | z | is Lipschitz, hence H also belongs to H a and H g . For any v ∈ R d , one easily checks that the path of minimal length joining 0 to v is the straightline, and that d H (0 , v ) = ln(1 + | v | ) . Choosing z = 0 and u = v we obtain that the left part of (5.49) is sharpln(1 + (cid:107) u (cid:107) H (0) ) = ln(1 + | u | ) = d H (0 , u ) . Choosing z = − v and u = v we obtain that the right part of (5.49) is sharp − ln (cid:0) − (cid:107) u (cid:107) H ( − u ) (cid:1) = − ln (cid:18) − | u | | u | (cid:19) = ln(1 + | u | ) = d H ( − u, − u + u ) . The next corollary uses Proposition 5.2.10 to obtain a lower bound for the mass ofballs in the measure z (cid:55)→ (cid:112) det H ( z ) dz associated to a graded metric H ∈ H g . For anymetric H ∈ H , any z ∈ R d and any r > B H ( z, r ) := { z + u ; (cid:107) u (cid:107) H ( z ) < r } , and we observe that | B H ( z, r ) | = ωr d (det H ( z )) − where ω denotes the volume of the standard euclidean ball of radius one. Corollary 5.2.11. There exists c = c ( d ) > such that the following holds. For any H ∈ H g , any r ≥ and any z ∈ R d we have (cid:90) B H ( z,r ) √ det H ≥ c ln r. (5.55) Proof: It follows from Proposition 5.2.10 that d H ( z , z ) ≥ ln( r + 1) for all z ∈ ∂B H ( z, r ).We define the integer k = (cid:22) ln( r + 1) − (cid:23) , and we consider k points z , · · · , z k ∈ E such that d H ( z , z i ) = i for all 0 ≤ i ≤ k . Wedefine r := 1 − e − = 0 . · · · in such way that − ln(1 − r ) = 1 / . We have accordingto (5.54) for all z ∈ B H ( z i , r ) (cid:112) det H ( z ) ≥ (1 − (cid:107) z − z i (cid:107) H ( z i ) ) d (cid:112) det H ( z i ) ≥ (1 − r ) d (cid:112) det H ( z i ) . (5.56)26 Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? For any z ∈ B H ( z i , r ) we have d H ( z , z ) ≤ d H ( z , z i ) + d H ( z i , z ) ≤ i − ln(1 − r ) ≤ k + 12 ≤ ln( r + 1) , which implies that B H ( z i , r ) ⊂ B H ( z, r ) for all 0 ≤ i ≤ k . Furthermore for any 0 ≤ i 12 ln 2 (cid:19) ln( r + 1) ≥ (cid:18) − 12 ln 2 (cid:19) ln r which concludes the proof with c = (1 − / (2 ln 2)) c . (cid:5) d − at each point We focus in this section on metrics H ∈ H such that the symmetric matrix H ( z ) hasan eigenspace of dimension at least d − z ∈ R d . Note that this conditionclearly holds if the dimension is d = 2, but is also relevant in some applications to higherdimension as illustrated in § H a or H g interms of the regularity of their eigenvalues and of their eigenvectors.We denote by S := { θ ∈ R d ; | θ | = 1 } the euclidean unit sphere of R d , equipped withthe distance d S ( θ, θ (cid:48) ) := arccos( (cid:104) θ, θ (cid:48) (cid:105) ) . We denote by A the space of parameters A := R ∗ + × R ∗ + × S , and we define a map S : A → S + d as follows S ( λ, µ, θ ) := λθθ T + µ (Id − θθ T ) . (5.57) .3. Metrics having an eigenspace of dimension d − at each point S , in any orthonormal basis B of R d which begins with the vector θ , hasthe following form [ S ( λ, µ, θ )] B = Λ 0 0 · · · µ · · · µ . We also define for any a = ( λ, µ, θ ) ∈ AH ( a ) := S ( a ) − = λ − θθ T + µ − (Id − θθ T ) . For any metric H ∈ H and any z ∈ R d we denote by dil z ( H ) × and dil z ( H ) + the localdilatations at z associated to the Lipschitz conditions (5.25) and (5.26) defining H g and H a : dil z ( H ) × := dil z ( H ; d × , d H ) and dil z ( H ) + := dil z ( H ; d + ) . (5.58)Let Ω ⊂ R d be an open set, let a ∈ C (Ω , A ), a ( z ) = ( λ ( z ) , µ ( z ) , θ ( z )), and let z ∈ Ω besuch that λ ( z ) (cid:54) = µ ( z ) or dil z ( θ ) < ∞ . We define D z ( a ) × := max (cid:26) dil z (ln λ ; d H ) , dil z (ln µ ; d H ) , (cid:12)(cid:12)(cid:12)(cid:12) λ ( z ) µ ( z ) − µ ( z ) λ ( z ) (cid:12)(cid:12)(cid:12)(cid:12) dil z ( θ ; d H ) (cid:27) , (5.59)and D z ( a ) + := max { dil z ( λ ) , dil z ( µ ) , | λ ( z ) − µ ( z ) | dil z ( θ ) } . (5.60)Note that these quantities are not defined if λ ( z ) = µ ( z ) and dil z ( θ ; d H ) = ∞ simulta-neously, since an indeterminate product 0 × ∞ appears in (5.59) and (5.60). Theorem 5.3.1. Let H ∈ H . Assume that there exists an open set Ω ⊂ R d and acontinuous function a ∈ C (Ω , A ) , a ( z ) = ( λ ( z ) , µ ( z ) , θ ( z )) , such that H = H ◦ a on Ω .For each z ∈ Ω such that λ ( z ) (cid:54) = µ ( z ) or dil z ( θ ) < ∞ one has D z ( a ) × ≤ dil z ( H ) × ≤ D z ( a ) × (5.61) D z ( a ) + ≤ dil z ( H ) + ≤ D z ( a ) + . (5.62)The rest of this section is devoted to the proof of this theorem and to Corollaries 5.3.5and 5.3.6. Our first intermediate result in the proof of Theorem 5.3.1 defines and estimatesa quantity ∆( a, b, c ) which appears repeatedly in the rest of the proof. Lemma 5.3.2. For each a, b, c ∈ R we define ∆( a, b, c ) := (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) a b · · · b c · · · c ... ... . . . (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) . (5.63) Then max {| a | , | b | , | c |} ≤ ∆( a, b, c ) ≤ {| a | , | b | , | c |} . (5.64)28 Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? Proof: We consider fixed values of a, b, c ∈ R and we denote by M the matrix appearingin (5.63), in such way that ∆( a, b, c ) = (cid:107) M (cid:107) . We also define λ := max {| a | , | b | , | c |} . Wehave ∆( a, b, c ) = (cid:107) M (cid:107) ≥ max ≤ i,j ≤ d | M ij | = λ, which establishes the left part of (5.64). For all p ∈ [1 , ∞ ] we denote by l p the usual normon R d of exponent p . We have (cid:107) M (cid:107) l ∞ → l ∞ = max ≤ i ≤ d (cid:88) ≤ j ≤ d | M ij | ≤ λ and (cid:107) M (cid:107) l → l = max ≤ j ≤ d (cid:88) ≤ ≤ d | M ij | ≤ λ. By interpolation we thus obtain (cid:107) M (cid:107) = (cid:107) M (cid:107) l → l ≤ (cid:112) (cid:107) M (cid:107) l → l (cid:107) M (cid:107) l ∞ → l ∞ ≤ λ, which concludes the proof. (cid:5) The parameter space A = S × R ∗ + × R ∗ + is a differential variety, and the regularity offunctions defined on A should be seen through local charts. For each a = ( λ, µ, θ ) ∈ A weintroduce a local chart ψ a of a neigborhood of a in A . ψ a : ( − λ, ∞ ) × ( − µ, ∞ ) × ( B (0 , π ) ∩ θ ⊥ ) → A , (5.65)where θ ⊥ := { Θ ∈ R d ; (cid:104) θ, Θ (cid:105) = 0 } denotes the space orthogonal to θ , and B (0 , π ) theopen euclidean ball of radius π . We define ψ a (Λ , M, r Θ) := ( λ + Λ , µ + M, cos( r ) θ + sin( r )Θ) , where | Θ | = 1. Note that ψ a is one to one and that d S ( θ, cos( r ) θ + sin( r )Θ) = r .Let V be a banach space. We say that a function ϕ : A → V is C if for any a ∈ A the function ϕ ◦ ψ a is C . In that case for any a ∈ A we define the differential d a ϕ ( A ) : R × R × θ ⊥ → V by the formula d a ϕ ( A ) := d ( ϕ ◦ ψ a )( A ) = lim t → ϕ ◦ ψ a ( tA ) − ϕ ( a ) t . Proposition 5.3.3. Define V + := S d , equipped with the standard norm (cid:107) · (cid:107) , and ϕ + := S on A . Then for all a, b ∈ A we have d + ( H ( a ) , H ( b )) = (cid:107) ϕ + ( a ) − ϕ + ( b ) (cid:107) . (5.66) Furthermore ϕ + ∈ C ( A , V + ) and for all a = ( λ, µ, θ ) ∈ A and all A = (Λ , M, r Θ) ∈ T a A ,with | Θ | = 1 , we have (cid:107) d a ϕ + ( A ) (cid:107) ∞ = ∆(Λ , ( λ − µ ) r, M ) . .3. Metrics having an eigenspace of dimension d − at each point Proof: The identity (5.66) directly follows from the fact that H ( a ) − = S ( a ) = ϕ + ( a ),and from the definition (5.23) of d + . The function ϕ + = S is C (in fact C ∞ ), since it hasa polynomial expression S ( λ, µ, θ ) := λθθ T + µ (1 − θθ T ) . We consider a fixed point a = ( λ, µ, θ ) ∈ A , and A = (Λ , M, r Θ) ∈ T a A , where | Θ | = 1.We thus have for t ∈ R S ( ψ a ( tA )) = ( λ + t Λ)( θ + tr Θ)( θ + tr Θ) T +( µ + tM )(Id − ( θ + tr Θ)( θ + tr Θ) T ) + O ( t )= S ( λ, µ, θ ) + t (Λ θθ T + M (Id − θθ T ) + r ( λ − µ )( θ Θ T + Θ θ T )) + O ( t )Therefore d a S ( A ) = Λ θθ T + M (Id − θθ T ) + r ( λ − µ )( θ Θ T + Θ θ T ) . (5.67)Choosing an orthonormal basis B of R d which begins with the unit orthogonal vectors θ and Θ, we obtain [ d a S ( A )] B = Λ r ( λ − µ ) 0 · · · r ( λ − µ ) M · · · M , which concludes the proof of this proposition. (cid:5) Proposition 5.3.4. Define V × := C ( S , R ) , equipped with the (cid:107) · (cid:107) ∞ norm, and define ϕ × : A → V × as follows : for any a ∈ A ϕ × ( a ) := ( u (cid:55)→ ln (cid:107) u (cid:107) H ( a ) ) . Then for all a, b ∈ A we have (cid:107) ϕ × ( a ) − ϕ × ( b ) (cid:107) ∞ = d × ( H ( a ) , H ( b )) . (5.68) Furthermore ϕ × ∈ C ( A , V × ) and for all a = ( λ, µ, θ ) ∈ A and all A = (Λ , M, r Θ) ∈ T a A ,with | Θ | = 1 , we have (cid:107) d a ϕ × ( A ) (cid:107) ∞ = ∆ (cid:18) Λ λ , r (cid:18) λµ − µλ (cid:19) , Mµ (cid:19) . (5.69) Proof: The expression (5.68) immediately follows from the expression (5.20) of the dis-tance d × . It is well known that the inverse map Inv : GL d → GL d is C and has the thefollowing differential : for any φ ∈ GL d and any Φ ∈ M d d φ Inv(Φ) = φ − Φ φ − . Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? Since S : A → S + d is C , as observed in the proof of Proposition 5.3.3, the compositionInv ◦ S : A → S + d is also C , and for any a = ( λ, µ, θ ) ∈ A and any A ∈ R × R × θ ⊥ d a (Inv ◦ S )( A ) = S ( a ) − ( d a S ( A )) S ( a ) − . For any u ∈ S we haveln (cid:107) u (cid:107) H ( a ) = ln (cid:107)S ( a ) − u (cid:107) = 12 ln (cid:0) (cid:104)S ( a ) − u, S ( a ) − u (cid:105) (cid:1) . Therefore, again by composition, d a (ln (cid:107) u (cid:107) H )( A ) = 12 (cid:107)S ( a ) − u (cid:107) ( (cid:104)S ( a ) − ( d a S ( A )) S − ( a ) u, S − ( a ) u (cid:105) + (cid:104)S − ( a ) u, S ( a ) − ( d a S ( A )) S − ( a ) u (cid:105) )= (cid:104) G ( a, A ) v, v (cid:105)(cid:107) v (cid:107) where v = S − ( a ) u . The function G has the following expression : for any a = ( λ, µ, θ ) ∈ R ∗ + × R ∗ + × S and any A = (Λ , M, r Θ) ∈ R × R × θ ⊥ , where | Θ | = 1, G ( a, A ) = S ( a ) − ( d a S ( A )) + ( d a S ( A )) S ( a ) − 2= Λ λ θθ T + Mµ (Id − θθ T ) + r (cid:18) λµ − µλ (cid:19) ( θ Θ T + Θ θ T ) , where we used the explicit expression (5.67) of d a S ( A ), and the fact that θ T Θ = 0.Choosing an orthonormal basis B of R d which begins with the unit orthogonal vectors θ and Θ, we obtain [ G ( a, A )] B = Λ λ r (cid:16) λµ − µλ (cid:17) · · · r (cid:16) λµ − µλ (cid:17) Mµ · · · Mµ . The map ϕ × is differentiable, as the composition of differentiable maps, and d a ( ϕ × ( A )) isthe element of V × := C ( S , R ) defined by u (cid:55)→ d a (ln (cid:107) u (cid:107) H )( A ) . Therefore (cid:107) d a ϕ × ( A ) (cid:107) ∞ = sup (cid:26) |(cid:104) G ( a, A ) v, v (cid:105)|(cid:107) v (cid:107) ; u ∈ S , v = S ( a ) − u (cid:27) = (cid:107) G ( a, A ) (cid:107) , which establishes (5.69) and concludes the proof of this proposition. (cid:5) We now consider a fixed z ∈ Ω and we remark thatdil z ( H ) × = dil z ( H ◦ a ) × = dil z ( ϕ × ◦ a ; d H ) . .3. Metrics having an eigenspace of dimension d − at each point (cid:107) · (cid:107) a ( z ) defined on the tangent space R × R × θ ( z ) ⊥ to a ( z ) ∈ A we havedil ∗ ( ϕ + ◦ ψ a ( z ) ; (cid:107) · (cid:107) a ( z ) ) dil z ( ψ − a ( z ) a ; d H , (cid:107) · (cid:107) a ( z ) ) ≤ dil z ( H ) × ≤ dil ( ϕ + ◦ ψ a ( z ) ; (cid:107) · (cid:107) a ( z ) ) dil z ( ψ − a ( z ) ◦ a ; d H , (cid:107) · (cid:107) a ( z ) ) (5.70)provided no indeterminate 0 × ∞ or ∞ × λ ( z ) (cid:54) = µ ( z ). For each a (cid:48) = ( λ (cid:48) , µ (cid:48) , θ (cid:48) ) ∈ A we define anorm (cid:107) · (cid:107) a (cid:48) on the tangent space R × R × θ (cid:48)⊥ to A at the point a (cid:48) : for all A = (Λ , M, Θ) ∈ R × R × θ (cid:48)⊥ (cid:107) A (cid:107) a (cid:48) = max (cid:26) | Λ | λ (cid:48) , | M | µ (cid:48) , (cid:12)(cid:12)(cid:12)(cid:12) λ (cid:48) µ (cid:48) − µ (cid:48) λ (cid:48) (cid:12)(cid:12)(cid:12)(cid:12) (cid:107) Θ (cid:107) (cid:27) . We have by construction dil z ( ψ − a ( z ) ◦ a ; d H , (cid:107) · (cid:107) a ( z ) ) = D z ( a ) × (5.71)According to Lemma 5.3.2 and Proposition 5.3.4 we have for any a (cid:48) = ( λ (cid:48) , µ (cid:48) , θ (cid:48) ) ∈ A andany A ∈ R × R × θ (cid:48)⊥ (cid:107) A (cid:107) a (cid:48) ≤ (cid:107) d a (cid:48) ϕ × ( A ) (cid:107) ∞ = (cid:107) d ( ϕ × ◦ ψ a (cid:48) ) (cid:107) ∞ ≤ (cid:107) A (cid:107) a (cid:48) . Since ϕ × is C it follows from (5.38) that1 ≤ dil ∗ ( ϕ × ◦ ψ a (cid:48) ; (cid:107) · (cid:107) a (cid:48) ) ≤ dil ( ϕ × ◦ ψ a (cid:48) ; (cid:107) · (cid:107) a (cid:48) ) ≤ . Combining (5.70), (5.71) and the last inequality we obtain D z ( a ) × ≤ dil z ( H ) × ≤ D z ( a ) × , which establishes the announced result (5.61) in the case λ ( z ) (cid:54) = µ ( z ). A similar reasoningestablishes the counterpart (5.62) of this inequality for the distance d + .We now consider the case λ ( z ) = µ ( z ). For any ε > a (cid:48) = ( λ (cid:48) , µ (cid:48) , θ (cid:48) ) ∈ A such that λ (cid:48) = µ (cid:48) we define a norm (cid:107) · (cid:107) a (cid:48) ,ε on the tangent space R × R × θ (cid:48)⊥ to a (cid:48) in A as follows : (cid:107) A (cid:107) a (cid:48) ,ε = max (cid:26) | Λ | λ (cid:48) , | M | µ (cid:48) , ε (cid:107) Θ (cid:107) (cid:27) . This modification is required because the original norm (cid:107) · (cid:107) a (cid:48) used in the case λ (cid:48) (cid:54) = µ (cid:48) isonly a semi-norm when λ (cid:48) = µ (cid:48) . Reasoning similarly to the case λ ( z ) (cid:54) = µ ( z ) we obtainthe upper bounddil z ( H ) × ≤ { dil z (ln λ ; d H ) , dil z (ln µ ; d H ) , ε dil z ( θ ; d H ) } . Since λ ( z ) = µ ( z ), the assumptions of the theorem state that dil z ( θ ; d H ) < ∞ . Since ε > z ( H ) × ≤ { dil z (ln λ ; d H ) , dil z (ln µ ; d H ) } = 2 D z ( a ) × . Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? We now turn to the lower bound on dil z ( H ) × . For that purpose we define the functions γ := min { λ, µ } and Γ := max { λ, µ } . For all z, z (cid:48) ∈ Ω we have according to (5.22) d × ( H ( z ) , H ( z (cid:48) )) ≥ max {| ln γ ( z ) − ln γ ( z (cid:48) ) | , | ln Γ( z ) − ln Γ( z (cid:48) ) |} . Therefore dil z ( H ) × = dil z ( H ; d × , d H ) ≥ max { dil z (ln γ ; d H ) , dil z (ln Γ; d H ) } . Hence according to Proposition 5.2.9dil z ( H ) × ≥ max { dil z (ln λ ; d H ) , dil z (ln µ ; d H ) } = D z ( a ) × . We have thus obtained D z ( a ) × ≤ dil z ( H ) × ≤ D z ( a ) × . Proceeding likewise we obtain D z ( a ) + ≤ dil z ( H ) + ≤ D z ( a ) + , which concludes the proof of Theorem 5.3.1.Our first corollary compares D z ( a ) × with the dilatation dil z ( θ ). Corollary 5.3.5. Under the hypotheses of Theorem 5.3.1 we have | λ ( z ) − µ ( z ) | dil z ( θ ) ≤ D z ( a ) × Proof: We have according to (5.35) | λ ( z ) − µ ( z ) | dil z ( θ ) ≤ | λ ( z ) − µ ( z ) | min { λ ( z ) , µ ( z ) } dil z ( θ ; d H )= (cid:18) max { λ ( z ) , µ ( z ) } min { λ ( z ) , µ ( z ) } − (cid:19) dil z ( θ ; d H ) ≤ (cid:12)(cid:12)(cid:12)(cid:12) λ ( z ) µ ( z ) − µ ( z ) λ ( z ) (cid:12)(cid:12)(cid:12)(cid:12) dil z ( θ ; d H ) ≤ D z ( a ) × which concludes the proof. (cid:5) The next corollary shows how to construct a quasi-acute metric from a graded one.Let M ∈ S d and let U ∈ O d and λ , · · · , λ d ∈ R be such that M = U T diag( λ , · · · , λ d ) U, (5.72)where diag( λ , · · · , λ d ) denotes the diagonal matrix of entries λ , · · · , λ d . For any α ∈ R we define the matrix max { α, M } ∈ S d as followsmax { α, M } := U T diag(max { α, λ } , · · · , max { α, λ d } ) U, (5.73)and we observe that max { α, M } does not depend on the choice of the matrix U and on theorder of the eigenvalues λ , · · · , λ d in the decomposition (5.72) of M . For any ( λ, µ, θ ) ∈ A and any s > { s − , H ( a ) } = min { s, λ } − θθ T + min { s, µ } − (Id − θθ T ) . (5.74) .3. Metrics having an eigenspace of dimension d − at each point Corollary 5.3.6. Let H ∈ H be such that H ( z ) has an eigenvalue of multiplicity at least d − for all z ∈ R d . We define s : R d → R ∗ + and H (cid:48) ∈ H as follows : for all z ∈ R d s ( z ) := inf z (cid:48) ∈ IR d | z − z (cid:48) | + (cid:107) H ( z (cid:48) ) − (cid:107) ,H (cid:48) ( z ) := max { s ( z ) − , H ( z ) } . If H ∈ H g , then H (cid:48) ∈ H a . Proof: We first observe that s : R d → R ∗ + is Lipschitz and that s ( z ) ≤ (cid:107) H ( z ) − (cid:107) for all z ∈ R d . For each ε > z ∈ R d we define H ε ( z ) := max { e ε s ( z ) − , H ( z ) } . Consider a fixed point z ∈ R d . If H ( z ) is proportional to Id, which means that (cid:107) H ( z ) (cid:107) − = (cid:107) H ( z ) − (cid:107) ≥ s ( z ), then H ε = e ε Id /s on a neighborhood of z . This impliesaccording to (5.45) dil z ( H ε ) × = dil z ( H ε ) + = dil z ( e − ε s ) ≤ e − ε ≤ . On the other hand if H ( z ) is not proportional to Id then there exists a ∈ C ( R d , A ), a ( z ) = ( λ ( z ) , µ ( z ) , θ ( z )) such that H = H ◦ a on a neighborhood of z , and D z ( a ) × ≤ z , H ε = H ◦ a ε = min { e − ε s, λ } − θθ T + min { e − ε s, µ } − (Id − θθ T ) , where a ε = (min { e − ε s, λ } , min { e − ε s, µ } , θ ) ∈ C ( R d , A ).We define H iε := e ε Id /s and we observe that H ε ≥ H iε and H ε ≥ H , hence for any p, q ∈ R d d H ε ( p, q ) ≥ max { d H ( p, q ) , d H iε ( p, q ) } . It follows from Proposition 5.2.5 thatdil z (ln min { e − ε s, λ } ; d H ε ) ≤ max { dil z (ln( e − ε s ); d H ε ) , dil z (ln λ ; d H ε ) }≤ max { dil z (ln( e − ε s ); d H iε ) , dil z (ln λ ; d H ) }≤ max { dil z ( e − ε s ) , dil z ( H ) × }≤ max { e − ε , } = 1 , where we used Proposition 5.2.7 and Theorem 5.3.1 in the third line. Proceeding likewisefor the other eigenvalue of H ε we conclude thatdil z (ln min { e − ε s, λ } ; d H ε ) ≤ z (ln min { e − ε s, µ } ; d H ε ) ≤ . (5.75)One easily checks that (cid:12)(cid:12)(cid:12)(cid:12) min { e − ε s, λ } min { e − ε s, µ } − min { e − ε s, µ } min { e − ε s, λ } (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12) λµ − µλ (cid:12)(cid:12)(cid:12)(cid:12) , Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? using that t (cid:55)→ t/µ − µ/t is increasing on R ∗ + . Therefore12 (cid:12)(cid:12)(cid:12)(cid:12) min { e − ε s, λ } min { e − ε s, µ } − min { e − ε s, µ } min { e − ε s, λ } (cid:12)(cid:12)(cid:12)(cid:12) dil z ( θ ; d H ε ) ≤ (cid:12)(cid:12)(cid:12)(cid:12) λµ − µλ (cid:12)(cid:12)(cid:12)(cid:12) dil z ( θ ; d H ) ≤ . Combining the last estimate with (5.75) we conclude that D z ( a ε ) × ≤ z ( H ε ) × ≤ z ( H ε ) + . Let us assume that λ ( z ) < µ ( z ), thendil z (min { e − ε s, λ } ) ≤ max (cid:8) dil z ( e − ε s ) , dil z ( λ ) (cid:9) ≤ max (cid:26) dil z ( e − ε s ) , dil z ( λ ; d H )min { λ ( z ) , µ ( z ) } (cid:27) = max (cid:26) dil z ( e − ε s ) , dil z ( λ ; d H ) λ ( z ) (cid:27) ≤ max { e − ε , dil z (ln λ ; d H ) }≤ max { e − ε , } = 1 , where we used (5.35) in the second line. Still assuming that λ ( z ) < µ ( z ), we have s ( z ) ≤(cid:107) H ( z ) − (cid:107) = µ ( z ) by construction of s , which implies that min { e − ε s, µ } = e − ε s on aneighborhood of z , and therefore dil z (min { e − ε s, µ } ) = dil z ( e − ε s ) ≤ e − ε ≤ . Proceedingsimilarly if λ ( z ) > µ ( z ) we also obtaindil z (min { e − ε s, λ } ) ≤ z (min { e − ε s, µ } ) ≤ . (5.76)We now focus on the local dilatation of the orientation θ , and for that purpose we observethat | min { e − ε s ( z ) , λ ( z ) } − min { e − ε s ( z ) , µ ( z ) }| ≤ | λ ( z ) − µ ( z ) | . Therefore according to Corollary 5.3.5 | min { e − ε s ( z ) , λ ( z ) } − min { e − ε s ( z ) , µ ( z ) }| dil z ( θ ) ≤ D z ( a ) × ≤ D z ( a ε ) + ≤ z ( H ε ) + ≤ z ( H ε ) × ≤ z ( H ε ) + ≤ , which implies that2 H ε ∈ H g and 4 H ε ∈ H a according to Proposition 5.2.5 and Remark 5.1.13. Hence forany z, z (cid:48) ∈ R d , since H ε ≤ e ε H (cid:48) , d × ( H ε ( z ) , H ε ( z (cid:48) )) ≤ d H ε ( z, z (cid:48) ) ≤ e ε d H (cid:48) ( z, z (cid:48) ) and d + ( H ε ( z ) , H ε ( z (cid:48) )) ≤ | z − z (cid:48) | . Letting ε → d × ( H (cid:48) ( z ) , H (cid:48) ( z (cid:48) )) ≤ d H (cid:48) ( z, z (cid:48) ) and d + ( H (cid:48) ( z ) , H (cid:48) ( z (cid:48) )) ≤ | z − z (cid:48) | . Therefore 4 H ∈ H a according to Remark 5.1.13, which concludes the proof of this corol-lary. (cid:5) .4. From mesh to metric The techniques presented in this section show how to produce an metric H ∈ H i , H a or H g from an isotropic, quasi-acute (in two dimensions), or graded mesh T . These resultsare summarized in Proposition 5.4.1 below.Combining this proposition with its counterpart Proposition 5.5.1 in the next section,on the generation of a mesh from a metric, yields the main result of this chapter, Theorem5.1.14, which states the equivalence of the classes T i,C ⊂ T a,C ⊂ T g,C of meshes and H i ⊂ H a ⊂ H g of metrics. Proposition 5.4.1. For any C > there exits C = C ( C , d ) such that the followingholds :i) For any T ∈ T i,C there exists H ∈ H i which is C -adapted to T .ii) If d = 2 , then for any T ∈ T a,C there exists H ∈ H a which is C -adapted to T .iii) For any T ∈ T g,C there exists H ∈ H g which is C -adapted to T . The rest of this section is devoted to the proof of this proposition. For any mesh T ∈ T ,and for any vertex v of T , we denote by n T ( v ) the number of simplices in the mesh T containing v : n T ( v ) := { T ∈ T ; v ∈ T } . We denote by H g T ∈ H the piecewise linear metric on the mesh T which satisfies for eachvertex v of T H g T ( v ) := 1 n T ( v ) (cid:88) T ∈T s.t. v ∈ T H T . (5.77)The next lemma establishes a property of the barycentric coordinates on a simplex,which is useful for the analysis of the regularity of a piecewise linear function on ananisotropic mesh, such as the metric H g T . Lemma 5.4.2. Let T be a simplex and let V be the collection of vertices of T . Let ( λ v ) v ∈ V be the barycentric coordinates on T . Then for all z, z (cid:48) ∈ T (cid:107) z − z (cid:48) (cid:107) H T = d + 1 d (cid:88) v ∈ V | λ v ( z ) − λ v ( z (cid:48) ) | . (5.78) Proof: We first assume that T is the reference equilateral simplex T eq . For any two vertices v, v (cid:48) in the collection V eq of vertices of T eq one has (cid:104) v, v (cid:48) (cid:105) = (cid:26) v = v, (cid:48) − /d if v (cid:54) = v (cid:48) . (5.79)Indeed | v | = 1 by construction for any vertex v of T eq , and for any v (cid:48) ∈ V eq distinct from v the scalar product (cid:104) v, v (cid:48) (cid:105) has a value α independent of v and v (cid:48) by symmetry. Since T eq Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? is centered at the origin we obtain 0 = (cid:104) v, (cid:80) v (cid:48) ∈ V v (cid:48) (cid:105) = | v | + dα which establishes (5.79).We thus obtain for any z, z (cid:48) ∈ T eq | z − z (cid:48) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:88) v ∈ V eq ( λ v ( z ) − λ v ( z (cid:48) )) v (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:88) v ∈ V eq | λ v ( z ) − λ v ( z (cid:48) ) | − d (cid:88) v (cid:54) = v (cid:48) ( λ v ( z ) − λ v ( z (cid:48) ))( λ v (cid:48) ( z ) − λ v (cid:48) ( z (cid:48) )) . Therefore | z − z (cid:48) | = (cid:18) d (cid:19) (cid:88) v ∈ V eq | λ v ( z ) − λ v ( z (cid:48) ) | − d (cid:88) v ∈ V eq λ v ( z ) − λ v ( z (cid:48) ) , which concludes the proof in the case of T eq since H T eq = Id and since the barycentriccoordinates of z and of z (cid:48) sum to 1.We now consider an arbitrary simplex T and an affine change change of coordinatesΦ such that Φ( T ) = T eq , Φ( z ) = φz + z where φ ∈ GL d and z ∈ R d . We have H T = φ T φ according to (5.5), hence for all z, z (cid:48) ∈ T | Φ( z ) − Φ( z (cid:48) ) | = | φ ( z − z (cid:48) ) | = (cid:107) z − z (cid:48) (cid:107) H T . Furthermore the barycentric coordinates of z and z (cid:48) in T are the same as those of Φ( z )and Φ( z (cid:48) ) in Φ( T ) = T eq . This implies (5.78) and concludes the proof of this lemma. (cid:5) The next proposition immediately implies Point iii) of Proposition 5.4.1. Indeed forany mesh T ∈ T g,C the metric dC H g T belongs to H g and is C √ d equivalent to T . Proposition 5.4.3. For any T ∈ T g,C the metric H g T is C -adapted to T , and dC H g T ∈ H g . Proof: We consider a simplex T ∈ T , and we recall that for any neighbor T (cid:48) of T one has C − H T ≤ H T (cid:48) ≤ C H T . Averaging these matrices as in (5.77) we obtain for any vertex v of TC − H T ≤ H g T ( v ) ≤ C H T . (5.80)Averaging with respect to the barycentric coordinates of a point z ∈ T we obtain C − H T ≤ H g T ( z ) ≤ C H T , (5.81)which establishes as announced that the metric H g T is C -adapted to T . .4. From mesh to metric H g T , and for that purpose we consider twopoints z, z (cid:48) ∈ T and a vector u ∈ R d \ { } . We thus have (cid:107) u (cid:107) H g T ( z ) (cid:107) u (cid:107) H g T ( z (cid:48) ) = (cid:80) v ∈ V λ v ( z ) (cid:107) u (cid:107) H g T ( v ) (cid:80) v ∈ V λ v ( z (cid:48) ) (cid:107) u (cid:107) H g T ( v ) = 1 + (cid:80) v ∈ V ( λ v ( z ) − λ v ( z (cid:48) )) (cid:107) u (cid:107) H g T ( v ) (cid:80) v ∈ V λ v ( z (cid:48) ) (cid:107) u (cid:107) H g T ( v ) ≤ C (cid:80) v ∈ V | λ v ( z ) − λ v ( z (cid:48) ) | (cid:80) v ∈ V λ v ( z (cid:48) ) ≤ C (cid:115) ( d + 1) (cid:88) v ∈ V | λ v ( z ) − λ v ( z (cid:48) ) | = 1 + C √ d (cid:107) z − z (cid:48) (cid:107) H T , where we used (5.80) in the third line, the Cauchy-Schwartz inequality in the fourth line,and Lemma 5.4.2 in the fifth line. Proceeding similarly for (cid:107) u (cid:107) H g T ( z (cid:48) ) / (cid:107) u (cid:107) H g T ( z ) we obtain d × ( H g T ( z ) , H g T ( z (cid:48) )) = max u (cid:54) =0 (cid:12)(cid:12)(cid:12) ln (cid:107) u (cid:107) H g T ( z ) − ln (cid:107) u (cid:107) H g T ( z (cid:48) ) (cid:12)(cid:12)(cid:12) ≤ 12 ln(1 + C √ d (cid:107) z − z (cid:48) (cid:107) H T ) ≤ C √ d (cid:107) z − z (cid:48) (cid:107) H T . Hence according to (5.81) and (5.35) for all z ∈ int( T )dil z ( H g T ) × = dil z ( H g T ; d × , (cid:107) · (cid:107) H ( z ) ) ≤ C dil z ( H g T ; d × , (cid:107) · (cid:107) H T ) ≤ C √ d/ . If follows from Corollary 5.2.6 that dil z ( H g T ) × ≤ C √ d/ z ∈ R d , and from Remark5.1.13 that ( dC / H ∈ H g , which concludes the proof. (cid:5) We associate to any mesh T ∈ T a metric H i T defined as follows : for all z ∈ R d , H i T ( z ) := (cid:107) H g T ( z ) (cid:107) Id . The next result immediately implies Point i) of Proposition 5.4.1. Indeed for any mesh T ∈ T i,C the metric dC H i T belongs to H i and is C √ d equivalent to the mesh T . Corollary 5.4.4. For any mesh T ∈ T i,C the metric H i T is C -adapted to the mesh T ,and dC H i T ∈ H i . Proof: We have for any simplex T ∈ T C − (cid:107)H T (cid:107) Id ≤ (cid:107)H T (cid:107) ρ ( T ) Id = (cid:107)H − T (cid:107) − Id ≤ H T ≤ (cid:107)H T (cid:107) Id . (5.82)38 Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? For any point z ∈ T we obtain since C − H g T ( z ) ≤ H T ≤ C H g T ( z ) C − H i T ( z ) = C − (cid:107) H g T ( z ) (cid:107) Id ≤ (cid:107)H T (cid:107) Id ≤ C (cid:107) H g T ( z ) (cid:107) = C H i T ( z ) . Combining this with (5.82) we obtain C − H i T ( z ) ≤ H T ≤ C H i T ( z ) , which establishes as announced that the metric H i T is C equivalent to the mesh T .According to Proposition 5.4.3 we have dC H g T ∈ H g . Corollary 5.2.8 thus impliesthat dC H i T ∈ H i , which concludes the proof of this proposition. (cid:5) Before turning to the case of quasi-acute metrics we prove a lemma on the neighbo-rhood of a simplex in a graded mesh. For any mesh T ∈ T and any closed set E ⊂ R d the neighborhood V T ( E ) of E in T is defined as the union of all simplices T ∈ T whichintersect E : V T ( E ) := (cid:91) T ∈T T ∩ E (cid:54) = ∅ T. Proposition 5.4.5. Let T ∈ T g,C and let T ∈ T . For all x ∈ T and all y ∈ R d \ V T ( T ) we have (cid:107) x − y (cid:107) H T ≥ ( C √ d ) − . Proof: We consider the function ϕ : R d → R which is piecewise linear on the mesh T ,and which satisfies for any vertex v of the mesh T ϕ ( v ) = (cid:26) v ∈ T, v / ∈ T. We denote by V (cid:48) the collection of vertices of a simplex T (cid:48) ∈ T , and by ( λ v ( z )) v ∈ V (cid:48) thebarycentric coordinates of a point z ∈ T (cid:48) . We have for any z, z (cid:48) ∈ T (cid:48) | ϕ ( z ) − ϕ ( z (cid:48) ) | ≤ (cid:88) v ∈ V (cid:48) | λ v ( z ) − λ v ( z (cid:48) ) | ϕ ( v ) ≤ (cid:115) ( d + 1) (cid:88) v ∈ V (cid:48) | λ v ( z ) − λ v ( z (cid:48) ) | = √ d (cid:107) z − z (cid:48) (cid:107) H T (cid:48) . Since ϕ is identically 0 on T (cid:48) if T ∩ T (cid:48) = ∅ , and since H T (cid:48) ≤ C H T otherwise, we obtainfor any z, z (cid:48) ∈ T (cid:48) | ϕ ( z ) − ϕ ( z (cid:48) ) | ≤ C √ d (cid:107) z − z (cid:48) (cid:107) H T . Therefore dil z ( ϕ, (cid:107) · (cid:107) H T ) for any T (cid:48) ∈ T and any z ∈ T (cid:48) , which implies that ϕ :( R d , (cid:107) · (cid:107) H T ) → R is C √ d -Lipschitz according to Corollary 5.2.6. For any x ∈ T and any y ∈ R d \ V T ( T ) we have ϕ ( x ) = 1 and ϕ ( y ) = 0, thus 1 = ϕ ( x ) − ϕ ( y ) ≤ C √ d (cid:107) x − y (cid:107) H T which concludes the proof of this proposition. (cid:5) .4. From mesh to metric d = 2. For each triangulation T ∈ T we define a function s T : R → R ∗ + and a metric H a T ∈ H as follows : for all z ∈ R s T ( z ) := inf z (cid:48) ∈ IR | z − z (cid:48) | + (cid:107) H g T ( z (cid:48) ) − (cid:107) , (5.83) H a T ( z ) := max { s T ( z ) − , H g T ( z ) } , (5.84)where the maximum of a real and of a symmetric matrix is defined at (5.73). The nextproposition immediately implies Point ii) of Proposition 5.4.1. Indeed the metric 32 C H a T belongs to H a and is √ CC adapted to the mesh T . Proposition 5.4.6. For any C ≥ there exists C = C ( C ) such that the following holds.For any mesh T ∈ T a,C the metric H a T is C -adapted to the mesh T , and C H a T ∈ H a . The fact that 32 C H a T ∈ H a directly follows from the fact that 2 C H g T ∈ H g andfrom Corollary 5.3.6.We define d T ( z ) := (cid:107) H g T ( z ) − (cid:107) for each z ∈ R . We establish below that there existsa constant η = η ( C ), 0 < η ≤ 1, such that for all z, z (cid:48) ∈ R if | z (cid:48) − z | ≤ ηd T ( z ) then d T ( z (cid:48) ) ≥ ηd T ( z ) . (5.85)Before turning to the proof of this property we show how it leads to the proof of Propo-sition 5.4.6. For all z ∈ R d we have s T ( z ) = inf z (cid:48) ∈ IR | z − z (cid:48) | + d T ( z (cid:48) ) ≥ inf z (cid:48) ∈ IR min {| z − z (cid:48) | , d T ( z (cid:48) ) } ≥ ηd T ( z (cid:48) ) , indeed | z − z (cid:48) | ≥ ηd T ( z ) or d T ( z (cid:48) ) ≥ ηd T ( z ) for any z (cid:48) ∈ R d according to (5.85). Combiningthis with the definition (5.84) of H a T we obtain for all z ∈ R η H a T ( z ) ≤ H g T ( z ) ≤ H a T ( z ) . Recalling that H g T is C -adapted to the mesh T , see proposition 5.4.3, we obtain for all T ∈ T and all z ∈ Tη C − H a T ( z ) ≤ C − H g T ( z ) ≤ H T ≤ C H g T ( z ) ≤ C H a T ( z )which establishes that the metric H a T is C /η adapted to the mesh T , and concludes theproof of Proposition 5.4.6.For notational simplicity we denote from this point H := H g T and C := √ C , hence C H ∈ H g and T is C adapted to H . It follows from (5.10) that C − (cid:107) H ( z ) − (cid:107) ≤ (cid:107)H − T (cid:107) ≤ diam( T ) ≤ (cid:107)H − T (cid:107) ≤ C (cid:107) H ( z ) − (cid:107) , for any T ∈ T and z ∈ T , which impliesdiam( T ) / (2 C ) ≤ d T ( z ) ≤ C diam( T ) . (5.86)40 Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? ( e ) edge e ω + u ω − u ω v e ω + u ω − u ω v v ( x , y ) ω + u ω − u ω z ( x ′ , y ′ ) Figure e and the set ♦ ( e ) (left), sequence ( z n ) n ≥ (center), diameter of atriangle intersecting ♦ ( e ) (right).The quantity d T ( z ) should therefore be heuristically regarded as the diameter of thetriangle T ∈ T containing z . Property (5.85) heuristically states that the diameters areconstrained to vary in a Lipschitz manner in a quasi-acute triangulation.We now turn to the proof of (5.85), and for that purpose we introduce some notations.For each u ∈ R , we denote by u ⊥ the vector obtained by rotating u by π / e = [ ω − u, ω + u ] ⊂ R we define the diamond ♦ ( e ) of e as follows ♦ ( e ) := { z ∈ R ; |(cid:104) z − ω, u (cid:105)| + C |(cid:104) z − ω, u ⊥ (cid:105)| < | u | } . This set is illustrated on Figure 5.3 (left).The next lemma considers a triangulation T (cid:48) on which the measure of sliverness isuniformly bounded by C , and compares the length an edge e of T (cid:48) with the diameter ofthe triangles in T (cid:48) which intersect ♦ ( e ). Lemma 5.4.7. Let T (cid:48) ∈ T be a triangulation such that S ( T ) ≤ C for all T ∈ T (cid:48) . Let e = [ ω − u, ω + u ] be an edge of T (cid:48) . Then there is no vertex of T (cid:48) in the diamond ♦ ( e ) ,and for all z ∈ ♦ ( e ) and any triangle T ∈ T (cid:48) containing z , we have diam( T ) ≥ | u | − C |(cid:104) z − ω, u ⊥ / | u |(cid:105)| . Proof: We denote by u x := (1 , 0) and u y := (0 , 1) the canonical basis of R , and weassume up to a translation and a rotation that ω = 0 and u = u y . We recall that for anytriangle T with maximal angle θ one has S ( T ) = max { , tan( θ/ } . We assume for contradiction that there exists a vertex v = ( x , y ) ∈ ♦ ( e ), and weassume without loss of generality that x < 0. We consider a vertex v = ( x , y ) of T (cid:48) such that (cid:104) v − v , u x (cid:105) ≥ θ := (cid:30) ( v − v , u x ) = arccos (cid:18) (cid:104) v − v , u x (cid:105)| v − v || u x | (cid:19) has the smallest possible value. By construction there exists a triangle T ∈ T containing v and v which has an angle larger or equal or equal than 2 θ at the vertex v . It follows .4. From mesh to metric δ (cid:3) δ (cid:3) δ (cid:3) ( e ) Figure e , the diamond ♦ ( e ) and the rectangle (cid:3) δ involved in Lemma 5.4.8(left), the rectangles (cid:3) ( e ) and (cid:3) δ .that tan θ ≤ S ( T ) ≤ C , and therefore | y − y | ≤ ( x − x ) tan θ ≤ C ( x − x ) . Proceeding inductively, as illustrated on Figure 5.3 (center), we define a sequence v n =( x n , y n ), n ≥ 0, of vertices of T satisfying | y n +1 − y n | ≤ C ( x n +1 − x n ) . Adding theseinequalities together we obtain | y n − y | ≤ C ( x n − x ) . It follows that one of the edges [ v n , v n +1 ], n ≥ 0, of the mesh T (cid:48) intersects the edge e ,which is a contradiction. As announced the diamond ♦ ( e ) therefore does not contain anyvertex.We now consider a point z = ( x , y ) ∈ ♦ ( e ) and we assume without loss of generalitythat x ≥ 0. We consider a triangle T containing z , and an edge e (cid:48) = [ ω (cid:48) − u (cid:48) , ω (cid:48) + u (cid:48) ] of T which is on the left of z (in the sense that the edge e (cid:48) and the segment [(0 , y ) , ( x , y )]joining the point z to its orthogonal projection on the edge e intersect). The point z andthe triangle T are illustrated on Figure 5.3 (right).Since there is no vertex of T (cid:48) in the diamond ♦ ( e ), the edge e (cid:48) intersects the boundary of ♦ ( e ) at two points. We denote these points by ( x, y ) and ( x (cid:48) , y (cid:48) ), and we assume withoutloss of generality that y ≥ ≥ y (cid:48) . Observing that 0 ≤ x ≤ C − , 0 ≤ x (cid:48) ≤ C − andmin { x, x (cid:48) } ≤ x we obtain y − y (cid:48) = (1 − C x ) − ( − C x (cid:48) ) ≥ − C x . Since u = u y = (0 , ω = 0 and 2 u (cid:48) = λ ( x − x (cid:48) , y − y (cid:48) ) for some | λ | ≥ 1, we obtaindiam( T ) = 2 | u (cid:48) | ≥ |(cid:104) u, u (cid:48) (cid:105)| = y − y (cid:48) ≥ − C x = | u | − C |(cid:104) z − ω, u ⊥ (cid:105)| , which concludes the proof of this proposition. (cid:5) From this point we denote by T (cid:48) a C -refinement of the quasi-acute mesh T ∈ T a,C ,which satisfies S ( T ) ≤ C for all T ∈ T (cid:48) . The next lemma involves a rectangular set (cid:3) ( e )42 Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? associated to an edge e of T (cid:48) , illustrated on Figure 5.4 (right). Note that e ⊂ int( (cid:3) ( e )),while e (cid:54)⊂ ♦ ( e ).It follows from (5.22) and Proposition 5.2.10 that for all z, u ∈ R d d T ( z + u ) ≥ (1 − C (cid:107) u (cid:107) H ( z ) ) d T ( z ) . (5.87) Lemma 5.4.8. There exists a constant δ = δ ( C , d ) > such that the following holds.For each edge e = [ ω − u, ω + u ] of T (cid:48) , let (cid:3) ( e ) be the rectangle defined by (cid:3) ( e ) = { z ∈ R ; |(cid:104) z − ω, u (cid:105)| ≤ (1 + δ ) | u | and |(cid:104) z − ω, u ⊥ (cid:105)| ≤ δ | u | } . One has d T ( z ) ≥ δ | u | for all z ∈ (cid:3) ( e ) . Proof: We denote by u x := (1 , 0) and u y := (0 , 1) the canonical basis of R . Up to arotation and a translation, we may assume that ω = 0 and u = u y = (0 , T ∈ T containing the edge e (note that e ⊂ T but that e may not be an edge of T , since e is an edge in the triangulation T (cid:48) but may not be an edge in the triangulation T ). It follows from (5.86) that d T ( z ) ≥ diam( T ) / (2 C ) ≥ diam( e ) / (2 C ) = C − for all z ∈ T , hence for all z ∈ e . Hence for all z ∈ e and all u ∈ R d , we obtain using(5.87) d T ( z + u ) ≥ (1 − C (cid:107) u (cid:107) H ( z ) ) C − ≥ (1 − C C (cid:107) u (cid:107) H T ) C − . We define C := 4 C C and we observe that, since (cid:107) u y (cid:107) H T ≤ 1, any point z (cid:48) of therectangle (cid:20) − C (cid:107) u x (cid:107) H T , C (cid:107) u x (cid:107) H T (cid:21) × (cid:2) − (1 + C − ) , C − (cid:3) . (5.88)can be written under the form z (cid:48) = z + u where z ∈ e and (cid:107) u (cid:107) H T ≤ / (2 C C ), andtherefore d T ( z (cid:48) ) ≥ / (2 C ) . We introduce a small constant δ = δ ( C ) > (cid:107)H T (cid:107) ≤ δ − , with δ := min { δ/C , / (2 C ) } . We thus assume from this point that (cid:107)H T (cid:107) ≥ δ − .We assume that δ < C − and we define the rectangle (cid:3) δ := [ − δ, δ ] × [1 − C δ, − C δ ] , which is included in ♦ ( e ) by construction. Since C H ∈ H g , the function z (cid:55)→ (cid:107) H ( z ) (cid:107) − is C -Lipschitz according to Corollary 5.2.8. We thus have for all z = ( x, y ) ∈ (cid:3) δ (cid:107) H ( z ) (cid:107) − ≤ (cid:107) H (0 , y ) (cid:107) − + C | x | ≤ C (cid:107)H T (cid:107) − + C | x | ≤ r ( δ ) := ( C + C ) δ. (5.89)On the other hand it follows from Lemma 5.4.7 that any z = ( x, y ) ∈ (cid:3) δ is contained ina triangle T (cid:48) such that diam( T (cid:48) ) ≥ (1 − C δ ), hence according to (5.86) d T ( z ) ≥ diam( T (cid:48) ) / (2 C ) ≥ R ( δ ) := (1 − C δ ) / (2 C ) . (5.90) .4. From mesh to metric z ∈ (cid:3) δ , we denote by θ ( z ) = (cos Θ( z ) , sin Θ( z )), Θ( z ) ∈ [0 , π ], the direction ofthe eigenvector associated to the small eigenvalue of H ( z ), in such way that (cid:107) θ ( z ) (cid:107) H ( z ) = (cid:107) H ( z ) − (cid:107) − = d T ( z ) − . (5.91)According to (5.87) we have for all z ∈ (cid:3) δ and all r ∈ R d T ( z + rθ ( z )) ≥ d T ( z )(1 − C | r | /d T ( z )) . For all z ∈ (cid:3) δ and all r ∈ R such that | r | ≤ R ( δ ) := R ( δ ) / (2 C ) we thus obtain d T ( z + rθ ( z )) ≥ R ( δ )(1 − C | r | /R ( δ )) ≥ R ( δ ) / . (5.92)We have for any (0 , y ) ∈ eC ≥ C (cid:107) u y (cid:107) H T ≥ (cid:107) u y (cid:107) H (0 ,y ) ≥ | cos Θ(0 , y ) | (cid:112) (cid:107) H (0 , y ) (cid:107) ≥ | cos Θ(0 , y ) | / ( δC ) , hence | cos Θ(0 , y ) | ≤ C δ. For all z ∈ (cid:3) δ we have according to Corollary 5.3.5( R ( δ ) − r ( δ )) dil z (Θ) ≤ ( (cid:107) H ( z ) − (cid:107) − (cid:107) H ( z ) (cid:107) − ) dil z (Θ) ≤ C . Hence we obtain for all z = ( x, y ) ∈ (cid:3) δ ( R ( δ ) − r ( δ )) | Θ( z ) − Θ(0 , y ) | ≤ C | x | ≤ C δ, hence since t (cid:55)→ cos t is a Lipschitz function | cos Θ( z ) | ≤ cos Θ ( δ ) := C δ + 2 C δR ( δ ) − r ( δ ) . (5.93)It follows that d T ( z (cid:48) ) ≥ R ( δ ) / z (cid:48) in the set { z + ˜ rθ ( z ) ; z ∈ (cid:3) δ , | ˜ r | ≤ R ( δ ) } ⊃ [ − δ , δ ] × [ − (1 + µ ) , µ ]where δ := δ − r cos Θ ( δ ) and µ := − C δ + r sin Θ ( δ ) (5.94)and where the constant 0 ≤ r ≤ R ( δ ) can be freely chosen. The rest of the proof consistsin choosing appropriate constants δ and r .As δ → r ( δ ) → R ( δ ) → / (2 C ) and cos Θ ( δ ) → 0. We thus choose δ = δ ( C ) > R ≥ / (4 C ) , cos Θ ≤ / (8 C ) , sin Θ ≥ / . (5.95)and such that r := 4 C δ , is smaller than R ( δ ) = R ( δ ) / (2 C ). Injecting this choice of r and (5.95) in (5.94) we thus obtain δ ≥ δ/ µ ≥ C δ. Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? This establishes under the hypothesis (cid:107)H T (cid:107) ≥ δ − that d T ( z ) ≥ / (8 C ) for all z in therectangle [ − δ/ , δ/ × [ − (1 + C δ ) , C δ ] , which concludes the proof of the proposition. (cid:5) Our next intermediate result establishes a technical property on the covering of asegment by smaller ones. Lemma 5.4.9. Let x , · · · , x k ∈ [0 , and let h , · · · , h k > be such that [0 , ⊂ (cid:91) ≤ i ≤ k [ x i − h i , x i + h i ] . (5.96) Let δ > and let h ∗ = h ∗ ( k, δ ) := exp(1 − k/δ ) /δ . Then [0 , ⊂ (cid:91) ≤ i ≤ k s.t. h i ≥ h ∗ [ x i − (1 + δ ) h i , x i + (1 + δ ) h i ] . (5.97) Proof: For each y ∈ [0 , 1] we denote by h ( y ) the value h i associated to the segment[ x i − h i , x i + h i ] of smallest length containing y . We thus have (cid:90) [0 , dyh ( y ) ≤ (cid:88) ≤ i ≤ k (cid:90) x i + h i x i − h i h i = 2 k. We assume for contradiction that (5.97) does not hold, and we consider a point x ∈ [0 , , y ∈ [0 , 1] there exists i ,1 ≤ i ≤ k , h ( y ) = h i , such that y ∈ [ x i − h i , x i + h i ] and x / ∈ [ x i − (1 + δ ) h i , x i + (1 + δ ) h i ].Therefore one of the two following possibilities holds δh ( y ) < | x − y | or h ( y ) < h ∗ . Therefore h ( y ) ≤ max {| x − y | /δ, h ∗ } , which implies (cid:90) dyh ( y ) ≥ (cid:90) dy max {| x − y | /δ, h ∗ } ≥ (cid:90) δh ∗ h ∗ + (cid:90) δh ∗ δy dy = δ (1 − ln( δh ∗ )) . Note that the second inequality is an equality if x = 0 or 1. Thus 2 k > δ (1 − ln( δh ∗ )),which contradicts the definition of h ∗ . This concludes the proof of this lemma. (cid:5) Observe that under the hypotheses of Lemma 5.4.9 we have[ − h ∗ δ, h ∗ δ ] ⊂ (cid:91) ≤ i ≤ k s.t. h i ≥ h ∗ [ x i − (1 + 2 δ ) h i , x i + (1 + 2 δ ) h i ] . (5.98)We now conclude the proof of Proposition 5.4.6. We consider a point z ∈ R containedin a triangle T ∈ T , and we denote by e the longest edge of T . Up to a rotation and a .4. From mesh to metric e Te e e h < h ∗ e e e e (cid:3) ( e ) (cid:3) ( e ) (cid:3) ( e ) Figure T ∈ T and the edges ( e i ) ≤ i ≤ k of T (cid:48) contained in T (left), theset (5.99) (right).translation, we may assume that e = [0 , u x ] joins the origin of R to u x := (1 , T (cid:48) is a C -refinement of T , the edge e of T is built of k ≤ C edges ( e i ) ≤ i ≤ k of T (cid:48) , e i = [( x i − h i ) u x , ( x i + h i ) u x ] . The triangle T and the edges ( e i ) ≤ i ≤ k are illustrated on Figure 5.5 (left).We define h ∗ := h ∗ ( C , δ / 2) = 2 exp(1 − C /δ ) /δ , where δ is the constant fromLemma 5.4.8, and we obtain using (5.98) that (cid:91) h ∗ ≤ h i (cid:3) ( e i ) ⊃ (cid:20) − h ∗ δ , h ∗ δ (cid:21) × [ − h ∗ δ , h ∗ δ ] , (5.99)This set is illustrated on Figure 5.5 (right). Since e is the longest edge of T , the angles of T at the extremities of e are acute. Furthermore the height h of T orthogonal to the edge e = [0 , u x ] satisfies h/ | T | = | T eq |√ det H T = | T eq | (cid:113) (cid:107)H T (cid:107)(cid:107)H − T (cid:107) − = | T eq |(cid:107)H − T (cid:107) ρ ( T ) ≥ | T eq | ρ ( T ) , where we used that (cid:107)H − T (cid:107) ≥ diam( T ) / ≥ / c := | T eq | / T ⊂ [0 , × [ − c/ρ ( T ) , c/ρ ( T )] (5.100)We distinguish two cases depending on the value of ρ ( T ). If ρ ( T ) ≥ ρ := 2 c/ ( h ∗ δ ) weobtain combining (5.99) and (5.100) that d T ≥ h ∗ δ on the set z + (cid:20) − h ∗ δ , h ∗ δ (cid:21) . Since d T ( z ) ≤ C diam( T ) = C this implies (5.85) with η = h ∗ δ / (2 C ). One the otherhand if ρ ( T ) ≤ ρ , then (cid:107) H ( z ) (cid:107) ≤ C (cid:107)H T (cid:107) ≤ ρ C (cid:107)H − T (cid:107) − ≤ ρ C (cid:107) H ( z ) − (cid:107) − ≤ C ρ d T ( z )46 Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? Figure u ∈ R d T ( z + u ) d T ( z ) ≥ − C (cid:107) u (cid:107) H ( z ) ≥ − C ∗ | u | d T ( z ) , where C ∗ := C ρ C . We thus obtain (5.85) with the constant η := 1 / (2 C ∗ ), whichconcludes the proof of Proposition 5.4.6. This section is devoted to the proof of the following result, on the generation of a meshfrom a metric. Proposition 5.5.1. There exists C = C ( d ) such that the following holds :i) For any H ∈ H i there exists T ∈ T i,C which is C -adapted to H .ii) If d = 2 , then for any H ∈ H a there exists T ∈ T a which is C -adapted to T .iii) If d = 2 , then for any H ∈ H g there exists T ∈ T g which is C -adapted to T . Point i) is established in § iii) in § ii) in § Remark 5.5.2. Let (cid:63) ∈ { i, a, g } be a symbol, and let ≥ c > and C ≥ be numericalconstants. Assume that for any metric H such that c H ∈ H (cid:63) there exists a mesh T ∈ T (cid:63),C which is C adapted to H . Then for any metric H (cid:48) ∈ H (cid:63) there exists a mesh T (cid:48) ∈ T (cid:63),C which is C/c -adapted to H (cid:48) . (Indeed, choose a mesh T (cid:48) which is C -equivalent to the metric H (cid:48) /c , and observe that T (cid:48) is C/c -equivalent to H (cid:48) .) The survey paper [77] describes a the construction of a hierarchical family of meshesof the rectangular domain [0 , d . All these meshes are the refinement of a “fundamental” .5. From metric to mesh T = { T σ ; σ ∈ Σ d } , where Σ d stands for the collection of permutations of the set { , · · · , d } , and T σ for the Kuhn simplex T σ := { z ∈ [0 , d ; z σ (1) ≤ · · · ≤ z σ ( d ) } . Starting from T , the algorithm described in [77] produces, guided by the user, a family( T n ) n ≥ of conforming meshes of the cube [0 , d proceeding as follows. Assume that themesh T n has already been generated.– (Marking) The user selects a subset M n ⊂ T n .– (Refinement) The algorithm generates a 2-refinement T n +1 of T n , T n +1 := Refine( T n , M n ) . which does not contain any of the marked simplices : T n +1 ∩ M n = ∅ . The elements of the meshes ( T n ) n ≥ have a very specific structure, which main featuresare described below. Property 5.5.3. Any simplex T produced by this algorithm satisfies the following pro-perties.a) (Discrete Volumes) The volume of T has the form | T | = 2 − g ( T ) d ! , (5.101) where g ( T ) ≥ is an integer called the “generation” of T .b) (Finite number of classes) The simplex g ( T ) d ( T − z T ) , belongs to a finite family T of simplices. The survey paper [77] also contains a result, Proposition 4.1, which establishes that T n +1 := Refine( T n , M n ) is a local refinement of T n in the following sense : for all T (cid:48) ∈T n \ T n +1 there exists T ∈ M n such that g ( T ) ≥ g ( T (cid:48) ) and d ( T, T (cid:48) ) ≤ D − g ( T (cid:48) ) /d , (5.102)where D = D ( d ) is a fixed constant and where d ( T, T (cid:48) ) := min {| z − z (cid:48) | ; z ∈ T, z (cid:48) ∈ T (cid:48) } . The next lemma shows that this refinement algorithm can be used to produce a confor-ming isotropic mesh T of the unit cube [0 , d , such that the diameters diam( T ), T ∈ T ,are specified by a given Lipschitz function.48 Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? Lemma 5.5.4. Let s : [0 , d → (0 , √ d ] be a c -lipschitz function, where < c ≤ . Let ( T n ) n ≥ be the sequence of meshes produced by the algorithm when the collection M n ⊂ T n of marked simplices is defined as follows : M n := (cid:110) T ∈ T n ; min T s < diam( T ) (cid:111) . (5.103) Then the sequence of meshes stabilizes : there exists N ≥ such that T n = T N for all n ≥ N . Furthermore we have for all T ∈ T N and all z ∈ T diam( T ) ≤ s ( z ) ≤ K c diam( T ) , (5.104) where K c := 2 d D + (1 + c ) + cDD − (5.105) in which D + := max { diam( T ) ; T ∈ T } and D − := min { diam( T ) ; T ∈ T } . Proof: We first show that for any n ≥ T ∗ ∈ T n , and any z ∈ T ∗ we have s ( z ) ≤ K c diam( T ∗ ). Indeed if T ∗ ∈ T then s ( z ) ≤ √ d = diam( T ∗ ) ≤ K c diam( T ∗ )since K c ≥ s is uniformly bounded by √ d .Otherwise T ∗ has a father T (cid:48) which was bisected in the refinement process, hence thereexists a simplex T such that (5.102) holds and which was selected for bisection at somestage of the algorithm.We thus obtain for any z ∈ T , since g ( T ∗ ) − g ( T (cid:48) ) ≥ g ( T ) and min T s < diam( T ) s ( z ) ≤ min T s + c (diam( T (cid:48) ) + d ( T, T (cid:48) )) ≤ diam( T ) + c (diam( T (cid:48) ) + d ( T, T (cid:48) )) ≤ D + − g ( T ) /d + c ( D + − g ( T (cid:48) ) /d + D − g ( T (cid:48) ) /d ) ≤ − ( g ( T ∗ ) − /d ( D + (1 + c ) + cD )= K c D − − g ( T ∗ ) /d ≤ K c diam( T ∗ ) , which concludes the proof of the right part of (5.104).From the fourth line of this inequality we also obtain an uniform lower bound on thevolume of the simplices generated by the refinement procedure | T ∗ | = 2 − g ( T ∗ ) d ! ≥ c := 1 d ! (cid:32) min { s ( z ) ; z ∈ [0 , d } d ( D + (1 + c ) + cD ) (cid:33) d . It follows that T n ) is uniformly bounded by 1 /c , which immediately implies that thesequence ( T n ) n ≥ of meshes stabilizes.Let N be such that T n = T N for all n ≥ N . We thus have M N = ∅ and thereforediam( T ) ≤ min T s for all T ∈ T N , which implies the left part of (5.104) and concludes the .5. From metric to mesh (cid:5) The rest of this section is devoted to the proof the point i) of Proposition 5.5.1. We thusconsider a metric H ∈ H i , and we recall that there exists a Lipschitz function s : R d → R ∗ + such that for all z ∈ R d H ( z ) = Id s ( z ) . For each n ≥ s n : [0 , d → R ∗ + , which is clearly Lipschitz, as follows s n ( z ) := 2 − n s (2 n ( z − z )) (5.106)where z = (1 / , · · · , / 2) is the barycenter of the cube [0 , d . For all z ∈ [0 , d and all n ≥ s n ( z ) ≤ s n ( z ) + | z − z | ≤ − n s (0) + √ d . The function s n is therefore uniformly bounded by √ d on [0 , d when n is sufficientlylarge. We denote by T n the mesh of the cube [0 , d described by Lemma 5.5.4 for thefunction s n . We thus havediam( T ) ≤ s n ( z ) ≤ C diam( T ) for all T ∈ T n , z ∈ T. (5.107)where C = C ( d ) is the constant from Lemma 5.5.4. We denote by T n the mesh of thecube [ − n − , n − ] d obtained by translating the mesh T n by − z and dilating it by 2 n . Inmathematical terms : T n := { n ( T − z ) ; T ∈ T n } . In view of (5.106) and (5.107) we thus havediam( T ) ≤ s ( z ) ≤ K diam( T ) for all T ∈ T n , z ∈ T, (5.108)We denote by C the smallest constant such that for all T in the finite set T one has C − Iddiam( T ) ≤ H T ≤ C Id K diam( T ) , and we thus obtain for all T ∈ T n and all z ∈ TC − H ( z ) ≤ H T ≤ C H ( z ) . (5.109)Unfortunately the mesh T n does not cover R d but only the cube [ − n − , n − ] d . The nextlemma shows that a global mesh T of the infinite domain R d can be extracted from thesequence of meshes ( T n ) n ≥ . Lemma 5.5.5. Let (Ω n ) n ≥ be an increasing sequence of polygonal domains of R d , whichexhausts R d . In other words (cid:91) n ≥ Ω n = R d and Ω n ⊂ Ω n +1 for all n ≥ . Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? Let H ∈ H g , let ( T n ) n ≥ be a sequence of conforming simplicial meshes of the domains Ω n . Assume that for all n ≥ , all T ∈ T n and all z ∈ T one has C − H ( z ) ≤ H T ≤ C H ( z ) . (5.110) Then there exists a mesh T ∈ T such that C − H ( z ) ≤ H T ≤ C H ( z ) for all T ∈ T andall z ∈ T . Proof: The heuristic idea of this proof is to “extract a converging subsequence” from thesequence ( T n ) n ≥ of meshes. Unfortunately meshes are discrete objects of combinatorialnature, and the convergence of meshes has no meaning a priori. We therefore use Lipschitzfunctions as an intermediate object, because their convergence is well defined and theybenefit from a compactness property.We associate a function φ T : R d → [0 , 1] to each simplex T , which is defined as follows.The function φ T is supported on T , and for all z ∈ T we have in terms of the barycentriccoordinates λ v ( z ) of z with respect to the vertices v ∈ V of T : φ T ( z ) := min v ∈ V λ v ( z ) . According to Lemma 5.4.2, we have for any z, z (cid:48) ∈ T and any v ∈ V | λ v ( z ) − λ v ( z (cid:48) ) | ≤ (cid:107) z − z (cid:48) (cid:107) H T , hence | φ T ( z ) − φ T ( z (cid:48) ) | ≤ max v ∈ V | λ v ( z ) − λ v ( z (cid:48) ) | ≤ (cid:107) z − z (cid:48) (cid:107) H T . (5.111)For all n ≥ φ n : R d → [0 , 1] as follows φ n := (cid:88) T ∈T n φ T . It follows from (5.110) and (5.111), and (5.35), that for all T ∈ T n and all z ∈ int( T )dil z ( φ n ; d H ) ≤ C . Since in addition φ n = 0 on R d \ Ω n , Corollary 5.2.6 implies that φ n : ( R d , d H ) → R is C -Lipschitz. It follows from Ascoli’s compactness theorem (for instance the versionrecalled in Theorem 6.2.3 in the next section) that there exists subsequence ( φ n k ) k ≥ which converges uniformly on all compact sets of R d to a C -Lipschitz function φ .We therefore assume, up to such an extraction, that φ n → φ uniformly on all compactsets of R d . We denote by T the collection of closures of connected components of the set { z ∈ R d ; φ ( z ) > } : T := (cid:8) E ; E is a connected component of { z ∈ R d ; φ ( z ) > } (cid:9) . For each z ∈ R d and each n ≥ 0, we denote by T n ( z ) an element of T n containing z if itexists, and T n ( z ) = ∅ otherwise. Likewise we denote by T ( z ) an element of T containing .5. From metric to mesh z if it exists, and ∅ otherwise. If φ ( z ) > 0, then z ∈ int( T n ( z )) for all n sufficiently large,and one easily checks that T n ( z ) converges in Hausdorff distance to T ( z ) ∈ T . (5.112)Hence T is a collection of simplices, and the convergence T n ( z ) → T ( z ) in Haussdorfdistance, combined with the hypothesis C − H ( z ) ≤ T n ( z ) ≤ C H ( z ), implies that C − H ( z ) ≤ T ( z ) ≤ C H ( z ) . (5.113)In order to conclude this proof we need to show that the collection T of simplices is aconforming mesh of R d . We thus have to check the hypotheses of Definition 5.1.4. Theinteriors int( T ) of the simplices T ∈ T are pairwise disjoint since they are the connectedcomponents of the set { z ∈ R d ; φ ( z ) > } . We claim furthermore that T covers R d .Indeed let z ∈ R d be arbitrary and let n ≥ z ∈ Ω n , hence T n ( z ) (cid:54) = ∅ . Since C − H ( z ) ≤ H T n ( z ) ≤ C H ( z ) there exists a subsequence ( T n ϕ ( k ) ( z )) k ≥ which convergesin the Haussdorf distance to a simplex T containing z . We clearly have T ∈ T whichestablishes as announced that T covers R d . The fact that T is locally finite easily followsfrom (5.113).We now turn to the proof of the conformity property, and for that purpose we consideran arbitrary Lipschitz function f : ( R d , d H ) → R with compact support. Let T be asimplex such that C − H ( z ) ≤ H T ≤ C H ( z ). We have for any vertex v of T | f ( v ) − f ( z T ) | ≤ d H ( z T , v ) ≤ C (cid:107) z T − v (cid:107) H T = C . Denoting as before by V the collection of vertices of T , and by ( λ v ) v ∈ V the barycentriccoordinates on T , we thus obtain for any z, z (cid:48) ∈ T I T f ( z ) − I T f ( z (cid:48) ) = (cid:88) v ∈ V ( λ v ( z ) − λ v ( z (cid:48) ))( f ( v ) − f ( z T )) , where I T denotes the IP (piecewise affine) interpolation on a simplex T . Hence usingLemma 5.4.2 | I T f ( z ) − I T f ( z (cid:48) ) | ≤ C (cid:88) v ∈ V | λ v ( z ) − λ v ( z (cid:48) ) |≤ C (cid:115) ( d + 1) (cid:88) v ∈ V | λ v ( z ) − λ v ( z (cid:48) ) | = C √ d (cid:107) z − z (cid:48) (cid:107) H T . The last inequality implies that dil z (I T n f ; d H ) ≤ C √ d for all T ∈ T n and all z ∈ int( T ), where I T (cid:48) denotes the IP interpolation on a mesh T (cid:48) ∈ T . Let n ≥ f ) ⊂ Ω n , which implies that I T n f is continuous.It follows from Corollary 5.2.6 that I T n f : ( R d , d H ) → R is C √ d -Lipschitz.52 Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? We define D := ∪ T ∈T int( T ) , and we (abusively) denote by I T f : D → R the functionwhich coincides with I T f on the interior of each T ∈ T . For any z ∈ D we obtain using(5.112) that I T n f ( z ) → I T f ( z ), hence for any z, z (cid:48) ∈ D | I T f ( z ) − I T f ( z (cid:48) ) | = lim n →∞ | I T n f ( z ) − I T n f ( z (cid:48) ) | ≤ √ dC d H ( z, z (cid:48) ) . The piecewise interpolant I T f : D → R therefore extends to a √ dC -Lipschitz functionI T f : ( R d , d H ) → R . It easily follows that I T f is continuous for any compactly suppor-ted C function f . This property characterizes conforming meshes, hence T ∈ T whichconcludes the proof. (cid:5) Corollary 5.5.6. (Bidimensional triangulations) Assume that the dimension is d = 2 .There exists c > such that the following holds : if s : R → R ∗ + is c -Lipschitz, then thereexist a mesh T of R built of half squares, and such that for all T ∈ T and z ∈ T diam( T ) ≤ s ( z ) ≤ T ) / . (5.114) Proof: When the dimension is d = 2, the collection T of triangles contains only halfsquares, of area 1 / 2, hence D + = D − = √ 2. As c → K c defined by (5.105)tends to √ 2. Hence K c ≤ / c is sufficiently small.From this point we construct as before a sequence of meshes ( T n ) n ≥ n of the squares[ − n − , n − ] which satisfies (5.107) hence (5.114), and we extract a global mesh T of R using Lemma 5.5.5. The triangles T ∈ T still satisfy the inequality (5.114) since they arelimits in the Haussdorff distance of triangles from the triangulations T n . (cid:5) We prove in this section a result of bi-dimensional mesh generation, Theorem 5.5.8,which implies Point (iii) of Proposition 5.5.1. The key ingredients of this section comefrom the paper [66]. We assume throughout this section that the dimension is d = 2.We first introduce the notion of equispaced points in an abstract metric space. Let( X, d X ) be a metric space and let δ > 0, we say that a subset V ⊂ X is δ -equispaced ifthe two following properties hold.a) (Covering) The distance from an arbitrary point x ∈ X to V is bounded by 1 : for all x ∈ X d X ( x, V ) := inf { d X ( x, v ) ; v ∈ V} ≤ . (5.115)b) (Separation) The pairwise distances between the points of V are larger than δ : for all v, v (cid:48) ∈ V such that v (cid:54) = v (cid:48) d X ( v, v (cid:48) ) ≥ δ. (5.116)The next lemma establishes the existence of such a set under certain assumptions. .5. From metric to mesh Lemma 5.5.7. Let ( X, d X ) be a metric space in which the closed balls B (cid:48) ( x, r ) := { y ∈ X ; d X ( x, y ) ≤ r } are compact for all x ∈ X and all r ≥ . Let V ⊂ X be such that d X ( v, v (cid:48) ) ≥ for all v, v (cid:48) ∈ V such that v (cid:54) = v (cid:48) . Then there exists a -equispaced set V ofthe metric space ( X, d X ) , containing V . Proof: We may assume that V (cid:54) = ∅ , up to including an arbitrary point of X . We choosean arbitrary point x ∈ V , and we define inductively a sequence ( x n ) ≤ n 1, andwe remark that x ∗ ∈ X n for all n ≥ 0. The definition (5.118) of x n +1 implies that d X ( x , x n ) ≤ d X ( x , x ∗ ) for all n ≥ 0, hence the collection of points { x n ; n ≥ } isincluded in the closed ball { x ∈ X ; d X ( x , x ) ≤ d X ( x , x ∗ ) } which is compact. Thesequence ( x n ) n ≥ therefore admits a converging subsequence, but this contradict the factthat the pairwise distances between these points are larger than 1, which concludes theproof of this proposition. (cid:5) We consider a fixed metric H ∈ H . For any x, y ∈ R we define d x ( y ) := (cid:107) x − y (cid:107) H ( x ) . We also consider a fixed discrete collection of vertices, or “sites”, V ⊂ R , and we definethe (anisotropic) Voronoi cell of a site v ∈ V as followsVor( v ) := { z ∈ R ; d v ( z ) ≤ d w ( z ) for all w ∈ V } . (5.119)The (anisotropic) Voronoi cell Vor( v ) is thus the collection of points z which are closer tothe vertex v than to any other site . The Voronoi diagram is the collection { Vor( v ) ; v ∈V} of all Voronoi cells. The classical isotropic Voronoi is obtained by choosing H = Ididentically on R d . An instance of an (anisotropic) Voronoi diagram is illustrated on Figure5.7.For each z ∈ R we denote by V z ⊂ V the collection of vertices which contain z intheir Voronoi region : V z := { v ∈ V ; z ∈ Vor( v ) } . Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? Figure G ( V , H ) defined as follows : G ( V , H ) := { [ v, v (cid:48) ] ; there exists z ∈ R such that V z = { v, v (cid:48) }} . (5.120)The next theorem establishes that G ( V , H ) is, under certain conditions on the metric H and the set V , the graph of a triangulation adapted to H . Theorem 5.5.8. For each δ > there exists c = c ( δ ) > and C = C ( δ ) ≥ such that thefollowing holds. Let H be a metric such that c H ∈ H g , and let V be a δ -equispaced subsetof ( R , d H ) . Then the G ( V , H ) is a planar graph which defines a partition of R into strictlyconvex polygons. Arbitrarily triangulating each polygon yields a triangulation T ∈ T g,C which is C -adapted to H , and called an anisotropic Delaunay triangulation. Furthermoreif V is in general position then G ( V , H ) is already the graph of a triangulation. Denote by c and C the constants attached to δ = 1 in this theorem. It follows that forany metric H such that c H ∈ H g , there exists a mesh T ∈ T g,C which is C -adapted to H ,obtained as the Delaunay triangulation of a 1-equispaced set of points V ⊂ ( R , d H ). Sucha set exists according to Lemma 5.5.7, since the balls of the metric space ( R , d H ) are com-pact according to Proposition 5.2.10. This immediately implies Point iii) of Proposition5.5.1 according to Remark 5.5.2.The rest of this section is devoted to the proof of Theorem 5.5.8, which is based onthe methods and results presented in [66]. For that purpose we recall some definitions ofthis paper. Definition 5.5.9. – (Wedge of two points) We define the wedge wedge( v, w ) of twodistinct sites v, w ∈ V as follows wedge( v, w ) := { z ∈ R ; ( z − v ) T H ( v )( w − v ) > and ( z − w ) T H ( w )( v − w ) > } – (Wedge property) We say that the Voronoi diagram of V is wedged if for any twodistinct v, w ∈ V we have Vor( v ) ∩ Vor( w ) ⊂ wedge( v, w ) . .5. From metric to mesh v ) ∩ Vor( w ) of points shared bythe Voronoi regions of the sites v and w lie in a cone defined by two linear inequalities.The wedge of two points is illustrated on Figure 5.7 (right). Note that the wedge propertyis not satisfied on this figure.For any metric H and any constant c > c H ∈ H g , it easily follows fromProposition 5.2.10 that for all x, y ∈ R ln(1 + c d x ( y )) ≤ c d H ( x, y ) ≤ − ln(1 − c d x ( y )) , (5.121)which is equivalent to1 − exp( − c d H ( x, y )) ≤ c d x ( y ) ≤ exp( c d H ( x, y )) − . (5.122) Lemma 5.5.10. The following holds for any δ > and any c > . Let H be a metricsuch that c H ∈ H g and let V be a δ -equispaced subset of ( R , d H ) . Then for any v ∈ V and any q ∈ Vor( v ) one has d v ( q ) := (cid:107) q − v (cid:107) H ( v ) ≤ r ( c ) := ( e c − /c. (5.123) If v, w ∈ V satisfy Vor( v ) ∩ Vor( w ) (cid:54) = ∅ , then d H ( v, w ) ≤ r ( c ) := − − e c ) /c. (5.124) Note that r ( c ) → and r ( c ) → as c → . Note also that d H ( v, w ) ≥ δ . Proof: For any q ∈ V there exists v ∗ ∈ V such that d H ( q, v ∗ ) ≤ , and therefore (cid:107) q − v ∗ (cid:107) H ( v ∗ ) ≤ r ( c ) := ( e c − /c, which implies (5.123) using the definition (5.119) of the Voronoi diagram. We thus turnto the proof of (5.124) and for that purpose we consider a point q ∈ Vor( v ) ∩ Vor( w ). Wethus have (cid:107) q − v (cid:107) H ( v ) = (cid:107) q − w (cid:107) H ( w ) ≤ r ( c ) hence using (5.121) d H ( v, w ) ≤ d H ( v, q ) + d H ( q, w ) ≤ r ( c ) := − − cr ( c )) /c, which concludes the proof. (cid:5) The next proposition, which is partly is based on Lemma 5 from [66], shows that thewedge property is automatically satisfied in our context. Proposition 5.5.11. For each δ there exists c = c ( δ ) > such that the following holds.If a metric H satisfies c H ∈ H g and if V is a δ -equispaced subset of ( R , d H ) , then thewedge property is satisfied. Proof: We assume for contradiction that there exists two distinct vertices v, w ∈ V , and apoint q ∈ Vor( V ) ∩ Vor( W ) such that q / ∈ wedge( v, w ). We may assume, up to exchangingthe roles of v and w , that ( q − w ) T H ( w )( v − w ) ≤ . Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? Defining the “relative distortion” τ := exp d × ( H ( v ) , H ( w )) we obtain using Lemma 5from [66] d w ( v ) ≤ d w ( q ) √ τ − . (5.125)We now estimate the quantities d w ( v ), d w ( q ) and τ appearing in this estimate, in orderto obtain a contradiction when c is sufficiently small.We have ln τ = d × ( H ( v ) , H ( w )) ≤ cd H ( v, w ) ≤ cr ( c ) . (5.126)Since V is δ -equispaced we have d H ( v, w ) ≥ δ and therefore c d w ( v ) ≥ − exp( − cd H ( v, w )) ≥ − exp( − cδ ) . (5.127)Injecting (5.123), (5.126) and (5.127) in (5.125) we obtain1 − exp( − cδ ) c ≤ exp( c ) − c (cid:112) exp(2 cr ( c )) − . (5.128)The left hand side tends to δ as c → 0, while the right hand side tends to 0 (and onlydepends on c ). We thus obtain a contradiction when c is sufficiently small which concludesthe proof of this proposition. (cid:5) Theorem 5.5.12 (Dual triangulation theorem. Labelle, Shewchuck, [66]) . Let H ∈ H and let V ⊂ R be a discrete point set. If the Voronoi diagram of V is wedged, then thegeometric dual of this diagram is a polygonalization of R with strictly convex polygons,and is a triangulation if V is in general position. Under the hypotheses of this theorem the vertices v , · · · , v k of a polygon in the dualVoronoi diagram satisfy by construction Vor( v ) ∩ · · · ∩ Vor( v k ) (cid:54) = ∅ . Let T be a trian-gulation obtained by arbitrarily triangulating these polygons. The vertices v , v , v of atriangle T ∈ T satisfy Vor( v ) ∩ Vor( v ) ∩ Vor( v ) (cid:54) = ∅ .The next theorem, originally stated in [66] as Corollary 10 to Theorem 9 of the samepaper, allows to control the angles of a triangle in the anisotropic Delaunay triangulation. Theorem 5.5.13 (Labelle, Shewchuk, [66]) . Let v , v , v ∈ V and let q ∈ Vor( v ) ∩ Vor( v ) ∩ Vor( v ) . Let r = d v ( q ) = d v ( q ) = d v ( q ) and l = min { d v i ( v j ) ; i, j ∈ { , , } , i (cid:54) = j } . Define β := max { r/l, / √ } , γ := exp max { d × ( H ( v i ) , H ( v j ))1 ≤ i < j ≤ } , and χ := 12 β − ( γ − β . (5.129) If γ ≤ √ and χ > then any angle θ of the triangle of vertices ( (cid:112) H ( v ) v k ) ≤ k ≤ satisfies arcsin( χ/γ ) ≤ θ ≤ χ/γ ) . .5. From metric to mesh Corollary 5.5.14. For each δ , < δ ≤ there exists c = c ( δ ) > and C = C ( δ ) ≥ such that the following holds. Let H be a metric such that c H ∈ H g and let V be a δ -equispaced subset of ( R , d H ) . If v , v , v ∈ V and if Vor( v ) ∩ Vor( v ) ∩ Vor( v ) (cid:54) = ∅ ,then denoting by T the triangle of vertices v , v , v we have for all z ∈ TC − H ( z ) ≤ H T ≤ C H ( z ) . Proof: We first obtain obtain some explicit bounds on the quantities r , l , β , γ and χ appearing in Theorem 5.5.13. It follows from Lemma 5.5.10 that r ≤ r ( c ) and δ ≤ d H ( v i , v j ) ≤ r ( c ) . Therefore l := min i (cid:54) = j d v i ( v j ) ≥ r ( c ) := (1 − exp( − cδ )) /c,L := max i (cid:54) = j d v i ( v j ) ≤ (exp( cr ( c )) − /c, and γ ≤ max i (cid:54) = j exp( c d H ( v i , v j )) ≤ exp( c r ( c )) . It follows that β ≤ r ( c ) := min { r ( c ) /r ( c ) , / √ } . Injecting this into (5.129) yields χ ≥ r ( c ) := 12 r ( c ) − (exp(2 cr ( c )) − r ( c )2 , and therefore χ/γ ≥ r ( c ) := r ( c ) exp( − cr ( c )).For any fixed δ , 0 < δ ≤ c → r ( c ) → r ( c ) → r ( c ) → δ , r ( c ) → min { δ, / √ } , r ( c ) → / (2 min { δ, / √ } ) and r ( c ) has the samelimit.We may therefore choose c sufficiently small in such way that l ≥ l := δ/ , L ≤ L := 4 and χ/γ ≥ χ := 1 / (4 δ ) . We denote by T the triangle of vertices v , v , v . The length of any edge of T (cid:48) := (cid:112) H ( v )( T ) is bounded above by L and below by l respectively, and the angles of T (cid:48) satisfy arcsin( χ ) ≤ θ ≤ χ ). The collection of triangles centered at the originwhich satisfy these inequalities is compact, thus there exists a constant C = C ( δ ) suchthat C − Id ≤ H T (cid:48) ≤ C Id , hence C − H ( v ) ≤ H T (cid:48) ≤ C H ( v ). For any z ∈ T we obtain, remarking that (cid:107) z − v (cid:107) H T ≤ z ∈ T and using (5.53), C − (1 − cC ) H ( z ) ≤ H T (cid:48) ≤ C (1 − cC ) − H ( z ) . We may assume that c ≤ / (4 C ), which concludes the proof of this proposition with C = 2 C . (cid:5) Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? We now prove of Theorem 5.5.8. According to the above results for any δ > c = c ( δ ) and C = C ( δ ) such that the following holds. For any metric H suchthat c H ∈ H g , and any set V ⊂ ( R , d H ) which is δ -equispaced, the Voronoi diagramassociated to V and H is wedged according to Proposition 5.5.11. The dual of this diagramis therefore a polygonalization of R with strictly convex polygons according to Theorem5.5.12. Randomly triangulating these cells yields a triangulation T ∈ T which is C -adapted to H according to Corollary 5.5.14.Furthermore consider to simplices T, T (cid:48) ∈ T such that T ∩ T (cid:48) (cid:54) = ∅ , and a point z ∈ T ∩ T (cid:48) . Then C − H T ≤ H ( z ) ≤ C H T (cid:48) , which shows that T ∈ T g,C , and concludes the proof of Theorem 5.5.8. This section is devoted to the proof of Point ii) of Proposition 5.5.1, which is obtainedby combining the constructions of isotropic and graded triangulations presented in § § few strongly obtuse angles if the set V of vertices is structured along lines which are transverse to the direction of anisotropy , as illustrated on Figure 5.2 (in the introductionof this chapter).We consider a fixed metric H such that c H ∈ H a , where c is an absolute constantspecified in the proof and such that 0 < c ≤ 1. This subsection is devoted to the construc-tion of a mesh T ∈ T a,C which is C -adapted to H , where C is an absolute constant alsospecified in the proof. This construction immediately implies Proposition 5.5.1 accordingto Remark 5.5.2.For each z ∈ R we define λ ( z ) := (cid:107) H ( z ) − (cid:107) and µ ( z ) = (cid:107) H ( z ) (cid:107) − , and we observe that λ ( z ) ≥ µ ( z ). The functions λ and µ are c -Lipschitz according toCorollary 5.2.8. If c is sufficiently small, then there exists according to Corollary 5.5.6 amesh T such that for all T ∈ T and all z ∈ T λ ( z ) ≤ diam( T ) ≤ λ ( z ) . (5.130)The mesh T is built of half squares by construction (as illustrated on Figure 5.6). Thediameter of any triangle in T is a power of √ 2, hence for any two triangles T, T (cid:48) ∈ T sharing a vertex v we obtain using (5.130)diam( T ) / √ ≤ diam( T (cid:48) ) ≤ √ T ) . (5.131)For any edge [ v, w ] of any triangle T ∈ T we have diam( T ) / √ ≤ | v − w | ≤ diam( T ),hence for any two edges [ v, w ] and [ v, w (cid:48) ] (sharing the vertex v ) of the triangulation T | v − w | / ≤ | v − w (cid:48) | ≤ | v − w | . (5.132) .5. From metric to mesh v, w ] of T λ ( z ) √ ≤ diam( T ) √ ≤ | v − w | ≤ diam( T ) ≤ λ ( z ) (5.133)We denote by V the collection of vertices of T . Consider a fixed v ∈ V and any vertex w ∈ V such that | v − w | is minimal. Then [ v, w ] is clearly an edge of T , since thistriangulation is conforming and built of half squares. This implies for any distinct v, w ∈ V λ ( v ) √ ≤ | v − w | . (5.134)For each z ∈ R we define ρ ( z ) := λ ( z ) /µ ( z ) = (cid:112) (cid:107) H ( z ) (cid:107)(cid:107) H ( z ) − (cid:107) ∈ [1 , ∞ ) . Consider z, z (cid:48) ∈ R such that | z − z (cid:48) | ≤ rλ ( z ), where r ≥ λ and µ are c -Lipschitz ρ ( z (cid:48) ) = λ ( z (cid:48) ) µ ( z (cid:48) ) ≥ λ ( z )(1 − rc ) µ ( z ) + rcλ ( z ) ≥ − rcρ ( z ) − + rc . (5.135)For each ρ ≥ r > c ρ ( ρ , r ) > ρ = − r c ρ ( ρ ,r )( ρ +1) − + r c ρ ( ρ ,c ) . Hence for all z, z (cid:48) ∈ R | z − z (cid:48) | ≤ r λ ( z ) ρ ( z ) ≥ ρ + 1 c ≤ c ρ ( ρ , r ) implies ρ ( z (cid:48) ) ≥ ρ For each z ∈ R such that ρ ( z ) > θ ( z ) ∈ S := { u ∈ R ; | u | = 1 } aunit vector such that H ( z ) = λ ( z ) − θ ( z ) θ ( z ) T + µ ( z ) − (Id − θ ( z ) θ ( z ) T ) . (5.136)Note that there exists two, opposite, choices for the vector θ ( z ). For each u, u (cid:48) ∈ R \ { } we define the (unoriented) angle (cid:30) ( u, u (cid:48) ) := arccos (cid:18) (cid:104) u, u (cid:48) (cid:105)| u || u (cid:48) | (cid:19) ∈ [0 , π ] . and (cid:1) ( u, u (cid:48) ) := min { (cid:30) ( u, u (cid:48) ) , (cid:30) ( u, − u (cid:48) ) } = arccos (cid:18) |(cid:104) u, u (cid:48) (cid:105)|| u || u (cid:48) | (cid:19) ∈ [0 , π / . Lemma 5.5.15. If c is sufficiently small, then the following holds. For any z, z (cid:48) ∈ R (cid:26) ρ ( z ) ≥ | z (cid:48) − z | ≤ λ ( z ) implies (cid:1) ( θ ( z ) , θ ( z (cid:48) )) ≤ π / . Likewise, ρ ( z ) ≥ and | z − z (cid:48) | ≤ λ ( z ) implies (cid:1) ( θ ( z ) , θ ( z (cid:48) )) ≤ / (10 √ . Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? v w + v w − θ ( v ) θ ( v ) ⊥ Figure V T ( v ) of a simplex v in the triangulation T (left), theneighbors w + and w − of a vertex v in T (right). Proof: We consider a fixed point z such that ρ ( z ) ≥ B := { z (cid:48) ∈ R ; | z (cid:48) − z | ≤ λ ( z ) } . For each z (cid:48) ∈ B we have λ ( z (cid:48) ) − µ ( z (cid:48) ) ≥ ( λ ( z ) − c | z − z (cid:48) | ) − ( µ ( z ) + c | z − z (cid:48) | )= λ ( z ) − µ ( z ) − c | z − z (cid:48) |≥ λ ( z ) − λ ( z ) / − cλ ( z )= λ ( z )(1 / − c ) , hence λ ( z (cid:48) ) − µ ( z (cid:48) ) ≥ λ ( z ) / c ≤ / B is simply connected there exists a continuous function θ ∗ : B → S such that θ ∗ ( z (cid:48) ) = ± θ ( z (cid:48) ) for all z (cid:48) ∈ B . It follows from Theorem 5.3.1 that for all z (cid:48) ∈ Bλ ( z ) dil z (cid:48) ( θ ∗ ) ≤ λ ( z (cid:48) ) − µ ( z (cid:48) )) dil z (cid:48) ( θ ∗ ) ≤ c. Therefore, since the distance d S on S involved in Theorem 5.3.1 coincides with the angle (cid:30) we obtain (cid:1) ( θ ( z ) , θ ( z (cid:48) )) ≤ (cid:30) ( θ ∗ ( z ) , θ ∗ ( z (cid:48) )) ≤ | z − z (cid:48) | max ˜ z ∈ [ z,z (cid:48) ] dil ˜ z ( θ ∗ ) ≤ c which concludes the proof if 30 c ≤ π / 12. The second estimate is obtained similarly. (cid:5) For each vertex v ∈ V we define a set Γ v ⊂ R as follows : Γ v := { v } if ρ ( v ) < 3, andΓ v := (cid:20) v, v w − (cid:21) ∪ (cid:20) v, v w + (cid:21) (5.137)if ρ ( v ) ≥ 3, where [ v, w − ] and [ v, w + ] are two edges of T which respectively minimize andmaximize the quantity (cid:104) θ ( v ) ⊥ , w − v (cid:105) (among the edges containing v ). In other words forany edge of T of the form [ v, w ] we have (cid:104) θ ( v ) ⊥ , w − − v (cid:105) ≤ (cid:104) θ ( v ) ⊥ , w − v (cid:105) ≤ (cid:104) θ ( v ) ⊥ , w + − v (cid:105) . (5.138)This property is illustrated on Figure 5.8 (right). .5. From metric to mesh Lemma 5.5.16. let v ∈ V be such that ρ ( v ) ≥ , and let [ v, w + ] , [ v, w − ] be two edges of T satisfying (5.138). We have (cid:30) ( θ ( v ) ⊥ , w + − v ) ≤ π / , and (cid:30) ( θ ( v ) ⊥ , v − w − ) ≤ π / . (5.139) Proof: We restrict without loss of generality our attention to the proof of the first in-equality, and we assume for contradiction that ϕ := (cid:30) ( θ ( v ) ⊥ , w + − v ) > π / 4. Among thetwo vertices w ∈ V such that the triangle of vertices v, w, w + belongs to T , we choose theone such that (cid:30) ( θ ⊥ , w − v ) is minimal. We denote ψ := (cid:30) ( w − v, w + − v ) ∈ { π / , π / } and we obtain (cid:104) θ ( v ) ⊥ , w + − v (cid:105) = | w + − v | cos ϕ and (cid:104) θ ( v ) ⊥ , w − v (cid:105) = | w − v | cos( ϕ − ψ ) . If ψ = π / | w + − v | = | w − v | since the triangles T ∈ T are half squares, andcos( ϕ − ψ ) = sin ϕ . If follows from (5.138) that cos ϕ ≥ sin ϕ which contradicts our as-sumption that ϕ > π / 4. If ψ = π / | w + − v | = √ | w − v | or | w + − v | = | w − v | / √ √ ϕ ≥ cos( ϕ − π / 4) which again contradicts our assumption that ϕ > π / (cid:5) We define Γ := ∪ v ∈V Γ v , and for each z ∈ Γ \ V we defineΓ z := (cid:91) v ∈V z ∈ Γ v Γ v . (5.140)Such a set Γ z is illustrated on Figure 5.9 (right). For each z ∈ Γ there exists v ∈ V suchthat Γ z = Γ v , or there exists v, v (cid:48) ∈ V , and w, w (cid:48) ∈ V , such that z ∈ [ v, v (cid:48) ] andΓ z = Γ v ∪ Γ (cid:48) v = (cid:20) v + 2 w , v (cid:21) ∪ [ v, v (cid:48) ] ∪ (cid:20) v (cid:48) , v (cid:48) + 2 w (cid:48) (cid:21) . Note that for any vertex v ∈ V one has Γ v ∩ V = { v } , hence (5.140) agrees with (5.137)for any z ∈ V . Note also that for any z, z (cid:48) ∈ Γ the following are equivalent– z (cid:48) ∈ Γ z ,– There exists v ∈ V such that { z, z (cid:48) } ⊂ Γ v ,– z ∈ Γ z (cid:48) .The next proposition gives a robust estimate of the orientation of the set Γ. Lemma 5.5.17. The following holds if c is sufficiently small. Let z ∈ Γ , let p, q ∈ Γ z betwo distinct points, and let z (cid:48) be such that | z − z (cid:48) | ≤ λ ( z ) . Then (cid:1) ( θ ( z (cid:48) ) ⊥ , p − q ) ≤ π / . Proof: We first remark that Γ z is not a singleton since the points p and q are distinct.Therefore Γ z = Γ v ∪ Γ v (cid:48) , or Γ z = Γ v , for some v, v (cid:48) ∈ V satisfying ρ ( v ) ≥ ρ ( v (cid:48) ) ≥ v ∪ Γ v (cid:48) .Since | v − z | ≤ min { λ ( z ) , λ ( v ) } we obtain | v − z (cid:48) | ≤ | v − z | + | z − z (cid:48) | ≤ λ ( v ) + 5 λ ( z ) ≤ λ ( v ) + 5( λ ( v ) + c | v − z | ) ≤ λ ( v )(6 + 5 c ) . Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? v z w ∗ R \ v τ ( v ) θ ( z ) z Figure c is sufficiently small, we thus have | v − z (cid:48) | ≤ λ ( v ) and therefore (cid:1) ( θ ( v ) , θ ( z (cid:48) )) ≤ π / t to the polygonal line Γ v thus satisfies (cid:1) ( θ ( z (cid:48) ) ⊥ , t ) ≤ (cid:1) ( θ ( v ) ⊥ , t ) + (cid:1) ( θ ( v ) , θ ( z )) ≤ π / π / 12 = π / . Proceeding likewise we obtain that (cid:1) ( θ ( z (cid:48) ) ⊥ , t ) ≤ π / z = Γ v ∪ Γ v (cid:48) , and therefore (cid:1) ( θ ( z (cid:48) ) ⊥ , p − q ) ≤ π / p, q on this line which concludes the proof. (cid:5) The next proposition gives some upper and lower bounds on the distance from a pointto the set Γ. Proposition 5.5.18. If c is sufficiently small, then the following holds. For any z ∈ Γ , min {| z − e | ; e ∈ Γ \ Γ z } ≥ λ ( z ) (5.141) and for any z ∈ R min {(cid:107) z − e (cid:107) H ( z ) ; e ∈ Γ } ≤ Proof: We first establish (5.141). Let z ∈ Γ and let v ∈ V be such that z ∈ Γ v . Thereexists an edge [ v, w ∗ ] of T and a parameter α ∈ [0 , / 3] such that z = (1 − α ) v + αw ∗ .We denote by V T ( v ) the union of all triangles T ∈ T containing v , and by V ( v ) thecollection of vertices w such that [ v, w ] is an edge of T . We thus haveΓ \ Γ z ⊂ R \ V T ( v ) (cid:91) w ∈V ( v ) \{ w ∗ } (cid:20) v + w , w (cid:21) . The set appearing in the right hand side is illustrated on Figure 5.9 (left). Denoting .5. From metric to mesh d ( z, E ) := inf {| z − e | ; e ∈ E } we obtain d ( z, Γ \ Γ z ) ≥ min (cid:26) d ( z, R \ V T ( v )) , min w ∈V ( v ) \{ w ∗ } d (cid:18) z, (cid:20) v + w , w (cid:21)(cid:19)(cid:27) ≥ min (cid:26) | w ∗ − z |√ , √ w ∈V ( v ) \{ w ∗ } | v − w | (cid:27) ≥ √ w ∈V ( v ) | v − w |≥ √ λ ( v ) √ ≥ λ ( z ) − c | z − v | ≥ λ ( z )(1 − c )9 , where we used (5.133) in the fourth line, and the fact that λ is c -Lipschitz in the last line.This establishes (5.141) if c is sufficiently small.We now turn to the proof of (5.142). Let T ∈ T and let z ∈ T . Since T is a half squarethere exists a vertex v of T such that | z − v | ≤ diam( T ) / 2, hence (cid:107) z − v (cid:107) H ( z ) ≤ | z − v | µ ( z ) ≤ diam( T )2 µ ( z ) ≤ λ ( z )2 µ ( z ) = ρ ( z )2 , which concludes the proof if ρ ( z ) ≤ 4. If ρ ( z ) ≥ c ≤ c ρ (3 , 1) then ρ ( v ) ≥ v of the triangle T .We choose two vertices v, w of T such that the line ( z + rθ ( z )) r ∈ IR intersects ∂T on thesegment [ v, ( v + w ) / w − , w + ∈ V the neighbors of v in T which satisfy(5.138). Our first step is to establish that the line ( z + rθ ( z )) r ∈ IR intersects the set Γ v ,which is heuristically due to the fact that the angle (cid:1) ( θ ( z ) , θ ( v )) is small and that thevertices w − and w + are the furthest away from v in the direction of θ ( v ) ⊥ as illustratedon Figure 5.8 (right).It follows from (5.132) an Lemma 5.5.16 that (cid:104) θ ( v ) ⊥ , w + − v (cid:105) ≥ | w + − v | cos( π / ≥ | w − v | √ . (5.143)We assume without loss of generality that (cid:104) θ ( z ) , θ ( v ) (cid:105) ≥ | θ ( z ) − θ ( v ) | ≤ (cid:30) ( θ ( z ) , θ ( v )) = (cid:1) ( θ ( z ) , θ ( v )) ≤ δ := 1 / (10 √ . It follows that (cid:104) θ ( z ) ⊥ , w − v (cid:105) = (cid:104) θ ( z ) ⊥ − θ ( v ) ⊥ , w − v (cid:105) + (cid:104) θ ( v ) ⊥ , w − v (cid:105)≤ | θ ( z ) − θ ( v ) || w − v | + (cid:104) θ ( v ) ⊥ , w + − v (cid:105)≤ (1 + 2 δ √ (cid:104) θ ( v ) ⊥ , w + − v (cid:105) , Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? where we injected (5.138) in the second line, and (5.143) in the third line. DenotingΘ := (cid:30) ( θ ( v ) ⊥ , w + − v ) ≤ π / (cid:30) ( θ ( z ) ⊥ , w + − v ) we obtain (cid:104) θ ( z ) ⊥ , w + − v (cid:105)(cid:104) θ ( v ) ⊥ , w + − v (cid:105) = cos(Φ)cos(Θ) ≥ cos(Θ) − | Φ − Θ | cos(Θ) = 1 − | Φ − Θ | cos Θ ≥ − δ √ , hence (cid:104) θ ( z ) ⊥ , w − v (cid:105) ≤ δ √ − δ √ (cid:104) θ ( z ) ⊥ , w + − v (cid:105) = 43 (cid:104) θ ( z ) ⊥ , w + − v (cid:105) . Proceeding likewise for w − we obtain43 (cid:104) θ ( z ) ⊥ , w − − v (cid:105) ≤ (cid:104) θ ( z ) ⊥ , w − v (cid:105) ≤ (cid:104) θ ( z ) ⊥ , w + − v (cid:105) , and therefore since z + rθ ( z ) ∈ [ v, ( v + w ) / 2] for some r ∈ R ,23 (cid:104) θ ( z ) ⊥ , w − − v (cid:105) ≤ (cid:104) θ ( z ) ⊥ , z − v (cid:105) ≤ (cid:104) θ ( z ) ⊥ , w + − v (cid:105) . It follows that the line ( z + rθ ( z )) r ∈ IR intersects the set Γ v := [ v, ( v + 2 w + ) / ∪ [ v, ( v +2 w − ) / 3] at some point z (cid:48) . The point z (cid:48) belongs to a triangle T (cid:48) containing v , hence λ ( z ) (cid:107) z (cid:48) − z (cid:107) H ( z ) = λ ( z ) (cid:107) rθ ( z ) (cid:107) H ( z ) = | r | = | z − z (cid:48) |≤ | z − v | + | v − z (cid:48) |≤ diam( T ) + 23 diam( T (cid:48) ) ≤ (cid:18) √ (cid:19) diam( T ) ≤ (cid:18) √ (cid:19) λ ( z ) , where we used (5.131) in the last line. This concludes the proof of this proposition. (cid:5) We construct in the next lemma a collection V of sites by sampling the metric space(Γ , d H ), which is the collection of vertices of our future quasi-acute triangulation T . Thisset is illustrated by small crosses on Γ z on Figure 5.9 (right). Lemma 5.5.19. The following holds if c is sufficiently small. There exists a discrete set V ⊂ Γ containing V and satisfying the following :i) (Separation) For all v, w ∈ V , d H ( v, w ) ≥ / .ii) (Distance from Γ ) For all z ∈ Γ , d H ( z, V ) ≤ / .iii) (Distance from R ) For all z ∈ R , d H ( z, V ) ≤ / . Proof: For any distinct v, w ∈ V we obtain using (5.134) and Proposition 5.2.10 d H ( v, w ) ≥ ln(1 + c (cid:107) v − w (cid:107) H ( v ) ) /c ≥ ln(1 + c | v − w | /λ ( v )) /c ≥ ln(1 + c √ / /c .5. From metric to mesh / 10 if c is sufficiently small.We denote by V a 1-equispaced subset of the metric space (Γ , d H ) containing V ,which exists according to Lemma 5.5.7. The two first announced properties, separationand distance from Γ, follow directly from this construction.The third property is obtained as follows : for any z ∈ R there exists according toProposition 5.5.18 a point e ∈ Γ such that (cid:107) z − e (cid:107) H ( z ) ≤ 2. Furthermore there exists avertex v ∈ V such that d H ( e, v ) ≤ / 10, hence d H ( z, V ) ≤ d H ( z, e ) + d H ( e, v ) ≤ − ln(1 − c (cid:107) z − e (cid:107) H ( z ) ) /c + 1 / ≤ − ln(1 − c ) /c + 1 / . which is smaller than 2 + 1 / c is sufficiently small. (cid:5) The next lemma introduces the anisotropic Delaunay triangulation T associated tothe metric H and the set of points V . Lemma 5.5.20. There exists an absolute constant C ≥ such that the following holdsif c is sufficiently small. The set V is wedged with respect to the metric H , and theanisotropic Delaunay triangulation T obtained by arbitrarily triangulating the cells of thegraph G ( V , H ) is C -adapted to H and belongs to T g,C . Proof: We denote α := (2 + 1 / − throughout the proof of this lemma. It follows from toLemma 5.5.19 that the set V is a α/ R , αd H ).Theorem 5.5.8 applied to the metric α H , therefore implies that, if c is sufficientlysmall, the set V is wedged for the metric α H and that arbitrarily triangulating theconvex cells of the graph G ( V , α H ), which are convex yields a triangulation T which is C -adapted to the metric α H (hence C := C/α -adapted to H ), where C is an absoluteconstant.It immediately follows from the definition (5.120) that the graphs G ( V , α H ) and G ( V , H ) are equal, and Definition 5.5.9 shows that the wedge property holds for H if andonly if it holds for α H . (cid:5) The key ingredient of the construction of the anisotropic Delaunay triangulation is theanisotropic Voronoi diagram introduced in (5.119). Our first lemma compares the Voronoiregions with some balls. We recall that B H ( z, r ) := { z (cid:48) ∈ R ; (cid:107) z (cid:48) − z (cid:107) ≤ r } , and B ( z, r )denotes the usual euclidean ball of radius r centered at z . Lemma 5.5.21. If c is sufficiently small, then the following holds. For any v ∈ V B H ( v, / ⊂ Vor( v ) B H ( v, / ⊃ Vor( v ) B ( v, µ ( v ) / ⊃ Vor( v ) ∩ Γ = Vor( v ) ∩ Γ v Proof: We first establish Point 1., and for that purpose we consider z ∈ ∂ Vor( v ), hence z ∈ Vor( v ) ∩ Vor( w ) for some vertex w ∈ V . We define r := (cid:107) z − v (cid:107) H ( v ) = (cid:107) z − w (cid:107) H ( w ) Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? and we obtain using Point i) of Lemma 5.5.191 / ≤ d H ( v, w ) ≤ d H ( z, v ) + d H ( z, w ) ≤ − ln(1 − c (cid:107) z − v (cid:107) H ( v ) ) /c − ln(1 − c (cid:107) z − v (cid:107) H ( v ) ) /c = − − cr ) /c. It follows that r ≥ (1 − exp( − c/ /c, and therefore r ≥ / 25 if c is sufficiently small, which concludes the proof of Point 1.We now turn to Point 2, and for that purpose we remark that for any z ∈ R thereexists according to Point iii) of Lemma 5.5.19 a vertex w ∈ V such that d H ( z, w ) ≤ / (cid:107) z − w (cid:107) H ( w ) ≤ (exp( cd H ( z, w )) − /c ≤ (exp( c (2 + 1 / − /c. Hence (cid:107) z − w (cid:107) H ( w ) ≤ / c is sufficiently small. It thus follows from the definition(5.119) of the Voronoi diagram that (cid:107) z − v (cid:107) H ( v ) ≤ / z ∈ Vor( v ), whichconcludes the proof of Point 2.We now turn to Point 3, and for that purpose we remark that for all z ∈ Γ there existsaccording to Point ii) of Lemma 5.5.19 a vertex w ∈ V such that d H ( z, w ) ≤ / 10. Itfollows that (cid:107) z − w (cid:107) H ( w ) ≤ (exp( cd H ( z, w )) − /c ≤ (exp( c/ − /c. Hence (cid:107) z − w (cid:107) H ( w ) ≤ / 19 if c is sufficiently small. It thus follows from the definition ofthe Voronoi diagram that (cid:107) z − v (cid:107) H ( v ) ≤ / 19 for all z ∈ Vor( v ) ∩ Γ. Therefore | z − v | λ ( v ) ≤ (cid:107) z − v (cid:107) H ( v ) ≤ < z ∈ Γ v according to Proposition 5.5.18. We thus have (cid:1) ( θ ( v ) ⊥ , z − v ) ≤ π / 3, according to Lemma 5.5.17, and therefore | z − v | µ ( v ) cos( π / ≤ (cid:107) z − v (cid:107) H ( v ) ≤ (cid:5) We recall that, by definition of the anisotropic Delaunay triangulation, the Voronoiregions Vor( v ) and Vor( w ) intersect for any edge [ v, w ] of T . We say that an edge [ v, w ]of T is transverse if Vor( v ) ∩ Vor( w ) ∩ Γ (cid:54) = ∅ , and we say that [ v, w ] is aligned otherwise.Our next intermediate result estimates the length of transverse and aligned edges. Lemma 5.5.22. Let [ v, w ] be an edge of T and let z ∈ [ v, w ] .1. If [ v, w ] is transverse, then | v − w | ≤ µ ( z ) / . More precisely max {| v − x | , | w − x |} ≤ µ ( z ) / for all x ∈ Vor( v ) ∩ Vor( w ) ∩ Γ . .5. From metric to mesh 2. In any case (cid:107) v − w (cid:107) H ( z ) ≤ . Proof: We first establish Point 1., hence we assume that [ v, w ] is transverse and weconsider a point x ∈ Vor( v ) ∩ Vor( w ) ∩ Γ. For notational simplicity we denote β := 4 / | v − x | ≤ βµ ( v ) ≤ β ( µ ( x )+ c | v − x | ) , hence | v − x | ≤ β (1 − cβ ) − µ ( x ) . Likewise | w − x | ≤ β (1 − cβ ) − µ ( x ). Furthermore | z − x | ≤ max {| v − x | , | w − x |} since z ∈ [ v, w ], hence µ ( x ) ≤ µ ( z ) + c | z − x | ≤ µ ( z ) + cβ (1 − cβ ) − µ ( x )which implies µ ( x ) ≤ (1 − cβ (1 − cβ ) − ) − µ ( z ) . Therefore | v − x | ≤ β (1 − cβ ) − µ ( x ) ≤ β (1 − cβ ) − (1 − cβ (1 − cβ ) − ) − µ ( z ) . The term in front of µ ( z ) in the right hand side tends to β = 4 / < / c → c is sufficiently small we therefore obtain as announced | v − x | ≤ µ ( z ) / 4. Likewise | w − x | ≤ µ ( z ) / | v − w | ≤ µ ( z ) / αd H ( v, w ) = d α H ( v, w ) ≤ − − e c (cid:48) ) /c (cid:48) where c (cid:48) := c/α . Hence d H ( v, w ) ≤ r ( c ) := − − e c/α ) /c. If follows that (cid:107) v − w (cid:107) H ( v ) ≤ r ( c ) := ( e cr ( c ) − /c, which implies using (5.53) that (cid:107) v − w (cid:107) H ( z ) ≤ (cid:107) v − w (cid:107) H ( v ) (1 − c (cid:107) v − z (cid:107) H ( v ) ) − ≤ r ( c ) := r ( c )(1 − cr ( c )) − for any z ∈ [ v, w ]. As c → r , r and r all tend to 2(1 + 1 / < 5. Choosing c sufficiently small we conclude the proof of this proposition. (cid:5) The next lemma describes the orientations of the transverse and aligned edges of T ,in regions of sufficient anisotropy. We introduce the constant ρ := max { / sin( π / , C } , where C is the constant from Lemma 5.5.20 (we only use in the sequel that T is C equivalent to the metric H ).For any v, w ∈ V , we say that [ v, w ] is an edge of Γ if( v, w ) ⊂ Γ \ V , where ( v, w ) denotes the relative interior of the segment [ v, w ]. The next lemma characte-rizes the transverse edges of T in regions where the anisotropy is sufficiently pronounced.For any x ∈ Γ Γ + x := { v (cid:48) ∈ Γ v ; (cid:104) θ ( x ) ⊥ , v (cid:48) − x (cid:105) > } , and we define Γ − x similarly. If [ v, w ] is an edge of Γ, then one easily checks that w is theclosest element to v in Γ + v ∩ V or Γ − v ∩ V .68 Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? Lemma 5.5.23. The following holds if c is sufficiently small. Let v, w ∈ V be a transverseedge, and assume that there exists z ∈ [ v, w ] such that ρ ( z ) ≥ ρ . Then [ v, w ] is and edgeof Γ , Vor( v ) ∩ Vor( w ) ∩ Γ is a singleton { x } , and x ∈ [ v, w ] . Proof: We first consider an transverse edge [ v, w ], a point z ∈ [ v, w ] such that ρ ( z ) ≥ ρ ,and a point x ∈ Vor( v ) ∩ Vor( w ) ∩ Γ. According to Lemma 5.5.21 we have x ∈ Γ v ∩ Γ w ,hence v, w ∈ Γ x .We may assume without loss of generality that v ∈ Γ + x . We denote by v (cid:48) the pointof Γ + x which is the closest to x , and our first objective is to to establish that v = v (cid:48) . Weassume for contradiction that this is not the case.It follows from Lemma 5.5.17 that the segment [ x, v ] and the line v (cid:48) + θ ( v (cid:48) ) R intersect.Hence there exists α ∈ [0 , 1] and r ∈ R such that x + α ( v − x ) = v (cid:48) + rθ ( v (cid:48) )Let u be a unit vector orthogonal to v − x . We have (cid:104) x, u (cid:105) = (cid:104) v (cid:48) , u (cid:105) + r (cid:104) θ ( v (cid:48) ) , u (cid:105) , andtherefore using Lemma 5.5.17, | r | / ≤ | r | cos (cid:1) ( θ ( v (cid:48) ) ⊥ , v − x )= | r (cid:104) θ ( v (cid:48) ) , u (cid:105)| = |(cid:104) x − v (cid:48) , u (cid:105)|≤ | x − v (cid:48) | ≤ | x − v | ≤ µ ( z ) / x, v ] does not intersect B H ( v (cid:48) , / | r | ≥ λ ( v (cid:48) ) / 25 which yields λ ( z ) − c | z − v (cid:48) | ≤ λ ( v (cid:48) ) ≤ µ ( z )25 / . It follows from Lemma 5.5.22 that | z − v (cid:48) | ≤ | z − x | + | x − v (cid:48) | ≤ max {| v − x | , | w − x |} + | x − v | ≤ µ ( z ) / . We thus obtain 25 / c/ ≥ ρ ( z ) ≥ ρ which is a contradiction since c ≤ 1. Thiscontradiction is illustrated on Figure 5.10 (left).Hence v = v (cid:48) is the point of Γ + x the closest from x . If w ∈ Γ + x , then we obtain w = v (cid:48) = v which is a contradiction. Hence w ∈ Γ − x , and reasoning similarly we find that w is thepoint of Γ − x the closest to x . This implies that [ v, w ] is the unique edge of Γ containing x (note that the only points of Γ which belong to two edges of Γ are some vertices v ∈ V .But x / ∈ V since any vertex v ∈ V belongs to the interior of its own Voronoi region).Assume for contradiction that there exists another point x (cid:48) ∈ Vor( v ) ∩ Vor( w ) ∩ Γ. Theabove argument shows that [ v, w ] is the edge of Γ containing x (cid:48) . Therefore x, x (cid:48) ∈ [ v, w ],but this implies x = x (cid:48) since the Voronoi regions are star shaped. (cid:5) Our next purpose is to estimate the orientations of the edges of T . .5. From metric to mesh xv θ ( v ′ ) v ′ w ∗ v ′ θ ( v ) x v Figure Lemma 5.5.24. The following holds if c is sufficiently small. Let [ v, w ] be an edge of T such that w / ∈ Γ v and assume that ρ ( z ) ≥ ρ for some z ∈ [ v, w ] . Then (cid:1) ( θ ( z ) , w − v ) ≤ π / . Proof: It follows from Lemma 5.5.20 that | w − v | µ ( z ) sin (cid:1) ( θ ( z ) , w − v ) ≤ (cid:107) w − v (cid:107) H ( z ) ≤ . On the other hand Proposition 5.5.18 yields since w / ∈ Γ v | v − w | ≥ λ ( v )10 / ≥ ( λ ( z ) − c | z − w | )10 / , therefore | v − w | /λ ( z ) ≥ / (91 + 10 c/ 91) which implies that | v − w | ≥ λ ( z ) / 10 if c issufficiently small. It follows thatsin (cid:1) ( θ ( z ) , w − v ) ≤ ρ ( z ) , which concludes the proof since ρ ≥ / sin( π / (cid:5) The next lemma establishes that the aligned edges of T are, as their name indicates,aligned with the direction θ of anisotropy in regions where this anisotropy is sufficientlypronounced. Lemma 5.5.25. The following holds if c is sufficiently small.1. For any vertex v ∈ V such that ρ ( v ) ≥ ρ , there exists two aligned edges [ v, w ] , [ v, w (cid:48) ] of T such that (cid:30) ( θ ( v ) , w − v ) ≤ π / and (cid:30) ( θ ( v ) , v − w (cid:48) ) ≤ π / 12 (5.144) 2. Consider an aligned edge [ v, v (cid:48) ] of T and a point z ∈ [ v, v (cid:48) ] such that ρ ( z ) ≥ ρ + 1 .Then (cid:1) ( θ ( z ) , w − v ) ≤ π / . Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? Proof: We first establish Point 1., and as an intermediate objective we prove that Γ v \ Vor( v ) (cid:54) = ∅ . It follows from the definition (5.140) of Γ v that there exists an edge [ v , w ]of T such that v ∈ [ v , ( v + 2 w ) / ⊂ Γ v . Therefore according to (5.133)max {| v − v | , | v − ( v + 2 w ) / |} ≥ | v − w | / ≥ λ ( v ) √ / . On the other hand | z − v | ≤ µ ( v ) / 19 for all z ∈ Vor( v ) ∩ Γ v according to Lemma 5.5.21.Since ρ ( v ) ≥ ρ > / (19 √ 2) we find that one of the points v or w does not belong toVor( v ).Consider a point x ∈ Γ v ∩ ∂ Vor( v ). There exists vertex w ∗ ∈ V such that x ∈ Vor( w ∗ )and [ v, w ∗ ] is an edge of T . Therefore [ v, w ∗ ] is a transverse edge of T such that x ∈ Vor( v ) ∩ Vor( w ∗ ) ∩ Γ and ρ ( v ) ≥ ρ . This implies according to Lemma 5.5.23 that [ v, w ∗ ]is an edge of Γ containing x .Let T, T (cid:48) ∈ T be the two triangles containing the edge [ v, w ∗ ], and let w, w (cid:48) ∈ V bethe third vertex of these triangles respectively. Since T is C -adapted to the metric H , weobtain using (5.10) | v − w ∗ | + | v − w | > diam( T ) ≥ (cid:107)H − T (cid:107) ≥ (cid:107) H ( v ) − (cid:107) /C = λ ( v ) /C . (5.145)If w ∈ Γ v , then we obtain using Lemma 5.5.17 and Lemma 5.5.22 that | v − w | / (2 µ ( v )) ≤ | v − w | cos( (cid:1) ( θ ( v ) ⊥ , w − v )) /µ ( v ) ≤ (cid:107) v − w (cid:107) H ( v ) ≤ , and | v − w ∗ | ≤ µ ( v ) / 2. Injecting this in (5.145) we obtain µ ( v )(10 + 1 / ≤ λ ( v ) /C hence ρ ( v ) ≤ (10 + 1 / C which is a contradiction. Therefore w / ∈ Γ v , which implies (cid:1) ( θ ( v ) , w − v ) ≤ π / 12 according to Lemma 5.5.24. Hence (cid:30) ( θ ( v ) , w − v ) ∈ [0 , π / ∪ [11 π / , π ] . and likewise (cid:30) ( θ ( v ) , w (cid:48) − v ) ∈ [0 , π / ∪ [11 π / , π ]. On the other hand we have (cid:1) ( θ ( v ) ⊥ , w ∗ − v ) ≤ π / (cid:30) ( θ ( v ) , w ∗ − v ) ∈ [ π / , π / T and T (cid:48) are on opposite sides of the edge [ v, w ∗ ] weobtain (5.144) which concludes the proof of Point 1.We now turn to the proof of Point 2. We have | v − z | /λ ( v ) ≤ (cid:107) v − w (cid:107) H ( z ) ≤ c ≤ c ρ ( ρ , 5) we therefore obtain ρ ( v ) ≥ ρ . If v (cid:48) / ∈ Γ v , then Lemma 5.5.24gives the announced result. We therefore assume for contradiction that v (cid:48) ∈ Γ v , and wemay assume without loss of generality that v (cid:48) ∈ Γ + v . It follows that Γ + v \ Vor( v ) (cid:54) = ∅ .Repeating our previous argument we consider a point x ∈ Γ + v ∩ ∂ Vor( v ), and we find thatthere exists a vertex w ∗ ∈ Γ + v such that [ v, w ∗ ] is an edge common to T and Γ containing x . We denote by T, T (cid:48) the two triangles containing this edge and by w, w (cid:48) the third vertexof these triangles.Since v (cid:48) ∈ Γ + v we have (cid:1) ( θ ( v ) ⊥ , v − v (cid:48) ) ≤ π / (cid:30) ( θ ( v ) , v (cid:48) − v ) ∈ [ π / , π / w, w (cid:48) satisfy (5.144) we obtain that [ v, v (cid:48) ] intersectsone of the triangles T or T (cid:48) , which is a contradiction, as illustrated on Figure 5.10 (right).This concludes the proof of this lemma. (cid:5) .5. From metric to mesh v ∈ V such that ρ ( v ) ≥ ρ we define Γ (cid:48) v := [ v, w − ] ∪ [ v, w + ] where w − , w + ∈ V are such that Γ v := [ v, (2 w − + v ) / ∪ [ v, (2 w + + v ) / (cid:48) theunion of these sets Γ (cid:48) := (cid:91) v ∈V ρ ( v ) ≥ ρ Γ (cid:48) v , which is a union of segments joining some vertices in V , hence in V according to Lemma5.5.19. We denote by Q the partition into convex polygons of R obtained by bisectingthe triangles T ∈ T by these segments. Such a partition is generally referred to as theoverlay T and Γ (cid:48) . Proposition 5.5.26. The following holds if c is sufficiently small. Denote by T (cid:48) thetriangulation obtained by arbitrarily triangulating the convex cells of Q . Then T (cid:48) is a -refinement of T and S ( T (cid:48) ) ≤ max { C (3 + ρ ) , tan(11 π / } for all T (cid:48) ∈ T (cid:48) . Proof: Our first objective is to give a uniform bound on the measure of sliverness in T (cid:48) , and for that purpose we denote by V (cid:48) the collection of vertices of T (cid:48) . We have byconstruction V (cid:48) ⊂ Γ ∪ Γ (cid:48) .Assuming that c ≤ c ρ ( ρ + 1 , ρ ( z ) ≥ ρ + 1 for all z ∈ Γ (cid:48) .Consider a vertex z ∈ V (cid:48) , and assume in a first time that z ∈ Γ (cid:48) . It follows from Lemma5.5.17 that there exists two edges [ z, v ], [ z, v (cid:48) ] of T (cid:48) , contained in Γ (cid:48) , and such that (cid:30) ( θ ( z ) ⊥ , v − z ) ≤ π / (cid:30) ( θ ( z ) ⊥ , z − v (cid:48) ) ≤ π / . Furthermore, it follows from Lemma 5.5.25 that there exists two edges [ z, w ], [ z, w (cid:48) ], of T (cid:48) such that (cid:30) ( θ ( v ) , w − z ) ≤ π / 12 and (cid:30) ( θ ( v ) , z − w (cid:48) ) ≤ π / . In more detail two cases are possible : either z is a vertex of the original triangulation T ,and Point 1. of Lemma 5.5.25 applies, or z is a vertex of T (cid:48) created by the overlay of Γand T . In that case Point 2. of Lemma 5.5.25 applies and the edges [ z, w ], [ z, w (cid:48) ], of T (cid:48) are contained in the same aligned edge of T .We thus have (cid:30) ( w − z, v − z ) ≤ (cid:30) ( w − z, θ ( z )) + (cid:30) ( θ ( z ) , θ ( z ) ⊥ ) + (cid:30) ( θ ( z ) ⊥ , v − z ) ≤ π / 12 + π / π / 3= 11 π / . Likewise (cid:30) ( v − z, w (cid:48) − z ) ≤ π / , (cid:30) ( w (cid:48) − z, v (cid:48) − z ) ≤ π / 12 and (cid:30) ( v (cid:48) − z, w − z ) ≤ π / , which immediately implies that any angle at the vertex z is bounded by 11 π / z ∈ V (cid:48) \ Γ (cid:48) . Since V (cid:48) ⊂ Γ ∪ Γ (cid:48) there exists v ∈ V such that z ∈ Γ v . If ρ ( z ) ≥ ρ + 3 and c ≤ c ρ ( ρ + 3 , 1) then ρ ( v ) ≥ ρ + 2 andtherefore z ∈ Γ (cid:48) , which is a contradiction. Therefore ρ ( z ) ≤ ρ + 3, which implies that the72 Chapter 5. Are riemannian metrics equivalent to simplicial meshes ? triangles T ∈ T containing z satisfy S ( T ) ≤ ρ ( T ) ≤ C ρ ( z ) ≤ C ( ρ + 3). Any angle θ at the vertex z in the triangulation T , hence also in the triangulation T (cid:48) , thus satisfiestan( θ/ ≤ C ( ρ + 3), which concludes the proof of the upper bound of the measure ofsliverness S on T (cid:48) .We now prove that T (cid:48) is a bounded refinement of T , and for that purpose we consideran edge e = [ v, w ] of T , and we denote z := ( v + w ) / 2. We have by construction | v − w | /λ ( z ) ≤ (cid:107) v − w (cid:107) H ( z ) ≤ 5. If a triangle T ∈ T intersects [ v, w ], then for any z (cid:48) ∈ [ v, w ] ∩ Tλ ( z )(1 − c/ ≤ λ ( z ) − c | z − z (cid:48) | ≤ λ ( z (cid:48) ) ≤ λ ( z ) + c | z − z (cid:48) | ≤ λ ( z )(1 + 5 c/ . Recalling that 2 λ ( z (cid:48) ) / ≤ diam( T ) ≤ λ ( z (cid:48) ) we obtain T ⊂ B ( z, λ ( z ) / λ ( z (cid:48) )) ⊂ B ( z, λ ( z )(5 / c / , and in the other hand since T is a half square (cid:112) | T | = diam( T ) ≥ λ ( z (cid:48) ) / ≥ λ ( z )(1 − c/ / . Comparing the areas we obtain that the number of triangles T ∈ T intersecting [ v, w ] isbounded by π (cid:18) / c/ − c/ / (cid:19) which is smaller than 100 if c is sufficiently small.It follows that any edge [ v, w ] of T is cut in at most 100 parts by the set Γ. For any T ∈ T we thus obtain ∂T ∩ V (cid:48) ) ≤ × 100 = 300. Since any conforming triangulationof the convex envelope of n + 2 points uses n triangles, we obtain that the triangulation T (cid:48) is a 298 refinement of T which concludes the proof of this proposition. (cid:5) hapter 6Approximation theory based onmetrics Contents L p error . . . . . . . . . . . . . . . . . . . . . . . . . . 2956.4.2 The W ,p error, when the measure of sliverness is uniformlybounded . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2986.4.3 The W ,p error, on a general mesh . . . . . . . . . . . . . . 301 The main purpose of adaptive mesh generation, compared uniform mesh generation, isto reduce the number of simplices required to achieve a given task. Here are two relevantexamples which are further discussed in this chapter :27374 Chapter 6. Approximation theory based on metrics 1. Given two domains of interest, generate the mesh with smallest possible cardinalitythat separates these domains with a layer of simplices.2. Given a function f , generate the mesh of a prescribed cardinality N that yields thebest finite element approximation to f of a given order m − L p or W ,p .Both examples can be viewed as optimization problems posed on the set of meshes. Thesecond problem was dealt with in Chapter 2 and 3 in the case where f is smooth, andan optimal mesh was constructed in the asymptotic regime N → + ∞ . The results of theprevious chapter have revealed that certain relevant classes of meshes can be equivalentlydescribed in terms of riemannian metrics, and a natural objective is therefore to translatethe above problems as optimization problems posed on the set of metrics . Optimizationproblems posed on metrics may indeed be easier to solve practically due to the continuousnature of these objects. The goal of this chapter is to explain how this objective can bemet for the above two examples. The dimension d ≥ T the collection of conforming simplicial meshes of R d and by H := C ( R d , S + d )the collection of metrics.In the previous chapter, we have introduced the classes of meshes T i,C ⊂ T a,C ⊂ T g,C , and characterized the respectively equivalent classes of metrics H i ⊂ H a ⊂ H g . These classes of meshes and metrics are defined on the entire unbounded domain R d .The above optimization problems do not make sense in this setting since the numberof elements in such meshes is infinite. In order to circumvent this difficulty while stillavoiding the difficulties related to the boundary of a domain, we work in a periodizedsetting that we describe below. We denote by T per the collection of meshes T ∈ T which satisfy the following proper-ties :i) (Translation invariance) For any T ∈ T and any u ∈ ZZ d , one has u + T ∈ ZZ d .ii) (Constrained diameters) For any T ∈ T one has H T ≥ Id.Accordingly, we denote by H per the collection of metrics H ∈ H which satisfy thefollowing properties :i) (Translation invariance) For any z ∈ R d and any u ∈ ZZ d , one has H ( z ) = H ( z + u ).ii) (Constrained positivity) For any z ∈ R d one has H ( z ) ≥ Id.A mesh T ∈ T per can be regarded, thanks to the translation invariance property, as amesh of the compact periodic space Π d := ( R / ZZ) d , (6.1) .1. Introduction d -dimensional torus. The diameter constraint property ensuresthat the simplices T of a periodic mesh T ∈ T per do not have a significantly larger sizethan the fundamental cell [0 , d of the standard ZZ d periodic tiling of R d .Strictly speaking, a periodic mesh T ∈ T per has an infinite cardinality. However itonly contains a finite number of simplices up to translation by an element of ZZ d . For any T ∈ T per , we therefore denote by T ) the number of equivalence classes in T for therelation T ∼ T (cid:48) if and only if there exists u ∈ ZZ d such that T (cid:48) = T + u. The number T ) is finite, and is the the cardinality of T seen as a mesh of the compactperiodic space (6.1). We define the mass m ( H ) of a periodic metric H ∈ H per as follows m ( H ) := (cid:90) [0 , d √ det H. The next proposition shows that the mass of a metric H and the cardinality of a mesh T are close when H and T are equivalent in the sense studied in the previous chapter.We denote by T eq a fixed equilateral simplex centered at the origin of R d and having itsvertices on the unit sphere. Proposition 6.1.1. The cardinality of a mesh T ∈ T per and the mass of a metric H ∈ H per satisfy the following properties :– If H T ≤ C H ( z ) for all T ∈ T and all z ∈ T , then | T eq | T ) ≤ C d m ( H ) . – If H ( z ) ≤ C H T for all T ∈ T and all z ∈ T , then m ( H ) ≤ C d | T eq | T ) . In particular, if T is C -adapted to H in the sense of Definition 5.1.1, then C − d m ( H ) ≤ | T eq | T ) ≤ C d m ( H ) . (6.2) Proof: For any z ∈ R d we denote by T ( z ) a simplex T ∈ T containing z . Recalling (seeProposition 5.1.3 in the previous chapter) that for any simplex T | T | (cid:112) det H T = | T eq | , we obtain T ) = (cid:90) [0 , d dz | T ( z ) | = 1 | T eq | (cid:90) [0 , d (cid:113) det H T ( z ) dz. (6.3)Furthermore we have det S ≤ det S (cid:48) for any S, S (cid:48) ∈ S + d satisfying S ≤ S (cid:48) . Applying thisremark to the matrices H T ( z ) and C H ( z ) or C − H ( z ) we obtain the announced inequa-lities, which concludes the proof of the proposition. (cid:5) The equivalence between classes of meshes and metrics of R d established in Theorem5.1.14 of the previous chapter has an immediate generalization to periodic meshes andmetrics. Theorem 6.1.2. There exists a constant C = C ( d ) such that for all C ≥ C ,1. The collection of meshes T i,C ∩ T per is equivalent to the collection of metrics H i ∩ H per . Chapter 6. Approximation theory based on metrics 2. If d = 2 , then the collection of meshes T a,C ∩ T per is equivalent to the collection ofmetrics H a ∩ H per .3. If d = 2 , then the collection of meshes T g,C ∩ T per is equivalent to the collection ofmetrics H g ∩ H per . Proof: The proof of Theorem 5.1.14 presented in the previous chapter can be adapted toperiodic metrics and triangulations with only slight changes, which are left to the reader. (cid:5) In order to ensure the existence of an “optimal metric” for a number of optimiza-tion problems posed on periodic and isotropic, quasi-acute or graded metrics, we need aproperty of compactness for such metrics.We equip the set H per of continuous and ZZ d periodic metrics with a distance d per which is defined as follows : for all H, H (cid:48) ∈ H per d per ( H, H (cid:48) ) := sup z ∈ [0 , d d × ( H ( z ) , H ( z )) = sup z ∈ R d d × ( H ( z ) , H (cid:48) ( z )) . Consider H, H (cid:48) ∈ H per and define δ = d per ( H, H (cid:48) ). We thus have for all z ∈ R d , e − δ H ( z ) ≤ H (cid:48) ( z ) ≤ e δ H ( z ) . (6.4)It immediately follows that e − dδ m ( H ) ≤ m ( H (cid:48) ) ≤ e dδ m ( H ) , (6.5)and e − δ d H ( x, y ) ≤ d H (cid:48) ( x, y ) ≤ e δ d H ( x, y ) . (6.6)which shows that the mass m (cid:55)→ m ( H ) and the distance between to fixed points H (cid:55)→ d H ( x, y ) depend continuously on H ∈ H per .We establish in § M ≥ (cid:63) ∈ { i, a, g } the collection of metrics { H ∈ H per ∩ H (cid:63) ; m ( H ) ≤ M } is a compact subset of H per . Isotropic and anisotropic mesh adaptation is often used to separate some geometricsets, in the sense given by the following definition. For any closed subset E ⊂ R d and anymesh T ∈ T we define the neighborhood V T ( E ) of E in T as follows : V T ( E ) := (cid:91) T ∈T ,T ∩ E (cid:54) = ∅ T. .1. Introduction T ∈ T separates two closed sets X, Y ⊂ R d if and only if V T ( X ) ∩ V T ( Y ) = ∅ , in other words if X and Y have disjoint neighborhoods in the mesh T .A natural objective is to build a mesh of minimal cardinality which separates twogiven regions. Consider two closed, disjoint and ZZ d -periodic sets X, Y ⊂ R d . We considerin § m (cid:63),C ( X, Y ) := min { T ) ; T ∈ T per ∩ T (cid:63),C and T separates X and Y } where (cid:63) ∈ { i, a, g } is a symbol and C ≥ d = 2 : m (cid:63) ( X, Y ) := min { m ( H ) ; H ∈ H per ∩ H (cid:63) and H separates X and Y } , where we say that a metric H separates X and Y if d H ( x, y ) ≥ x, y ) ∈ X × Y. Precisely, we show that there exists a constant C = C ( C ) ≥ 1, independent of the sets X, Y , and such that C − m (cid:63) ( X, Y ) ≤ m (cid:63),C ( X, Y ) ≤ Cm (cid:63) ( X, Y ) . Imposing more constraints on the triangulation typically raises the number of trianglesrequired to achieve a given task, and likewise assuming more regularity of the metricraises the mass required for a given task. We illustrate this remark in the context ofthe separation of domains as follows. We fix the parameter r := 1 / X := { x ∈ R d ; d ( x, ZZ d ) ≤ r } , where d ( x, E ) := min {| x − e | ; e ∈ E } , and for any 0 < δ ≤ r Y δ := { y ∈ R d ; d ( y, ZZ d ) ≥ r + δ } . The sets X and Y δ are illustrated on Figure 6.1. The following equivalences are establishedas δ → m i ( X, Y δ ) (cid:39) δ − ( d − ,m a ( X, Y δ ) (cid:39) δ − d − | ln δ | ,m g ( X, Y δ ) (cid:39) δ − d − . A construction closely related to the separation of the sets X and Y δ is presented onFigure 8 in the main introduction of this thesis.78 Chapter 6. Approximation theory based on metrics Figure X (light gray) and Y δ (dark gray), with d = 2 and δ = 0 . One of the key purposes of anisotropic mesh generation is the adaptive approximationof functions, as discussed extensively in Chapters 2 and 3. In order to study this problemfrom the point of view of metrics, we introduce an error e H ( f ) p associated to a functionand a metric, and we compare this quantity with the approximation error of f on a mesh.For that purpose we denote by B H ( z ) the open ball of radius 1 around z for the norm (cid:107) · (cid:107) H ( z ) (therefore an ellipse) : B H ( z ) := { z + u ; (cid:107) u (cid:107) H ( z ) < } . For any exponent p , 1 ≤ p ≤ ∞ , and any f ∈ L p loc ( R d ) we denote by e H ( f ; z ) p the errorof best approximation of f on B H ( z ) : e H ( f ; z ) p := inf µ ∈ IP m − (cid:107) f − µ (cid:107) L p ( B H ( z )) . (6.7)We denote by L p per the collection of functions f ∈ L p loc ( R d ) which are ZZ d -periodic. For all f ∈ L p per and all H ∈ H per we define the approximation error e H ( f ) p of f associated tothe metric H as follows e H ( f ) pp := (cid:90) [0 , d (cid:112) det H ( z ) e H ( f ; z ) pp dz. (6.8)Consider a metric H ∈ H per ∩ H g and a mesh T ∈ T per such that H T ≥ H ( z ) , z ∈ T, T ∈ T , (6.9)which heuristically means that that the mesh T is “more refined” than the metric H . Weestablish that for any f ∈ L p per inf g ∈ V m − ( T ) (cid:107) f − g (cid:107) p ≤ Ce H ( f ) p (6.10)where V k ( T ) stands for the space of finite elements of degree k on T , and C is an absoluteconstant. Furthermore we give an explicit expression of a finite element approximation g ∈ V k ( T ) which satisfies (6.10), namely g := I m − T f H (cid:48) , (6.11) .1. Introduction H (cid:48) := 4 H . We denote by f H the convolution distorted by the metric H of a function f with a fixed compactly supported mollifier ϕ (which satisfies a moments condition) f H ( z ) := (cid:90) R d f ( z + H ( z ) − u ) ϕ ( | u | ) du. The extension of the approximation result (6.10) to the W ,p semi-norm is not trivialdue to the fact that interpolation on simplices of large measure of sliverness is not a stableprocedure.For any vector field v ∈ L p loc ( R d , R d ) and any metric H ∈ H we introduce the twoquantities e aH ( v ; z ) p := min ν ∈ IP dm − (cid:107) v − ν (cid:107) L p ( B H ( z )) (6.12)and e gH ( v ; z ) p := (cid:107) H ( z ) (cid:107) min ν ∈ IP dm − (cid:107) H ( z ) − ( v − ν ) (cid:107) L p ( B H ( z )) . (6.13)Note that e aH ( v ; z ) p ≤ e gH ( v ; z ) p (6.14)with equality if H ( z ) is proportionnal to Id. If v is ZZ d periodic and if H ∈ H per , then wedefine e aH ( v ) p and e gH ( v ) p similarly to (6.8).Consider a metric H ∈ H per ∩ H a and mesh T ∈ T per which satisfy (6.9). We establishthat for any f ∈ W ,p per inf g ∈ V m − ( T ) (cid:107)∇ ( f − g ) (cid:107) p ≤ C e aH ( ∇ f ) p max T ∈T S ( T ) , (6.15)where C = C ( d ).This estimate should be compared with the following. Consider a metric H ∈ H per ∩ H g and a mesh T ∈ T which satisfy (6.9) and in addition C H ( z ) ≥ H T for all T ∈ T and z ∈ T . We establish that inf g ∈ V m − ( T ) (cid:107)∇ ( f − g ) (cid:107) p ≤ C e gH ( ∇ f ) p , (6.16)where C = C ( C , d ).The function g given by (6.11) satisfies both estimates (6.15) and (6.16). These resultsput in light the price to pay to obtain the best error estimate (6.15) in the W ,p norm :the measure of sliverness should be uniformly bounded on T , and the metric H shouldbelong to H a . If these conditions are not satisfied then we can only prove the estimate(6.16), which penalizes the anisotropy of the metric H as observed in (6.14). Note thataccording to Theorem 6.1.2, if the dimension is d = 2 then for any metric H ∈ H a ∩ H per there exists a mesh T ∈ T a,C ∩ T per which is C (cid:48) -equivalent to H , where C and C (cid:48) areabsolute constants. By definition of quasi-acute meshes there exists a C -refinement T of T on which the measure of sliverness is uniformly bounded by C , which satisfies (6.9),and such that T ) ≤ C (cid:48)(cid:48) m ( H ) where C (cid:48)(cid:48) is an absolute constant.80 Chapter 6. Approximation theory based on metrics We establish some counterparts for metrics of the asymptotic estimates presented inChapters 2 and 3 for triangulations and more general meshes. Our estimates hold withoutrestriction on the degree m − m ≥ 2, or the dimension d ≥ 2. Theyapply to a sufficiently smooth function and to metrics of asymptotically large mass.We denote by IH m the vector space of all homogeneous polynomials of degree m in d variables, equipped with the norm (cid:107) π (cid:107) := sup | u |≤ | π ( u ) | . We define the shape function K as follows : for all π ∈ IH m K ( π ) = inf A ∈ SL d (cid:107) π ◦ A (cid:107) . (6.17)We establish that for any ZZ d periodic and C m functionlim sup M →∞ M md inf H ∈ H g m ( H ) ≤ M e H ( f ) p ≤ C (cid:107) K ( d m f ) (cid:107) τ , (6.18)where the exponent τ is defined by τ := md + p .The proof of this result would be greatly simplified if one could establish the existenceof a smooth map π ∈ IH m (cid:55)→ A ( π ) ∈ SL d such that K ( π ) = (cid:107) π ◦ A ( π ) (cid:107) . This is unfortu-nately not the case, and the key ingredient of the proof of (6.18) is the definition and theanalysis of a family K ( α ) of functions which tend to K as α → ∞ and which are defined bya well posed optimization problem (for which there exists a unique minimizer dependingcontinuously on the parameter π ). We also prove in § K isuniformly equivalent to a continuous function on IH m , as well as the shape functions L g and L g defined below.In the case of W ,p norms we introduce two distinct shape functions L a ( π ) := inf A ∈ SL d (cid:107) ( ∇ π ) ◦ A (cid:107) ,L g ( π ) := inf A ∈ SL d (cid:107) A − (cid:107)(cid:107)∇ ( π ◦ A ) (cid:107) , where IH dm − is equipped with the norm (cid:107) ν (cid:107) := sup | u | =1 | ν ( u ) | . We establish similar results to (6.18), for any C m and ZZ d periodic function f lim sup M →∞ M m − d inf H ∈ H a m ( H ) ≤ M e aH ( ∇ f ) p ≤ C (cid:107) L a ( d m f ) (cid:107) τ (6.19) .2. Compactness of metrics of a given mass M →∞ M m − d inf H ∈ H g m ( H ) ≤ M e gH ( ∇ f ) p ≤ C (cid:107) L g ( d m f ) (cid:107) τ (6.20)where the exponent τ is defined by τ := m − d + p .For a quadratic or cubic (in two dimensions) homogeneous polynomial π , we give anexplicit expression of near-minimizers of the optimization problems defining L a ( π ) and L g ( π ) in terms of the coefficients of π . This expression can be used to compute efficientlyin numerical applications a metric H adapted to the approximated function f using theavailable information on the derivatives of f .We also use these minimizers to compute the ratio L a ( π ) /L g ( π ) ≤ 1, and we discussfor which polynomials this ratio is significantly small. In other words for which types ofanisotropic features of the approximated function f the use of quasi-acute triangulationsleads to a significantly smaller approximation error than the use of graded anisotropictriangulations (which is the case in all numerical software known to the author). The purpose of this section is to establish the following compactness result. We recallthat the collection H per of ZZ d periodic metrics is equipped with the distance d per ( H, H (cid:48) ) := sup z ∈ IR d d × ( H ( z ) , H (cid:48) ( z )) . Theorem 6.2.1. For any symbol (cid:63) ∈ { i, a, g } and any m ≥ the collection of metrics { H ∈ H per ∩ H (cid:63) ; m ( H ) ≤ m } (6.21) is a compact subset of H per The rest of this section is devoted to the proof of this theorem. Our first intermediateresult establishes that the set (6.21) is closed in H per . Lemma 6.2.2. The set (6.21) is closed in H per , as well as the sets H i , H a , H g . Proof: As observed in § H (cid:55)→ H ( z ) at a fixed point z ∈ R d ,the distance H (cid:55)→ d H ( x, y ) between two fixed points x, y ∈ R d and the mass H (cid:55)→ m ( H )define continuous maps on H per , see (6.4), (6.5) and (6.5). For any fixed x, y ∈ R d , thecollection of all metrics H ∈ H per which satisfy one of the following conditions d × ( H ( x ) , H ( y )) ≤ d H ( x, y ) ,d + ( H ( x ) , H ( y )) ≤ | x − y | , (cid:107) H ( x ) (cid:107)(cid:107) H ( x ) − (cid:107) = 1 ,m ( H ) ≤ m , Chapter 6. Approximation theory based on metrics is therefore a closed subset of H per .A metric H belongs to H g (resp. H a , resp. H i ) if and only if it satisfies the first ofthese inequalities for all x, y ∈ R d (resp. the two first, resp. the three first). The sets H i , H a and H g are thus an intersection of closed subsets of H per , hence are closed. The set(6.21) is obtained by imposing the fourth inequality, and is therefore also closed. (cid:5) We recall a classical result which states that the collection of Lipschitz functions bet-ween two compact metric spaces is itself a compact metric space. It is an immediateconsequence of Ascoli theorem. Theorem 6.2.3. If two metric spaces ( X, d X ) and ( Y, d Y ) are compact, then the collection Lip( X, Y ) of Lipschitz functions from X to Y is also compact when equipped with thedistance d ( f, g ) := max x ∈ X d Y ( f ( x ) , g ( x )) . We use this theorem to establish that closed and bounded subsets of H per ∩ H g arecompact. Lemma 6.2.4. Any closed and bounded subset H ∗ of H per ∩ H g is compact. Proof: Since the collection H ∗ of metrics is bounded, the following quantity M is finite M := sup H ∈ H ∗ sup z ∈ [0 , d (cid:107) H ( z ) (cid:107) . For any x, y ∈ R d and any H ∈ H ∗ we thus have d H ( x, y ) ≤ M | x − y | .We consider the compact metric space X := [0 , d , equipped with the distance d X ( x, y ) := M | x − y | , and the compact metric space Y := { M ∈ S + d ; Id ≤ M ≤ C Id } equipped with the distance d × . According to Theorem 6.2.3 the metric space Lip( X, Y )is thus compact.Consider a sequence ( H n ) n ≥ of elements of H ∗ . For all n ≥ 0, the restriction H n | [0 , d is an element of Lip( X, Y ). Since this space is compact there exists a sub-sequence H ϕ ( n ) which converges uniformly on [0 , d . Recalling that H n is periodic for all n ≥ H ϕ ( n ) converges uniformly to a periodic metric H ∈ H per .Since H ∗ is closed in H per ∩ H g , and since H per ∩ H g is closed in H per , we obtain that H ∗ is closed in H per and therefore H ∈ H ∗ which concludes the proof. (cid:5) The next lemma establishes that the set (6.21) appearing in Theorem 6.2.1 is bounded.This set is closed according to Lemma 6.2.2, hence compact according to Lemma 6.2.4,which concludes the proof of Theorem 6.2.1. Lemma 6.2.5. Let m ≥ be a constant. .2. Compactness of metrics of a given mass 1. There exists C = C ( m , d ) such that for any H ∈ H g satisfying H ( z ) ≥ Id for all | z | ≤ and (cid:90) | z |≤ (cid:112) det H ( z ) dz ≤ m we have (cid:107) H (0) (cid:107) ≤ C .2. The set { H ∈ H per ∩ H g ; m ( H ) ≤ m } is bounded. Proof: We first show how 2. can be obtained assuming 1., and for that purpose weconsider H ∈ H per ∩ H g and z ∗ . The ball { z ∈ R d ; | z − z ∗ | ≤ } is covered by the 2 d cubes z ∗ + u + [0 , d where u ∈ {− , } d . Consequently (cid:90) | z − z ∗ |≤ (cid:112) det H ( z ) dz ≤ (cid:88) u ∈{− , } d (cid:90) z ∗ + u +[0 , d √ det H = 2 d m and we already know that H ( z ) ≥ Id for all z ∈ R d . It follows that (cid:107) H ( z ∗ ) (cid:107) ≤ C (cid:48) where C (cid:48) is the constant associated to 2 d m in Point 1., which concludes the proof of Point 2.We now turn to the proof of Point 1. At each point z ∈ R d we denote the eigenvaluesof H ( z ) by λ ( z ) ≥ · · · ≥ λ d ( z ) . For each k ∈ { , · · · , d } the function λ k : R d → R ∗ + is continuous and λ ( z ) ≥ | z | ≤ λ (0) = (cid:112) (cid:107) H (0) (cid:107) in terms of m . Weshall proceed as follows : we first give an upper bound for λ d (0) that depends only on m ,and then for each 1 ≤ k ≤ d − λ k (0) in terms of m and λ k +1 (0). The upper bound on λ d (0) . We define the ellipse E := { z ∈ R d ; (cid:107) z (cid:107) H (0) ≤ λ d (0) } and we observe that | z | ≤ z ∈ E . We thus obtain using Corollary 5.2.11 that c d ln λ d (0) ≤ (cid:90) E (cid:112) det H ( z ) dz ≤ (cid:90) | z |≤ (cid:112) det H ( z ) dz ≤ m , where c d > d . This gives as announcedan upper bound on λ d (0) in terms of m . The upper bound on λ k (0) . The vector space R d is the orthogonal sum R d = U ⊕ V where U is the sum of the eigenspaces associated to the eigenvalues λ (0) , · · · , λ k (0) of H (0) and V is associated to the other eigenvalues λ k +1 (0) , · · · , λ d (0) of H (0). We havefor all u ∈ U and all v ∈ V , (cid:107) u (cid:107) H (0) ≥ λ k (0) | u | and (cid:107) v (cid:107) H (0) ≤ λ k +1 (0) | v | . Chapter 6. Approximation theory based on metrics We consider an isometry P ∈ O d,k (in the sense that P T P is the k × k identity matrix)satisfying P ( R k ) = U . For each v ∈ V we define a metric H v on R k as follows : for all u ∈ R k H v ( u ) := P T H v ( v + P u ) P. It follows from Proposition 5.2.2 (in the previous chapter) that H v belongs to the collection H g ( R k ) of graded metrics on R k . We define B U := { u ∈ R k ; | u | ≤ / } and B V := { v ∈ V ; λ k +1 (0) | v | ≤ / } , as we observe that | P u + v | ≤ u, v ) ∈ B U × B V . (6.22)Recalling that λ i ( z ) ≥ ≤ i ≤ d and | z | ≤ 1, we obtain for all ( u, v ) ∈ B U × B V det H v ( u ) ≤ λ ( P u + v ) · · · λ k ( P u + v ) ≤ λ ( P u + v ) · · · λ d ( P u + v )= det H ( P u + v ) . (6.23)For any u ∈ R k and any v ∈ V we have according to (5.53) (cid:107) u (cid:107) H v (0) = (cid:107) P u (cid:107) H ( v ) ≥ (1 − (cid:107) v (cid:107) H (0) ) (cid:107) P u (cid:107) H ( z ∗ ) ≥ λ k (0) | u | / , (6.24)and therefore { u ∈ R k ; (cid:107) u (cid:107) H v (0) ≤ λ k (0) / } ⊂ B U . Denoting by c k the constant from Corollary 5.2.11, applied in dimension k , we thereforeobtain for any v ∈ B V (cid:90) B U (cid:112) det H v ( u ) du ≥ c k ln( λ k (0) / . (6.25)We thus obtain successively (cid:90) | z |≤ (cid:112) det H ( z ) dz ≥ (cid:90) v ∈ B V (cid:90) u ∈ B U (cid:112) det H ( P u + v ) dudv ≥ (cid:90) v ∈ B V (cid:90) u ∈ B U (cid:112) det H v ( u ) dudv ≥ (cid:90) v ∈ B V c k ln( λ k (0) / dv = | B V | c k ln( λ k (0) / ω d − k (2 λ k +1 (0)) d − k c k ln( λ k (0) / , where we used (6.22) and the Fubini integration formula in the first line, (6.23) in thesecond line, and (6.25) in the third line. In the last line we denote by ω d − k the volume ofthe unit euclidean ball in R d − k .This inequality yields an upper bound on λ k (0) in terms of M and λ k +1 (0) λ k (0) ≤ (cid:18) m (2 λ k +1 (0)) d − k ω d − k (cid:19) , which concludes the proof of this theorem. (cid:5) .3. Separation of two domains We study in this section a geometrical problem : the separation of two sets usinga mesh of minimal cardinality. Our first result, Proposition 6.3.1, establishes that thisproblem has an equivalent formulation in terms of metrics. Proposition 6.3.1. We assume that the collection of meshes T per ∩ T (cid:63),C and the collec-tion of metrics H per ∩ H (cid:63) are equivalent in dimension d , where (cid:63) ∈ { a, b, c } is a symboland C ≥ is a constant (According to Theorem 6.1.2, this condition holds at least if d = 2 and C is sufficiently large).There exists a constant C ≥ such that the following holds :Let X, Y ⊂ R d be two closed, disjoint and periodic sets, in the sense that x + u ∈ X and y + u ∈ Y, for any x ∈ X, y ∈ Y and u ∈ ZZ d . Consider the two optimization problems m (cid:63),C ( X, Y ) := min { T ) ; T ∈ T per ∩ T (cid:63),C and T separates X and Y } ,m (cid:63) ( X, Y ) := min { m ( H ) ; H ∈ H per ∩ H (cid:63) and H separates Y and Y } . Then C − m (cid:63) ( X, Y ) ≤ m (cid:63),C ( X, Y ) ≤ Cm (cid:63) ( X, Y ) . (6.26) Proof: We denote by C ≥ T ∈ T per ∩ T (cid:63),C there exists H ∈ H per ∩ H (cid:63) which is C -equivalent to T , and vice versa.The proof of this proposition is a simple translation between the vocabulary of meshesand the vocabulary of metrics. We first establish that m (cid:63) ( X, Y ) ≤ Cm (cid:63),C ( X, Y ), andthen that m (cid:63),C ( X, Y ) ≤ Cm (cid:63) ( X, Y ).Let T ∈ T per ∩ T (cid:63),C be a mesh which separates the sets X and Y , and let H ∈ H per ∩ H (cid:63) be a metric which is C equivalent to T . We thus have for any T ∈ T and any z ∈ TC − H ( z ) ≤ H T ≤ C H ( z ) . Let x ∈ X and let y ∈ Y . Let T ∈ T be the simplex containing x and let V T ( T ) be theneighborhood of T in T . Since x ∈ T and y / ∈ V T ( T ) we have (cid:107) x − y (cid:107) H T ≥ c := ( C √ d ) − according to Proposition 5.4.5 (in the previous chapter). It follows that d H ( x, y ) ≥ ln(1 + (cid:107) x − y (cid:107) H ( x ) ) ≥ ln(1 + (cid:107) x − y (cid:107) H T /C ) ≥ r := ln(1 + cC − ) . The metric ˜ H := r − H thus belongs to H per ∩ H (cid:63) since 0 < r ≤ 1, see Remark 5.1.13.Furthermore this metric separates the sets X and Y and satisfies according to Proposition6.1.1 m ( ˜ H ) ≤ ( C /r ) d | T eq | T ) . Taking the infimum over all meshes T ∈ T ∗ ,C which separate X and Y we obtain the leftpart of (6.26) : m ∗ ( X, Y ) ≤ | T eq | ( C /r ) d m ∗ ,C ( X, Y ).86 Chapter 6. Approximation theory based on metrics We now turn to the proof of the right part of (6.26), and for that purpose we considera Riemannian metric H ∈ H ∗ ∩ H per which separates the sets X and Y . We consider aparameter λ ≥ 1, which value will be specified later, and we observe that ˜ H := λ H ∈ H ∗ ∩ H per . Therefore there exists a mesh T ∈ T (cid:63),C ∩ T per such that for all T ∈ T and all z ∈ T one has C − ˜ H ( z ) ≤ H T ≤ C ˜ H ( z ) . Let us assume that T does not separate the sets X and Y , hence that V T ( X ) ∩ V T ( Y ) (cid:54) = ∅ . In that case there exists two simplices T, T (cid:48) ∈ T sharing a vertex v , and two points x ∈ T ∩ X and y ∈ T (cid:48) ∩ Y . For any simplex T and any z, z (cid:48) ∈ T we have (cid:107) z − z (cid:48) (cid:107) H T ≤ λ ≤ d ˜ H ( x, y ) ≤ d ˜ H ( x, v ) + d ˜ H ( v, y ) ≤ C ( (cid:107) x − v (cid:107) H T + (cid:107) v − y (cid:107) H T (cid:48) ) ≤ C . We now choose the particular the value λ := 4 C + 1, which contradicts the previousequation and shows that T does separate X and Y . Furthermore | T eq | T ) ≤ C d m ( ˜ H ) = ( λC ) d m ( H ) . Taking the infimum among all metrics H ∈ H ∗ which separate X and Y we obtain | T eq | m ∗ ,C ( X, Y ) ≤ ( λC ) d m ∗ ( X, Y ), which concludes the proof of this proposition. (cid:5) As a concrete example, we define r := 1 / < δ ≤ r we consider in thefollowing the sets X := { x ∈ R d ; d ( x, ZZ d ) ≤ r } and Y δ := { y ∈ R d ; d ( y, ZZ d ) ≥ r + δ } , (6.27)which are illustrated on Figure 6.1. The next theorem gives a sharp estimate of the minimalmass of a metric which separates the sets X and Y δ , as δ → 0. The notation α δ (cid:39) β δ stands for the following : there exists a constant C ≥ < δ ≤ r C − α δ ≤ β δ ≤ Cα δ . Theorem 6.3.2. We have the equivalences m i ( X, Y δ ) (cid:39) δ − ( d − m a ( X, Y δ ) (cid:39) δ − d − | ln δ | m g ( X, Y δ ) (cid:39) δ − d − . Furthermore for any (cid:63) ∈ { i, a, g } and any < δ ≤ r the explicit metric H (cid:63)δ ∈ H (cid:63) ∩ H per defined below is a near minimizer of the problem m (cid:63) ( X, Y δ ) . In other words H (cid:63)δ separates X and Y δ and satisfies m ( H (cid:63)δ ) (cid:39) m (cid:63) ( X, Y δ ) . The proof is divided in two parts : we first give the explicit expression of the metrics H (cid:63)δ and we use it to obtain an upper bound on m (cid:63) ( X, Y δ ). We then prove a lower boundon m (cid:63) ( X, Y δ ). .3. Separation of two domains We define in this section the metrics H (cid:63)δ , where (cid:63) ∈ { i, a, g } and 0 < δ ≤ r := 1 / H (cid:63)δ ∈ H (cid:63) , in Lemma 6.3.4 that H (cid:63)δ separates X and Y δ , and we prove in Lemma 6.3.5 an upper estimate on the masses m ( H (cid:63)δ ) of thesemetrics.In order to define H (cid:63)δ we need to introduce some notations. For any z ∈ R d \ { } wedefine s ( z ) := min { r , || z | − r |} and θ ( z ) := z | z | . (6.28)For any 0 < δ ≤ r and any z ∈ R d \ { } we define three matrices S (cid:63)δ ( z ) ∈ S + d , where (cid:63) ∈ { i, a, g } , as follows S iδ ( z ) := max { s ( z ) , δ } Id S aδ ( z ) := max { s ( z ) , δ } θ ( z ) θ ( z ) T + max (cid:110) s ( z ) , √ δ (cid:111) (Id − θ ( z ) θ ( z ) T ) S gδ ( z ) := max { s ( z ) , δ } θ ( z ) θ ( z ) T + (cid:112) r max { s ( z ) , δ } (Id − θ ( z ) θ ( z ) T ) . (6.29)We also define S (cid:63)δ (0) = r Id for any (cid:63) ∈ { i, a, g } . Eventually we define the metrics H ∗ δ bythe equality 2 H (cid:63)δ ( z + u ) − := S (cid:63)δ ( z ) (6.30)for all z ∈ (cid:2) − , (cid:3) d and all u ∈ ZZ d . Observe that H (cid:63)δ = 8 Id for all z ∈ [ − / , / d such that | z | ≥ r = 1 / , (6.31)since S (cid:63)δ ( z ) = r Id on this set. Figure 8 in the main introduction of this thesis illustratessome (finite) triangulations which are respectively equivalent to the metrics H (cid:63)δ , (cid:63) ∈{ i, a, g } . Lemma 6.3.3. For all < δ ≤ r and all (cid:63) ∈ { i, a, g } one has H (cid:63)δ ∈ H per ∩ H (cid:63) . Proof: We first remark that s = r on the boundary ∂ ([ − , ] d ), and therefore H (cid:63)δ = Idon this set for any (cid:63) ∈ { i, a, g } . Using (6.30) we thus obtain that H (cid:63)δ is continuous andZZ d periodic.We denote s per ( z ) := min { r , | d ( z, ZZ d ) − r |} and we observe that s per is a Lipschitz function. Since H iδ = 4 Id / max { s per , δ } , we obtainthat H iδ ∈ H i (because z (cid:55)→ min { s ( z ) , δ } / (cid:63) = i .We now consider the case (cid:63) = g , and for that purpose we use the notations of § z ( H ) × defined in (5.58) is smaller than 1 for all z in the setΩ := { z ∈ R d ; 0 < | z | < r − δ } . We define the functions λ := s/ µ := √ s/ Chapter 6. Approximation theory based on metrics and we observe that H gδ = λ − θθ T + µ − θθ T on Ω. The functions λ and µ are C ∞ on Ωand we have ∇ λ = − θ ∇ µ = − θ √ s . Therefore for any z ∈ Ωdil z (ln λ ; d H ) = dil z ( λ ; d H ) λ ( z ) = (cid:107)∇ λ ( z ) (cid:107) H ( z ) − λ ( z ) = (cid:107) − θ ( z ) / (cid:107) H ( z ) − λ ( z ) = 1 / , and dil z (ln µ ; d H ) = dil z ( µ ; d H ) µ ( z ) = (cid:107)∇ µ ( z ) (cid:107) H ( z ) − µ ( z ) = λ ( z )8 (cid:112) s ( z ) µ ( z ) = 1 / . On the other hand the derivative of θ at a point z ∈ R d \ { } in the direction u ∈ R d isthe component of u orthogonal to θ and divided by | z | : ∂θ∂u ( z ) = u − (cid:104) u, θ ( z ) (cid:105) θ ( z ) | z | , therefore dil z ( θ ; d H ) = µ ( z ) | z | . On the other hand for z ∈ Ω (cid:12)(cid:12)(cid:12)(cid:12) λ ( z ) µ ( z ) − µ ( z ) λ ( z ) (cid:12)(cid:12)(cid:12)(cid:12) = (cid:114) r s ( z ) − (cid:115) s ( z ) r ≤ r − s ( z ) = | z | , where we used the fact that √ t − / √ t ≤ t for all t ≥ (cid:12)(cid:12)(cid:12)(cid:12) λ ( z ) µ ( z ) − µ ( z ) λ ( z ) (cid:12)(cid:12)(cid:12)(cid:12) dil z ( θ ; d H ) ≤ µ ( z )2 ≤ / . It follows that the quantity D z ( a ) × defined in (5.59) is smaller than 1 / 2, and thereforethat dil z ( H gδ ) × ≤ z ( H gδ ) × ≤ { z ∈ R d ; r + δ < | z | < r } . Furthermore H gδ isconstant on the annulus { z ∈ R d ; r − δ < | z | < r + δ } and on the domain { z ∈ ( − / , / d ; | z | > r } . DefiningΓ := { z ∈ R d ; | z | ∈ { , r − δ, r + δ, r }} ∪ ∂ ([ − / , / d ) , and Γ := { z + u ; z ∈ Γ , u ∈ ZZ d } we have thus established that dil z ( H gδ ) × ≤ R d \ Γ.Corollary 5.2.6 thus implies that H gδ ∈ H g .The proof that H aδ ∈ H a is extremely similar to the proof that H gδ ∈ H g : the quantities D z ( a ) × and D z ( a ) + defined in (5.59) and (5.60) can be explicitely computed on R d \ Γ andare smaller that 1 / 2. Theorem 5.3.1 thus implies that the local dilatations dil z ( H aδ ) × anddil z ( H aδ ) + are bounded by 1 on R d \ Γ, and it follows from Corollary 5.2.6 that H aδ ∈ H a .The details of the proof are left to the reader. (cid:5) The next lemma establishes the that the metrics H (cid:63)δ do separate the sets of interest. .3. Separation of two domains Lemma 6.3.4. For all < δ ≤ r and all (cid:63) ∈ { i, a, g } the metric H (cid:63)δ separates the sets X and Y δ . Proof: Let γ ∈ C ([0 , , R d ) be such that γ (0) ∈ X and γ (1) ∈ Y δ . Since the sets X , Y δ are ZZ d -periodic, as well as the metrics H (cid:63)δ , we may assume that γ (0) ∈ (cid:2) − , (cid:3) d . Recallingthe definition (6.27) of X and Y δ we obtain | γ (0) | ≤ r and | γ (1) | ≥ r + δ. We define t := max { t ∈ [0 , 1] ; γ ( t ) ∈ X } and t := min { t ∈ [ t , 1] ; γ ( t ) ∈ Y δ } . For any t ∈ [ t , t ] we have γ ( t ) = r ≤ | γ ( t ) | ≤ r + δ = γ ( t ) . and therefore (cid:107) γ (cid:48) ( t ) (cid:107) H (cid:63)δ ( γ ( t )) ≥ |(cid:104) γ (cid:48) ( t ) , θ ( γ ( t )) (cid:105)| δ = 2 || γ | (cid:48) ( t ) | δ , where | γ | stands for the function z (cid:55)→ | γ ( z ) | . Hence l H (cid:63)δ ( γ ) ≥ (cid:90) (cid:107) γ (cid:48) ( t ) (cid:107) H (cid:63)δ ( γ ( t )) dt ≥ (cid:90) t t | γ | (cid:48) ( t ) dtδ = 2 | γ ( t ) | − | γ ( t ) | δ = 2 . It follows that d H (cid:63)δ ( X, Y δ ) ≥ (cid:5) The next lemma estimates the mass of the metrics m ( H (cid:63)δ ), which yields an upperestimate on the minimal mass m (cid:63) ( X, Y δ ) required to separate the sets X and Y δ using ametric in H per ∩ H (cid:63) . Lemma 6.3.5. There exists a constant C = C ( d ) > such that for all < δ ≤ r m ( H iδ ) ≤ Cδ − ( d − , (6.32) m ( H aδ ) ≤ Cδ − d − | ln δ | , (6.33) m ( H gδ ) ≤ Cδ − d − . (6.34) Proof: Recalling the definition (6.30) of H (cid:63)δ in terms of S (cid:63)δ we obtain m ( H (cid:63)δ ) = 8 d c + 2 d (cid:90) | z |≤ r dz det S (cid:63)δ ( z ) , where c stands for the volume of the following set, on which H (cid:63)δ = 8 Id according to(6.31), c := (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:40) z ∈ (cid:20) − , (cid:21) d ; | z | ≥ r (cid:41)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . Chapter 6. Approximation theory based on metrics Furthermore (cid:90) | z |≤ r dz det S (cid:63)δ ( z ) = c (cid:90) r r =0 r d − drs (cid:63)δ ( r ) , where c stands for the d − { z ∈ R d ; | z | = 1 } , andwhere s (cid:63)δ ( r ) := det S (cid:63)δ ( z )for any z such that | z | = r . Therefore there exists a constant c > 0, independent of δ ,0 < δ ≤ r , and of (cid:63) ∈ { i, a, g } , and such that m ( H (cid:63)δ ) ≤ c (cid:90) r r =0 drs (cid:63)δ ( r ) . Recalling the definition (6.29) of S iδ ( z ) we obtain s iδ ( r ) = max { δ, | r − r |} d for all 0 ≤ r ≤ r , hence (cid:90) r r =0 drs iδ ( r ) = (cid:90) r − δ dr ( r − r ) d + (cid:90) r + δr − δ drδ d + (cid:90) r r + δ dr ( r − r ) d = 2 δ − ( d − + 2 (cid:32) δ − ( d − d − − r − ( d − d − (cid:33) ≤ δ − ( d − which establishes (6.32). Similarly, we find s gδ ( r ) = r d − max { δ, | r − r |} d +12 , hence r d − (cid:90) r r =0 drs gδ ( r ) = (cid:90) r − δ dr ( r − r ) d +12 + (cid:90) r + δr − δ drδ d +12 + (cid:90) r r + δ dr ( r − r ) d +12 = 2 δ − d − + 2 (cid:32) δ − d − d − − r − d − ( d − (cid:33) ≤ δ − d − which establishes (6.34). Eventually, we have s aδ ( r ) = max { δ, | r − r |} max (cid:110) √ δ, | r − r | (cid:111) d − ,and therefore (cid:90) r r =0 drs aδ ( r ) = (cid:90) I δ dr | r − r | d + (cid:90) J δ dr | r − r | δ d − + (cid:90) r + δr − δ drδ d +12 , (6.35)where I δ = (cid:104) , r − √ δ (cid:105) ∪ (cid:104) r + √ δ, r (cid:105) and J δ = (cid:104) r − √ δ, r − δ (cid:105) ∪ (cid:104) r + δ, r + √ δ (cid:105) . Therefore (cid:90) r r =0 drs aδ ( r ) = 2 d − (cid:18)(cid:16) √ δ (cid:17) − ( d − − r − ( d − (cid:19) +2 δ − d − (cid:16) ln (cid:16) √ δ (cid:17) − ln δ (cid:17) +2 δ − d − ≤ δ − d − | ln δ | + 4 δ − d − .3. Separation of two domains (cid:5) We prove in this subsection the lower bound announced in Theorem 6.3.2. Lemma 6.3.6. There exists a constant c = c ( d ) > such that for all < δ ≤ r , m i ( X, Y δ ) ≥ cδ − ( d − , (6.36) m a ( X, Y δ ) ≥ cδ − d − | ln δ | , (6.37) m g ( X, Y δ ) ≥ cδ − d − . (6.38) Proof: We consider a metric H ∈ H (cid:63) , where (cid:63) ∈ { i, a, g } , which separates the sets X and Y δ . In particular H ∈ H g . We denote by S := { v ∈ R d ; | v | = 1 } the euclidean unitsphere of R d , and by ω the d − S . We define for all v ∈ S n ( v ) := (cid:107) v (cid:107) H ( r v ) and m ( v ) := inf | w | =1 (cid:107) w (cid:107) H ( r v ) = (cid:107) H ( r v ) − (cid:107) − . Proposition 5.2.10 (in the previous chapter), states that for any z, u ∈ R d we have d H ( z, z + u ) ≤ − ln(1 − (cid:107) u (cid:107) H ( z ) ) , (6.39)where the right hand side equals ∞ by convention if (cid:107) u (cid:107) H ( z ) ≥ v ∈ S and we observe that r v ∈ X and ( r + δ ) v ∈ Y δ .Therefore 1 ≤ d H ( r v, ( r + δ ) v ) ≤ − ln (cid:0) − δ (cid:107) v (cid:107) H ( r v ) (cid:1) , hence λδn ( v ) ≥ λ := (1 − e − ) − . Let v, w ∈ S , and let σ ∈ {− , } be such that σ (cid:104) v, w (cid:105) ≥ 0. Remarking that 2 r + δ ≤ / ≤ (cid:12)(cid:12)(cid:12) r v + σ √ δw (cid:12)(cid:12)(cid:12) = r + δ + 2 r √ δσ (cid:104) v, w (cid:105) ≥ r + δ ≥ r + (2 r + δ ) δ = ( r + δ ) . Hence r v ∈ X and r v + σ √ δw ∈ Y δ . Therefore1 ≤ d H ( r v, r v + σ √ δw ) ≤ − ln (cid:16) − √ δ (cid:107) w (cid:107) H ( r v ) (cid:17) , which implies λ √ δm ( v ) ≥ 1. We we now distinguish between the three cases (cid:63) = i , a or g . – If (cid:63) = i , then H ( z ) = Id /s ( z ) where s is a Lipschitz function, and s ( r v ) = n ( v ) − ≤ λδ . It follows that m ( H ) ≥ (cid:90) S (cid:90) r r d − drdvs ( rv ) d ≥ (cid:90) S (cid:90) r + δr r d − drdv ( n ( v ) − + δ ) d ≥ r d − δω ( λδ + δ ) d = r d − ω ( λ + 1) d δ − ( d − , Chapter 6. Approximation theory based on metrics where ω denotes the d − S . Taking the infimum among allmetrics H ∈ H i which separate the sets X and Y δ , we obtain as announced m i ( X, Y δ ) ≥ cδ − ( d − where c = r d − ω ( λ + 1) d . – If (cid:63) = a , then the functions z (cid:55)→ (cid:107) H ( z ) (cid:107) − as well as z (cid:55)→ (cid:107) H ( z ) − (cid:107) are Lipschitz.Hence for any v ∈ S and any µ ∈ R we have (cid:107) H (( r + µ ) v ) (cid:107) − ≤ n ( v ) − + | µ | ≤ λδ + | µ | , (cid:107) H (( r + µ ) v ) − (cid:107) ≤ m ( v ) − + | µ | ≤ λ √ δ + | µ | . Recalling that det M ≥ (cid:107) M (cid:107)(cid:107) M − (cid:107) − ( d − for any M ∈ S + d we obtain( λδ + | µ | )( λ √ δ + | µ | ) d − (cid:112) det H (( r + µ ) v ) ≥ . It follows that m ( H ) ≥ ω (cid:90) r − r ( r + µ ) d − dµ ( λδ + | µ | )( λ √ δ + | µ | ) d − ≥ ω (cid:90) √ δ r d − dµ ( λδ + µ )( λ √ δ + √ δ ) d − ≥ ωr d − (( λ + 1) √ δ ) d − (ln √ δ − ln( λδ )) . Hence m a ( X, Y δ ) ≥ cδ − d − ( − ln δ − λ ) where c = ω λ + 1) d − which concludes the proof of (6.37).– We now consider the case (cid:63) = g . We have for any v ∈ S (cid:112) det H ( r v ) ≥ n ( v ) m ( v ) d − ≥ n ( v )( λ √ δ ) d − . Using (5.54) (in the previous chapter) we therefore obtain for any µ ∈ R (cid:112) det H (( r + µ ) v ) ≥ (1 − | µ | n ( v )) d (cid:112) det H ( r v ) ≥ (1 − | µ | n ( v )) d n ( v )( λ √ δ ) d − hence m ( H ) ≥ (cid:90) S (cid:90) r − r ( r + µ ) d − (1 − | µ | n ( v )) d dµdvn ( v )( λ √ δ ) d − ≥ (cid:90) S (cid:90) n ( v ) r d − (1 / d dµdvn ( v )( λ √ δ ) d − = ωr d − d ( λ √ δ ) d − . .4. Approximation in a given norm Figure m g ( X, Y δ ) ≥ cδ − d − where c = ωr d − d λ d − , which concludes the proof of (6.38) and of this lemma. (cid:5) Remark 6.3.7 (Constrained triangulations) . A variation of the problem considered inthis section is often considered in the litterature. We say that a mesh T ∈ T weaklyseparates two regions X, Y ⊂ R d if int( V T ( X )) ∩ int( V T ( Y )) = ∅ , where int( E ) refers to the collection of interior points of E .A common approach for weakly separating two closed and disjoint sets X, Y ⊂ R d , isbased on constrained mesh generation. The first step is to build a polygonal hypersurface Γ such that for any path γ ∈ C ([0 , , R d ) satisfying γ (0) ∈ X and γ (1) ∈ Y , there exists s ∈ [0 , such that γ ( s ) ∈ Γ . The second step is to build a mesh T of the domain Ω whichis constrained to contain the polygonal surface Γ in its skeleton ∪ T ∈T ∂T .The main advantage of weak separation is that it can be achieved efficiently withoutrequiring anisotropic simplices, see Figure 6.2. Heuristically the same number of simplicesis required to separate the sets X and Y δ “weakly” with an isotropic triangulation, or“strongly” with a graded triangulation. We refer the interested reader to [86] for a surveyon constrained mesh generation. We compare in this section the approximation error, measured in the L p norm or the W ,p semi norm, of a function f by finite elements on a mesh T with the error associatedto f and a metric H . In this section, as well as in the next one, we fix an integer m ≥ m − α always refers to a d -plet of non-negative integers α = ( α , · · · , α d ) ∈ ZZ d + , Chapter 6. Approximation theory based on metrics and we denote | α | = α + · · · + α d . We define the monomial Z α := (cid:89) ≤ i ≤ d Z α i i , where the variable is Z = ( Z , ..., Z d ), and we denote by IP k the collection of all polyno-mials of degree at most k . IP k := Vect { Z α ; | α | ≤ k } . We introduce a fixed function ϕ ∈ L ∞ ( R + , R ) supported in [0 , 1] and satisfying thefollowing moments property : for all µ ∈ IP m − one has (cid:90) IR d µ ( z ) ϕ ( | z | ) dz = µ (0) . (6.40)For any f ∈ L ( R d ) and any H ∈ H we denote by f H the convolution of f with ϕ ( | · | )distorted by the metric H , also referred to as the distorted mollification of f by H , andwhich is defined as follows : for all z ∈ R d f H ( z ) := (cid:90) IR d f ( z + H ( z ) − u ) ϕ ( | u | ) du (6.41)= (cid:112) det H ( z ) (cid:90) IR d f ( z + u ) ϕ ( (cid:107) u (cid:107) H ( z ) ) du. (6.42)Note that µ H = µ for any polynomial µ ∈ IP m − . Indeed we obtain using the momentsproperty (6.40) that for all z ∈ R d µ H ( z ) = µ ( z + H ( z ) − 0) = µ ( z ) . The next proposition establishes that the distorted mollification f (cid:55)→ f H is boundedin some functional spaces. For any metric H ∈ H , any point z ∈ R d and any radius r > B H ( z, r ) := { z + u ; (cid:107) u (cid:107) H ( z ) < r } , as well as B H ( z ) := B H ( z, 1) and B ( z, r ) := B Id ( z, r )(the latter being the standard open euclidean ball). Note that | B H ( z, r ) | (cid:112) det H ( z ) = ω r d where ω := | B (0 , | . (6.43) Lemma 6.4.1. 1. (Boundedness from L to C ) For any H ∈ H and any f ∈ L one has f H ∈ C ( R d ) . Furthermore for any z ∈ R d | f H ( z ) | ≤ (cid:112) det H ( z ) (cid:107) ϕ (cid:107) L ∞ (cid:107) f (cid:107) L ( B H ( z )) , (6.44) where B H ( z ) := { u ∈ R d ; (cid:107) u (cid:107) H ( z ) < } . .4. Approximation in a given norm 2. (Boundedness from W , to W , ∞ loc ) For any H ∈ H a and f ∈ W , one has f H ∈ W , ∞ loc . Furthermore for any z ∈ R d dil z ( f H ) ≤ (cid:112) det H ( z ) (cid:107) ϕ (cid:107) L ∞ (cid:107)∇ f (cid:107) L ( B H ( z )) where dil z ( f H ) := lim ε → (cid:107)∇ f H (cid:107) L ∞ ( B ( z,ε )) is the local dilatation of f at z . Proof: We first establish 1. The continuity of f H follows from the expression (6.41) ofthe distorted mollification, and from general results on parameter dependent integrals.Inequality (6.44) follows from the fact that ϕ is supported on [0 , 1] and from (6.42).We now turn to the proof of 2. Metrics H ∈ H a satisfy (cid:107) H ( z ) − − H ( z (cid:48) ) − (cid:107) ≤ | z − z (cid:48) | for any z, z (cid:48) ∈ R d , see Definition (5.1.12) (in the previous chapter). Hence for any fixed u ∈ B (0 , 1) the map z (cid:55)→ z + H ( z ) − u is 2-Lipschitz. Derivating (6.41) under the integral sign we therefore obtaindil z ( f H ) ≤ (cid:90) IR d |∇ f ( z + H ( z ) − u ) | | ϕ ( | u | ) | du ≤ (cid:107) ϕ (cid:107) L ∞ (cid:90) B H ( z ) |∇ f ( z + H ( z ) − u ) | du = 2 (cid:112) det H ( z ) (cid:107) ϕ (cid:107) L ∞ (cid:107)∇ f (cid:107) L ( B H ( z )) which concludes the proof of this proposition. (cid:5) L p error This subsection is devoted to the proof of the following proposition, which states thatthe finite element approximation error measured in the L p norm of a function f on a mesh T , is controlled by the error e H ( f ) p associated to a metric H when the mesh satisfies thesize condition (6.45). Proposition 6.4.2. There exists a constant C = C ( m, d, ϕ ) such that the following holds.Let H ∈ H per ∩ H g and T ∈ T per be such that H ( z ) ≤ H T , T ∈ T , z ∈ T. (6.45) Then for all ≤ p ≤ ∞ and all f ∈ L p per we have (cid:107) f − I m − T f H (cid:48) (cid:107) L p ([0 , d ) ≤ Ce H ( f ) p , where we denoted H (cid:48) := 4 H . Our first intermediate lemma studies the variations of the balls B H ( z, r ) as z ∈ R d and r > H ∈ H g is a graded metric. We recall, see (5.53) and (5.54) (in theprevious chapter), that for any H ∈ H g , and for any z, u, v ∈ R d satisfying (cid:107) u (cid:107) H ( z ) < − (cid:107) u (cid:107) H ( z ) ) (cid:107) v (cid:107) H ( z ) ≤ (cid:107) v (cid:107) H ( z + u ) ≤ (1 − (cid:107) u (cid:107) H ( z ) ) − (cid:107) v (cid:107) H ( z ) , (1 − (cid:107) u (cid:107) H ( z ) ) d (cid:112) det H ( z ) ≤ (cid:112) det H ( z + u ) ≤ (1 − (cid:107) u (cid:107) H ( z ) ) − d (cid:112) det H ( z ) . (6.46)96 Chapter 6. Approximation theory based on metrics Lemma 6.4.3. Let H ∈ H g and let z ∈ R d .1. For any z (cid:48) ∈ B H ( z, / one has B H ( z (cid:48) , / ⊂ B H ( z ) .2. For any z (cid:48) ∈ B H ( z, / one has B H ( z (cid:48) , / ⊃ B H ( z, / . Proof: We first prove Point 1., and for that purpose we observe that for any q ∈ R d (cid:107) q − z (cid:48) (cid:107) H ( z (cid:48) ) ≥ (1 − (cid:107) z (cid:48) − z (cid:107) H ( z ) ) (cid:107) q − z (cid:48) (cid:107) H ( z ) ≥ (1 − (cid:107) z (cid:48) − z (cid:107) H ( z ) )( (cid:107) q − z (cid:107) H ( z ) − (cid:107) z (cid:48) − z (cid:107) H ( z ) ) ≥ (1 − / (cid:107) q − z (cid:107) H ( z ) − / (cid:107) q − z (cid:107) H ( z ) ≥ (cid:107) q − z (cid:48) (cid:107) H ( z (cid:48) ) ≥ / 4, which establishesthe first inclusion.We now turn to the proof of Point 2. We obtain for any q ∈ R d (cid:107) q − z (cid:48) (cid:107) H ( z (cid:48) ) ≤ (1 − (cid:107) z (cid:48) − z (cid:107) H ( z ) ) − (cid:107) q − z (cid:48) (cid:107) H ( z ) ≥ (1 − (cid:107) z − z (cid:48) (cid:107) H ( z ) )( (cid:107) q − z (cid:107) H ( z ) + (cid:107) z (cid:48) − z (cid:107) H ( z ) ) ≥ (1 − / − ( (cid:107) q − z (cid:107) H ( z ) + 1 / . If (cid:107) q − z (cid:107) H ( z ) < / (cid:107) q − z (cid:48) (cid:107) H ( z (cid:48) ) < ( + ) = 1 / 2. Hence B H ( z (cid:48) , / ⊃ B H ( z, / (cid:5) The next lemma estimates the finite element approximation error locally . Lemma 6.4.4. There exists C = C ( m, d, ϕ ) such that the following holds. Let H ∈ H g , T ∈ T , z ∈ R d and let T z := { T ∈ T ; T ⊂ B H ( z, / } . (6.47) Then (cid:107) f − I m − T f H (cid:48) (cid:107) L p (Ω z ) ≤ C e H ( f ; z ) p , (6.48) where H (cid:48) := 4 H and Ω z denotes the union of all triangles in T z . Proof: We denote by C I the norm of the interpolation operatorI m − T eq : C ( T eq ) → C ( T eq ) , where T eq denotes the reference equilateral simplex. We obtain using a change of variablesthat for any triangle T and any g ∈ C ( T ) one has (cid:107) I m − T g (cid:107) L ∞ ( T ) ≤ C I (cid:107) g (cid:107) L ∞ ( T ) . Let µ ∈ IP m − be a polynomial satisfying (cid:107) f − µ (cid:107) L p ( B H ( z )) = e H ( f ; z ) p . Let g := f − µ and let T ∈ T z . We have (cid:107) I m − T g H (cid:48) (cid:107) L p ( T ) ≤ | T | p (cid:107) I m − T g H (cid:48) (cid:107) L ∞ ( T ) ≤ C I | T | p (cid:107) g H (cid:48) (cid:107) L ∞ ( T ) ≤ C I | T | p (cid:107) g H (cid:48) (cid:107) L ∞ ( B H ( z, / . .4. Approximation in a given norm (cid:107) g H (cid:48) (cid:107) L ∞ ( B H ( z, / ≤ (cid:107) ϕ (cid:107) L ∞ max {(cid:107) g (cid:107) L ( B H (cid:48) ( z (cid:48) )) (cid:112) det H (cid:48) ( z (cid:48) ) ; z (cid:48) ∈ B H ( z, / }≤ (cid:107) ϕ (cid:107) L ∞ (cid:107) g (cid:107) L ( B H ( z )) d (1 − / − d (cid:112) det H ( z ) , where we used the inclusion stated in Point 1. of Lemma 6.4.3 and (6.46) in the secondline. Defining C ∗ := 8 d C I (cid:107) ϕ (cid:107) L ∞ we thus obtain (cid:107) I m − T g H (cid:48) (cid:107) L p (Ω z ) = (cid:32) (cid:88) T ∈T z (cid:107) I m − T g H (cid:48) (cid:107) pL p ( T ) (cid:33) p ≤ C ∗ (cid:107) g (cid:107) L ( B H ( z )) (cid:112) det H ( z ) (cid:32) (cid:88) T ∈T z | T | (cid:33) p ≤ C ∗ (cid:107) g (cid:107) L ( B H ( z )) ω | B H ( z ) | | B H ( z ) | p ≤ C ∗ ω (cid:107) g (cid:107) L p ( B H ( z )) , where we used in the third line the expression (6.43) of the volume | B H ( z ) | , and theinclusion (6.47). Since the distorted convolution reproduces polynomials in IP m − , as wellas the interpolation operator I m − T , we have g − I m − T g H = ( f − µ ) − I m − T ( f − µ ) H = f − I m − T f H . Therefore (cid:107) f − I m − T f H (cid:107) L p (Ω z ) = (cid:107) g − I m − T g H (cid:107) L p (Ω z ) ≤ (cid:107) g (cid:107) L p (Ω z ) + (cid:107) I m − T g H (cid:107) L p (Ω z ) ≤ (1 + C ∗ ω ) (cid:107) g (cid:107) L p ( B H ( z )) = (1 + C ∗ ω ) e H ( f ; z ) p which concludes the proof. (cid:5) We now prove Proposition 6.4.2. For each triangle T ∈ T we defineΩ T := { z ∈ R d ; T ⊂ B H ( z, / } and n ( T ) := (cid:90) Ω T (cid:112) det H ( z ) dz, and we observe that z ∈ Ω T if and only if T ∈ T z , where T z is defined in (6.47). Note also that Ω T ⊂ B ( z T , / 2) since H ( z ) ≥ Id for all z ∈ R d , hence the sets u + Ω T , u ∈ ZZ d , are pairwise disjoint. We denote by T ∗ a systemof representatives of the set T for the relation of equivalence T ∼ T (cid:48) if T = T (cid:48) + u for some u ∈ ZZ d . Chapter 6. Approximation theory based on metrics It follows from the local error estimate (6.48) that C p e H ( f ) pp = C p (cid:90) [0 , d (cid:112) det H ( z ) e H ( f ; z ) pp dz ≥ (cid:90) [0 , d (cid:112) det H ( z ) (cid:107) f − I m − T f H (cid:107) pL p (Ω z ) dz = (cid:90) [0 , d (cid:112) det H ( z ) (cid:88) T ∈T z (cid:107) f − I m − T f H (cid:107) pL p ( T ) dz = (cid:88) T ∈T ∗ (cid:107) f − I m − T f H (cid:107) pL p ( T ) (cid:90) [0 , d (cid:112) det H ( z ) (cid:88) u ∈ ZZ d χ Ω T ( z − u ) dz = (cid:88) T ∈T ∗ (cid:107) f − I m − T f H (cid:107) pL p ( T ) (cid:90) R d (cid:112) det H ( z ) χ Ω T ( z ) dz = (cid:88) T ∈T ∗ (cid:107) f − I m − T f H (cid:107) pL p ( T ) n T ≥ (cid:107) f − I m − T f H (cid:107) pL p ([0 , d ) min T ∈T n T . We now establish that n T is uniformly bounded below, which concludes the proof ofProposition 6.4.2. It follows from (6.45) that T ⊂ B H ( z T , / z T denotes thebarycenter of T . The inclusion property established in Point 2. of Lemma 6.4.3 thusimplies that B H ( z T , / ⊂ Ω T . We thus obtain the lower bound n ( T ) ≥ | B H ( z T , / | min z ∈ B H ( z T , / (cid:112) det H ( z ) ≥ ω − d (cid:112) det H ( z T ) (cid:112) det H ( z T )(1 − / d = ω (5 / d , where we used the explicit expression (6.43) of the volume of | B H ( z, r ) | and the estimate(6.46) on the variations of (cid:112) det H ( z ). W ,p error, when the measure of sliverness is uniformlybounded We establish in this subsection the following proposition which shows that the finiteelement approximation error, measured in the W ,p semi norm on a mesh with boundedmeasure of sliverness, is controlled by e aH ( ∇ f ) p . Proposition 6.4.5. There exists a constant C = C ( m, d, ϕ ) such that the following holds.Let H ∈ H per ∩ H a and T ∈ T per be such that H ( z ) ≤ H T , T ∈ T , z ∈ T. (6.49) Then for all ≤ p ≤ ∞ and all f ∈ W ,p per we have (cid:107)∇ ( f − I m − T f H (cid:48) ) (cid:107) L p ([0 , d ) ≤ Ce aH ( f ) p max T ∈T S ( T ) , .4. Approximation in a given norm where we denoted H (cid:48) := 4 H . The main difficulty of this proof is contained in the following lemma, which states thata near-best approximation of the gradient of a function, on an ellipse, has the form of thegradient of a polynomial. In this proposition we denote by u · v the scalar product of twovectors u, v ∈ R d . Lemma 6.4.6. 1. (The orthogonal projection is compatible with gradients) We equipthe space L ( B , R d ) , where B := B (0 , , with the scalar product (cid:104) v, w (cid:105) := (cid:90) B v ( z ) · w ( z ) dz. For any f ∈ H ( B ) there exists µ ∈ IP m − such that the orthogonal projection of ∇ f onto IP dm − is ∇ µ .2. (Optimization among gradients) There exists C proj = C proj ( m, d ) such that for anyellipsoid E ⊂ R d , any ≤ p ≤ ∞ and any f ∈ W ,p ( E ) one has inf µ ∈ IP m − (cid:107)∇ f − ∇ µ (cid:107) L p ( E ) ≤ C proj inf ν ∈ IP dm − (cid:107)∇ f − ν (cid:107) L p ( E ) (6.50) Proof: We denote k := m − 1, and we first establish Point 1. We define S := ∂ B and wedenote by D k the collection of polynomial vector fields ν ∈ IP dk − satisfying (cid:26) div ν ( z ) = 0 for all z ∈ B ν ( z ) · z = 0 for all z ∈ S For any g ∈ H ( B ) and any ν ∈ D k we thus have (cid:90) B ∇ g · ν = − (cid:90) B g ( z ) div ν ( z ) dz + (cid:90) S g ( z ) ν ( z ) · z dz = 0 . (6.51)Our first objective is to establish the orthogonal decompositionIP dk − = D k ⊕ ∇ IP k . The spaces ∇ IP k and D k are orthogonal according to (6.51), and we know that dim ∇ IP k =dim IP k − 1. Therefore we only have to show that dim D k ≥ dim(IP d − k ) − dim IP k + 1.The vector space D k is the kernel of the map d k defined as follows d k : IP dk − → IP k − ⊕ IP k ( S ) ν (cid:55)→ (div ν, ( ν ( z ) · z ) | S ) , where IP k ( S ) denotes the collection of restrictions to S of elements of IP k . The kernel ofthe map γ k : IP k → IP k ( S ) µ (cid:55)→ µ | S is Ker( γ k ) = (1 − | z | )IP k − . Hence dim(IP k ( S )) = dim IP k − dim IP k − . Chapter 6. Approximation theory based on metrics The map d k is not surjective, since we have the linear relation (cid:90) B div ν ( z ) dz = (cid:90) S ν ( z ) · z dz. Therefore dim D k = dim Ker d k = dim(IP d − k ) − dim Im d k ≥ dim(IP d − k ) − (dim IP k − + dim IP k ( S ) − d − k ) − dim IP k + 1 , which implies the orthogonal decomposition IP dk − = D k ⊕ ∇ IP k as announced.Finally for any f ∈ H ( B ) the vector field ∇ f is orthogonal to D k according to (6.51).The orthogonal projection of ∇ f onto IP dk − is therefore also orthogonal to D k , hence isan element of ∇ IP k which concludes the proof Point 1.We now turn to the proof of Point 2. We denote by Q : L ( B , R d ) → IP dk − the L ( B )orthogonal projection. We recall that for any v ∈ L ( B , R d ) the projection Q ( v ) ∈ IP dk − is defined by the collection of linear conditions (cid:90) B ( v − Q ( v )) · ν = 0 for all ν ∈ IP dk − . Hence Q ( v ) depends linearly on a finite number of moments of v , and therefore conti-nuously extends to all v ∈ L ( B , R d ). Therefore there exists a constant C such that forall 1 ≤ p ≤ ∞ and all v ∈ L p ( B , R d ) (cid:107) Q ( v ) (cid:107) L p ( B ) ≤ C (cid:107) v (cid:107) L p ( B ) . For any A ∈ GL d and any v ∈ L ( B , R d ) we claim that Q ( A ( v )) = A ( Q ( v )). Indeed forany ν ∈ IP dk − one has (cid:90) B ( A ( v ( z )) − A ( Q ( v ( z )))) · ν ( z ) dz = (cid:90) B ( v ( z ) − Q ( v ( z ))) · A T ( ν ( z )) dz = 0 . Consider 1 ≤ p ≤ ∞ , f ∈ W ,p ( B ) and A ∈ GL d . Let µ ∈ IP m − be such that Q ( ∇ f ) = ∇ µ , and let ν ∈ IP dm − , we obtain (cid:107) A ( ∇ f − ∇ µ ) (cid:107) L p ( B ) = (cid:107) A ( ∇ f ) − A ( Q ( ∇ f )) (cid:107) L p ( B ) = (cid:107) A ( ∇ f ) − Q ( A ( ∇ f )) (cid:107) L p ( B ) = (cid:107) A ( ∇ f − ν ) − Q ( A ( ∇ f ) − ν ) (cid:107) L p ( B ) ≤ (1 + C ) (cid:107) A ( ∇ f − ν ) (cid:107) L p ( B ) . Therefore inf µ ∈ IP m − (cid:107) A ( ∇ f − ∇ µ ) (cid:107) L p ( B ) ≤ (1 + C ) inf ν ∈ IP dm − (cid:107) A ( ∇ f − ν ) (cid:107) L p ( B ) . .4. Approximation in a given norm E → B we obtain the announced result (6.50), with theconstant C proj := C + 1. This concludes the proof of this lemma. (cid:5) The rest of the proof of Proposition 6.4.5 is extremely similar to the proof of Propo-sition 6.4.2. The estimates | f H ( z ) | ≤ (cid:112) det H ( z ) (cid:107) ϕ (cid:107) L ∞ (cid:107) f (cid:107) L ( B H ( z )) (cid:107) I m − T g (cid:107) L ∞ ( T ) ≤ C I (cid:107)∇ g (cid:107) L ∞ ( T ) , used in the proof of Proposition 6.4.2 have the counterpartsdil z ( f H ) ≤ (cid:112) det H ( z ) (cid:107) ϕ (cid:107) L ∞ (cid:107)∇ f (cid:107) L ( B H ( z )) (cid:107)∇ I m − T g (cid:107) L ∞ ( T ) ≤ C (cid:48) I (cid:107)∇ g (cid:107) L ∞ ( T ) S ( T ) , which are proved respectively in Lemma 6.4.1 and Lemma 3.6.3 (in Chapter 3). From thispoint the adaptation of the proof is straightforward and is therefore left to the reader. W ,p error, on a general mesh This subsection is devoted to the proof of Proposition 6.4.7, which estimates the W ,p approximation error of a function f on a mesh with arbitrary measure of sliverness, interms of the approximation error e gH ( ∇ f ) p with respect to a metric. This approach doesnot allow to recover the optimal anisotropic error estimates for W ,p norms developed inChapter 3. However it is perhaps more taylored to current anisotropic mesh generationsoftware, which generally do not guarantee any bound on the measure of sliverness (orthe maximal angle in dimension 2, since S ( T ) = max { , tan( θ/ } for any triangle T ofmaximal angle θ ). Proposition 6.4.7. For all C ≥ there exists a constant C = C ( C , m, d, ϕ ) such thatthe following holds. Let H ∈ H per ∩ H g and let T ∈ T per be such that H ( z ) ≤ H T ≤ (4 C ) H ( z ) , T ∈ T , z ∈ T. (6.52) Then for all ≤ p ≤ ∞ and all f ∈ W ,p per we have (cid:107)∇ ( f − I m − T f H (cid:48) ) (cid:107) L p ([0 , d ) ≤ Ce gH ( f ) p , where we denoted H (cid:48) := 4 H . Our first step in the proof of this proposition is the next lemma which estimates the local approximation error. Lemma 6.4.8. For all C ≥ there exists C = C ( C , m, d ) such that the following holds.Let H ∈ H per and T ∈ T per be such that (6.52) holds. Let T ∈ T and z ∈ T . Then for all ≤ p ≤ ∞ and all f ∈ W ,p per we have (cid:107)∇ ( f − I m − T f H (cid:48) ) (cid:107) L p ( T ) ≤ Ce gH ( ∇ f ; z ) p , (6.53) where we denoted H (cid:48) := 4 H . Chapter 6. Approximation theory based on metrics Proof: We assume in a first time that z = 0 and that H ( z ) = Id, in such way that B H ( z ) = B := B (0 , 1) is the euclidean unit ball. We consider a polynomial µ ∈ IP m − such that (cid:107)∇ ( f − µ ) (cid:107) L p ( B ) is minimal, and such that (cid:82) B ( f − µ ) = 0. The latter point canbe ensured by adding an appropriate constant to µ . Defining g := f − µ , we obtain usingLemma 6.4.6 (cid:107)∇ g (cid:107) L p ( B ) ≤ C proj e aH ( f ; z ) p . Since g ∈ W , ( B ) and (cid:82) B g = 0 there exists according to Sobolev’s injection theorem aconstant C sob = C sob ( m, d ) such that (cid:107) g (cid:107) L ( B ) ≤ C sob (cid:107)∇ g (cid:107) L ( B ) . The function g H (cid:48) is continuous on T and satisfies according to Lemma 6.4.1 (cid:107) g H (cid:48) (cid:107) L ∞ ( T ) ≤ (cid:107) ϕ (cid:107) ∞ max {(cid:107) g (cid:107) L ( B H (cid:48) ( z (cid:48) )) (cid:112) det H (cid:48) ( z (cid:48) ) ; z (cid:48) ∈ T }≤ (cid:107) ϕ (cid:107) ∞ (cid:107) g (cid:107) L ( B ) d (1 − / d = C (cid:107) g (cid:107) L ( B ) , where we used (6.46) and the fact that (cid:107) z − z (cid:48) (cid:107) H ( z ) ≤ / z (cid:48) ∈ T .We denote by C I the norm of the operator ∇ I m − T eq : C ( T eq ) → IP dm − , where T eq denotes the reference equilateral simplex : for any h ∈ C ( T eq ) (cid:107)∇ I m − T eq h (cid:107) L ∞ ( T eq ) ≤ C I (cid:107) h (cid:107) L ∞ ( T eq ) . We recall that H T ( T ) is the image T eq by a translation and a rotation, see Proposition5.1.3 in the previous chapter. Choosing the function h := g H (cid:48) ◦ H − T ∈ C ( T eq ), we obtainafter a change of variables (cid:107)H − T ( ∇ I m − T g H (cid:48) ) (cid:107) L ∞ ( T ) ≤ C I (cid:107) g H (cid:48) (cid:107) L ∞ ( T ) , which implies (cid:107)∇ I m − T g H (cid:48) (cid:107) L ∞ ( T ) ≤ C I (cid:107)H T (cid:107)(cid:107) g H (cid:48) (cid:107) L ∞ ( T ) . We thus obtain, since (cid:107)H T (cid:107) ≤ C (cid:107) H (cid:48) ( z ) (cid:107) = 4 C and since | T | ≤ | B | , (cid:107)∇ I m − T g H (cid:48) (cid:107) L p ( T ) ≤ | B | p (cid:107)∇ I m − T g H (cid:48) (cid:107) L ∞ ( T ) ≤ C C I | B | p (cid:107) g H (cid:48) (cid:107) L ∞ ( T ) ≤ C C C I | B | p (cid:107) g (cid:107) L ( B ) ≤ C C C I C sob | B | p (cid:107)∇ g (cid:107) L ( B ) ≤ C C C I C sob | B |(cid:107)∇ g (cid:107) L p ( B ) . Defining C ∗ := 4 C C C I C sob | B | we obtain (cid:107)∇ ( f − I m − T f H (cid:48) ) (cid:107) L p ( T ) = (cid:107)∇ ( g − I m − T g H (cid:48) ) (cid:107) L p ( T ) ≤ (1 + C ∗ ) (cid:107)∇ g (cid:107) L p ( B ) ≤ C e aH ( ∇ f ; z ) p , .4. Approximation in a given norm C = (1 + C ∗ ) C proj .We now turn to the general case, and we do not any more assume that z = 0 and H ( z ) = Id. We define the affine change of coordinates Φ( z (cid:48) ) := H ( z ) − z (cid:48) + z , and we applyour previous reasonning to the metric H Φ defined by (5.30) (in the previous chapter), H Φ ( z (cid:48) ) := H ( z ) − H (Φ( z (cid:48) )) H ( z ) − , which satisfies H Φ (0) = Id and H Φ ∈ H g according to Proposition 5.2.2 (in the previouschapter). We also define the triangle T Φ := Φ − ( T ), the function f Φ := f ◦ Φ and themetric H (cid:48) Φ = 4 H Φ . The first part of this proof implies that (cid:107)∇ ( f Φ − I m − T Φ f Φ H (cid:48) Φ ) (cid:107) L p ( T Φ ) ≤ Ce aH Φ ( ∇ f Φ ; 0) p . We thus obtain (cid:107)∇ ( f − I m − T f H (cid:48) ) (cid:107) L p ( T ) ≤ (cid:107) H ( z ) (cid:107) (cid:107) H ( z ) − ( ∇ ( f − I m − T f H (cid:48) )) (cid:107) L p ( T ) = (cid:107) H ( z ) (cid:107) (det H ( z )) − p (cid:107)∇ ( f Φ − I m − T Φ f Φ H (cid:48) Φ ) (cid:107) L p ( T Φ ) ≤ C (cid:107) H ( z ) (cid:107) (det H ( z )) − p e aH Φ ( ∇ f Φ ; 0) p = C (cid:107) H ( z ) (cid:107) e aH ( H ( z ) − ( ∇ f ); z ) p = e gH ( f ; z ) p , where we used the change of variables Φ − : T → T Φ in the second and fourth lines. Thisconcludes the proof of this lemma. (cid:5) We now conclude the proof of Proposition 6.4.7. For any T ∈ T and any z ∈ T ,recalling that H T ≤ (4 C ) H ( z ) we obtain | T eq | / | T | = (cid:112) det H T ≤ (4 C ) d (cid:112) det H ( z ) . Averaging (6.53) over T , it follows that (cid:107)∇ ( f − I m − T f ) (cid:107) pL p ( T ) ≤ C | T | (cid:90) T e gH ( f ; z ) pp dz ≤ C (cid:48) (cid:90) T (cid:112) det H ( z ) e gH ( f ; z ) pp dz where C (cid:48) := (4 C ) d C/ | T eq | .We denote by T ∗ a system of representatives of the set T for the relation of equivalence T ∼ T (cid:48) if T = T (cid:48) + u for some u ∈ ZZ d . We thus obtain (cid:107)∇ ( f − I m − T f H (cid:48) ) (cid:107) pL p ([0 , d ) = (cid:88) T ∈T ∗ (cid:107)∇ ( f − I m − T f H (cid:48) ) (cid:107) pL p ( T ) ≤ C (cid:48) (cid:88) T ∈T ∗ (cid:90) T (cid:112) det H ( z ) e gH ( f ; z ) pp dz = C (cid:48) (cid:90) [0 , d (cid:112) det H ( z ) e gH ( f ; z ) pp dz which concludes the proof of this proposition.04 Chapter 6. Approximation theory based on metrics This section is devoted to the explicit construction of a metric H adapted to a givenfunction f , in the setting where the function f is sufficiently smooth and the mass ofmetric is asymptotically large.Our main result is the following theorem, which is the counterpart for metrics ofthe results developed in Chapters 2 and 3. It involves the shape functions K , L a and L g which are defined in the introduction of this chapter, § Theorem 6.5.1. There exists a constant C = C ( m, d ) such that for each f ∈ C m per lim sup M →∞ M md inf H ∈ H g m ( H ) ≤ M e H ( f ) p ≤ C (cid:107) K ( d m f ) (cid:107) τ , where τ := md + p . Furthermore lim sup M →∞ M m − d inf H ∈ H a m ( H ) ≤ M e aH ( ∇ f ) p ≤ C (cid:107) L a ( d m f ) (cid:107) τ lim sup M →∞ M m − d inf H ∈ H g m ( H ) ≤ M e gH ( ∇ f ) p ≤ C (cid:107) L g ( d m f ) (cid:107) τ where τ := m − d + p . Our first lemma shows that the optimization problems on SL d defining the shapefunctions K , L a and L g can be reformulated into optimization problems posed on thecollection S + d of symmetric positive definite matrices. Lemma 6.5.2. Let ψ ∈ C (GL d , R ∗ + ) be homogeneous of degree r > and such that ψ ( A ) = ψ ( AO ) for any A ∈ GL d and any O ∈ O d . Then inf { (det M ) r d ; M ∈ S + d and ψ ( M − ) ≤ } = inf { ψ ( A ) ; A ∈ SL d } . Proof: To each M ∈ S + d we associate A := (det M ) d M − ∈ GL d , in such way thatdet( A ) = 1 and ψ ( A ) = (det M ) r d ψ ( M − ) ≤ (det M ) r d . Conversely to each A ∈ GL d we associate M := ψ ( A ) r ( AA T ) − ∈ S + d , in such way that (det M ) r d = ψ ( A ) and ψ ( M − ) = ψ ( ψ ( A ) − r ( AA T ) ) = ψ ( A ) − ψ ( AO ) = 1 where O := A − ( AA T ) is orthogo-nal. (cid:5) It follows from this lemma that for any π ∈ IH m K ( π ) = inf { (det M ) m d ; (cid:107) π ◦ M − (cid:107) ≤ } (6.54) L a ( π ) = inf { (det M ) m − d ; (cid:107) ( ∇ π ) ◦ M − (cid:107) ≤ } (6.55) L g ( π ) = inf { (det M ) m − d ; (cid:112) (cid:107) M (cid:107)(cid:107)∇ ( π ◦ M − ) (cid:107) ≤ } . (6.56) .5. Asymptotic approximation and explicit metrics Remark 6.5.3. The optimizations among symmetric matrices appearing in these expres-sions have a geometrical interpretation : finding the ellipsoid of maximal volume includedin a set of algebraic boundary. Indeed one easily checks that the three following propertiesare equivalent for all π ∈ IH m and all M ∈ S + d i) (cid:107) π ◦ M − (cid:107) ≤ . (resp. (cid:107) ( ∇ π ) ◦ M − (cid:107) ≤ , resp. (cid:112) (cid:107) M (cid:107)(cid:107)∇ ( π ◦ M − ) (cid:107) ≤ )ii) | π ( z ) | ≤ (cid:107) z (cid:107) mM for all z ∈ R d . (resp. |∇ π ( z ) | ≤ (cid:107) z (cid:107) m − M , resp. (cid:107) M (cid:107) | M − ( ∇ π ( z )) | ≤(cid:107) z (cid:107) m − M )iii) The ellipsoid defined by the inequality (cid:107) z (cid:107) M ≤ , z ∈ R d , is included in the set ofalgebraic boundary defined by the inequality | π ( z ) | ≤ . (resp. |∇ π ( z ) | ≤ , resp. (cid:107) M (cid:107) | M − ( ∇ π ( z )) | ≤ which depends on M ) We consider a fixed function f ∈ C m and for each z ∈ R d we define the homogeneouspolynomial π z ∈ IH m as follows π z ( Z ) := (cid:88) | α | = m ∂ α f ( z )( ∂z ) α Z α α ! . (6.57)where α ! = α ! · · · α d ! for α = ( α , · · · , α d ) ∈ ZZ d + .The heuristic guideline of the proof of Theorem 6.5.1 is the following (here in the caseof approximation in the L p norm). Consider a C ∞ periodic map M : R d → S + d such that (cid:107) π z ◦ M ( z ) − (cid:107) ≤ z ∈ R d , and define H ( z ) := (det M ( z )) − mp + d M ( z ) . Then is is not difficult to show that for all λ sufficiently large one has λH ∈ H a and m ( λH ) md e λH ( f ) p ≤ C (cid:13)(cid:13) (det M ) m d (cid:13)(cid:13) L τ ([0 , d ) . This leads to the estimate announced in Theorem 6.5.1 if (cid:13)(cid:13) (det M ) m d (cid:13)(cid:13) L τ ([0 , d ) is compa-rable to (cid:107) K ( d m f ) (cid:107) L τ ([0 , d ) , in other words if (det M ( z )) m d is comparable to K ( π z ) for all z ∈ R d , which means that M ( z ) is a close to be a minimizer of the infimum (6.54) defining K ( π z ). The same principles apply to the two other estimates stated in Theorem 6.5.1.The main ingredient of the proof of Theorem 6.5.1 is the construction of a continuousmap π (cid:55)→ M ( π ) such that (cid:107) π ◦ M ( π ) − (cid:107) ≤ M ( π )) m d is sufficientlyclose to K ( π ) (in the case of the approximation in the L p norm). Up to a few technicali-ties, which are adressed in § M ( z ) := M ( π z ) and reasoning as above thenconcludes the proof. We give in § π , in the case of piecewise linear and piecewise quadratic approximation intwo dimensions. Such explicit expressions are valuable for numerical implementation infinite element software which use anisotropic meshes, and have been implemented by theauthor in [93].We introduce in § C ( S + d , R ∗ + ),which is a variant of the classical notion of convexity. This is used in § K ( α ) , α ∈ (0 , ∞ ) of the shape function K which are defined by well posed opti-mization problems on S + d , in arbitrary dimension d and degree m . In contrast for some06 Chapter 6. Approximation theory based on metrics choices of π ∈ IH m there exists no minimizer to the optimization problem appearing inthe expression (6.54) of K ( π ), and other choices there exists an infinity of minimizers.We conclude the proof of Theorem 6.5.1 in § § K , L a and L g : they are uniformlyequivalent on IH m to some continuous functions. We give in this section some explicit minimizers, or near minimizers (in a sense definedbelow) of the optimization problems among symmetric matrices appearing in the expres-sions (6.54), (6.55) and (6.56) of the shape functions K , L a and L g . Our results are so farlimited to the case m = 2, which corresponds to piecewise linear approximation, and to m = 3 in dimension d = 2, which corresponds to piecewise quadratic approximation.These minimizers or near-minimizers have already been presented in Chapter 2 for theshape function K , and Chapter 3 for the shape function L a . The latter are recalled in thenext proposition, and completed with their counterparts for the shape function L g .For any homogeneous quadratic polynomial π we denote by [ π ] ∈ S d the associatedsymmetric matrix, which satisfies z T [ π ] z = π ( z ) for all z ∈ R d . Note that (cid:107) π (cid:107) = (cid:107) [ π ] (cid:107) . Proposition 6.5.4. i) (Piecewise linear approximation, in any dimension) If m = 2 and d ≥ , then for any π ∈ IH , one has L a ( π ) = 2 | det π | d and L g ( π ) = 2 (cid:112) (cid:107) π (cid:107) d (cid:112) | det π | . The minimizing matrices M in the expressions (6.55) and (6.56) of L a ( π ) and L g ( π ) are respectively M a ( π ) = 4[ π ] and M g ( π ) = (cid:107) π (cid:107) | [ π ] | . ii) (Piecewise quadratic approximation, in two dimensions) If m = 3 and d = 2 , let π = ax + 3 bx y + 3 cxy + dy ∈ IH , let M a ( π ) := (cid:115)(cid:18) a bb c (cid:19) + (cid:18) b cc d (cid:19) and let M g ( π ) := M a ( π ) + (cid:18) − disc π (cid:107) π (cid:107) (cid:19) + Id where disc π is the discriminant of the cubic polynomial π : disc π = 4( ac − b )( bd − c ) − ( ad − bc ) . (6.58) Then M a ( π ) and M g ( π ) are near minimizers of optimization problems defining L a ( π ) and L g ( π ) respectively, in the following sense : there exists a constant C such thatfor all π ∈ IH (cid:112) det M a ( π ) ≤ CL a ( π ) and (cid:113) det M g ( π ) ≤ L g ( π ) . .5. Asymptotic approximation and explicit metrics Furthermore if π is not univariate then M a ( π ) and M g ( π ) are non degenerate andwe have (cid:107) ( ∇ π ) ◦ M a ( π ) − (cid:107) ≤ C and (cid:107)M g ( π ) (cid:107) (cid:107)∇ ( π ◦ M g ( π ) − ) (cid:107) ≤ C. (6.59) Proof: We recall that ∇ ( π ◦ A ) = A T (( ∇ π ) ◦ A ) for any π ∈ IH m and A ∈ GL d .We begin with the case i) of a quadratic polynomial π ∈ IH , and we observe that ∇ π ( z ) = 2[ π ] z . Hence |∇ π ( z ) | = 4 z T [ π ] z and the announced result for L a easily follows.For L g , we are looking for a matrix M ∈ S + d such that (cid:107) M (cid:107)(cid:107)∇ ( π ◦ M − ) (cid:107) = 2 (cid:107) M (cid:107) (cid:107) M − [ π ] M − (cid:107) ≤ , (6.60)and of minimal determinant. Recalling that (cid:107) AB (cid:107) ≥ (cid:107) A (cid:107)(cid:107) B − (cid:107) − for all A, B ∈ GL d ,therefore (cid:107) M − [ π ] M − (cid:107) ≥ (cid:107) π (cid:107) / (cid:107) M (cid:107) , we obtain from (6.60) that 2 (cid:107) π (cid:107) ≤ (cid:107) M (cid:107) . Hence4 (cid:107) π (cid:107)(cid:107) M − [ π ] M − (cid:107) ≤ . (6.61)A matrix of minimal determinant which achieves this inequality is clearly M = 4 (cid:107) π (cid:107) | [ π ] | ,where | S | denotes the absolute value of a symmetric matrix S ∈ S d . Observing that thismatrix also satisfies the condition (6.60) we conclude the proof of Point i).We now turn to the case ii) of cubic polynomial in IH . The case of L a and M a isexposed in Chapter 3. We therefore turn to the case of L g . Combining the homogeneityand the continuity of the functions involved, one easily shows that there exists a constant C such that for all π ∈ IH C − (cid:107) π (cid:107) ≤ (cid:107)∇ π (cid:107) ≤ C (cid:107) π (cid:107) C − (cid:107) π (cid:107) ≤ λ ( π ) := (cid:107)M a ( π ) (cid:107) ≤ C (cid:107) π (cid:107) , (6.62)and µ ( π ) := (cid:107)M a ( π ) − (cid:107) − ≤ C (cid:107) π (cid:107) ν ( π ) := (cid:16) − disc π (cid:107) π (cid:107) (cid:17) + ≤ C (cid:107) π (cid:107) . (6.63)We consider below a fixed polynomial π ∈ IH , and we recall that L g ( π ) := inf {√ det M ; (cid:112) (cid:107) M (cid:107)(cid:107)∇ ( π ◦ M − ) (cid:107) ≤ } . We first prove the lower bound L g ( π ) ≥ c (cid:112) det M g ( π ), where c > M ∈ S + d satisfying the constraint (cid:107) M (cid:107)(cid:107) ∇ ( π ◦ M − ) (cid:107) ≤ 1. Combining this with (6.62) we obtain1 ≥ C − (cid:107) M (cid:107)(cid:107) π ◦ M − (cid:107) ≥ C − (cid:107) M (cid:107) (cid:107) π (cid:107)(cid:107) M (cid:107) = (cid:107) π (cid:107) C (cid:107) M (cid:107) , hence (cid:107) π (cid:107) ≤ C (cid:107) M (cid:107) . It follows that1 ≥ C − ( C − (cid:107) π (cid:107) ) (cid:107) π ◦ M − (cid:107) = (cid:107) ˜ π ◦ M − (cid:107) , Chapter 6. Approximation theory based on metrics where ˜ π := C − (cid:107) π (cid:107) π . The solution ˜ M of the minimization probleminf { det M ; (cid:107) ˜ π ◦ M − (cid:107) ≤ } is known exactly in the case of cubic polynomials in two dimensions, see Proposition 2.4.4in Chapter 2, and satisfies(det ˜ M ) = κ | disc ˜ π | = κ ( C − (cid:107) π (cid:107) ) | disc π | = κC − (cid:107) π (cid:107) | disc π | . where κ is a positive constant which depends only on the sign of disc ˜ π . Defining c :=˜ κ C − , where ˜ κ stands for the minimial value of κ for the two possible signs, we obtain L g ( π ) ≥ (cid:112) det ˜ M = κ C − (cid:107) π (cid:107) | disc π | ≥ c (cid:112) (cid:107) π (cid:107) ν ( π )On the other hand, there exists c > L g ( π ) ≥ L a ( π ) ≥ c (cid:112) det M a ( π ) = c (cid:112) λ ( π ) µ ( π ) . Furthermore we havedet M g ( π ) = ( λ ( π ) + (cid:107) π (cid:107) )( µ ( π ) + ν ( π )) ≤ (1 + C ) min { λ ( π ) , (cid:107) π (cid:107)} (2 max { µ ( π ) , ν ( π ) } ) ≤ C ) max { λ ( π ) µ ( π ) , (cid:107) π (cid:107) ν ( π ) } . We thus obtain (cid:113) det M g ( π ) ≤ C ∗ L g ( π ) where C ∗ := 2(1 + C ) max { c − , c − } . which concludes the proof of our lower estimate for L g .We now turn to the proof of the property (6.59), which is equivalent to the following (cid:107)M g ( π ) (cid:107) (cid:107)M g ( π ) − ( ∇ π ( x, y )) (cid:107) ≤ C when (cid:107) ( x, y ) (cid:107) M g ( π ) ≤ . For any rotation U ∈ O we have M a ( π ◦ U ) = U T M a ( π ) U , see Proposition 3.4.3 inChapter 3, and disc( π ◦ U ) = (disc π )(det U ) = disc π , see Chapter 2. We may thereforeassume, up to a rotation, that M a ( π ) is a diagonal matrix and that the first diagonalcoefficient is larger than the second one. Therefore λ ( π ) = a + 2 b + c and µ ( π ) = b + 2 c + d , and M a ( π ) = (cid:18) λ ( π ) 00 µ ( π ) (cid:19) . Note that this matrix is degenerate if and only if µ ( π ) = 0,which means that π is the univariate polynomial ax . In order to avoid notational clutter, .5. Asymptotic approximation and explicit metrics π ∈ IH is fixed, we denote below by λ , µ and ν the quantities λ ( π ), µ ( π ) and ν ( π ). Our purpose is to establish an upper bound on the quantity (cid:107)M g ( π ) (cid:107) (cid:107)M g ( π ) − ( ∇ π ( x, y )) (cid:107) = ( ax + 2 bxy + cy ) + (cid:18) λ + νµ + ν (cid:19) ( bx + 2 cxy + dy ) . (6.64)under the hypothesis (cid:107) ( x, y ) (cid:107) M g ( π ) = ( λ + ν ) x + ( µ + ν ) y ≤ , which implies in particular λx ≤ { µ, ν } y ≤ 1. Injecting this in (6.64), andobserving using (6.62) and (6.63) that ν ≤ C λ , we see that it is sufficient to bound thefollowing quantities a λ , b λµ , c µ and λµ b λ , λµ c µλ , λ max { µ, ν } d max { µ, ν } . (6.65)Observing that µ ≥ max { b, c, d } and λ ≥ max { a, b, c, µ } ≥ max { a, b, c, d } (6.66)we find that all the quantities appearing in (6.65) are smaller than one, except perhapsthe last one : λd max { µ, ν } (6.67)Using the expression (6.58) of the discriminant we find that − disc π = ( a + b + c ) d + aQ ( b, c, d ) + R ( b, c, d )where Q and R are homogenous polynomials of degree 3 and 4 respectively. Hence thereexists a constant C , independent of π , such that λ d = − disc π − ( aQ ( b, c, d ) + R ( b, c, d )) ≤ (cid:107) π (cid:107) ν + C λµ ≤ ( C + C ) λ max { µ, ν } . The previous inequality yields a uniform bound on (6.67), which concludes the proof ofthis proposition. (cid:5) Proposition (6.5.4) immediately gives an explicit expression, up to a fixed multipli-cative constant, of the shape functions L a and L g in the case m = 3 and d = 2 : L a ( π ) (cid:39) (cid:112) M a ( π ) and L g ( π ) (cid:39) (cid:112) M g ( π ). It follows that L g ( ax + dy ) (cid:39) (cid:112) | ad | max {| a | , | d |} and L a ( ax + dy ) (cid:39) (cid:112) | ad | , and L g ( ax + 3 cxy ) (cid:39) L a ( ax + 3 cxy ) (cid:39) (cid:112) max {| a | , | c |}| c | , Chapter 6. Approximation theory based on metrics where f ( a, b, c, d ) (cid:39) g ( a, b, c, d ) means that there exists a constant C ≥ C − f ( a, b, c, d ) ≤ g ( a, b, c, d ) ≤ Cf ( a, b, c, d ) for all a, b, c, d ∈ R .For a function which has anisotropic features of the type ax + dy , where a and d havedifferent orders of magnitude, the use of quasi-acute meshes and metrics may thereforelead to a substantial improvement of the approximation error compared to the use ofgraded meshes and metrics. This is not the case in contrast if all the anisotropic featuresof the approximated function are of the type ax + 3 cxy . We introduce in this section the geometric average of two symmetric positive definitematrices, and the property of geometric convexity for functions defined on S + d . This notionis a variant of the classical property of convexity, which is appropriate for the study ofthe minimization problems appearing in the expressions (6.54), (6.55) and (6.56) of theshape functions K , L a and L g .Let M, M (cid:48) ∈ S + d , and let S := M − . We define the geometric average Avg( M, M (cid:48) ) ∈ S + d of M and M (cid:48) as follows S Avg( M, M (cid:48) ) S := √ SM (cid:48) S (6.68)For instance Avg( M, M − ) = Id, and for any k, k (cid:48) > kM, k (cid:48) M (cid:48) ) = √ kk (cid:48) Avg( M, M (cid:48) ) . The following proposition gives an alternative characterization (6.69) of Avg( M, M (cid:48) ).This proposition also shows that Avg( M, M (cid:48) ) is a natural midpoint between M and M (cid:48) forthe distance d × , and establishes two basic properties of the geometric average of matrices. Proposition 6.5.5. For any M, M (cid:48) ∈ S + d the following holds.1. A matrix ˜ M ∈ S + d satisfies ˜ M = Avg( M, M (cid:48) ) if and only if there exists A ∈ GL d and a diagonal matrix D such that ˜ M = A T A, M = A T e D A, M (cid:48) = A T e − D A. (6.69) 2. The geometric average Avg( M, M (cid:48) ) is a midpoint between M and M (cid:48) for the distance d × on S + d : d × ( M, Avg( M, M (cid:48) )) = d × (Avg( M, M (cid:48) ) , M (cid:48) ) = 12 d × ( M, M (cid:48) ) . 3. The geometric average is commutative and compatible with the inversion of ma-trices : Avg( M, M (cid:48) ) = Avg( M (cid:48) , M ) (6.70)Avg( M, M (cid:48) ) − = Avg( M − , M (cid:48)− ) (6.71) .5. Asymptotic approximation and explicit metrics Proof: We first establish point 1. Let us assume in a first time that M, M (cid:48) and ˜ M havethe form (6.69). We easily obtain ˜ M M − ˜ M = M (cid:48) Defining S := M − we thus have ( S ˜ M S ) = SM (cid:48) S. Taking the square root of the previous equation we obtain (6.68) which shows that ˜ M =Avg( M, M (cid:48) ).In order to establish the converse of this property we only need to show that for any M, M (cid:48) ∈ S + d there exists A ∈ GL d and a diagonal matrix D such that M = A T e D A and M (cid:48) = A T e − D A . The expression of Avg( M, M (cid:48) ) then automatically follows from theprevious argument. Since the matrix M (cid:48)− M M (cid:48)− is symmetric positive definite, thereexists U ∈ O d and a diagonal matrix D such that M (cid:48)− M M (cid:48)− = U T exp(4 D ) U . Choo-sing A := e D U M (cid:48) we obtain the announced result.We now turn to the proof of Point 2. Let A ∈ GL d and let D, D (cid:48) be diagonal matrices,of diagonal coefficients D , · · · , D k and D (cid:48) , · · · , D (cid:48) d respectively. We define δ := max ≤ i ≤ d | D i − D (cid:48) i | . We claim that the matrices N := A T e D A and N (cid:48) := A T e D (cid:48) A satisfy d × ( N, N (cid:48) ) = δ .Indeed d × ( N, N (cid:48) ) := sup u (cid:54) =0 | ln (cid:107) u (cid:107) N − ln (cid:107) u (cid:107) N (cid:48) | = sup u (cid:54) =0 (cid:12)(cid:12)(cid:12) ln | e D Au | − ln | e D (cid:48) Au | (cid:12)(cid:12)(cid:12) = sup v (cid:54) =0 (cid:12)(cid:12)(cid:12) ln | e D v | − ln | e D (cid:48) v | (cid:12)(cid:12)(cid:12) . For any v = ( v , · · · , v d ) ∈ R d we have | e D v | = (cid:88) ≤ i ≤ d e D i v i ≤ e δ (cid:88) ≤ i ≤ d e D (cid:48) i v i = (cid:16) e δ | e D (cid:48) v | (cid:17) , and similarly | e D (cid:48) v | ≤ e δ | e D v | . Therefore (cid:12)(cid:12) ln | e D v | − ln | e D (cid:48) v | (cid:12)(cid:12) ≤ δ , and taking the supre-mum among all v (cid:54) = 0 we obtain d × ( N, N (cid:48) ) ≤ δ . Furthermore denote by i , 1 ≤ i ≤ d ,the position such that | D i − D (cid:48) i | = δ . Choosing v = (0 , · · · , , · · · , i , we obtain δ = (cid:12)(cid:12) ln | e D v | − ln | e D (cid:48) v | (cid:12)(cid:12) ≤ d × ( N, N (cid:48) ). We thusconclude that d × ( N, N (cid:48) ) = d × ( A T e D A, A T e D (cid:48) A ) = δ = (cid:107) D − D (cid:48) (cid:107) . Recalling that M, M (cid:48) and ˜ M := Avg( M, M (cid:48) ) have the form (6.69) we thus obtain d × ( M, Avg( M, M (cid:48) )) = d × (Avg( M, M (cid:48) ) , M (cid:48) ) = (cid:107) D (cid:107) and d × ( M, M (cid:48) ) = 2 (cid:107) D (cid:107) . Chapter 6. Approximation theory based on metrics which concludes the proof of this point.Finally we establish Point 3. Using the first point we obtain that M, M (cid:48) and ˜ M :=Avg( M, M (cid:48) ) have the form˜ M = A T A, M = A T e D A, M (cid:48) = A T e − D A It follows that ˜ M = A T A, M (cid:48) = A T e − D A, M = A T e D A, (6.72)and ˜ M − = ( A − ) T A − , M − = ( A − ) T e − D A − , M (cid:48)− = ( A − ) T e D A − . (6.73)Equation (6.70) (resp. (6.71)) is the consequence of (6.72) (resp. (6.73)) and of the cha-racterization of the geometric average given in the first point. (cid:5) We recall that a function ψ ∈ C ( S d , R ) is convex if and only if for all M, M (cid:48) ∈ S d onehas ψ (cid:18) M + M (cid:48) (cid:19) ≤ ψ ( M ) + ψ ( M (cid:48) )2 . (6.74)The next proposition introduces the family of geometrically convex functions on S + d , whichis defined by a property similar to (6.74) but in which the arithmetic averages are replacedwith geometric averages. Definition 6.5.6. We say that a function ϕ ∈ C ( S + d , R ∗ + ) is Geometrically Convex andHomogenous (GCH), if it satisfies the following properties.i) (Sub multiplicativity) For any M, M (cid:48) ∈ S + d one has ϕ (Avg( M, M (cid:48) )) ≤ (cid:112) ϕ ( M ) ϕ ( M (cid:48) ) . (6.75) ii) (Homogenous) There exists α ∈ R , called the degree of ϕ , such that for all M ∈ S + d and all λ ∈ R ∗ + one has ϕ ( λM ) = λ α ϕ ( M ) . We say that the function ϕ is Strictly Geometrically Convex and Homogeneous (SGCH),if ϕ is GCH and if the equality in (6.75) only holds if M and M (cid:48) are proportional. For instance the function det is GCH of degree d , sincedet Avg( M, M (cid:48) ) = √ det M det M (cid:48) and det( λM ) = λ d M. We insist on point that GCH functions are by assumption continuous and strictly posi-tive on S + d . The next lemma enumerates some simple properties of geometrically convexfunctions. Lemma 6.5.7. 1. The product of two GCH functions is GCH, and the product of aGCH function with a SGCH function is SGCH. Furthermore the degrees add. .5. Asymptotic approximation and explicit metrics 2. The elevation to a positive power α ∈ R ∗ + of a GCH (resp. SGCH) function is alsoGCH (resp. SGCH). Furthermore the degree is multiplied by α .3. If a function ϕ is GCH (resp. SGCH) then M (cid:55)→ ϕ ( M − ) is also GCH (resp.SGCH), and has the opposite degree. Proof: The first two points are immediate, and the last point is a direct consequence ofthe compatibility (6.71) of the geometric average with the inversion : if ϕ is GCH and M, M (cid:48) ∈ S + d then ϕ (Avg( M, M (cid:48) ) − ) = ϕ (Avg( M − , M (cid:48)− )) ≤ (cid:112) ϕ ( M − ) ϕ ( M (cid:48)− ) , which concludes the proof. (cid:5) Aside from the determinant, a number of functions on S + d are geometrically convex.The next proposition, which is based on an argument of complex analysis, enumeratessome of them. For all π ∈ IH m we define (cid:107) π (cid:107) C || := sup {| π ( z ) | ; z ∈ C || d , | z | ≤ } . Proposition 6.5.8. 1. The trace map M (cid:55)→ Tr M is SGCH.2. For any fixed polynomial π ∈ IH m \ { } the map M (cid:55)→ (cid:107) π ◦ M (cid:107) C || is GCH, of degree m/ .3. The norm map M (cid:55)→ (cid:107) M (cid:107) is GCH. Proof: We first establish Point 1. Let M, M (cid:48) ∈ S + d and let ˜ M := Avg( M ). Let A ∈ GL d and let D be a diagonal matrix such that M = A T e D A , M (cid:48) = A T e − D A and ˜ M = A T A .We denote by ( A ij ) ≤ i,j ≤ d the coefficients of A , and we define for all 1 ≤ i ≤ d A i := (cid:88) ≤ j ≤ d A ij . Note that A i > ≤ i ≤ d , since otherwise a full line of A would be zero and A would be degenerate. Denoting by D i , · · · , D d the diagonal coefficients of D we obtainTr( M ) = (cid:88) ≤ i ≤ d A i e D i , Tr( M (cid:48) ) = (cid:88) ≤ i ≤ d A i e − D i , Tr( ˜ M ) = (cid:88) ≤ i ≤ d A i . It thus follows from the Cauchy Schwartz inequality that Tr( ˜ M ) ≤ Tr( M ) Tr( M (cid:48) ). Fur-thermore equality occurs if and only if the vectors( A i e D i ) ≤ i ≤ d and ( A i e − D i ) ≤ i ≤ d are proportional, which implies that D i = D j for all 1 ≤ i, j ≤ d and therefore that M and M (cid:48) are proportional.We now turn to the proof of Point 2., and for that purpose we recall a classical result ofcomplex analysis, called Hadamard’s three lines theorem. Let Ω := { α ∈ C || ; |(cid:60) ( α ) | < } ,14 Chapter 6. Approximation theory based on metrics where (cid:60) denotes the real part, and let g be a continuous and bounded function on Ωwhich is holomorphic on Ω. Then | g (0) | ≤ (cid:18) sup (cid:60) α = − | g ( α ) | (cid:19) (cid:18) sup (cid:60) z = α | g ( α ) | (cid:19) . (6.76)We consider a fixed z ∈ C || d satisfying | z | ≤ 1, and we define for all α ∈ Ω g ( α ) := π ( A T e αD z ) . The function g is continuous, uniformly bounded on Ω by (cid:107) π (cid:107) C || ( (cid:107) A (cid:107) e (cid:107) D (cid:107) ) d , and holo-morphic on C || , hence on Ω. Therefore Hadamard’s three lines theorem applies. For any σ ∈ {− , } one has sup (cid:60) ( α )= σ | π ( A T e αD z ) | ≤ (cid:107) π ◦ ( A T e σD ) (cid:107) C || , since | e i (cid:61) ( α ) D z | = | z | = 1, where (cid:61) denotes the imaginary part and i the imaginary unit.Applying (6.76) to the function g we thus obtain | π ( A T z ) | ≤ (cid:107) π ◦ ( A T e D ) (cid:107) C || (cid:107) π ◦ ( A T e − D ) (cid:107) C || . Taking the supremum among all z ∈ C || d such that | z | ≤ (cid:107) π ◦ A T (cid:107) C || ≤ (cid:107) π ◦ ( A T e D ) (cid:107) C || (cid:107) π ◦ ( A T e − D ) (cid:107) C || . One easily checks that the matrices O, O (cid:48) , ˜ O ∈ GL d defined by OM = e D A, O (cid:48) M (cid:48) = e − D A and ˜ O ˜ M = A, are orthogonal. Since orthogonal matrices are also unitary matrices, in the sense that | Oz | = | z | for all z ∈ C || , we obtain (cid:107) π ◦ A T e − D (cid:107) C || = (cid:107) π ◦ M (cid:107) C || , (cid:107) π ◦ A T e − D (cid:107) C || = (cid:107) π ◦ M (cid:48) (cid:107) C || (cid:107) π ◦ A T (cid:107) C || = (cid:107) π ◦ ˜ M (cid:107) C || , thus (cid:107) π ◦ M (cid:107) C || satisfies the sub-multiplicativity property (6.75). Furthermore the func-tion M ∈ S + d (cid:55)→ (cid:107) π ◦ M (cid:107) C || is clearly continuous, strictly positive, and homogeneous ofdegree m/ π ∈ IH the canonicalquadratic form π ( z ) := (cid:88) ≤ j ≤ d z j . Our purpose is to establish that for any M ∈ S + d (cid:107) π ◦ M (cid:107) C || = (cid:107) M (cid:107) , (6.77) .5. Asymptotic approximation and explicit metrics (cid:107) M (cid:107) is GCH in view of the previous point. For any z ∈ C || such that | z | ≤ | π ( M z ) | = | z T M M z | ≤ | M z | ≤ (cid:107) M (cid:107) = (cid:107) M (cid:107) . Furthermore choosing a normalized eigenvector z ∈ R d associated to the maximal eigen-value of M we obtain | π ( M z ) | = (cid:107) M (cid:107) , which establishes (6.77) and concludes the proof. (cid:5) For each M ∈ S + d we define κ ( M ) := 1 d (cid:112) Tr( M ) Tr( M − ) . The function κ is SGCH according to Lemma 6.5.7 and since M (cid:55)→ Tr M is SGCH. Interms of the eigenvalues λ , · · · , λ d of M we have d κ ( M ) = (cid:32) (cid:88) ≤ i ≤ d λ i (cid:33) (cid:32) (cid:88) ≤ i ≤ d λ − i (cid:33) , therefore κ ( M ) ≥ 1, using the Cauchy Schwartz inequality, with equality if and only if M = m Id for some m > 0. Note also that κ ( M ) ≤ (cid:112) (cid:107) M (cid:107)(cid:107) M − (cid:107) ≤ d κ ( M ) . We regard the function κ as a close variation on the condition number (cid:112) (cid:107) M (cid:107)(cid:107) M − (cid:107) ofthe matrix M . We define in this section three variants of the shape functions K , L a and L g : for all π ∈ IH m K C || ( π ) := inf A ∈ SL d (cid:107) π ◦ A (cid:107) C || L C || a ( π ) := inf A ∈ SL d (cid:112) (cid:107) G ( π ) ◦ A (cid:107) C || L C || g ( π ) := inf A ∈ SL d (cid:107) A − (cid:107)(cid:107) π ◦ A (cid:107) C || , where G ( π ) := |∇ π | ∈ IH m − . For each α ∈ (0 , ∞ ) we define three new variants of theshape functions K ( α ) ( π ) := inf { κ ( M ) α (det M ) m d ; (cid:107) π ◦ M − (cid:107) C || ≤ κ ( M ) ≤ e α } ,L ( α ) a ( π ) := inf { κ ( M ) α (det M ) m − d ; (cid:107) G ( π ) ◦ M − (cid:107) C || ≤ κ ( M ) ≤ e α } ,L ( α ) g ( π ) := inf { κ ( M ) α (det M ) m − d ; (cid:107) M (cid:107) (cid:107) π ◦ M − (cid:107) C || ≤ κ ( M ) ≤ e α } . This section is devoted to the proof of the following theorem.16 Chapter 6. Approximation theory based on metrics Theorem 6.5.9. 1. The shape function K (resp. L a , resp. L g ) is uniformly equivalentto K C || on IH m (resp. L C || a , resp. L C || g ).2. For each π ∈ IH m we have the decreasing convergence lim α →∞ K ( α ) ( π ) = K C || ( π ) . (resp. L ( α ) a ( π ) → L C || a ( π ) decreasingly, resp. L ( α ) g ( π ) → L C || g ( π ) decreasingly, as α →∞ ).3. For each α ∈ (0 , ∞ ) and each π ∈ IH m \ { } there exists a unique minimizer M ( α ) ( π ) ∈ S + d to the optimization problem defining K ( α ) ( π ) , and the map π ∈ IH m \ { } → M ( α ) ( π ) ∈ S + d is continuous. (resp. likewise for L ( α ) ( π ) and M ( α ) a ( π ) , resp. likewise for L ( α ) g ( π ) and M ( α ) g ( π ) ) We first establish Point 1. of this theorem. Since the vector spaces IH m and IH m − have finite dimension, there exists three constants C K , C a and C g such that (cid:107) π (cid:107) ≤ (cid:107) π (cid:107) C || ≤ C K (cid:107) π (cid:107) , π ∈ IH m , (cid:107) π (cid:48) (cid:107) ≤ (cid:107) π (cid:48) (cid:107) C || ≤ C a (cid:107) π (cid:48) (cid:107) , π ∈ IH m − ,C − g (cid:107)∇ π (cid:107) ≤ (cid:107) π (cid:107) C || ≤ C g (cid:107)∇ π (cid:107) , π ∈ IH m . (6.78)Hence it follows from the definition of the original shape functions, given in § K ≤ K C || ≤ C K K , L a ≤ L C || a ≤ C a L a and C − g L g ≤ L C || g ≤ C g L g on IH m .Point 2. of this theorem is an immediate consequence of the fact that κ ( M ) ∈ [1 , ∞ )for all M ∈ S + d .The following proposition illustrates the use of κ as a regularization term, and imme-diately implies Point 3. of Theorem 6.5.9. Proposition 6.5.10. Let ϕ be a GSH function of degree r > . Let ( X, d X ) be a metricspace, and let ψ ∈ C ( S + d × X, R ∗ + ) be such that ψ ( · , x ) is GCH of degree − for any fixed x ∈ X . We define for all x ∈ X and all α ∈ (0 , ∞ ) L ( α ) ( x ) := inf (cid:110) κ ( M ) α ϕ ( M ) ; ψ ( M, x ) ≤ and κ ( M ) ≤ e α (cid:111) . For any x ∈ X there exists a unique minimizer M ( α ) ( x ) to the optimization problemdefining L ( α ) ( x ) . Furthermore the map M ( α ) : X → S + d is continuous. Proof: We consider a fixed parameter α ∈ (0 , ∞ ). We first establish the existence ofa minimizer to the optimization problem defining L ( α ) ( x ), and we thus consider a fixed x ∈ X . For any β ≥ S + d K β := { M ∈ S + d ; | ln(det M ) | ≤ β and κ ( M ) ≤ e α } . (6.79)We define Λ := sup M ∈ K ϕ ( M ) and λ := inf M ∈ K ψ ( M, x ) , .5. Asymptotic approximation and explicit metrics M ∈ S + d satisfying κ ( M ) ≤ e α ϕ ( M ) ≤ Λ(det M ) rd and ψ ( M, x ) ≥ λ (det M ) − d . (6.80)Let ( M n ) n ≥ , M n ∈ S + d , be a minimizing sequence for the optimization problem defining L ( α ) ( x ). It follows from (6.80) that det( M n ) uniformly bounded above and below : hencethere exists β ≥ | ln det M n | ≤ β for all n ≥ 0. Furthermore κ ( M n ) ≤ e α for each n ≥ 0, hence M n ∈ K β . Since the set K β is compact there exists a convergingsubsequence M σ ( n ) → M ∞ ∈ K β . By continuity of the functions ϕ , ψ ( · , x ) and κ on S + d ,the matrix M ∞ is a minimizer of the optimization problem L ( α ) ( x ).We now consider two minimizers M, M (cid:48) to the optimization problem defining L ( α ) ( x ),and we intend to show that M = M (cid:48) . We denote ˜ M := Avg( M, M (cid:48) ). Since the functions ψ ( · , x ) and κ are GCH, the matrix ˜ M satisfies the inequalities ψ ( ˜ M , x ) ≤ κ ( ˜ M ) ≤ e α . We define for each M ∈ S + d the quantityΦ( M ) := κ ( M ) α ϕ ( M ) , (6.81)and we observe that Φ : S + d → R ∗ + is a SGCH function of degree r > M is not proportionnal to M (cid:48) we thus haveΦ( ˜ M ) < (cid:112) Φ( M )Φ( M (cid:48) ) = L ( α ) ( x ) , which is a contradicts the fact that M and M (cid:48) are minimizers of the optimization problemdefining L ( α ) ( x ). Thus M = mM (cid:48) for some m > 0, but this implies L ( α ) ( x ) = Φ( M ) = Φ( M (cid:48) ) = m r Φ( M ) , therefore m = 1 and M = M (cid:48) , which establishes as announced that the minimizer isunique.Finally we establish that the map x (cid:55)→ M ( α ) ( x ) is continuous. For all x, x (cid:48) ∈ X , wedefine ω ( x, x (cid:48) ) := sup M ∈ K | ln ψ ( M, x ) − ln ψ ( M, x (cid:48) ) | , where the compact set K is defined by (6.79), and we observe that ω ( x, x (cid:48) ) → x (cid:48) → x . (Note that ω ( x, x (cid:48) ) is the modulus of continuity of the continuous function X → C ( K , R ), defined by x (cid:55)→ ln ψ ( · , x ).)Let x, x (cid:48) ∈ X and let M := e ω ( x,x (cid:48) ) M ( α ) ( x ) and M (cid:48) := e ω ( x,x (cid:48) ) M ( α ) ( x (cid:48) ). We thus have κ ( M ) = κ ( M ( α ) ( x )) ≤ e α , and likewise κ ( M (cid:48) ) ≤ e α . Since ψ ( · , x (cid:48) ) is homogeneous of degree − ψ ( M, x (cid:48) ) ≤ e ω ( x,x n ) ψ ( M, x ) = e ω ( x,x (cid:48) ) ψ ( e ω ( x,x (cid:48) ) M ( α ) ( x ) , x ) ≤ , Chapter 6. Approximation theory based on metrics and likewise ψ ( M (cid:48) , x ) ≤ 1. Furthermore since the function Φ defined by (6.81) is homo-geneous of degree r we obtain L ( α ) ( x (cid:48) ) ≤ Φ( M ) = e rω ( x,x (cid:48) ) L ( α ) ( x ) , which impliesΦ( M (cid:48) ) = Φ( e ω ( x,x (cid:48) ) M ( α ) ( x (cid:48) )) = e rω ( x,x (cid:48) ) L ( α ) ( x (cid:48) ) ≤ e rω ( x,x (cid:48) ) L ( α ) ( x ) . Consider a converging sequence in X , x n → x , and define M n := e ω ( x,x n ) M ( α ) ( x n ) . The arguments above show that M n is a minimizing sequence for the optimization problem L ( α ) ( x ). The arguments developed for the existence of a minimizer show that { M n ; n ≥ } is contained in the compact set K β if the constant β ≥ M n ) n ≥ tends to a minimizer of the optimization problem L ( α ) ( x ), thus to M ( α ) ( x ) by uniqueness.Since the sequence M n takes its values in the compact set K β and has the singleadherence value M ( α ) ( x ), we have the convergence M n → M ( α ) ( x ) as n → ∞ . Since ω ( x, x n ) → M ( α ) ( x n ) → M ( α ) ( x ) , which concludes the proof of this proposition. (cid:5) Remark 6.5.11. One easily checks that for each π ∈ IH m , each α ∈ (0 , ∞ ) and each λ ∈ R ∗ + we have K ( α ) ( λπ ) = λL ( α ) ( π ) and M ( α ) ( λπ ) = λ m M ( α ) ( π ) . It follows that M ( α ) ( π ) → as π → , and therefore that the map from IH m to S d definedby π (cid:55)→ (cid:26) M ( α ) ( π ) if π (cid:54) = 00 if π = 0 is continuous. Likewise L ( α ) a and L ( α ) are -homogeneous on IH m , and M ( α ) a and M ( α ) g are m − homogeneous on IH m which implies that they tend to as π → , and are continuouson IH m . This subsection is devoted to the construction of a metric H which is adapted to agiven smooth function f in the sense that the error e H ( f ) p (resp. e aH ( ∇ f ) p , resp. e gH ( ∇ f ) p )is optimally small among all metrics of the same mass m ( H ) up to a fixed multiplicativeconstant. Our construction only applies to an interpolation large mass of the metric H . Wedo not establish that it is optimal, but this is suggested by the lower error estimates given .5. Asymptotic approximation and explicit metrics f is so far an openquestion.Our first lemma shows that the regularity constraints defining by the classes H a and H g of metrics “disappear” when one multiplies a metric with a large constant. Lemma 6.5.12. For any H ∈ H per which is C , there exists λ = λ ( H ) such that λ H ∈ H a for all λ ≥ λ . Proof: The following maps are C since they are the composition of C maps z ∈ R d (cid:55)→ H ( z ) − ∈ S d z ∈ R d (cid:55)→ ϕ × ( H ( z )) := ( u (cid:55)→ ln (cid:107) u (cid:107) H ( z ) ) ∈ C ( S , R )where S := { u ∈ R d ; | u | = 1 } denotes the unit euclidean sphere (see § ϕ × ). Since H is C and ZZ d -periodicthese maps are uniformly Lipschitz, hence there exists two constants C + , C × such that forall z, z (cid:48) ∈ R d d + ( H ( z ) , H ( z (cid:48) )) := (cid:107) H ( z ) − − H ( z (cid:48) ) − (cid:107) ≤ C + | z − z (cid:48) | ,d × ( H ( z ) , H ( z (cid:48) )) := (cid:107) ϕ × ( H ( z )) − ϕ × ( H ( z (cid:48) )) (cid:107) L ∞ ( S ) ≤ C × | z − z (cid:48) | . Furthermore, since H is continuous and periodic, there exists ε > H ( z ) ≥ ε Idfor all z ∈ R d . Hence d H ( z, z (cid:48) ) ≥ ε | z − z (cid:48) | , for all z, z (cid:48) ∈ R d . From this point, it follows from Remark 5.1.13 in the previous chapterthat λ H ∈ H a for all λ ≥ λ := max { C + , C × /ε } . (cid:5) Given a function f ∈ C m per , an exponent 1 ≤ p ≤ ∞ , and a triplet s = ( α, ρ, δ ) ∈ ( R ∗ + ) of parameters we construct a C ∞ metric H as follows. For each z ∈ R d we define M ( z ) := M ( α ) ( π z ) + ρ Id , where the homogeneous polynomial π z ∈ IH m is defined by (6.57). Note that M is conti-nuous according to Remark 6.5.11. We consider a fixed mollifier ψ , namely a radial non-negative compactly supported C ∞ function of integral one, and we denote ψ δ := ψ ( · /δ ) /δ d .We then define H ( z ) := (det M (cid:48) ( z )) − mp + d M (cid:48) ( z ) where M (cid:48) := ψ δ ∗ M. (6.82)Given a symbol (cid:63) ∈ { a, g } we define similarly M (cid:63) := M ( α ) (cid:63) ( π z ) + ρ Id, M (cid:48) (cid:63) := ψ δ ∗ M (cid:63) and H (cid:63) ( z ) := (det M (cid:48) (cid:63) ( z )) − m − p + d M (cid:48) (cid:63) ( z ) . The rest of this subsection is devoted to the proof of the following theorem, whichimmediately implies the asymptotic estimates (6.18), (6.19) and (6.20) announced in theintroduction.20 Chapter 6. Approximation theory based on metrics Theorem 6.5.13. There exists a constant C = C ( m, d ) such that the following holds forany f ∈ C m per , any ≤ p ≤ ∞ and any ε > . If the triplet of parameters s = ( α, ρ, δ ) iswell chosen then1. Defining τ by τ := md + p we have lim λ →∞ m ( λ H ) md e λ H ( f ) p ≤ C (cid:107) K ( d m f ) (cid:107) L τ ([0 , d ) + ε. (6.83) 2. Defining τ by τ := m − d + p we have lim λ →∞ m ( λ H a ) m − d e aλ H a ( ∇ f ) p ≤ C (cid:107) L a ( d m f ) (cid:107) L τ ([0 , d ) + ε and lim λ →∞ m ( λ H g ) m − d e gλ H g ( f ) p ≤ C (cid:107) L g ( d m f ) (cid:107) L τ ([0 , d ) + ε. The proof of the three estimates are completely similar. We therefore focus on theproof of (6.83), and the details of the two other estimates are left to the reader. Foreach z ∈ R d we identify the polynomial π z with the collection d m f ( z ) /m ! of m -th orderderivatives of f at z , and we observe that (cid:107) det( M ( α ) ( d m f )) m d (cid:107) L τ ([0 , d ) = (cid:107) K ( α ) ( d m f ) (cid:107) L τ ([0 , d ) −→ α →∞ (cid:107) K C || ( d m f ) (cid:107) L τ ([0 , ≤ C K (cid:107) K ( d m f ) (cid:107) L τ ([0 , where the constant C K is defined in (6.78). In the second line we used the fact that K ( α ) ( d m f ) converges pointwise decreasingly to K C || ( d m f ), which implies the convergenceof the integrals. We may therefore choose α sufficiently large and ρ sufficiently small insuch way that (cid:107) (det M ) m d (cid:107) L τ ([0 , d ) < C (cid:107) K ( d m f ) (cid:107) L τ ([0 , + ε, where C = C K /m ! (recall that M ( α ) and K ( α ) are homogeneous functions, as observed inRemark 6.5.11).For each z ∈ R d such that π z (cid:54) = 0 we have (cid:107) π z ◦ M ( z ) − (cid:107) = sup | u | =1 | π z ( u ) |(cid:107) u (cid:107) mM ( z ) = sup | u | =1 | π z ( u ) | ( (cid:107) u (cid:107) M ( π z ) + ρ ) m < sup | u | =1 | π z ( u ) |(cid:107) u (cid:107) m M ( π z ) = (cid:107) π z ◦ M ( π z ) − (cid:107)≤ . We may therefore choose δ > (cid:107) π z ◦ M (cid:48) ( z ) − (cid:107) < z ∈ [0 , d and such that (cid:107) (det M (cid:48) ( z )) m d (cid:107) L τ ([0 , d ) < C (cid:107) K ( d m f ) (cid:107) L τ ([0 , + ε. (6.84) .5. Asymptotic approximation and explicit metrics m ( H ) τ = (cid:107) (det M (cid:48) ( z )) m d (cid:107) L τ ([0 , d ) . We define for each z ∈ R d µ z := (cid:88) | α | R > H ≥ Id /R uniformly on [0 , d , hence on R d . The function f z,λ converges uniformly to π z on B (0 , R ) :lim λ →∞ (cid:107) f z,λ − π z (cid:107) L ∞ ( B (0 ,R )) = 0 , (6.86)since π z is the m -th term in the Taylor development of f at z . Since B H ( z ) ⊂ B (0 , R ) wetherefore obtainlim λ →∞ λ m + dp e λ H ( f ; z ) p = lim λ →∞ λ m + dp inf µ ∈ IP m − (cid:107) f z,λ − µ (cid:107) L p ( B H ( z )) = inf µ ∈ IP m − (cid:107) π z − µ (cid:107) L p ( B H ( z )) ≤ (cid:107) π z (cid:107) L p ( B H ( z )) ≤ | B H ( z ) | p (cid:107) π z (cid:107) L ∞ ( B H ( z )) = ω (det H ( z )) − p (cid:107) π z ◦ H ( z ) − (cid:107) = ω (cid:107) π z ◦ M ( z ) − (cid:107)≤ ω, where ω := | B (0 , | denotes the volume of the unit euclidean ball. We used in the thirdline the proportionality relation H ( z ) = (det M ( z )) − mp + d M ( z ) and the fact that π z ishomogeneous of degree m , which implydet H ( z ) = det M ( z ) mpmp + d and (cid:107) π z ◦ H ( z ) − (cid:107) = (det M ( z )) m mp + d ) (cid:107) π z ◦ M ( z ) − (cid:107) . The convergence (6.86) is uniform over all z ∈ [0 , d , and thereforelim λ →∞ λ mp e λ H ( f ) pp = lim λ →∞ λ mp (cid:90) [0 , d (cid:112) det( λ H ( z )) e λ H ( f ; z ) pp dz = (cid:90) [0 , d (cid:112) det H ( z ) (cid:16) lim λ →∞ λ mp + d e λ H ( f ; z ) pp (cid:17) dz ≤ ω p (cid:90) [0 , d (cid:112) det H ( z ) dz. = ω p m ( H ) . Chapter 6. Approximation theory based on metrics Since m ( λ H ) = λ d m ( H ) we therefore obtain injecting (6.84)lim λ →∞ m ( λ H ) md e λ H ( f ) p ≤ ω m ( H ) md + p = C (cid:107) K ( d m f ) (cid:107) L τ ([0 , d ) + ε, which concludes the proof. The shape functions K , L a and L g are defined in § d of all matrices of unit determinant. Onecan infer from this property that the shape functions are upper-semi continuous, but theyare not continuous in general as illustrated in Remark 2.4.7 (in Chapter 2) for the shapefunction K when m = 4 and d = 2.The purpose of this subsection is to establish the following theorem, which states thatthe shape functions are nevertheless uniformly equivalent to a continuous function on IH m . Theorem 6.5.14. There exists a constant C = C ( m, d ) and three continuous functions K c , L ca and L cg on IH m such that for all π ∈ IH m C − K c ( π ) ≤ K ( π ) ≤ CK c ( π ) C − L ca ( π ) ≤ L a ( π ) ≤ CL ca ( π ) C − L cg ( π ) ≤ L g ( π ) ≤ CL cg ( π ) . Note that in the case of the shape functions K and L a , in dimension d = 2, thistheorem is a consequence of the explicit algebraic equivalents of these shape functionsgiven in Chapters 2 and 3.The analysis presented below applies to all three shape functions, without restrictionon the dimension d . As observed in § L a and L g are uniformlyequivalent on IH m to the functions L ∗ a and L ∗ g respectively, defined for all π ∈ IH m by L ∗ a ( π ) := inf A ∈ SL d (cid:112) (cid:107) G ( π ) ◦ A (cid:107) , (6.87) L ∗ g ( π ) := inf A ∈ SL d (cid:107) A − (cid:107)(cid:107) π ◦ A (cid:107) , (6.88)where G ( π ) := |∇ π | ∈ IH m − . Let us also recall that K ( π ) := inf A ∈ SL d (cid:107) π ◦ A (cid:107) . (6.89)The fact that L ∗ a is uniformly equivalent to a continuous function therefore directly followsfrom the same property for the shape function K (in degree 2 m − m ), bycomposition with the continuous functions G : IH m → IH m − and √· : R + → R + .The next lemma gives a criterion on the lower semi-continuous envelope of a function,which guarantees that it is equivalent to a continuous function. .5. Asymptotic approximation and explicit metrics Lemma 6.5.15. Let ( X, d X ) be an arbitrary metric space, and let f : X → R + . Wedefine the lower semi continuous envelope f of f as follows : for all x ∈ Xf ( x ) := lim ε → inf y ∈ B ( x,ε ) f ( y ) . Assume that f is upper semi-continuous and that there exists a constant C such that f ( x ) < Cf ( x ) or f ( x ) = 0 (6.90) for all x ∈ X . Then there exists a continuous function f c ∈ C ( X, R + ) such that C − f c ≤ f ≤ Cf c . (6.91) Proof: For each n ∈ ZZ we define the set E n := { x ∈ X ; f ( x ) ≥ C n } , (6.92)which is closed since f is upper semi-continuous. For any x ∈ X \ E n there exists asequence ( x k ) k ≥ , x k ∈ X \ E n (hence f ( x k ) < C ), converging to x . Therefore f ( x ) ≤ lim inf k →∞ f ( x k ) ≤ C. Combining this with (6.90) we obtain that x / ∈ E n +1 , hence the closed sets E n +1 and X \ E n are disjoint. We denote by ( r n ) n ∈ ZZ , r n ∈ C ( X, [0 , r n | E n +1 = 1 and r n | X \ E n = 0 . The simplest construction of such functions is the following : r n ( x ) := d X ( x, X \ E n ) d X ( x, X \ E n ) + d X ( x, E n +1 ) , where d X ( x, E ) := inf { d X ( x, e ) ; e ∈ E } . We define a function r : X → [ −∞ , ∞ ) (observethat −∞ is included) by the sum r ( x ) := (cid:88) n< ( r n ( x ) − 1) + (cid:88) n ≥ r n ( x ) . For each n ∈ ZZ the function r is equal to n + r n + r n +1 on the open set int( E n \ E n +2 ).Furthermore E n +1 \ E n +2 ⊂ { x ∈ X ; f ( x ) < C n +2 and f ( x ) > C n } ⊂ int( E n \ E n +2 ) , therefore r is continuous on the reunion (cid:91) n ∈ ZZ int( E n \ E n +2 ) ⊃ (cid:91) n ∈ ZZ E n +1 \ E n +2 = { x ∈ X ; f ( x ) > } . Consider x ∈ X such that f ( x ) = 0, hence r ( x ) = −∞ . Since f is upper semi-continuousthere exists for each n ∈ ZZ a constant δ > f < C n on B ( x, δ ). It follows that r ≤ n on B ( x, δ ), hence that r is continuous at x , in the topology of [ −∞ , ∞ ).24 Chapter 6. Approximation theory based on metrics The function r is therefore continuous on the whole space X . Since C ≥ f c : X → R + defined by f c ( x ) := C r ( x ) , is therefore also continuous. We have f c ( x ) = 0 if and only if r ( x ) = −∞ , which isequivalent to f ( x ) = 0. Furthermore for each n ∈ ZZ and each x ∈ E n \ E n +1 we have C n ≤ f ( x ) < C n +1 and C n ≤ f c ( x ) ≤ C n +1 by construction, which establishes (6.91) andconcludes the proof of this lemma. (cid:5) The proofs that the functions K and L ∗ g are uniformly equivalent on IH m to a continuousfunction are extremely similar. The case of L ∗ g is nevertheless slightly more involved dueto the term (cid:107) A − (cid:107) appearing in (6.88) which is not present in (6.89). We therefore focusour attention on L ∗ g , and we leave to the reader the details of the adaptation of the proofto the shape function K .We define an auxiliary function L on M d × IH m as follows : for all ( B, π ) ∈ M d × IH m L ( B, π ) := inf A ∈ SL d (cid:107) A − B (cid:107)(cid:107) π ◦ A (cid:107) , (6.93)hence L (Id , π ) = L g ( π ) for all π ∈ IH m .The function L is upper semi continuous since it is defined as the infimum of a familyof continuous functions. Hence for any converging sequence ( B n , π n ) → ( B, π ) in M d × IH m we have L ( B, π ) ≥ lim sup n →∞ L ( B n , π n ) . The next lemma shows that this inequality becomes an equality if the limit is zero. Lemma 6.5.16. For any converging sequence ( B n , π n ) → ( B, π ) in M d × IH m the followingholds : if lim n →∞ L ( B n , π n ) = 0 then L ( B, π ) = 0 . Proof: If L ( B n , π n ) → A n ) n ≥ , A n ∈ SL d , such thatlim n →∞ (cid:107) A − n B n (cid:107)(cid:107) π n ◦ A n (cid:107) = 0 . (6.94)For each n we consider the principal value decomposition A n = U n D n V n , where U n , V n ∈O d and D n is a diagonal matrix satisfying det D n = 1, with positive diagonal coefficients t n = ( t n, , · · · , t n,d ). We have for any n ≥ 0, since V n is orthogonal, (cid:107) A − n B n (cid:107)(cid:107) π ◦ A n (cid:107) = (cid:107) V T n D − n U T n B n (cid:107)(cid:107) π ◦ ( U n D n V n ) (cid:107) = (cid:107) D − n U T n B n (cid:107)(cid:107) π ◦ ( U n D n ) (cid:107) . Since the collection O d of orthogonal matrices is compact, the sequence of orthogonalmatrices U n admits a converging sub-sequence U ϕ ( n ) → U . We define B (cid:48) := U T B , π (cid:48) := π ◦ U , and we observe that since | det U | = 1 L ( B, π ) = L ( B (cid:48) , π (cid:48) ) . (6.95) .5. Asymptotic approximation and explicit metrics nB (cid:48) n := U T ϕ ( n ) B ϕ ( n ) , π (cid:48) n := π ϕ ( n ) ◦ U ϕ ( n ) , and D (cid:48) n := D ϕ ( n ) , in such way thatlim n →∞ ( B (cid:48) n , π (cid:48) n ) = ( B (cid:48) , π (cid:48) ) and lim n →∞ (cid:107) D (cid:48)− n B (cid:48) n (cid:107)(cid:107) π (cid:48) n ◦ D (cid:48) n (cid:107) = 0 . (6.96)We thus recognize our starting point (6.94) except that the matrices A n are replaced withdiagonal matrices D (cid:48) n , of coefficients t (cid:48) n := ( t (cid:48) n, , · · · , t (cid:48) n,d ) = ( t ϕ ( n ) , , · · · , t ϕ ( n ) ,d ) . In order to avoid notational clutter we do not keep track of the sub-sequence extraction,and until the end of this proof we write π , π n , B , B n , D n , t n and t n,i for the variables π (cid:48) , π (cid:48) n , B (cid:48) , B (cid:48) n , D (cid:48) n , t (cid:48) n and t (cid:48) n,i .We denote by I ⊂ { , · · · , d } the collection of indices i such that the associated lineof B is nonzero. There exists a constant c > n sufficiently large (cid:107) D − n B n (cid:107) ≥ c max i ∈ I t − n,i . (6.97)We denote by Λ the collection of exponentsΛ := { α ∈ ZZ d + ; | α | ≤ m } and by π α , π n,α the coefficients of π and π n respectively : π = (cid:88) α ∈ Λ π α Z α and π n = (cid:88) α ∈ Λ π n,α Z α . It follows from (6.96) and (6.97) that for any α ∈ Λ we havelim n →∞ (cid:18) max i ∈ I t − n,i (cid:19) π n,α t αn = 0 . (6.98)For each t = ( t , · · · , t d ) ∈ R d + we defineΛ( t ) := (cid:26) α ∈ Λ ; (cid:18) max i ∈ I t − i (cid:19) t α ≥ (cid:27) . The set Λ has only a finite number of subsets, since it is finite. Hence there exists anextraction ψ such that Λ ∗ := Λ( t ψ ( n ) ) does not depend on n . In view of (6.98) we thushave for all α ∈ Λ ∗ π α = lim n →∞ π ψ ( n ) ,α = 0 . Defining D := D ψ (0) , and denoting by t = ( t , · · · , t d ), its diagonal coefficients we thusobtain lim n →∞ (cid:107) D − n B (cid:107)(cid:107) π ◦ D n (cid:107) = 0 . Chapter 6. Approximation theory based on metrics Indeed (cid:107) D − n B (cid:107) ≤ C max i ∈ I t − ni and (cid:18) max i ∈ I t − ni (cid:19) π α t nα = π α (cid:18) t α max i ∈ I t − i (cid:19) n , which equals 0 if α ∈ Λ ∗ , and tends to zero if α ∈ Λ \ Λ ∗ . It follows that L ( B, π ) = 0 asannounced. (cid:5) The next lemma compares the function L with its lower semi-continuous envelope. Lemma 6.5.17. There exists a constant C such that L ≤ C L on M d × IH m . Proof: We denote by A the following compact subset of M d × IH m A := { ( B, π ) ∈ M d × IH m ; (cid:107) B (cid:107) = (cid:107) π (cid:107) = 1 and (cid:107) A − B (cid:107)(cid:107) π ◦ A (cid:107) ≥ A ∈ SL d } . It follows from the definition (6.93) of L that L ( B, π ) = 1 for all ( B, π ) ∈ A . Therefore L does not vanish on A according to Lemma 6.5.16. Since L is lower semi continuous, itattains its minimum on A , we thus define C − := inf A L . Let ( B, π ) ∈ M d × IH m , and let ( A n ) n ≥ , A n ∈ SL d , be a sequence such thatlim n →∞ (cid:107) A − n B (cid:107)(cid:107) π ◦ A n (cid:107) = L ( B, π ) := inf A ∈ SL d (cid:107) A − B (cid:107)(cid:107) π ◦ A (cid:107) . (6.99)If L ( B, π ) = 0, then L ( B, π ) = 0 and there is nothing to prove, we may therefore assumethat L ( B, π ) > 0. We have for each n ≥ L (cid:18) A − n B (cid:107) A − n B (cid:107) , π ◦ A n (cid:107) π ◦ A n (cid:107) (cid:19) = L ( B, π ) (cid:107) A − n B (cid:107)(cid:107) π ◦ A n (cid:107) , which tends to L ( B, π ) / L ( B, π ) as n → ∞ . Consider an extraction ϕ such that thefollowing quantities convergelim n →∞ A − ϕ ( n ) B (cid:107) A − ϕ ( n ) B (cid:107) = B ∗ and lim n →∞ π ◦ A ϕ ( n ) (cid:107) π ◦ A ϕ ( n ) (cid:107) = π ∗ . We now prove that ( B ∗ , π ∗ ) ∈ A . We first remark that (cid:107) B ∗ (cid:107) = (cid:107) π ∗ (cid:107) = 1 by construc-tion. Assume for contradiction that there exists A ∈ GL d such that (cid:107) A − B ∗ (cid:107)(cid:107) π ∗ ◦ A (cid:107) = δ < 1. We obtain lim n →∞ (cid:107) ( A ϕ ( n ) A ) − B (cid:107)(cid:107) π ◦ ( A ϕ ( n ) A ) (cid:107) = lim n →∞ (cid:107) A − ( A − ϕ ( n ) B ) (cid:107)(cid:107) A − ϕ ( n ) B (cid:107) (cid:107) ( π ◦ A ϕ ( n ) ) ◦ A (cid:107)(cid:107) π ◦ A ϕ ( n ) (cid:107) (cid:107) A − ϕ ( n ) B (cid:107)(cid:107) π ◦ A ϕ ( n ) (cid:107) = δ L ( B, π ) .5. Asymptotic approximation and explicit metrics L , hence ( B ∗ , π ∗ ) ∈ A as announced. We thusobtain, since L is lower semi continuous,lim n →∞ L (cid:18) A − n B (cid:107) A − n B (cid:107) , π ◦ A n (cid:107) π ◦ A n (cid:107) (cid:19) ≥ L ( B ∗ , π ∗ ) ≥ C − . If follows that C L ( B, π ) ≥ L ( B, π ), which concludes the proof of this lemma. (cid:5) We now conclude the proof that L ∗ g is equivalent to a continuous function. ApplyingLemma 6.5.15 to the function L and the constant C + 1, where C is the constant fromLemma 6.5.17, we obtain that that L is equivalent to a continuous function on M d × IH m .Recalling that L ∗ g ( π ) = L (Id , π ) we obtain that L ∗ g is also equivalent to a continuousfunction, on IH m , which concludes the proof of Theorem 6.5.14.28 Chapter 6. Approximation theory based on metrics art IVHierarchical refinement algorithms hapter 7Adaptive multiresolution analysisbased on anisotropic triangulations Contents Approximation by piecewise polynomial functions is a standard procedure which occursin various applications. In some of them such as terrain data simplification or imagecompression, the function to be approximated might be fully known, while it might beonly partially known or fully unknown in other applications such as denoising, statisticallearning or in the finite element discretization of PDE’s.In all these applications, one usually makes the distinction between uniform and adap-tive approximation. In the uniform case, the domain of interest is decomposed into apartition where all elements have comparable shape and size, while these attributes areallowed to vary strongly in the adaptive case. In the context of adaptive triangulations,33132 Chapter 7. Adaptive and anisotropic multiresolution analysis another important distinction is between isotropic and anisotropic triangulations. In thefirst case the triangles satisfy a condition which guarantees that they do not differ toomuch from equilateral triangles. This can either be stated in terms of a minimal value θ > ρ T := h T r T of each triangle T where h T and r T respectively denote the diameter of T and of its largestinscribed disc. In the second case, which is in the scope of the present chapter, the aspectratio is allowed to be arbitrarily large, i.e. long and thin triangles are allowed. In summary,adaptive and anisotropic triangulations mean that we do not fix any constraint on thesize and shape of the triangles.Given a function f and a norm (cid:107) · (cid:107) X of interest, we can formulate the problem offinding the optimal triangulation for f in the X -norm in two related forms :– For a given N find a triangulation T N with N triangles and a piecewise polynomialfunction f N (of some fixed degree) on T N such that (cid:107) f − f N (cid:107) X is minimized.– For a given ε > T N with minimal number of triangles N anda piecewise polynomial function f N such that (cid:107) f − f N (cid:107) X ≤ ε .In this chapter X will be the L p norm for some arbitrary 1 ≤ p ≤ ∞ . The exact solutionto such problems is usually out of reach both analytically and algorithmically : even whenrestricting the search of the vertices to a finite grid, the number of possible triangulationshas combinatorial complexity and an exhaustive search is therefore prohibitive.Concrete mesh generation algorithms have been developed in order to generate inreasonable time triangulations which are “close” to the above described optimal trade-offbetween error and complexity. They are typically governed by two intuitively desirablefeatures :1. The triangulation should equidistribute the local approximation error between eachtriangle. This rationale is typically used in local mesh refinement algorithms fornumerical PDE’s [90] : a triangle is refined when the local approximation error(estimated by an a-posteriori error indicator) is large.2. In the case of anisotropic meshes, the local aspect ratio should in addition be op-timally adapted to the approximated function f . In the case of piecewise linearapproximation, this is achieved by imposing that the triangles are isotropic withrespect to a distorted metric induced by the Hessian d f . We refer in particularto [16] where this task is executed using Delaunay mesh generation techniques.While these last algorithms fastly produce anisotropic meshes which are naturallyadapted to the approximated function, they suffer from two intrinsic limitations :1. They are based on the evaluation of the Hessian d f , and therefore do not in principleapply to arbitrary functions f ∈ L p (Ω) for 1 ≤ p ≤ ∞ or to noisy data.2. They are non-hierarchical : for N > M , the triangulation T N is not a refinement of T M .One way to circumvent the first limitation is to regularize the function f , either byprojection onto a finite element space or by convolution by a mollifier. However this raises .1. Introduction f .The need for hierarchical triangulations is critical in the construction of wavelet bases,which play an important role in applications to image and terrain data processing, inparticular data compression [31]. In such applications, the multilevel structure is also ofkey use for the fast encoding of the information. Hierarchy is also useful in the designof optimally converging adaptive methods for PDE’s [12, 50, 74, 87]. However, all thesedevelopments are so far mostly restricted to isotropic refinement methods. Let us mentionthat hierarchical and anisotropic triangulations have been investigated in [64], yet in thiswork the triangulations are fixed in advance and therefore generally not adapted to theapproximated function. A natural objective is therefore to design adaptive algorithmic techniques that combinehierarchy and anisotropy, and that apply to any function f ∈ L p (Ω) , without any need forregularization. In this chapter we propose and study a simple greedy refinement procedure that achievesthis goal : starting from an initial triangulation D , the procedure bisects every trianglefrom one of its vertices to the mid-point of the opposite segment. The choice of the vertexis typically the one which minimizes the new approximation error after bisection amongthe three options.Surprisingly, it turns out that - in the case of piecewise linear approximation - thiselementary strategy tends to generate anisotropic triangles with optimal aspect ratio. Thisfact is rigorously proved in Chapter 8 which establishes optimal error estimates for theapproximation of smooth and convex functions f ∈ C , by adaptive triangulations T N with N triangles. These triangulations are obtained by consecutively applying the refinementprocedure to the triangle of maximal error. The estimates in Chapter 8 are of the form (cid:107) f − f N (cid:107) L p ≤ CN − (cid:107) (cid:112) | det( d f ) |(cid:107) L τ , τ = 1 p + 1 , (7.1)and were already established in [4, 27] for functions which are not necessarily convex,however based on triangulations which are non-hierarchical and based on the evaluationof d f . Note that (7.1) improves on the estimate (cid:107) f − f N (cid:107) L p ≤ CN − (cid:107) d f (cid:107) L τ , τ = 1 p + 1 . (7.2)which can be established, see § (cid:107) f − f N (cid:107) L p ≤ CN − (cid:107) d f (cid:107) L p , (7.3)which is known to hold for uniform triangulations.The main objective of the present chapter is to introduce the refinement procedureas well as several approximation methods based on it, and to study their convergence for an arbitrary function f ∈ L p . In § Chapter 7. Adaptive and anisotropic multiresolution analysis refinement procedure and define the anisotropic hierarchy of triangulations ( D j ) j ≥ . Weshow how this general framework can be used to derive adaptive approximations of f eitherby triangulations based on greedy or optimal trees, or by wavelet thresholding. In § f ∈ L p and show how to modify the procedure so that convergenceholds for any arbitrary f ∈ L p . We finally present in § Our refinement procedure is based on a local approximation operator A T acting from L p ( T ) onto IP m - the space of polynomials of total degree less or equal to m . Here, theparameters m ≥ ≤ p ≤ ∞ are arbitrary but fixed. For a generic triangle T , wedenote by ( a, b, c ) its edge vectors oriented in clockwise or anticlockwise direction so that a + b + c = 0 . We define the local L p approximation error e T ( f ) p := (cid:107) f − A T f (cid:107) L p ( T ) . The most natural choice for A T is the operator B T of best L p ( T ) approximation which isdefined by (cid:107) f − B T f (cid:107) L p ( T ) = min π ∈ IP m (cid:107) f − π (cid:107) L p ( T ) . However this operator is non-linear and not easy to compute when p (cid:54) = 2. In practice, oneprefers to use an operator which is easier to compute, yet nearly optimal in the sense that (cid:107) f − A T f (cid:107) L p ( T ) ≤ C inf π ∈ IP m (cid:107) f − π (cid:107) L p ( T ) , (7.4)with C a Lebesgue constant independent of f and T . Two particularly simple admissiblechoices of approximation operators are the following :1. A T = P T , the L ( T )-orthogonal projection onto IP m , defined by P T f ∈ IP m suchthat (cid:82) T ( f − P T f ) π = 0 for all π ∈ IP m . This operator has finite Lebesgue constantfor all p , with C = 1 when p = 2 and C ≥ A T = I T , the local interpolation operator which is defined by I T f ∈ IP m such that I T f ( γ ) = f ( γ ) for all γ ∈ Σ := { (cid:80) k i m v i ; k i ∈ IN , (cid:80) k i = m } where { v , v , v } arethe vertices of T (in the case m = 0 we can take for Σ the barycenter of T ). Thisoperator is only defined on continuous functions and has Lebesgue constant C > L ∞ norm. .2. An adaptive and anisotropic multiresolution framework A T is either P T or I T (in the case where p = ∞ ), or any linear operator that fulfills the continuity assumption (7.4).Given a target function, our refinement procedure defines by induction a hierarchyof nested triangulations ( D j ) j ≥ with D j ) = 2 j D ). The procedure starts from thecoarse triangulation D of Ω, which is fixed independently of f . When Ω = [0 , we maysplit it into two symmetric triangles so that D ) = 2. For every T ∈ D j , we split T into two sub-triangles of equal area by bisection from one of its three vertices towards themid-point of the opposite edge e ∈ { a, b, c } . We denote by T e and T e the two resultingtriangles. The choice of e ∈ { a, b, c } is made according to a refinement rule that selectsthis edge depending on the properties of f . We denote by R this refinement rule, whichcan therefore be viewed as a mapping R : ( f, T ) (cid:55)→ e. We thus obtain two children of T corresponding to the choice e . D j +1 is the triangulationconsisting of all such pairs corresponding to all T ∈ D j .In this chapter, we consider refinement rules where the selected edge e minimizes a decision function e (cid:55)→ d T ( e, f ) among { a, b, c } . We refer to such rules as greedy refinementrules . A more elaborate type of refinement rule is also considered in § f , in contrast to simpler procedures such as newestvertex bisection (i.e. split T from the most recently created vertex) which is independentof f and generates triangulations with isotropic shape constraint.Therefore, the choice of d T ( e, f ) is critical in order to obtain triangles with an optimalaspect ratio. The most natural choice corresponds to the optimal split d T ( e, f ) = e T e ( f ) pp + e T e ( f ) pp , (7.5)i.e. choose the edge that minimizes the resulting L p error after bisection. It is proved inChapter 8 in the case of piecewise linear approximation, that when f is a C functionwhich is strictly convex or concave the refinement rule based on the decision function d T ( e, f ) = (cid:107) f − I T e f (cid:107) L ( T e ) + (cid:107) f − I T e f (cid:107) L ( T e ) . (7.6)generates triangles which tend to have have an optimal aspect ratio, locally adapted tothe Hessian d f . This aspect ratio is independent of the L p norm in which one wants tominimize the error between f and its piecewise affine approximation. Remark 7.2.1. If the minimizer e is not unique, we may choose it among the multipleminimizers either randomly or according to some prescribed ordering of the edges (forexample the largest coordinate pair of the opposite vertex in lexicographical order). Remark 7.2.2. The triangulations D j which are generated by the greedy procedure are ingeneral non-conforming, i.e. exhibit hanging nodes. This is not problematic in the presentsetting since we consider approximation in the L p norm which does not require globalcontinuity of the piecewise polynomial functions. Chapter 7. Adaptive and anisotropic multiresolution analysis The refinement rule R defines a multiresolution framework. For a given f ∈ L p (Ω)and any triangle T we denote by C ( T ) := { T , T } , the children of T which are the two triangles obtained by splitting T based on the pres-cribed decision function d T ( e, f ). We also say that T is the parent of T and T andwrite T = P ( T ) = P ( T ) . Note that D j := ∪ T ∈D j − C ( T ) . We also define D := ∪ j ≥ D j , which has the structure of an infinite binary tree . Note that D j depends on f (except for j = 0) and on the refinement rule R , and thus D also depends on f and R : D j = D j ( f, R ) and D = D ( f, R ) . For notational simplicity, we sometimes omit the dependence in f and R when there isno possible ambiguity. A first application of the multiresolution framework is the design of adaptive anisotro-pic triangulations T N for piecewise polynomial approximation, by a greedy tree algorithm .For any finite sub-tree S ⊂ D , we denote by L ( S ) := { T ∈ S s . t . C ( T ) / ∈ S} its leaves which form a partition of Ω. We also denote by I ( S ) := S \ L ( S ) , its inner nodes . Note that any finite partition of Ω by elements of D is the set of leavesof a finite sub-tree. One easily checks that S ) = 2 L ( S )) − N . For each N , the greedy tree algorithm defines a finite sub-tree S N of D which growsfrom S N := D = T N , by adding to S N − the two children of the triangle T ∗ N − whichmaximizes the local L p -error e T ( f ) p over all triangles in T N − .The adaptive partition T N associated with the greedy algorithm is defined by T N := L ( S N ) . Similarly to D , the triangulation T N depends on f and on the refinement rule R , but alsoon p and on the choice of the approximation operator A T . We denote by f N the piecewise .2. An adaptive and anisotropic multiresolution framework f which is defined as A T f on each T ∈ T N . The global L p approximation error is thus given by (cid:107) f − f N (cid:107) L p = (cid:107) ( e T ( f ) p ) (cid:107) (cid:96) p ( T N ) . Stopping criterions for the algorithm can be defined in various ways :– Number of triangles : stop once a prescribed N is attained.– Local error : stop once e T ( f ) p ≤ ε for all T ∈ T N , for some prescribed ε > (cid:107) f − f N (cid:107) L p ≤ ε for some prescribed ε > Remark 7.2.3. The role of the triangle selection based on the largest e T ( f ) p is to equi-distribute the local L p error, a feature which is desirable when we want to approximate f in L p (Ω) with the smallest number of triangles. However, it should be well understood thatthe refinement rule may still be chosen based on a decision function defined by approxima-tion errors in norms that differ from L p . In particular, as explained earlier, the decisionfunction (7.6) generates triangles which tend to have have an optimal aspect ratio, locallyadapted to the Hessian d f when f is strictly convex or concave, and this aspect ratio isindependent of the L p norm in which one wants to minimize the error between f and itspiecewise affine approximation. The greedy algorithm is one particular way of deriving an adaptive triangulation for f within the multiresolution framework defined by the infinite tree D . An interestingalternative is to build adaptive triangulations within D which offer an optimal trade-off between error and complexity . This can be done when 1 ≤ p < ∞ by solving theminimization problem min S (cid:110) (cid:88) T ∈L ( S ) e T ( f ) pp + λ S ) (cid:111) (7.7)among all finite trees, for some fixed λ > 0. In this approach, we do not directly control thenumber of triangles which depends on the penalty parameter λ . However, it is immediateto see that if N = N ( λ ) is the cardinality of T ∗ N = L ( S ∗ ) where S ∗ is the minimizing tree,then T ∗ N minimizes the L p approximation error T ∗ N := argmin T ) ≤ N (cid:88) T ∈T e T ( f ) pp , where the minimum is taken among all partitions T of Ω within D of cardinality less thanor equal to N .Due to the additive structure of the error term, the minimization problem (7.7) canbe performed in fast computational time using an optimal pruning algorithm of CARTtype, see [17, 49]. In the case p = ∞ the associated minimization problemmin S (cid:110) sup T ∈L ( S ) e T ( f ) ∞ + λ S ) (cid:111) , (7.8)can also be solved by a similar fast algorithm. It is obvious that this method improvesover the greedy tree algorithm : if N is the cardinality of the triangulation resulting from38 Chapter 7. Adaptive and anisotropic multiresolution analysis the minimization in (7.7) and f ∗ N the corresponding piecewise polynomial approximationof f associated with this triangulation, we have (cid:107) f − f ∗ N (cid:107) L p ≤ (cid:107) f − f N (cid:107) L p , where f N is built by the greedy tree algorithm. The multiresolution framework allows us to introduce the piecewise polynomial mul-tiresolution spaces V j = V j ( f, R ) := { g s . t . g | T ∈ IP m , T ∈ D j } , which depend on f and on the refinement rule R . These spaces are nested and we denoteby V = V ( f, R ) = ∪ j ≥ V j ( f, R ) , their union. For notational simplicity, we sometimes omit the dependence in f and R when there is no possible ambiguity.The V j spaces may be used to construct wavelet bases, following the approach intro-duced in [1] and that we describe in our present setting.The space V j is equipped with an orthonormal scaling function basis : ϕ iT , i = 1 , · · · , 12 ( m + 1)( m + 2) , T ∈ D j , where the ϕ iT for i = 1 , · · · , ( m + 1)( m + 2) are supported in T and constitute anorthonormal basis of IP m in the sense of L ( T ) for each T ∈ D . There are several possiblechoices for such a basis. In the particular case where m = 1, a simple one is to take for T with vertices ( v , v , v ), ϕ iT ( v i ) = | T | − / √ ϕ iT ( v j ) = −| T | − / √ , j (cid:54) = i. We denote by P j the orthogonal projection onto V j : P j g := (cid:88) T ∈D j (cid:88) i (cid:104) g, ϕ iT (cid:105) ϕ iT . We next introduce for each T ∈ D j a set of wavelets ψ iT , i = 1 , · · · , 12 ( m + 1)( m + 2) , which constitutes an orthonormal basis of the orthogonal complement of IP m ( T ) intoIP m ( T (cid:48) ) ⊕ IP m ( T (cid:48)(cid:48) ) with { T (cid:48) , T (cid:48)(cid:48) } the children of T . In the particular case where m = 1,a simple choice for such a basis is as follows : if ( v , v , v ) and ( w , w , w ) denote thevertices of T (cid:48) and T (cid:48)(cid:48) , with the convention that v = w and v = w denote the common .2. An adaptive and anisotropic multiresolution framework v , w ) (i.e. T has vertices( v , w , v )), then ψ T := ϕ T (cid:48) − ϕ T (cid:48)(cid:48) √ ,ψ T := ϕ T (cid:48) − ϕ T (cid:48) − ϕ T (cid:48)(cid:48) + ϕ T (cid:48)(cid:48) ,ψ T := ϕ T (cid:48) − ϕ T (cid:48) + ϕ T (cid:48)(cid:48) − ϕ T (cid:48)(cid:48) . where ϕ iT (cid:48) and ϕ iT (cid:48)(cid:48) are the above defined scaling functions.The family ψ iT , i = 1 , · · · , 12 ( m + 1)( m + 2) , T ∈ D j constitutes an orthonormal basis of W j , the L -orthogonal complement of V j in V j +1 . Amultiscale orthonormal basis of V J is given by { ϕ T } T ∈D ∪ { ψ iT } i =1 , ··· , ( m +1)( m +2) ,T ∈D j , j =0 , ··· ,J − . Letting J go to + ∞ we thus obtain that { ϕ T } T ∈D ∪ { ψ iT } i =1 , ··· , ( m +1)( m +2) ,T ∈D j , j ≥ is an orthonormal basis of the space V ( f, R ) := V ( f, R ) L (Ω) = ∪ j ≥ V j ( f, R ) L (Ω) . For the sake of notational simplicity, we rewrite this basis as( ψ λ ) λ ∈ Λ , Note that V ( f, R ) is not necessarily dense in L (Ω) and so V ( f, R ) is not always equal to L (Ω). Therefore, the expansion of an arbitrary function g ∈ L (Ω) in the above waveletbasis does not always converge towards g in L (Ω). The same remark holds for the L p convergence of the wavelet expansion of an arbitrary function g ∈ L p (Ω) (or C (Ω) in thecase p = ∞ ) : L p -convergence holds when the space V ( f, R ) p := V ( f, R ) L p (Ω) coincides with L p (Ω) (or contains C (Ω) in the case p = ∞ ), since we have (cid:13)(cid:13)(cid:13) f − (cid:88) | λ | However, this condition might not hold for the hierarchy ( D j ) j ≥ produced by the refine-ment procedure. On the other hand, the multiresolution approximation being intrinsicallyadapted to f , a more reasonable requirement is that the expansion of f converges towards f in L p (Ω) when f ∈ L p (Ω) (or C (Ω) in the case p = ∞ ). This is equivalent to the property f ∈ V ( f, R ) p . We may then define an adaptive approximation of f by thresholding its coefficients atsome level ε > f ε := (cid:88) | f λ |≥ ε f λ ψ λ , where f λ := (cid:104) f, ψ λ (cid:105) . When measuring the error in the L p norm, a more natural choice is toperform thresholding on the component of the expansion measured in this norm, definingtherefore f ε := (cid:88) (cid:107) f λ ψ λ (cid:107) Lp ≥ ε f λ ψ λ . We shall next see that the condition f ∈ V ( f, R ) p also ensures the convergence of thetree-based adaptive approximations f N and f ∗ N towards f in L p (Ω). We shall also seethat this condition may not hold for certain functions f , but that this difficulty can becircumvented by a modification of the refinement procedure. The following result relates the convergence towards f of its approximations by pro-jection onto the spaces V j ( f, R ), greedy and optimal tree algorithms, and wavelet thre-sholding. This result is valid for any refinement rule R . Theorem 7.3.1. Let R : ( f, T ) (cid:55)→ e be an arbitrary refinement rule and let f ∈ L p (Ω) .The following statements are equivalent : (i) f ∈ V ( f, R ) p . (ii) The greedy tree approximation converges : lim N → + ∞ (cid:107) f − f N (cid:107) L p = 0 . (iii) The optimal tree approximation converges : lim N → + ∞ (cid:107) f − f ∗ N (cid:107) L p = 0 .In the case p = 2 , they are also equivalent to : (iv) The thresholding approximation converges : lim ε → (cid:107) f − f ε (cid:107) L = 0 . Proof : Clearly, (ii) implies (iii) since (cid:107) f − f ∗ N (cid:107) L p ≤ (cid:107) f − f N (cid:107) L p . Since the triangulation D N is a refinement of T ∗ N , we also find that (iii) implies (i) as inf g ∈ V N (cid:107) f − g (cid:107) L p ≤ (cid:107) f − f ∗ N (cid:107) L p .We next show that (i) implies (ii). We first note that a consequence of (i) is thatlim j → + ∞ sup T ∈D j e T ( f ) p = 0 . It follows that for any η > 0, there exists N ( η ) such that for N > N ( η ), all triangles T ∈ T N satisfy e T ( f ) p ≤ η. .3. Convergence analysis ε > 0, there exists J = J ( ε ) such thatinf g ∈ V J (cid:107) f − g (cid:107) L p ≤ ε. For N > N ( η ), we now split T N into T + N ∪ T − N where T + N := T N ∩ ( ∪ j ≥ J D j ) and T − N := T N ∩ ( ∪ j 0, we can firstchoose ε > C p ε p < δ/ 2, and then choose η > J ( ε ) N η p < δ/ p = ∞ the estimate is modified into (cid:107) f − f N (cid:107) L ∞ ≤ max { Cε, η } , which also implies (ii) by a similar reasoning.We finally prove the equivalence between (i) and (iv) when p = 2. Property (i) isequivalent to the L convergence of the orthogonal projection P j f to f as j → + ∞ , orequivalently of the partial sum (cid:88) | λ | The equivalence between statements (i) and (iv) can be extended to such that for any finitely supportedsequences ( c λ ) and ( d λ ) such | c λ | ≤ | d λ | for all λ , one has (cid:107) (cid:88) c λ ψ λ (cid:107) L p ≤ C (cid:107) (cid:88) d λ ψ λ (cid:107) L p . A consequence of this property is that if f ∈ L p can be expressed as the L p limit f = lim j → + ∞ (cid:88) | λ | We now show that if we use a greedy refinement rule R based on a decision function ei-ther based on the interpolation or L projection error after bisection, there exists functions f ∈ L p (Ω) such that f / ∈ V ( f, R ) p . Without loss of generality, it is enough to construct f on the reference triangle T ref of vertices { (0 , , (1 , , (1 , } , since our construction canbe adapted to any triangle by an affine change of variables.Consider first a decision function defined from the interpolation error after bisection,such as (7.6), or more generally d T ( f, e ) := (cid:107) f − I T e f (cid:107) pL p ( T e ) + (cid:107) f − I T e f (cid:107) pL p ( T e ) . Let f be a continuous function which is not identically 0 on T ref and which vanishes at allpoints ( x, y ) such that x = k m for k = 0 , , · · · , m , where m is the degree of polynomialapproximation. For such an f , it is easy to see that I T f = 0 and that I T (cid:48) f = 0 for allsubtriangles T (cid:48) obtained by one bisection of T ref . This shows that there is no preferredbisection. Assuming that we bisect from the vertex (0 , 0) to the opposite mid-point (1 , ),we find that a similar situation occurs when splitting the two subtriangles. Iterating thisobservation, we see that an admissible choice of bisections leads after j steps to a triangula-tion D j of T ref consisting of the triangles T j,k with vertices { (0 , , (1 , − j k ) , (1 , − j ( k +1)) } with k = 0 , · · · , j − 1. On each of these triangles f is interpolated by the null function andtherefore by (7.4) the best L ∞ approximation in V j does not converge to f as j → + ∞ ,i.e. f / ∈ V ( f, R ) ∞ . It can also easily be checked that f / ∈ V ( f, R ) p .Similar counter-examples can be constructed when the decision function is definedfrom the L projection error, and has the form d T ( f, e ) := (cid:107) f − P T e f (cid:107) pL p ( T e ) + (cid:107) f − P T e f (cid:107) pL p ( T e ) . Here, we describe such a construction in the case m = 1. We define f on R as a functionof the first variable given by f ( x, y ) = u ( x ) , if x ∈ (cid:20) , (cid:21) , f ( x, y ) = u (cid:18) x − (cid:19) , if x ∈ (cid:18) , (cid:21) , where u is a non-trivial function in L ([0 , ]) such that (cid:90) u ( x ) dx = (cid:90) xu ( x ) dx = (cid:90) x u ( x ) dx = 0 . A possible choice is u ( x ) = L (4 x − 1) = 160 x − x + 24 x − L is the Legendrepolynomial of degree 3 defined on [ − , Lemma 7.3.3. Let T be any triangle such that its vertices have x coordinates either (0 , , or (0 , , or ( , , . Then f is orthogonal to IP in L ( T ) . Proof : Define T := (cid:26) ( x, y ) ∈ T, x ∈ (cid:20) , (cid:21)(cid:27) and T := (cid:26) ( x, y ) ∈ T, x ∈ (cid:18) , (cid:21)(cid:27) . .3. Convergence analysis v ( x, y ) being either the function 1 or x or y , we have (cid:82) T f ( x, y ) v ( x, y ) dxdy = (cid:82) T u ( x ) v ( x, y ) dxdy + (cid:82) T u ( x ) v ( x, y ) dxdy = (cid:82) u ( x ) q ( x ) dx + (cid:82) u ( x ) q ( x ) dx, with q ( x ) := (cid:90) T ,x v ( x, y ) dy, q ( x ) := (cid:90) T ,x v ( x, y ) dy, where T i,x = { y : ( x, y ) ∈ T i } for i = 0 , 1. The functions q and q are polynomials ofdegree at most 2 and we thus obtain from the properties of u that (cid:82) T f v = 0. (cid:5) The above lemma shows that for any of the three possible choices of bisection of T ref based on the L decision function, the error is left unchanged since the projection of f onall possible sub-triangle is 0. There is therefore no preferred choice, and assuming that webisect from the vertex (0 , 0) to the opposite mid-point (1 , ), then we see that a similarsituation occurs when splitting the two subtriangles. The rest of the arguments showingthat f / ∈ V ( f, R ) p are the same as in the previous counter-example.The above two examples of non-convergence reflect the fact that when f has some oscillations , the refinement procedure cannot determine the most appropriate bisection.In order to circumvent this difficulty one needs to modify the refinement rule. Our modification consists of bisecting from the most recently generated vertex of T ,in case the local error is not reduced enough by all three bisections. More precisely, wemodify the choice of the bisection of any T as follows :Let e be the edge which minimizes the decision function d T ( e, f ). If (cid:0) e T e ( f ) pp + e T e ( f ) pp (cid:1) /p ≤ θe T ( f ) , we bisect T towards the edge e (greedy bisection). Otherwise, we bisect T from its mostrecently generated vertex (newest vertex bisection). Here θ is a fixed number in (0 , p = ∞ we use the conditionmax { e T e ( f ) , e T e ( f ) } ≤ θe T ( f ) . This new refinement rule benefits from the mesh size reduction properties of newest vertexbisection. Indeed, a property illustrated in Figure 7.1 is that a sequence { BN N } of onearbitrary bisection ( B ) followed by two newest vertex bisections ( N ) produces triangleswith diameters bounded by half the diameter of the initial triangle. A more general pro-perty - which proof is elementary yet tedious - is the following : a sequence of the type { BN B · · · BN } of length k + 2 with a newest vertex bisection at iteration 2 and k + 2produces triangles with diameter bounded by (1 − − k ) times the diameter of the initialtriangle, the worst case being illustrated in Figure 7.2 with k = 3.Our next result shows that the modified algorithm now converges for any f ∈ L p .44 Chapter 7. Adaptive and anisotropic multiresolution analysis Figure { BN N } sequence Figure { BN BBN } sequence (the dark triangle has dia-meter at most 7 / Theorem 7.3.4. With R defined as the modified bisection rule, we have f ∈ V ( f, R ) p , for any f ∈ L p (Ω) (or C (Ω) when p = ∞ ). Proof : We first give the proof when p < ∞ . For each triangle T ∈ D j with j ≥ 1, weintroduce the two quantities α ( T ) := e T ( f ) pp e T ( f ) pp + e T (cid:48) ( f ) pp and β ( T ) := e T ( f ) pp + e T (cid:48) ( f ) pp e P ( T ) ( f ) pp , where T (cid:48) is the “brother” of T , i.e. C ( P ( T )) = { T, T (cid:48) } . When a greedy bisection occurs inthe split of P ( T ), we have β ( T ) ≤ θ p . When a newest vertex bisection occurs, we have β ( T ) ≤ C p inf π ∈ IP m (cid:107) f − π (cid:107) pL p ( T ) + inf π ∈ IP m (cid:107) f − π (cid:107) pL p ( T (cid:48) ) inf π ∈ IP m (cid:107) f − π (cid:107) pL p ( P ( T )) ≤ C p , where C is the constant of (7.4).We now consider a given level index j > D j . For each T ∈ D j ,we consider the chain of nested triangles ( T n ) jn =0 with T j = T and T n − = P ( T n ), n = j, j − , · · · , 1. We define α ( T ) = j (cid:89) n =1 α ( T n ) and β ( T ) = j (cid:89) n =1 β ( T n ) . It is easy to see that α ( T ) β ( T ) = e T ( f ) pp e T ( f ) pp so that e T ( f ) pp ≤ C α ( T ) β ( T ) , .3. Convergence analysis C := max T ∈D e T ( f ) pp . It is also easy to check by induction on j that (cid:88) T ∈D j α ( T ) = (cid:88) T ∈D j − α ( T ) = · · · = (cid:88) T ∈D α ( T ) = D . We denote by f j the approximation to f in V j defined by f j = A T f on all T ∈ T j so that (cid:107) f − f j (cid:107) pL p = (cid:88) T ∈D j e T ( f ) pp . In order to prove that f j converges to f in L p , it is sufficient to show that the sequence ε j := max T ∈D j min { β ( T ) , diam( T ) } , tends to 0 as j grows. Indeed, if this holds, we split D j into two sets D + j and D − j overwhich β ( T ) ≤ ε j and diam( T ) ≤ ε j respectively. We can then write (cid:107) f − f j (cid:107) pL p = (cid:88) T ∈D + j e T ( f ) pp + (cid:88) T ∈D − j e T ( f ) pp ≤ C ε j (cid:88) T ∈D + j α ( T ) + C p (cid:88) T ∈D − j inf π ∈ IP m (cid:107) f − π (cid:107) pL p ( T ) ≤ C D ε j + C p (cid:88) T ∈D − j inf π ∈ IP m (cid:107) f − π (cid:107) pL p ( T ) , where C is the constant of (7.4). Clearly the first term tends to 0 and so does the secondterm by standard properties of L p spaces since the diameter of the triangles in D − j goesto 0. It thus remains to prove that lim j → + ∞ ε j = 0 . Again we consider the chain ( T n ) jn =0 which terminates at T , and we associate to it a chain( q n ) j − n =0 where q n = 1 or 2 if bisection of T n is greedy or newest vertex respectively. If r isthe total number of 2 in the chain ( q n ) we have β ( T ) ≤ C pr θ p ( j − r ) , with C the constant in (7.4). Let a k > Cθ k − ≤ . We thus have β ( T ) ≤ ( C p θ p ( k − ) r θ p ( j − rk ) ≤ θ p ( j − rk ) . (7.9)We now denote by l the maximal number of disjoint sub-chains of the type( ν , , ν , ν , · · · , ν q , Chapter 7. Adaptive and anisotropic multiresolution analysis with ν j ∈ { , } and of length q + 2 ≤ k + 3 which can be extracted from ( q n ) j − n =0 . Fromthe remarks on the diameter reduction properties of newest vertex bisection, we see thatdiam( T ) ≤ B (1 − − k ) l , with B := max T ∈D diam( T ) a fixed constant. On the other hand, it is not difficult tocheck that r ≤ l + 3 + j − r k . (7.10)Indeed let α be the total number of 1 in the sequence ( q n ) which are not preceeded bya 2, and let α i be the size of the series of 1 following the i -th occurence of 2 in ( q n ) for i = 1 , · · · , r . Note that some α i might be 0. Clearly we have j = α + α + · · · + α r + r. From the above equality, the number of i such that α i > k is less than j − r k and thereforethere is at least m ≥ r − j − r k indices { i , · · · , i m − } such that α i ≤ k . Denoting by β i the position of the i -th occurence of 2 (so that β i +1 = β i + α i + 1), we now consider thedisjoint sequences of indices S t = { β i t , · · · , β i t +2 } , t = 0 , , · · · There is at least m − { , · · · , j } and by construction each of themcontains a sequence of the type ( ν , , ν , ν , · · · , ν q , 2) with ν j ∈ { , } and of length q + 2 ≤ k + 3. Therefore the maximal number of disjoint sequences of such type satisfies l ≥ 13 ( r − j − r k ) − , which is equivalent to (7.10). Therefore according to (7.9) β ( T ) ≤ θ p ( j − rk ) ≤ θ p ( j − l +1) k + r ) If 3( l + 1) k ≤ j , we have β ( T ) ≤ θ pj . On the other hand, if 3( l + 1) k ≥ j , we havediam( T ) ≤ B (1 − − k ) j k − . We therefore conclude that ε j goes to 0 as j grows, which proves the result for p < ∞ .We briefly sketch the proof for p = ∞ , which is simpler. We now define β ( T ) as β ( T ) := e T ( f ) ∞ e P ( T ) ( f ) ∞ , so that β ( T ) ≤ θ if a greedy bisection occurs in the split of P ( T ). With the same definitionof β ( T ) we now have e T ( f ) ∞ ≤ C β ( T ) , .4. Numerical illustrations C := max T ∈D e T ( f ) ∞ . With the same definition of ε j and splitting of D j , we nowreach (cid:107) f − f j (cid:107) L ∞ ≤ max (cid:40) C ε j , C max T ∈D − j inf π ∈ IP m (cid:107) f − π (cid:107) L ∞ ( T ) (cid:41) , which again tends to 0 if ε j tends to 0 and f is continuous. The proof that ε j tends to 0as j grows is then similar to the case p < ∞ . (cid:5) Remark 7.3.5. The choice of the parameter θ < deserves some attention : if it ischosen too small, then most bisections are of type N and we end up with an isotropictriangulation. In the case m = 1 , a proper choice can be found by observing that when f has C smoothness, it can be locally approximated by a quadratic polynomial q ∈ IP with e T ( f ) p ≈ e T ( q ) p when the triangle T is small enough. For such quadratic functions q ∈ IP , one can explicitely study the minimal error reduction which is always ensured bythe greedy refinement rule defined a given decision function. In the particular case p = 2 and with the choice A T = P T which is considered in the numerical experiments of § (cid:107) q − P T q (cid:107) L ( T ) can be obtained by formal computing and canbe used to prove a guaranteed error reduction by a factor θ ∗ = . It is therefore natural tochoose θ such that θ ∗ < θ < (for example θ = ) which ensures that bisections of type N only occur in the early steps of the algorithm, when the function still exhibits too manyoscillations on some triangles. The following numerical experiments were conducted with piecewise linear approxi-mation in the L setting : we use the L -based decision function d T ( f, e ) := (cid:107) f − P T e f (cid:107) L ( T e ) + (cid:107) f − P T e f (cid:107) L ( T e ) , and we take for A T the L ( T )-orthogonal projection P T onto IP . In these experiments, thefunction f is either a quadratic polynomial or a function with a simple analytic expressionwhich allows us to compute the quantities e T ( f ) and d T ( e, f ) without any quadratureerror, or a numerical image in which case the computation of these quantities is discretizedon the pixel grid. Our first goal is to illustrate numerically the optimal adaptation properties of therefinement procedure in terms of triangle shape. For this purpose, we take f = q aquadratic form i.e. an homogeneous polynomial of degree 2. In this case, all trianglesshould have the same aspect ratio since the Hessian is constant. In order to measure thequality of the shape of a triangle T in relation to q , we introduce the following quantity :if ( a, b, c ) are the edge vectors of T , we define ρ q ( T ) := max {| q ( a ) | , | q ( b ) | , | q ( c ) |}| T | (cid:112) | det( q ) | , Chapter 7. Adaptive and anisotropic multiresolution analysis Figure D for q ( x, y ) := x + 100 y (left) and q ( x, y ) := x − y (right).where det( q ) is the determinant of the 2 × Q associated with q , i.e.such that q ( u ) = (cid:104) Qu, u (cid:105) for all u ∈ R . Using the reference triangle and an affine change of variables, it is provedin § e T ( q ) p ∼ | T | p ρ q ( T ) (cid:112) | det( q ) | , with equivalence constants independent of q and T . Therefore, if T is a triangle of givenarea, its shape should be designed in order to minimize ρ q ( T ).In the case where q is positive definite or negative definite, ρ q ( T ) takes small valueswhen T is isotropic with respect to the metric | ( x, y ) | q := (cid:112) | q ( x, y ) | , the minimal value √ being attained for an equilateral triangle for this metric. Specifically, we choose q ( x, y ) := x + 100 y and display in Figure 7.3 (left) the triangulation D obtained after j = 8iterations of the refinement procedure, starting with a triangle which is equilateral for theeuclidean metric (and therefore not adapted to q ). Triangles such that ρ q ( T ) ≤ √ q ( x, y ) := x − y . For such quadratic functions, triangles which are isotropic withrespect to the metric | · | | q | have a low value of ρ q , where | q | denotes the positive quadraticform associated to the absolute value | Q | of the symmetric matrix Q associated to q .Recall for any symmetric matrix Q there exists λ , λ ∈ R and a rotation R such that Q = R T (cid:18) λ λ (cid:19) R, and the absolute value | Q | is defined as | Q | = R T (cid:18) | λ | | λ | (cid:19) R. In the present case, R = I and | q | ( x, y ) = x + 10 y . .4. Numerical illustrations ! ! Figure g δ , for δ = 0 . , . , . , . ρ q is left invariant by any linear transformation witheigenvalues ( t, t ) for any t > u, v ) such that q ( u ) = q ( v ) = 0,i.e. belonging to the null cone of q . More precisely, for any such transformation ψ andany triangle T , one has ρ q ( ψ ( T )) = ρ q ( T ). In our example we have u = ( √ , 1) and v = ( √ , − ρ q . Triangles T such that ρ | q | ( T ) ≤ √ ρ q ( T ) ≤ √ ρ | q | ( T ) > √ q but not to | q | -are displayed in grey, and the others in dark. We observe that all the triangles trianglesproduced by the refinement procedure except one are either of the first or second typeand therefore have a good aspect ratio. These empirical observations will be rigorouslyjustified in § We next study the adaptive triangulations produced by the greedy tree algorithm fora function f displaying a sharp transition along a curved edge. Specifically we take f ( x, y ) = f δ ( x, y ) := g δ ( (cid:112) x + y ) , where g δ is defined by g δ ( r ) = − r for 0 ≤ r ≤ g δ (1 + δ + r ) = − − (1 − r ) for r ≥ g δ is a polynomial of degree 5 on [1 , δ ] which is determined by imposing that g δ is globally C . The parameter δ therefore measures the sharpness of the transition asillustrated in Figure 7.4. It can be shown that the Hessian of f δ is negative definite for (cid:112) x + y < δ/ 2, and of mixed type for 1 + δ/ < (cid:112) x + y ≤ δ .Figure 7.5 displays the triangulation T obtained after 10000 steps of the algorithmfor δ = 0 . 2. In particular, triangles T such that ρ q ( T ) ≤ q is the quadratic formassociated with d f measured at the barycenter of T - are displayed in white, others ingrey. As expected, most triangles are of the first type and therefore well adapted to f .We also display on this figure the adaptive isotropic triangulation produced by the greedytree algorithm based on newest vertex bisection for the same number of triangles.Since f is a C function, approximations by uniform, adaptive isotropic and adaptiveanisotropic triangulations all yield the convergence rate O ( N − ). However the constant C := lim sup N → + ∞ N (cid:107) f − f N (cid:107) L , strongly differs depending on the algorithm and on the sharpness of the transition, asillustrated in the table below. We denote by C U , C I and C A the empirical constants50 Chapter 7. Adaptive and anisotropic multiresolution analysis Figure T (left), detail (center), isotropic triangulation (right).(estimated by N (cid:107) f − f N (cid:107) for N = 8192) in the uniform, adaptive isotropic and adaptiveanisotropic case respectively, and by U ( f ) := (cid:107) d f (cid:107) L , I ( f ) := (cid:107) d f (cid:107) L / and A ( f ) := (cid:107) (cid:112) | det( d f ) |(cid:107) L / the theoretical constants suggested by (7.3), (7.2) and (7.1). We observethat C U and C I grow in a similar way as U ( f ) and I ( f ) as δ → U ( f ) ≈ . δ − / and I ( f ) ≈ . δ − / ). In contrast C A and A ( f ) remainuniformly bounded, a fact which reflects the superiority of the anisotropic mesh as thelayer becomes thinner. δ U ( f ) I ( f ) A ( f ) C U C I C A . . 75 7 . 87 1 . 78 0 . . . 50 23 . . 98 0 . . 05 1705 82 8 . 48 65 . . 13 0 . . 02 3670 105 8 . 47 200 6 . 60 0 . We finally apply the greedy tree algorithm to numerical images. In this case the data f has the form of a discrete array of pixels, and the L ( T )-orthogonal projection is replacedby the (cid:96) ( S T )-orthogonal projection, where S T is the set of pixels with centers containedin T . The approximated 512 × 512 image is displayed in Figure 7.6 which also shows itsapproximation f N by the greedy tree algorithm based on newest vertex bisection with N = 2000 triangles. The systematic use of isotropic triangles results in strong ringingartifacts near the edges.We display in Figure 7.7 the result of the same algorithm now based on our greedybisection procedure with the same number of triangles. As expected, the edges are betterapproximated due to the presence of well oriented anisotropic triangles. Yet artifactspersist on certain edges due to oscillatory features in the image which tend to mislead thealgorithm in its search for triangles with good aspect ratio, as explained in § § L p function. Note that encoding a triangulation resulting from N iterationsof the anisotropic refinement algorithm is more costly than for the newest vertex rule :the algorithm encounters at most 2 N triangles and for each of them, one needs to encode .4. Numerical illustrations Figure f with newest vertex (right). Figure f with greedy bisection (left), modified procedure (right).52 Chapter 7. Adaptive and anisotropic multiresolution analysis one out of four options (bisect towards edge a or b or c or not bisect), therefore resultinginto 4 N bits, while only two options need to be encoded when using the newest vertexrule (bisect or not), therefore resulting into 2 N bits. In the perspective of applications toimage compression, another issue is the quantization and encoding of the piecewise affinefunction as well as the treatment of the triangular visual artifacts that are inherent to theuse of discontinuous piecewise polynomials on triangulated domains. These issues will bediscussed in a further work specifically dealing with image applications. In this chapter, we have studied a simple greedy refinement procedure which generatestriangles that tend to have an optimal aspect ratio . This fact is rigorously proved in Chap-ter 8, together with the optimal convergence estimate (7.1) for the adaptive triangulationsconstructed by the greedy tree algorithm in the case where the approximated function f is C and convex. Our numerical results illustrate these properties.In the present chapter we also show that for a general f ∈ L p the refinement procedurecan be misled by oscillations in f , and that this drawback may be circumvented by a simplemodification of the refinement procedure. This modification appears to be useful in imageprocessing applications, as shown by our numerical results.Let us finally mention several perspectives that are raised from our work, and that arethe object of current investigation :1. Conforming triangulations : our algorithm inherently generates hanging nodes, whichmight not be desirable in certain applications, such as the numerical discretizationof PDE’s where anisotropic elements are sometimes used [2]. When using the greedytree algorithm, an obvious way of avoiding this phenomenon is to bisect the chosentriangle together with an adjacent triangle in order to preserve conformity. Howe-ver, it is no more clear that this strategy generates optimal triangulations. In fact,we observed that many inappropriately oriented triangles can be generated by thisapproach. An alternative strategy is to apply the non-conforming greedy tree al-gorithm until a prescribed accuracy is met, followed by an additional refinementprocedure in order to remove hanging nodes.2. Discretization and encoding : our work is in part motivated by applications to imageand terrain data processing and compression. In such applications the data to beapproximated is usually given in discrete form (pixels or point clouds) and thealgorithm can be adapted to such data, as shown in our numerical image examples.Key issues which need to be dealt with are then (i) the efficient encoding of theapproximations and of the triangulations using the tree structure in a similar spiritas in [31] and (ii) the removal of the triangular visual artifacts due to discontinuouspiecewise polynomial approximation by an appropriate post-processing step.3. Adaptation to curved edges : one of the motivation for the use of anisotropic trian-gulations is the approximation of functions with jump discontinuities along an edge.For simple functions, such as characteristic functions of domains with smooth boun-daries, the L p -error rate with an optimally adapted triangulation of N elements is .5. Conclusions and perspectives O ( N − p ). This rate reflects an O (1) error concentrated on a strip of area O ( N − ) separating the curved edge from a polygonal line. Our first investigations inthis direction indicate that the greedy tree algorithm based on our refinement pro-cedure cannot achieve this rate, due to the fact that bisection does not offer enoughgeometrical adaptation. This is in contrast with other splitting procedures, such asin [40] in which the direction of the new cutting edge is optimized within an infiniterange of possible choices, or [48] where the number of choices grows together withthe resolution level. An interesting question is thus to understand if the optimal ratefor edges can be achieved by a splitting procedure with a small and fixed numberof choices similar to our refinement procedure, which would be beneficial from botha computational and encoding viewpoint. This question is addressed, and partiallyanswered, in § Chapter 7. Adaptive and anisotropic multiresolution analysis hapter 8Greedy bisection generates optimallyadapted triangulations Contents In finite element approximation, a classical and important distinction is made between uniform and adaptive methods. In the first case all the elements which constitute the meshhave comparable shape and size, while these attributes are allowed to vary strongly in thesecond case. An important feature of adaptive methods is the fact that the mesh is notfixed in advance but rather tailored to the properties of the function f to be approximated.Since the function approximating f is not picked from a fixed linear space, adaptive finiteelements can be considered as an instance of non-linear approximation . Other instancesinclude approximation by rational functions, or by N -term linear combinations of a basisor dictionary. We refer to [42] for a general survey on non-linear approximation.In this chapter, we focus our interest on piecewise linear finite element functions definedover triangulations of a bidimensional polygonal domain Ω ⊂ IR . Given a triangulation T we denote by V T := { v s . t . v | T ∈ IP , T ∈ T } the associated finite element space. The35556 Chapter 8. Greedy bisection generates optimally adapted triangulations norm in which we measure the approximation error is the L p norm for 1 ≤ p ≤ ∞ andwe therefore do not require that the triangulations are conforming and that the functionsof V T are continuous between triangles. For a given function f we define e N ( f ) L p := inf T ) ≤ N inf g ∈ V T (cid:107) f − g (cid:107) L p , the best approximation error of f when using at most N elements. In adaptive finiteelement approximation, critical questions are :1. Given a function f and a number N > 0, how can we characterize the optimal mesh for f with N elements corresponding to the above defined best approximation error.2. What quantitative estimates are available for the best approximation error e N ( f ) L p ?Such estimates should involve the derivatives of f in a different way than for non-adaptive meshes.3. Can we build by a simple algorithmic procedure a mesh T N of cardinality N and afinite element function f N ∈ V T N such that (cid:107) f − f N (cid:107) L p is comparable to e N ( f ) L p ?While the optimal mesh is usually difficult to characterize exactly, it should satisfytwo intuitively desirable features : (i) the triangulation should equidistribute the localapproximation error between each triangle and (ii) the aspect ratio of a triangle T shouldbe isotropic with respect to a distorted metric induced by the local value of the hessian d f on T (and therefore anisotropic in the sense of the euclidean metric). Under suchprescriptions on the mesh, quantitative error estimates have recently been obtained in[4, 27] when f is a C function. These estimates are of the formlim sup N →∞ N e N ( f ) L p ≤ C (cid:107) (cid:112) | det( d f ) |(cid:107) L τ , τ = 1 p + 1 , (8.1)where det( d f ) is the determinant of the 2 × C function f this estimate has been proved to be asymptotically optimal in [27], in the following senselim inf N → + ∞ N e N ( f ) L p ≥ c (cid:107) (cid:112) | det( d f ) |(cid:107) L τ . (8.2)The convexity assumption can actually be replaced by a mild assumption on the sequenceof triangulations which is used for the approximation of f : a sequence ( T N ) N ≥ N is saidto be admissible if T N ) ≤ N andsup N ≥ N (cid:18) N / max T ∈T N diam( T ) (cid:19) < ∞ . Then it is proved in Chapter 2, that for any admissible sequence and any C function f ,one has lim inf N → + ∞ N inf g ∈ V T N (cid:107) f − g (cid:107) L p ≥ c (cid:107) (cid:112) | det( d f ) |(cid:107) L τ . (8.3)The admissibility assumption is not a severe limitation for an upper estimate of the errorsince it is also proved that for all ε > 0, there exist an admissible sequence such thatlim sup N → + ∞ N inf g ∈ V T N (cid:107) f − g (cid:107) L p ≤ C (cid:107) (cid:112) | det( d f ) |(cid:107) L τ + ε. (8.4) .1. Introduction d f and imposing that each triangle of themesh is isotropic with respect to a metric which is properly related to its local value. Werefer in particular to [16] where this program is executed using Delaunay mesh generationtechniques. While these algorithms fastly produce anisotropic meshes which are naturallyadapted to the approximated function, they suffer from two intrinsic limitations :1. They use the data of d f , and therefore do not apply to non-smooth or noisy func-tions.2. They are non-hierarchical : for N > M , the triangulation T N is not a refinement of T M .In Chapter 7, an alternate strategy is proposed for the design of adaptive hierarchicalmeshes, based on a simple greedy algorithm : starting from an initial triangulation T N ,the algorithm picks the triangle T ∈ T N with the largest local L p error. This triangle isthen bisected from the mid-point of one of its edges to the opposite vertex. The choice ofthe edge among the three options is the one that minimizes the new approximation errorafter bisection. The algorithm can be applied to any L p function, smooth or not, in thecontext of piecewise polynomial approximation of any given order. In the case of piecewiselinear approximation, numerical experiments in Chapter 7 indicate that this elementarystrategy generates triangles with an optimal aspect ratio and approximations f N ∈ V T N such that (cid:107) f − f N (cid:107) L p satisfies the same estimate as e N ( f ) L p in (8.1).The goal of this chapter is to support these experimental observations by a rigorousanalysis. This chapter is organized as follows :In § T with respect to a quadratic form. We showthat the optimal error estimate (8.1) is met when each triangle is non-degenerate in thesense of the above measure with respect to the quadratic form given by the local hessian d f . We end by briefly recalling the greedy algorithm which is introduced in Chapter 7.In § q such that its associated quadratic form q is of positive or negative sign. A keyobservation is that the edge which is bisected is the longest with respect to the metricinduced by q . This allows us to prove that the triangles generated by the refinementprocedure adopt an optimal aspect ratio in the sense of the non-degeneracy measureintroduced in § § C func-tion f which is assumed to be strictly convex (or strictly concave). We first establish aperturbation result, which shows that when f is locally close to a quadratic function q the algorithm behaves in a similar manner as when applied to q . We then prove that thediameters of the triangles produced by the algorithm tend to zero so that the perturbation58 Chapter 8. Greedy bisection generates optimally adapted triangulations result can be applied. This allows us to show that the optimal convergence estimatelim sup N →∞ N (cid:107) f − f N (cid:107) L p ≤ C (cid:107) (cid:112) | det( d f ) |(cid:107) L τ (8.5)is met by the sequence of approximations f N ∈ V T N generated by the algorithm.The extension of this result to an arbitrary C function f remains an open problem.It is possible to proceed to an analysis similar to § q is of mixed sign, also proving that the triangles adopt an optimal aspect ratio asthey get refined. We describe this analysis in § § C functions f for which the approximation f N fails to converge towards f due to this phenomenon. Such examples are discussed in Chapter 7 which also proposesa modification of the algorithm for which convergence is always ensured. However, we donot know if the optimal convergence estimate (8.5) holds for any f ∈ C with this modifiedalgorithm, although this seems plausible from the numerical experiments. We shall make use of a linear approximation operator A T that maps continuous func-tions defined on T onto IP . For an arbitrary but fixed 1 ≤ p ≤ ∞ , we define the local L p approximation error e T ( f ) p := (cid:107) f − A T f (cid:107) L p ( T ) . The critical assumptions in our analysis for the operator A T will be the following :1. A T is continuous in the L ∞ norm.2. A T commutes with affine changes of variables : A T ( f ) ◦ φ = A φ − ( T ) ( f ◦ φ ) for allaffine φ .3. A T reproduces IP : A T ( π ) = π , for any π ∈ IP .Note that the commutation assumption implies that for any function f and any affinetransformation φ : x (cid:55)→ x + Lx we have e φ ( T ) ( f ) p = | det( L ) | /p e T ( f ◦ φ ) p , (8.6)Two particularly simple admissible choices of approximation operators are the following :– A T = P T , the L ( T )-orthogonal projection operator : (cid:82) T ( f − P T f ) π = 0 for all π ∈ IP .– A T = I T , the local interpolation operator : I T f ( v i ) = f ( v i ) with { v , v , v } thevertices of T .All our results are simultaneously valid when A T is either P T or I T , or any linear operatorthat fulfills the three above assumptions. .2. Adaptive finite element approximation f and a triangulation T N with N = T N ), we can associate afinite element approximation f N defined on each T ∈ T N by f N ( x ) = A T f ( x ). The globalapproximation error is given by (cid:107) f − f N (cid:107) L p = (cid:16) (cid:88) T ∈T N e T ( f ) pp (cid:17) p , with the usual modification when p = ∞ . Remark 8.2.1. The operator B T of best L p ( T ) approximation which is defined by (cid:107) f − B T f (cid:107) L p ( T ) = min π ∈ IP m (cid:107) f − π (cid:107) L p ( T ) , does not fall in the above category of operators, since it is non-linear (and not easy tocompute) when p (cid:54) = 2 . However, it is clear that any estimate on (cid:107) f − f N (cid:107) L p with f N defined as A T f on each T implies a similar estimate when f N is defined as B T f on each T . Here and throughout the chapter, when q ( x, y ) = a , x + 2 a , xy + a , y + a , x + a , y + a , we denote by q the associated quadratic form : if u = ( x, y ) q ( u ) = a , x + 2 a , xy + a , y . Note that q ( u ) = (cid:104) Qu, u (cid:105) where Q = (cid:18) a , a , a , a , (cid:19) . We definedet( q ) := det( Q ) . If q is a positive or negative quadratic form, we define the q -metric | v | q := (cid:112) | q ( v ) | (8.7)which coincides with the euclidean norm when q ( v ) = x + y for v = ( x, y ). If q is aquadratic form of mixed sign, we define the associated positive form | q | which correspondsto the symmetric matrix | Q | that has same eigenvectors as Q with eigenvalues ( | λ | , | µ | ) if( λ, µ ) are the eigenvalues of Q . Note that generally | q | ( u ) (cid:54) = | q ( u ) | and that one alwayshas | q ( u ) | ≤ | q | ( u ). Remark 8.2.2. If det Q > , then there exists a × matrix L and ε ∈ { +1 , − } suchthat L t QL = ε (cid:18) (cid:19) . The linear change of coordinates φ ( u ) := Lu , where u = ( x, y ) ∈ R , therefore satisfies q ◦ φ ( u ) = ε ( x + y ) . On the other hand, if det Q < then there exists a × matrix L such that L t QL = (cid:18) − (cid:19) . Defining again φ ( u ) := Lu we obtain in this case q ◦ φ ( u ) = x − y . Chapter 8. Greedy bisection generates optimally adapted triangulations A standard estimate in finite element approximation states that if f ∈ W ,p (Ω) theninf g ∈ V h (cid:107) f − g (cid:107) L p ≤ Ch (cid:107) d f (cid:107) L p , where V h is the piecewise linear finite element space associated with a triangulation T h ofmesh size h := max T ∈T h diam( T ). If we restrict our attention to uniform triangulations,we have N := T h ) ∼ h − . Therefore, denoting by e unif N ( f ) L p the L p approximation error by a uniform triangulationof cardinality N , we can re-express the above estimate as e unif N ( f ) L p ≤ CN − (cid:107) d f (cid:107) L p . (8.8)This estimate can be significantly improved when using adaptive partitions. We give heresome heuristic arguments, which are based on the assumption that on each triangle T therelative variation of d f is small so that it can be considered as constant over T (whichmeans that f is replaced by a quadratic function on each T ), and we also indicate theavailable results which are proved more rigorously.First consider isotropic triangulations, i.e. such that all triangles satisfy a uniformestimate ρ T = h T r T ≤ A, (8.9)where h T := diam( T ) denotes the size of the longest edge of T , and r T is the radius of thelargest disc contained in T . In such a case we start from the local approximation estimateon any T e T ( f ) p ≤ Ch T (cid:107) d f (cid:107) L p ( T ) , and notice that h T (cid:107) d f (cid:107) L p ( T ) ∼ | T | (cid:107) d f (cid:107) L p ( T ) = (cid:107) d f (cid:107) L τ ( T ) , with τ := p + 1 and | T | the area of T , where we have used the isotropy assumption (8.9)in the equivalence and the fact that d f is constant over T in the equality. It follows that e T ( f ) p ≤ C (cid:107) d f (cid:107) L τ ( T ) , τ := 1 p + 1 . Assume now that we can construct adaptive isotropic triangulations T N with N := T N )which equidistributes the local error in the sense that for some prescribed ε > cε ≤ e T ( f ) p ≤ ε, (8.10)with c > T and N . Then defining f N as A T ( f ) on each T ∈ T N , we have on the one hand (cid:107) f − f N (cid:107) L p ≤ N /p ε, .2. Adaptive finite element approximation τ := p + 1, N ( cε ) τ ≤ (cid:88) T ∈T N (cid:107) f − f N (cid:107) τL p ( T ) ≤ C τ (cid:88) T ∈T N (cid:107) d f (cid:107) τL τ ( T ) ≤ C τ (cid:107) d f (cid:107) τL τ . Combining both, one obtains for e iso N ( f ) L p := (cid:107) f − f N (cid:107) L p the estimate e iso N ( f ) L p ≤ CN − (cid:107) d f (cid:107) L τ . (8.11)This estimate improves upon (8.8) since the rate N − is now obtained with the weakersmoothness condition d f ∈ L τ and since, even for smooth f , the quantity (cid:107) d f (cid:107) L τ mightbe significantly smaller than (cid:107) d f (cid:107) L p . This type of result is classical in non-linear ap-proximation and also occurs when we consider best N -term approximation in a waveletbasis.The principle of error equidistribution suggests a simple greedy algorithm to build anadaptive isotropic triangulation for a given f , similar to our algorithm but where thebisection of the triangle T that maximizes the local error e T ( f ) p is systematically donefrom its most recently created vertex in order to preserve the estimate (8.9). Such analgorithm cannot exactly equilibrate the error in the sense of (8.10) and therefore doesnot lead to the same the optimal estimate as in (8.11). However, it was proved in [13] thatit satisfies (cid:107) f − f N (cid:107) L p ≤ C | f | B τ,τ N − , for all τ such that τ < p + 1, provided that the local approximation operator A T isbounded in the L p norm. Here B τ,τ denotes the usual Besov space which is a naturalsubstitute for W ,τ when τ < 1. Therefore this estimate is not far from (8.11). We now turn to anisotropic adaptive triangulations, and start by discussing the optimalshape of a triangle T for a given function f at a given point. For this purpose, we againreplace f by a quadratic function assuming that d f is constant over T . For such a q ∈ IP and its associated quadratic form q , we first derive an equivalent quantity for the localapproximation error. Here and as well as in § § T andwe denote by ( a, b, c ) its edge vectors oriented in clockwise or anti-clockwise direction sothat a + b + c = 0 . Proposition 8.2.3. The local L p -approximation error satisfies e T ( q ) p = e T ( q ) p ∼ | T | p max {| q ( a ) | , | q ( b ) | , | q ( c ) |} , where the constant in the equivalence is independent of q , T and p . Proof: The first equality is trivial since q and q differ by an affine function. Let T eq bean equilateral triangle of area | T eq | = 1, and edges a, b, c . Let E be the 3-dimensional62 Chapter 8. Greedy bisection generates optimally adapted triangulations vector space of all quadratic forms. Then the following quantities are norms on E , andthus equivalent : e T eq ( q ) p ∼ max {| q ( a ) | , | q ( b ) | , | q ( c ) |} . (8.12)Note that the constants in this equivalence are independent of p since all L p ( T eq ) normsare uniformly equivalent on E .If T is an arbitrary triangle, there exists an affine transform φ : x (cid:55)→ x + Lx such that T = φ ( T eq ). For any quadratic function q , we thus obtain from (8.6) e T ( q ) = e T ( q ) = e φ ( T eq ) ( q ) = | det L | p e T eq ( q ◦ φ ) = | det L | p e T eq ( q ◦ L )since q ◦ L is the homogeneous part of q ◦ φ . By (8.12), we thus have e T ( q ) ∼ | det L | p max {| q ( La ) | , | q ( Lb ) | , | q ( Lc ) |} , where { a, b, c } are again the edge vectors of T eq . Remarking that | T | = | det L | and that { La, Lb, Lc } are the edge vectors of T , this concludes the proof of this proposition. (cid:5) In order to describe the optimal shape of a triangle T for the quadratic function q , we fixthe area of | T | and try to minimize the error e T ( q ) p or equivalently max {| q ( a ) | , | q ( b ) | , | q ( c ) |} .The solution to this problem can be found by introducing for any q such that det( q ) (cid:54) = 0the following measure of non-degeneracy for T : ρ q ( T ) := max {| q ( a ) | , | q ( b ) | , | q ( c ) |}| T | (cid:112) | det( q ) | . (8.13)Let φ be a linear change of variables, q a quadratic form and T a triangle of edges a, b, c .Then det( q ◦ φ ) = (det φ ) det( q ), the edges of φ ( T ) are φ ( a ) , φ ( b ) , φ ( c ) and | φ ( T ) | = | det φ || T | . Hence we obtain ρ q ◦ φ ( T ) = max {| q ◦ φ ( a ) | , | q ◦ φ ( b ) | , | q ◦ φ ( c ) |}| T | (cid:112) | det( q ◦ φ ) | = max {| q ( φ ( a )) | , | q ( φ ( b )) | , | q ( φ ( c )) |}| det φ || T | (cid:112) | det( q ) | = ρ q ( φ ( T )) . (8.14)The last equation, combined with Remark 8.2.2, allows to reduce the study of ρ q ( T ) totwo elementary cases by change of variable :1. The case where det( q ) > q ( x, y ) = x + y . Recall that for anytriangle T with edges a, b, c we define h T := diam( T ) = max {| a | , | b | , | c |} , with | · | the euclidean norm. In this case we therefore have ρ q ( T ) = h T | T | , which correspondsto a standard measure of shape regularity in the sense that its boundedness isequivalent to a property such as (8.9). This quantity is minimized when the triangle T is equilateral, with minimal value √ (in fact it was also proved in [28] that theminimum of the interpolation error (cid:107) q − I T q (cid:107) L p ( T ) among all triangles of area | T | = 1is attained when T is equilateral). For a general quadratic form q of positive sign, .2. Adaptive finite element approximation √ is obtained for triangleswhich are equilateral with respect to the metric | · | q . More generally triangles witha good aspect ratio, i.e. a small value of ρ q ( T ), are those which are isotropic withrespect to this metric . Of course, a similar conclusion holds for a quadratic form ofnegative sign.2. The case where det( q ) < q ( x, y ) = x − y . In this case, the analysispresented in [26] shows that the quantity ρ q ( T ) is minimized when T is a half ofa square with sides parallel to the x and y axes, with minimal value 2. But using(8.14) we also notice that ρ q ( T ) = ρ q ( L ( T )) for any linear transformation L suchthat q = q ◦ L . This holds if L has eigenvalues ( λ, λ ), where λ (cid:54) = 0, and eigenvectors(1 , 1) and ( − , L are also optimal triangles. Note that such triangles can be highly anisotropic.For a general quadratic form q of mixed sign, we notice that ρ q ( T ) ≤ ρ | q | ( T ), andtherefore triangles which are equilateral with respect to the metric | · | | q | have a goodaspect ratio, i.e. a small value of ρ q ( T ). In addition, by similar arguments, we findthat all images of such triangles by linear transforms L with eigenvalues ( λ, λ ) andeigenvectors ( u, v ) such that q ( u ) = q ( v ) = 0 also have a good aspect ratio, since q = q ◦ L for such transforms.We leave aside the special case where det( q ) = 0. In such a case, the triangles minimizingthe error for a given area degenerate in the sense that they should be infinitely long andthin, aligned with the direction of the null eigenvalue of q .Summing up, we find that triangles with a good aspect ratio are characterized by thefact that ρ q ( T ) is small. In addition, from Proposition 8.2.3 and the definition of ρ q ( T ),we have e T ( q ) p ∼ | T | p (cid:112) | det( q ) | ρ q ( T ) = (cid:107) (cid:112) | det( q ) |(cid:107) L τ ( T ) ρ q ( T ) , τ := 1 p + 1 . (8.15)We now return to a function f such that d f is assumed to be constant on every T ∈ T N .Assuming that all triangles have a good aspect ratio in the sense that ρ q ( T ) ≤ C for some fixed constant C and with q the value of d f over T , we find up to a change in C that e T ( f ) p ≤ C (cid:107) (cid:112) | det( d f ) |(cid:107) L τ ( T ) (8.16)By a similar reasoning as with isotropic triangulations, we now obtain that if the trian-gulation equidistributes the error in the sense of (8.10) (cid:107) f − f N (cid:107) L p ≤ CN − (cid:107) (cid:112) | det( d f ) |(cid:107) L τ , (8.17)and therefore (8.1) holds. This estimate improves upon (8.11) since the quantity (cid:107) d f (cid:107) L τ might be significantly larger than (cid:107) (cid:112) | det( d f ) |(cid:107) L τ , in particular when f has some aniso-tropic features, such as sharp gradients along curved edges.The above derivation of (8.1) is heuristic and non-rigorous. Clearly, this estimatecannot be valid as such since det( d f ) may vanish while the approximation error does not64 Chapter 8. Greedy bisection generates optimally adapted triangulations (consider for instance f depending only on a single variable). More rigorous versions werederived in [27] and [4]. In these results | d f | is typically replaced by a majorant | d f | + εI ,avoiding that its determinant vanishes. The estimate (8.1) can then be rigorously provedbut holds for N ≥ N ( ε, f ) large enough. This limitation is unavoidable and reflects thefact that enough resolution is needed so that the hessian can be viewed as locally constantover each optimized triangle. Another formulation, which is rigorously proved in Chapter2, reads as follows. Proposition 8.2.4. There exists an absolute constant C > such that for any polygonaldomain Ω and any function f ∈ C (Ω) , one has lim sup N → + ∞ N e N ( f ) L p ≤ C (cid:107) (cid:112) | det( d f ) |(cid:107) L τ . Given a target function f , our algorithm iteratively builds triangulations T N with N = T N ) and finite element approximations f N . The starting point is a coarse triangulation T N . Given T N , the algorithm selects the triangle T which maximizes the local error e T ( f ) p among all triangles of T N , and bisects it from the mid-point of one of its edges towardsthe opposite vertex. This gives the new triangulation T N +1 .The critical part of the algorithm lies in the choice of the edge e ∈ { a, b, c } from which T is bisected. Denoting by T e and T e the two resulting triangles, we choose e as theminimizer of a decision function d T ( e, f ), which role is to drive the generated trianglestowards an optimal aspect ratio. While the most natural choice for d T ( e, f ) correspondsto the split that minimizes the error after bisection, namely d T ( e, f ) = e T e ( f ) pp + e T e ( f ) pp , we shall instead focus our attention on a decision function which is defined as the L normof the interpolation error d T ( e, f ) = (cid:107) f − I T e f (cid:107) L ( T e ) + (cid:107) f − I T e f (cid:107) L ( T e ) . (8.18)For this decision, the analysis of the algorithm is made simpler, due to the fact that wecan derive explicit expressions of (cid:107) f − I T f (cid:107) L ( T ) when f = q is a quadratic polynomialwith a positive homogeneous part q . We prove in § ρ q ( T ). This leads us in § f is C and strictly convex. Remark 8.2.5. It should be well understood that while the decision function is based onthe L norm, the selection of the triangle to be bisected is done by maximizing e T ( f ) p . Thealgorithm remains therefore governed by the L p norm in which we wish to minimize theerror (cid:107) f − f N (cid:107) p for a given number of triangles. Intuitively, this means that the L p -norminfluences the size of the triangles which have to equidistribute the error, but not theiroptimal shape. .2. Adaptive finite element approximation Remark 8.2.6. It was pointed out to us that the L norm of the interpolation error toa suitable convex function is also used to improve the mesh in the context of moving gridtechniques, see [29]. We define a variant of the decision function as follows D T ( e, f ) := (cid:107) f − I T f (cid:107) L ( T ) − d T ( e, f ) . Note that D T ( e, f ) is the reduction of the L interpolation error resulting from the bi-section of the edge e , and that the selected edge that minimizes d T ( · , f ) is also the onethat maximizes D ( · , f ). The function D T has a simple expression in the case where f isa convex function. Lemma 8.2.7. Let T be a triangle and let f be a convex function on T . Let e be an edgeof T with endpoints z and z . Then D T ( e, f ) = | T | (cid:18) f ( z ) + f ( z )2 − f (cid:18) z + z (cid:19)(cid:19) . (8.19) If in addition f has C smoothness, we also have D T ( e, f ) = | T | (cid:90) (cid:104) d f ( z t ) e, e (cid:105) min { t, − t } dt, where z t := (1 − t ) z + tz . (8.20) Proof: Since f is convex, we have I T f ≥ f on T , hence (cid:107) f − I T f (cid:107) L ( T ) = (cid:90) T (I T f − f ) . Similarly I T e f ≥ f on T e and I T e f ≥ f on T e , hence D T ( e, f ) = (cid:90) T I T f − (cid:90) T e I T e f − (cid:90) T e I T e f. Let z be the vertex of T opposite the edge e . Since the function f is convex, it followsthe previous expression that D T ( e, f ) is the volume of the tetrahedron of vertices (cid:18) z ,x + z ,x , z ,y + z ,y , f (cid:18) z + z (cid:19)(cid:19) and ( z i,x , z i,y , f ( z i )) for i = 0 , , . where ( z i,x , z i,y ) are the coordinates of z i . Let u = z − z and v = z − z . We thus have D T ( e, f ) = | det( M ) | where M := u x v x u x + v x u y v y u y + v y f ( z ) − f ( z ) f ( z ) − f ( z ) f (cid:0) z + z (cid:1) − f ( z ) . Subtracting the half of the first two columns to the third one we find that M has thesame determinant as˜ M := u x v x u y v y f ( z ) − f ( z ) f ( z ) − f ( z ) f (cid:0) z + z (cid:1) − f ( z )+ f ( z )2 . Chapter 8. Greedy bisection generates optimally adapted triangulations Recalling that 2 | T | = | det( u, v ) | we therefore obtain (8.19). In order to establish (8.20),we observe that we have in the distribution sense ∂ t (min { t, − t } + ) = δ − δ / + δ , where δ t is the one-dimensional Dirac function at a point t . Hence for any univariate function h ∈ C ([0 , (cid:90) h (cid:48)(cid:48) ( t ) min { t, − t } dt = h (0) − h (1 / 2) + h (1) . Combining this result with (8.19) we obtain (8.20). (cid:5) In this section, we study the algorithm when applied to a quadratic polynomial q suchthat det( q ) > 0. We shall assume without loss of generality that q is positive definite,since all our results extend in a trivial manner to the negative definite case.Our first observation is that the refinement procedure based on the decision function(8.18) always selects for bisection the longest edge in the sense of the q -metric | · | q definedby (8.7). Lemma 8.3.1. An edge e of T maximizes D T ( e, q ) among all edges of T if and only if itmaximizes | e | q among all edges of T . Proof: The hessian d q is constant and for all e ∈ R one has (cid:104) d qe, e (cid:105) = 2 q ( e ) . If e is an edge of a triangle T , and if q is a convex quadratic function, equation (8.20)therefore gives D T ( e, q ) = | T | q ( e ) (cid:90) min { t, − t } dt = | T | | e | q . (8.21)This concludes the proof. (cid:5) It follows from this lemma that the longest edge of T in the sense of the q -metric isselected for bisection by the decision function. In the remainder of this section, we usethis fact to prove that the refinement procedure produces triangles which tend to adoptan optimal aspect ratio in the sense that ρ q ( T ) becomes small in an average sense.For this purpose, it is convenient to introduce a close variant to ρ q ( T ) : if T is atriangle with edges a, b, c , such that | a | q ≥ | b | q ≥ | c | q , we define σ q ( T ) := q ( b ) + q ( c )4 | T |√ det q = | b | q + | c | q | T |√ det q . (8.22)Using the inequalities | b | q + | c | q ≤ | a | q and | a | q ≤ | b | q + | c | q ), we obtain the equivalence ρ q ( T )8 ≤ σ q ( T ) ≤ ρ q ( T )2 . (8.23) .3. Positive quadratic functions ρ q , this quantity is invariant under a linear coordinate changes φ , in the sensethat σ q ◦ φ ( T ) = σ q ( φ ( T )) , From (8.15) and (8.23) we can relate σ q to the local approximation error. Proposition 8.3.2. There exists a constant C , which depends only on the choice of A T ,such that for any triangle T , quadratic function q and exponent ≤ p ≤ ∞ , the local L p -approximation error satisfies C − e T ( q ) p ≤ σ q ( T ) (cid:107) (cid:112) det q (cid:107) L τ ( T ) ≤ C e T ( q ) p . (8.24) where τ := p + 1 . Our next result shows that σ q ( T ) is always reduced by the refinement procedure. Proposition 8.3.3. If T is a triangle with children T and T obtained by the refinementprocedure for the quadratic function q , then max { σ q ( T ) , σ q ( T ) } ≤ σ q ( T ) . Proof: Assuming that | a | q ≥ | b | q ≥ | c | q , we know that the edge a is cut and thatthe children have area | T | / a/ , b, ( c − b ) / a/ , ( b − c ) / , c (recall that a + b + c = 0). We then have2 | T | (cid:112) det q σ q ( T i ) ≤ q (cid:16) a (cid:17) + q (cid:18) b − c (cid:19) (8.25)= q (cid:18) b + c (cid:19) + q (cid:18) b − c (cid:19) (8.26)= q ( b ) + q ( c )2 (8.27)= 2 | T | (cid:112) det q σ q ( T ) . (8.28) (cid:5) Remark 8.3.4. For any positive definite quadratic form q , the minimum of σ q is , asis easily seen using the inequality | det( b, c ) | ≤ | b || c | ≤ | b | + | c | , in which we have equality if and only if b, c ∈ R are orthogonal vectors of the same norm.When q is the euclidean metric, the triangle that minimizes σ q is thus the half square.This is consistent with the above result since it is the only triangle which is similar (i.e.identical up to a translation, a rotation and a dilation) to both of its children after onestep of longest edge bisection. Remark 8.3.5. A result of similar nature was already proved in [82] : longest edge bi-section has the effect that the minimal angle in any triangle after an arbitrary number ofrefinements is at most twice the minimal angle of the initial triangle. Chapter 8. Greedy bisection generates optimally adapted triangulations Our next objective is to show that as we iterate the refinement process, the value of σ q ( T ) becomes bounded independently of q for almost all generated triangles. For thispurpose we introduce the following notation : if T is a triangle with edges such that | a | q ≥ | b | q ≥ | c | q , we denote by ψ q ( T ) the subtriangle of T obtained after bisection of a which contains the smallest edge c . We first establish inequalities between the measures σ q and ρ q applied to T and ψ q ( T ). Proposition 8.3.6. Let T be a triangle, then σ q ( ψ q ( T )) ≤ ρ q ( T ) (8.29) ρ q ( ψ q ( T )) ≤ ρ q ( T )2 (cid:18) ρ q ( T ) (cid:19) (8.30) Proof: We first prove (8.29). Obviously, ψ q ( T ) contains one edge s ∈ { a, b, c } from T ,and one half edge t ∈ { a , b , c } from T . Therefore σ q ( ψ q ( T )) ≤ | s | q + | t | q | ψ q ( T ) |√ det q ≤ | a | q + | a | q | T |√ det q = 58 ρ q ( T ) . For the proof of (8.30), we restrict our attention to the case q = x + y , without lossof generality thanks to the invariance formula (8.14). Let T be a triangle with edges | a | ≥ | b | ≥ | c | . If h is the width of T in the direction perpendicular to a , then h = 2 | T || a | = 2 | a | ρ q ( T ) . The sub-triangle ψ q ( T ) of T has edges a , c, d where d = b − c , and the angles at the endsof a are acute. Indeed (cid:104) c, a/ (cid:105) = 14 (cid:0) | b | − | a | − | c | (cid:1) ≤ (cid:104) d, a/ (cid:105) = 14 (cid:0) | c | − | b | (cid:1) ≤ . By Pythagora’s theorem we thus findmax { (cid:12)(cid:12)(cid:12) a (cid:12)(cid:12)(cid:12) , | c | , | d | } ≤ (cid:12)(cid:12)(cid:12) a (cid:12)(cid:12)(cid:12) + h = | a | (cid:18) ρ q ( T ) (cid:19) . Dividing by the respective areas of T and ψ q ( T ), we obtain the announced result. (cid:5) Our next result shows that a significant reduction of σ q occurs at least for one of thetriangles obtained by three successive refinements, unless it has reached a small value of σ q . We use the notation ψ q ( T ) := ψ q ( ψ q ( T )) and ψ q ( T ) := ψ q ( ψ q ( T )). Proposition 8.3.7. Let T be a triangle such that σ q ( ψ q ( T )) ≥ . Then σ q ( ψ q ( T )) ≤ . σ q ( T ) . .3. Positive quadratic functions Proof: The monotonicity of σ q established in Proposition (8.3.3) implies that5 ≤ σ q ( ψ q ( T )) ≤ σ q ( ψ q ( T )) ≤ σ q ( ψ q ( T )) . Combining this with inequality (8.29) we obtain8 ≤ min { ρ q ( ψ q ( T )) , ρ q ( ψ q ( T )) , ρ q ( T ) } . If a triangle S obeys ρ q ( S ) ≥ 4, then12 (cid:18) ρ q ( S ) (cid:19) ≤ ρ q ( ψ q ( S )) ≤ ρ q ( S ) according to inequality (8.30). We can apply this to S = ψ q ( T ) and S = T , therefore obtaining ρ q ( ψ q ( T )) ≤ ρ q ( ψ q ( T )) ≤ ρ q ( T ) . (8.31)We now remark that inequality (8.30) is equivalent to ( ρ q ( S ) − ρ q ( ψ q ( S ))) ≥ ρ q ( ψ q ( S )) − , hence ρ q ( S ) ≥ ρ q ( ψ q ( S )) + (cid:113) ρ q ( ψ q ( S )) − 16 (8.32)provided that ρ q ( S ) ≥ ρ q ( ψ q ( S )). Applying this to S = ψ q ( T ) and recalling that ρ q ( ψ q ( T )) ≥ ρ q ( ψ q ( T )) ≥ √ − ≥ . . Applying again (8.32) to S = T we obtain ρ q ( T ) ≥ . √ . − ≥ . . Using (8.30), it follows that ρ ( ψ q ( T )) ρ ( T ) ≤ (cid:18) ρ q ( ψ q ( T )) (cid:19) (cid:18) ρ q ( ψ q ( T )) (cid:19) (cid:18) ρ q ( T ) (cid:19) ≤ . . Eventually, the inequalities (8.23) imply that2 σ q ( ψ q ( T )) ≤ ρ q ( ψ q ( T )) ≤ . ρ q ( T ) ≤ . σ q ( T ))which concludes the proof. (cid:5) An immediate consequence of Propositions 8.3.3 and 8.3.7 is the following. Corollary 8.3.8. If ( T i ) i =1 are the eight children obtained from three successive refine-ment procedures from T for the function q , then– for all i , σ q ( T i ) ≤ σ q ( T ) ,– there exists i such that σ q ( T i ) ≤ . σ q ( T ) or σ q ( T i ) ≤ . Chapter 8. Greedy bisection generates optimally adapted triangulations We are now ready to prove that most triangles tend to adopt an optimal aspect ratioas one iterates the refinement procedure. Theorem 8.3.9. Let T be a triangle, and q a positive definite quadratic function. Let k = ln σ q ( T ) − ln 5 − ln(0 . . Then after n applications of the refinement procedure starting from T ,at most Cn k n/ of the n generated triangles satisfy σ q ( S ) ≥ , where C is an absoluteconstant. Therefore the proportion of such triangles tends exponentially fast to as n → + ∞ . Proof: If we prove the proposition for n multiple of 3, then it will hold for all n (witha larger constant) since σ q decreases at each refinement step. We now assume that n =3 m , and consider the octree with root T obtained by only considering the triangles ofgeneration 3 i for i = 0 , · · · , n .According to Corollary 8.3.8, for each node of this tree, one of its eight children eitherchecks σ q ≤ θ := 0 . 69. Weremark that if σ q is diminished at least k times on the path going from the root T to aleaf S , then σ q ( S ) ≤ 5. As a consequence, the number N ( m ) of triangles S which are suchthat σ q ( S ) > n = 3 m is bounded by the number of words inan eight letters alphabet { a , · · · , a } with length m and that use the letter a at most k times, namely N ( m ) ≤ k (cid:88) l =0 (cid:18) ml (cid:19) m − l ≤ Cm k m , which is the announced result. (cid:5) The fact that most triangles tend to adopt an optimal aspect ratio as one iterates therefinement procedure is a first hint that the approximation error in the greedy algorithmmight satisfy the estimate (8.1) corresponding to an optimal triangulation. The followingresult shows that this is indeed the case, when this algorithm is applied on a triangulardomain Ω to a quadratic function q with positive definite associated quadratic form q .The extension of this result to more general C convex functions on polygonal domainsrequires a more involved analysis based on local perturbation arguments and is the objectof the next section. Corollary 8.3.10. Let Ω be a triangle, and let q be a quadratic function with positivedefinite associated quadratic form q . Let q N be the approximant of q on Ω obtained by thegreedy algorithm for the L p metric, using the L decision function (8.18). Then lim sup N →∞ N (cid:107) q − q N (cid:107) L p (Ω) ≤ C (cid:107) (cid:112) det( q ) (cid:107) L τ (Ω) , where τ = p + 1 and where the constant C depends only on on the choice of the approxi-mation operator A T used in the definition of the approximant. Proof: For any triangle T , quadratic function q ∈ IP , and exponent p , let e (cid:48) T ( q ) p := inf π ∈ IP (cid:107) q − π (cid:107) L p ( T ) .3. Positive quadratic functions q on T . Let T be a fixed triangle of area 1, then forany q ∈ IP and 1 ≤ p ≤ ∞ one has e (cid:48) T ( q ) ≤ e (cid:48) T ( q ) p ≤ e T ( q ) p ≤ e T ( q ) ∞ . Furthermore, e (cid:48) T ( · ) and e T ( · ) ∞ are semi norms on the finite dimensional space IP whichvanish precisely on the same subspace of IP , namely IP . Hence these semi-norms areequivalent. It follows that c e T ( q ) p ≤ e (cid:48) T ( q ) p ≤ e T ( q ) p (8.33)where c is independent q ∈ IP and of p ≥ 1. Using the invariance property (8.6) we findthat (8.33) holds for any triangle T in place of T with the same constant c . We alsodefine for any triangulation T , e T ( f ) pp := (cid:88) T ∈T e T ( f ) pp and e (cid:48)T ( f ) pp := (cid:88) T ∈T e (cid:48) T ( f ) pp , and we remark that c e T ( q ) p ≤ e (cid:48)T ( q ) p ≤ e T ( q ) p . For each n , we denote by T un thetriangulation of Ω produced by n successive refinements based on the L decision function(8.18) for the quadratic function q of interest (note that T un ) = 2 n ). We also define T σn := { T ∈ T un ; σ q ( T ) > } . Therefore σ q ( T ) ≤ T / ∈ T σn , and on the other handwe know from Proposition 8.3.3 that σ q ( T ) ≤ σ q (Ω) for any T ∈ T un . It follows fromProposition 8.3.2 that e T un ( q ) p ≤ C (cid:16)(cid:80) T ∈T un ( σ q ( T ) | T | τ √ det q ) p (cid:17) p ≤ C (cid:16) p × n + σ q (Ω) p T σn ) (cid:17) p (cid:16) | Ω | n (cid:17) τ √ det q , where C is the constant in (8.24). According to Theorem 8.3.9, we know thatlim n → + ∞ − n T σn ) = 0 . Hence lim sup n →∞ n e T un ( q ) p ≤ C | Ω | τ (cid:112) det q = 5 C (cid:107) (cid:112) det q (cid:107) L τ (Ω) . We now denote by T gn the triangulation generated by the greedy procedure with stoppingcriterion based on the error η n := C − − nτ (cid:107)√ det q (cid:107) L τ (Ω) . It follows from (8.24) that forall T ∈ T uk with k ≤ n , one has e T ( q ) p ≥ C − σ q ( T ) (cid:107) (cid:112) det q (cid:107) L τ ( T ) ≥ C − − kτ (cid:107) (cid:112) det q (cid:107) L τ (Ω) ≥ η n , where we used that | T | = 2 − k | Ω | and that the minimal value of σ q is 1. This shows that T ng is a refinement of T un . Furthermore any triangle T ∈ T un has at most 2 k ( T ) children in T gn , where k ( T ) is the smallest integer such that η n ≥ C − n + k ( T ) τ σ q ( T ) (cid:107) (cid:112) det q (cid:107) L τ (Ω) . Chapter 8. Greedy bisection generates optimally adapted triangulations Since ≤ τ ≤ k ( T ) ≤ k ( T ) τ ≤ τ C σ q ( T ) ≤ C σ q ( T ). Hence T gn ) ≤ C (cid:88) T ∈T un σ q ( T ) ≤ C (5 × n + σ q (Ω) T σn )) = C n (1 + ε n ) , where C = 20 C and ε n → n → ∞ . If T N is the triangulation generated after N steps of the greedy algorithm, then there exists n ≥ T N is a refinement of T gn (hence a refinement of T un ) and T gn +1 is a refinement of T N . It follows that T N ) ≤ T gn +1 ) ≤ C n +1 (1 + ε n +1 ), and c e T N ( q ) p ≤ e (cid:48)T N ( q ) p ≤ e (cid:48)T un ( q ) p ≤ e T un ( q ) p , where we have used the fact that e (cid:48)T ( f ) p ≤ e (cid:48) ˜ T ( f ) p whenever T is a refinement of ˜ T .Eventually,lim sup N →∞ N e T N ( q ) p ≤ lim sup n →∞ C c n +1 (1 + ε n +1 ) e T un ( q ) ≤ C C c (cid:107) (cid:112) det q (cid:107) L τ (Ω) , which concludes the proof. (cid:5) The goal of this section is to prove that the approximation error in the greedy algo-rithm applied to a C function f satisfies the estimate (8.1) corresponding to an optimaltriangulation. Our main result is so far limited to the case where f is strictly convex. Theorem 8.4.1. Let f ∈ C (Ω) be such that d f ( x ) ≥ mI, for all x ∈ Ω for some arbitrary but fixed m > independent of x . Let f N be the approximant obtainedby the greedy algorithm for the L p metric, using the L decision function (8.18). Then lim sup N →∞ N (cid:107) f − f N (cid:107) L p ≤ C (cid:107) (cid:112) det( d f ) (cid:107) L τ , (8.34) where τ = p + 1 and where C is a constant independent of p , f and m . Equation (8.34) can be rephrased as follows : there exists a sequence ε N ( f ) such that ε N ( f ) → N → ∞ and (cid:107) f − f N (cid:107) L p ≤ (cid:16) C (cid:107) (cid:112) det( d f ) (cid:107) L τ + ε N ( f ) (cid:17) N − . Note also that since (cid:107) (cid:112) det( d f ) (cid:107) L τ > 0, there exists N ( f ) such that (cid:107) f − f N (cid:107) L p ≤ C (cid:107) (cid:112) det( d f ) (cid:107) L τ N − for all N ≥ N ( f ). It should be stressed hard that N ( f ) canbe arbitrarily large depending on the function f . Intuitively, this means that when f .4. The case of strictly convex functions C functions is sofar incomplete, as it is explained in the end of the introduction. The proof of Theorem8.4.1 uses the fact that a strictly convex C function is locally close to a quadratic functionwith positive definite hessian, which allows us to exploit the results obtained in § We consider a triangle T , a function f ∈ C ( T ), a convex quadratic function q and µ > T d q ≤ d f ≤ (1 + µ ) d q. (8.35)It follows that det( d q ) ≤ det( d f ) ≤ det((1 + µ ) d q ) = (1 + µ ) det( d q ) . Since det( d q ) =4 det( q ), we obtain2 (cid:107) (cid:112) det q (cid:107) L τ ( T ) ≤ (cid:107) (cid:112) det d f (cid:107) L τ ( T ) ≤ µ ) (cid:107) (cid:112) det q (cid:107) L τ ( T ) . (8.36)The following Lemma shows how the local errors associated to f and q are close Proposition 8.4.2. The exists a constant C e > , depending only on the operator A T such that (1 − C e µ ) e T ( q ) p ≤ e T ( f ) p ≤ (1 + C e µ ) e T ( q ) p . (8.37) Proof: It follows from inequality (8.35) that the functions f − q and (1 + µ ) q − f areconvex, henceI T ( f − q ) − ( f − q ) ≥ T ((1 + µ ) q − f ) − ((1 + µ ) q − f ) ≥ T . We therefore obtain0 ≤ (I T f − f ) − (I T q − q ) ≤ µ (I T q − q ) . There exists a constant C > A T such that for any h ∈ C ( T ), e T ( h ) p ≤ C | T | p (cid:107) h (cid:107) L ∞ ( T ) . Furthermore according to Proposition 8.2.3 there exists a constant C > A T such that | T | /p (cid:107) q − I T q (cid:107) L ∞ ( T ) ≤ C e T ( q ) p . Hence | e T ( f ) p − e T ( q ) p | ≤ e T ( f − q ) p = e T ((I T f − f ) − (I T q − q )) p ≤ C | T | p (cid:107) (I T f − f ) − (I T q − q ) (cid:107) L ∞ ( T ) ≤ C | T | p (cid:107) µ (I T q − q ) (cid:107) L ∞ ( T ) ≤ C C µe T ( q ) p Chapter 8. Greedy bisection generates optimally adapted triangulations This concludes the proof of this Lemma, with C e = C C . (cid:5) Note that using Proposition 8.3.2, and assuming that µ ≤ c e := C e , we have with τ := 1 + p , e T ( f ) p ∼ e T ( q ) p ∼ σ q ( T ) (cid:107) (cid:112) | det q |(cid:107) L τ ( T ) ∼ σ q ( T ) (cid:107) (cid:112) det d f (cid:107) L τ ( T ) , (8.38)with absolute constants in the equivalence.We next study the behavior of the decision function e (cid:55)→ d T ( e, f ). For this purpose,we introduce the following definition. Definition 8.4.3. Let T be a triangle with edges a, b, c . A δ -near longest edge bisectionwith respect to the q -metric is a bisection of any edge e ∈ { a, b, c } such that q ( e ) ≥ (1 − δ ) max { q ( a ) , q ( b ) , q ( c ) } Proposition 8.4.4. Assume that f and q satisfy (8.35). Then, the bisection of T pres-cribed by the decision function e (cid:55)→ d T ( e, f ) is a µ -near longest edge bisection for the q -metric. Proof: It follows directly from Equation (8.20) that for any edge e of T , D T ( e, q ) ≤ D T ( e, f ) ≤ D T ( e, (1 + µ ) q ) , hence we obtain using (8.21) | T | q ( e ) ≤ D T ( e, f ) ≤ (1 + µ ) | T | q ( e ) . (8.39)Therefore the bisection of T prescribed by the decision function e (cid:55)→ d T ( e, f ) selects an e such that (1 + µ ) q ( e ) ≥ max { q ( a ) , q ( b ) , q ( c ) } . It is therefore a δ -near longest edge bisection for the q -metric with δ = µ µ ≤ µ andtherefore also a µ -near longest edge bisection. (cid:5) In the rest of this section, we analyze the difference between a longest edge bisection inthe q -metric and a δ -near longest edge bisection. For that purpose we introduce a distancebetween triangles : if T , T are two triangles with edges a , b , c and a , b , c such that q ( a ) ≥ q ( b ) ≥ q ( c ) and q ( a ) ≥ q ( b ) ≥ q ( c ) , (8.40)we define ∆ q ( T , T ) = max {| q ( a ) − q ( a ) | , | q ( b ) − q ( b ) | , | q ( c ) − q ( c ) |} . Note that ∆ q is a distance up to rigid transformations. .4. The case of strictly convex functions Lemma 8.4.5. Let T , T be two triangles, let ( R , U ) and ( R , U ) be the two pairs ofchildren from the longest edge bisection of T in the q -metric, and a δ -near longest edgebisection of T in the q -metric. Then, up to a permutation of the pair of triangles ( R , U ) , max { ∆ q ( R , R ) , ∆ q ( U , U ) } ≤ 54 ∆ q ( T , T ) + δ q ( a ) . where a is the longest edge of T in the q -metric. Proof: We assume that the edges of T and T are named and ordered as in (8.40). Up toa permutation, R and U have edge vectors b , a / , ( c − b ) / c , a / , ( b − c ) / R , U ) :– q ( e ) < (1 − δ ) q ( a ) for e = b and c . In such a case the triangle T is bisected towards a , so that up to a permutation, R and U have edge vectors b , a / , ( c − b ) / c , a / , ( b − c ) / 2. Using that q (( c − b ) / 2) = q ( c ) / q ( b ) / − q ( a ) / a + b + c = 0, it clearly follows thatmax { ∆ q ( R , R ) , ∆ q ( U , U ) } ≤ 54 ∆ q ( T , T ) . – q ( e ) ≥ (1 − δ ) q ( a ) for some e = b or c . In such a case T may be bisected saytowards b , so that up to a permutation, R and U have edge vectors a , b / , ( c − a ) / c , b / , ( b − c ) / 2. But since | q ( b ) − q ( a ) | ≤ δ q ( a ), we obtain thatmax { ∆ q ( R , R ) , ∆ q ( U , U ) } ≤ 54 ∆ q ( T , T ) + δ q ( a ) . (8.41) (cid:5) We now introduce a perturbed version of the estimates describing the decay of thenon-degeneracy measure which were obtained in Proposition 8.3.3 and Corollary 8.3.8. Proposition 8.4.6. If ( T i ) i =1 are the two children obtained from a refinement of a triangle T in which a δ -near longest edge bisection in the q -metric is selected, then max { σ q ( T ) , σ q ( T ) } ≤ (1 + 4 δ ) σ q ( T ) . (8.42) If ( T i ) i =1 are the eight children of a triangle T obtained from three successive refinementsin which a δ -near longest edge bisection in the q -metric is selected, then– for all i , σ q ( T i ) ≤ σ q ( T )(1 + C δ ) ,– there exists i such that σ q ( T i ) ≤ . σ q ( T )(1 + C δ ) or σ q ( T i ) ≤ M ,where C = and M = 5(1 + C δ ) . Proof: We first prove (8.42), and for that purpose we introduce the two children T (cid:48) , T (cid:48) obtained by bisecting the longest edge of T in the q -metric. If follows from (8.41) that,up to a permutation of the pair ( T (cid:48) , T (cid:48) ),max { ∆ q ( T , T (cid:48) ) , ∆ q ( T , T (cid:48) ) } ≤ δ q ( a ) , Chapter 8. Greedy bisection generates optimally adapted triangulations where a is the longest edge of T in the q -metric. Hence | σ q ( T i ) − σ q ( T (cid:48) i ) | ≤ q ( T i , T (cid:48) i )4 | T i | (cid:112) det( q ) ≤ δ q ( a )4 | T i | (cid:112) det( q ) ≤ δσ q ( T ) . (8.43)We know from Proposition 8.3.3 that max { σ q ( T (cid:48) ) , σ q ( T (cid:48) ) } ≤ σ q ( T ). Combining this pointwith (8.43) we conclude the proof of (8.42).We now turn to proof of the second part of the proposition and for that purpose weintroduce the eight children ( T (cid:48) i ) i =1 obtained from three successive refinements of T inwhich the longest edge in the q -metric is selected. Iterating (8.41), we find that, up to apermutation of the triangles ( T (cid:48) i ) i =1 , one hasmax i =1 , ··· , ∆ q ( T i , T (cid:48) i ) ≤ (cid:32) (cid:18) (cid:19) (cid:33) δ q ( a ) = 6116 δ q ( a ) = C δ q ( a ) , where, again, a is the longest edge of T in the q -metric. Repeating the argument (8.43)we find that max i =1 , ··· , | σ q ( T i ) − σ q ( T (cid:48) i ) | ≤ C δσ q ( T ) . (8.44)We know from Corollary 8.3.8 that σ q ( T (cid:48) i ) ≤ σ q ( T ) for all i and that there exists i suchthat either σ q ( T (cid:48) i ) ≤ . σ q ( T ) or σ q ( T (cid:48) i ) ≤ 5. Combining this point with (8.44) weconclude the proof of the proposition. (cid:5) Our next step towards the proof of Theorem 8.4.1 is to show that the triangulationproduced by the greedy algorithm is locally optimal in the following sense : if the refine-ment procedure for the function f produces a triangle T ∈ D on which f is close enoughto a quadratic function q , then the triangles which are generated from the refinement of T tend to adopt an optimal aspect ratio in the q -metric, and a local version of the optimalestimate (8.1) holds on T .We first prove that most triangles adopt an optimal aspect ratio as we iterate therefinement procedure. Our goal is thus to obtain a result similar to Theorem 8.3.9 whichwas restricted to quadratic functions. However, due to the perturbations by C µ thatappear in Proposition 8.4.6, the formulation will be slightly different, yet sufficient for ourpurposes : we shall prove that the measure of non-degeneracy becomes bounded by anabsolute constant in an average sense, as we iterate the refinement procedure.As in the previous section, we assume that f and q satisfy (8.35). For any T , wedefine T un ( T ) the triangulation of T which is built by iteratively applying the refinementprocedure for the function f to all generated triangles up to 3 n generation levels. Notethat T un ( T )) = 2 n and | T (cid:48) | = 2 − n | T | , T (cid:48) ∈ T un ( T ) . .4. The case of strictly convex functions r > 0, we define the average r -th power of the measure of non-degeneracy of the 2 n triangles obtained from T after 3 n iterations by σ r q ( n ) = 12 n (cid:88) T (cid:48) ∈T un ( T ) σ r q ( T (cid:48) ) . We also define γ ( r, µ ) := 18 (cid:16) . C µ ) (cid:17) r + 78 (1 + C µ ) r , where C is the constant in Proposition 8.4.6. Note that for any r > 0, the function γ ( r, · )is continuous and increasing, and that 0 < γ ( r, < 1. Hence for any r > 0, there exists µ ( r ) > < γ ( r ) < γ ( r, µ ) ≤ γ ( r ), if 0 < µ < µ ( r ). Proposition 8.4.7. Assume that f and q satisfy (8.35) with < µ ≤ µ ( r ) . We then have σ r q ( n ) ≤ σ r q ( T ) γ ( r ) n + M r − γ ( r )) , where M is the constant in Proposition 8.4.6. Therefore σ r q ( n ) ≤ C := 1 + M r − γ ( r )) , if n ≥ σ q ( T ) λ with λ := r ln 2 − ln γ ( r ) . Proof: Let us use the notations u = 0 . C µ ) and v = (1 + C µ ). According toProposition 8.4.6, we have σ r q ( n ) ≤ E ( σ rn ) , where E is the expectation operator and σ n is the Markov chain with value in [1 , + ∞ [defined by– σ n +1 = max { σ n u, M } with probability α := ,– σ n +1 = σ n v with probability β := ,– σ := σ q ( T ) with probability 1.Denoting by µ n the probability distribution of σ n , we have E ( σ rn +1 ) = (cid:90) ∞ σ r dµ n +1 ( σ )= (cid:90) ∞ ( α (max { uσ, M } ) r + β ( vσ ) r ) dµ n ( σ )= αM r (cid:90) M/u dµ n ( σ ) + αu r (cid:90) ∞ M/u σ r dµ n ( σ ) + βv r (cid:90) + ∞ σ r dµ n ( σ ) ≤ αM r + ( αu r + βv r ) E ( σ rn ) ≤ αM r + γ ( r ) E ( σ rn )By iteration, it follows that E ( σ rn ) ≤ E ( σ r ) γ ( r ) n + αM r − γ ( r ) , Chapter 8. Greedy bisection generates optimally adapted triangulations which gives the result. (cid:5) Our next goal is to show that the greedy algorithm initialized from T generates atriangulation which is a refinement of T un ( T ) and therefore more accurate, yet with asimilar amount of triangles. To this end, we apply the greedy algorithm with root T andstopping criterion given by the local error η := min T (cid:48) ∈T un ( T ) e T (cid:48) ( f ) p . Therefore T (cid:48) is splitted if and only if e T (cid:48) ( f ) p > η . We denote by T N ( T ) the resultingtriangulation where N is its cardinality. From the definition of the stopping criterion, itis clear that T N ( T ) is a refinement of T un ( T ). Proposition 8.4.8. Assume that f and q satisfy (8.35) with µ ≤ , and define r := ln 2ln 4 − ln 3 > . We then have N ≤ C n σ r q ( n ) , where C is an absolute constant. Assuming in addition that µ ≤ µ ( r ) as in Proposition8.4.7, we obtain that N ≤ C n , if n ≥ σ q ( T ) λ with λ := r ln 2 − ln γ ( r ) , and where C = C C . Proof: Let T be a triangle in T un ( T ) and T a triangle in T N ( T ) such that T ⊂ T . Weshall give a bound on the number of splits k which were applied between T and T , i.e.such that | T | = 2 − k | T | . We first remark that according to Proposition 8.3.2 and (8.38),we have η ≥ c min T (cid:48) ∈T un ( T ) | T (cid:48) | p σ q ( T (cid:48) ) (cid:112) det q ≥ c | T | p (cid:112) det q , where c is an absolute constant. On the other hand, using both Proposition 8.4.2 andProposition 8.4.6, we obtain e T ( f ) q ≤ C | T | p σ q ( T ) √ det q = C | T | p − k (1+ p ) σ q ( T ) √ det q ≤ C | T | p σ q ( T ) (cid:16) − (1+ p ) (1 + 4 µ ) (cid:17) k √ det q . ≤ Cc σ q ( T ) (cid:16) µ (cid:17) k η ≤ Cc σ q ( T )( ) k η, where C is an absolute constant. Therefore we see that k is at most the smallest integersuch that Cc σ q ( T )( ) k ≤ 1. It follows that the total number n ( T ) of triangles T ∈ T N ( T )which are contained in T is bounded by n ( T ) ≤ k ≤ (cid:18) Cc σ q ( T ) (cid:19) r , .4. The case of strictly convex functions N = (cid:88) T ∈T un ( T ) n ( T ) ≤ (cid:18) Cc (cid:19) r (cid:88) T ∈T un ( T ) σ q ( T ) r = C n σ r q ( n ) , with C = 2 (cid:0) Cc (cid:1) r . The fact that N ≤ C n when 2 n ≥ σ q ( T ) λ with λ := r ln 2 − ln γ ( r ) is animmediate consequence of Proposition 8.4.7. (cid:5) Our last step towards the proof of Theorem 8.4.1 consists in deriving local error esti-mates for the greedy algorithm. For η > 0, we denote by f η the approximant to f obtainedby the greedy algorithm with stopping criterion given by the local error η : a triangle T is splitted if and only if e T ( f ) p > η . The resulting triangulation is denoted by T η = T N , with N = N ( η ) = T η ) . For this N , we thus have f η = f N . For a given T generated by the refinement procedureand such that η ≤ e T ( f ) p , we also define T η ( T ) = { T (cid:48) ⊂ T ; T (cid:48) ∈ T η } the triangles in T η which are contained in T and N ( T, η ) = T η ( T )) . Our next result provides with estimates of the local error (cid:107) f − f η (cid:107) L p ( T ) and of N ( T, η ) interms of η , provided that µ is small enough. Theorem 8.4.9. Assume that f and q satisfy (8.35) with µ ≤ c := min { , µ ( r ) } , andthat η ≤ η , where η = η ( T ) := (cid:18) | T | σ q ( T ) λ (cid:19) τ (cid:112) det q , with λ := r ln 2 − ln γ ( r ) , and τ = p + 1 . Then (cid:107) f − f η (cid:107) L p ( T ) ≤ ηN ( T, η ) p , (8.45) and N ( T, η ) ≤ C η − τ (cid:107) (cid:112) det( d f ) (cid:107) τL τ ( T ) , (8.46) where C is an absolute constant. Proof: The first estimate is trivial since (cid:107) f − f η (cid:107) L p ( T ) = (cid:16) (cid:88) T (cid:48) ∈T η ( T ) e T (cid:48) ( f ) pp (cid:17) p ≤ (cid:16) (cid:88) T (cid:48) ∈T η ( T ) η p (cid:17) p = ηN ( T, η ) p . Chapter 8. Greedy bisection generates optimally adapted triangulations In the case p = ∞ , we trivially have (cid:107) f − f η (cid:107) L ∞ ( T ) ≤ η. For the second estimate, we define n = n ( T ) the smallest positive integer such that2 n ( T ) ≥ σ q ( T ) λ with λ := r ln 2 − ln γ ( r ) . For any fixed n ≥ n , we define η n := min T (cid:48) ∈T un ( T ) e T (cid:48) ( f ) p . We know from Proposition 8.4.8 that with the choice η = η n N ( T, η n ) ≤ C n . (8.47)On the other hand, we know from Proposition 8.4.7, that σ r q ( n ) ≤ C , from which itfollows that min T (cid:48) ∈T un ( T ) σ q ( T (cid:48) ) ≤ C r . According to Proposition 8.4.2, we also have η n ≤ C min T (cid:48) ∈T un ( T ) | T (cid:48) | p σ q ( T (cid:48) ) (cid:112) det q ≤ C r C (cid:16) | T | n (cid:17) τ (cid:112) det q , where C is an absolute constant, which also reads2 n ≤ C τr C τ η − τn | T | (cid:112) det q τ . Combining this with (8.47), we have obtained the estimate N ( T, η n ) ≤ C C τr C τ η − τn | T | (cid:112) det q τ , which by Proposition 8.4.2 is equivalent to (8.46) with η = η n . In order to obtain (8.46)for all arbitrary values of η , we write that η n +1 < η ≤ η n for some n ≥ n , then N ( T, η ) ≤ N ( T, η n +1 ) ≤ C n +1) ≤ C C τr C τ η − τn | T |√ det q τ ≤ C C τr C τ η − τ | T |√ det q τ , which by Proposition 8.4.2 is equivalent to (8.46). In the case where η ≥ η n , we simplywrite N ( T, η ) ≤ N ( T, η n ) ≤ C n ≤ C σ q ( T ) λ = 64 C η − τ | T |√ det q τ ≤ C η − τ | T |√ det q τ , and we conclude in the same way. (cid:5) .4. The case of strictly convex functions (cid:107) f − f η (cid:107) L p ( T ) ≤ C τ (cid:107) (cid:112) det( d f ) (cid:107) L τ ( T ) N ( T, η ) − . In order to obtain the global estimate of Theorem 8.4.1, we need to be ensured that aftersufficiently many steps of the greedy algorithm, the target f can be well approximatedby quadratic function q = q ( T ) on each triangle T , so that our local results will apply onsuch triangles. This is ensured due to the following key result. Proposition 8.4.10. Let f be a C function such that d f ( x ) ≥ mI for some arbitrarybut fixed m > independent of x . Let T N be the triangulation generated by the greedyalgorithm applied to f using the L decision function given by (8.18). Then lim N → + ∞ max T ∈T N diam( T ) = 0 , i.e. the diameter of all triangles tends to . Proof: Let T be a triangle with an angle θ at a vertex z . The other vertices of T can bewritten as z = z + αu and z = z + βv where α, β ∈ R + and u, v ∈ R are unitary. Weassume that αu is the longest edge of T , hence θ ≤ π / 2. Observe that ρ ( T ) := h T | T | = α αβ sin θ = 2 αβ sin θ , and | u − v | = 2 sin (cid:18) θ (cid:19) = sin θ cos( θ ) = 2 αβρ ( T ) cos( θ ) . Since √ ≤ cos (cid:0) θ (cid:1) we thus obtain | u − v | ≤ √ αβρ ( T ) ≤ αβρ ( T ) . We now set M := (cid:107) d f (cid:107) L ∞ (Ω) and for all δ > ω ( δ ) := sup z,z (cid:48) ∈ Ω , (cid:107) z − z (cid:48) (cid:107)≤ δ (cid:107) d f ( z ) − d f ( z (cid:48) ) (cid:107) . For t ∈ IR, we define H ut := d f z + tu and H vt := d f z + tv . and notice that (cid:107) H ut − H vt (cid:107) ≤ ω ( t | u − v | ) . Hence, if 0 ≤ t ≤ β , we have (cid:107) H ut − H vt (cid:107) ≤ ω (cid:18) αρ ( T ) (cid:19) . Chapter 8. Greedy bisection generates optimally adapted triangulations Furthermore, for all t we have |(cid:104) H ut u, u (cid:105) − (cid:104) H ut v, v (cid:105)| = |(cid:104) H ut u, u (cid:105) − (cid:104) H ut u − ( u − v ) , u − ( u − v ) (cid:105)| = | (cid:104) H ut u, u − v (cid:105) − (cid:104) H ut ( u − v ) , u − v (cid:105)|≤ M | u || u − v | + M | u − v | ≤ Mβ (cid:18) αβρ ( T ) + (cid:16) αρ ( T ) (cid:17) (cid:19) . Applying the identity (8.20) to the edges e = αu and βv , and using a change of variable,we can write D T ( αu, f ) = (cid:90) R min { t, α − t } + (cid:104) H ut u, u (cid:105) dt and D T ( βv, f ) = (cid:90) R min { t, β − t } + (cid:104) H vt v, v (cid:105) dt where we have used the notation r + := max { r, } . Hence, noticing that (cid:90) R min { t, λ − t } + dt = (cid:90) λ min { t, λ − t } dt = ( λ + ) , and using the previous estimates we obtain D T ( αu, f ) − D T ( βv, f ) = (cid:90) R (min { t, α − t } + (cid:104) H ut u, u (cid:105) − min { t, β − t } + (cid:104) H vt v, v (cid:105) ) dt = (cid:90) R (min { t, α − t } + − min { t, β − t } + ) (cid:104) H ut u, u (cid:105) dt − (cid:90) R min { t, β − t } + ( (cid:104) H vt v, v (cid:105) − (cid:104) H ut u, u (cid:105) ) dt ≥ m (cid:90) R (min { t, α − t } + − min { t, β − t } + ) dt − (cid:90) β min { t, β − t } ( |(cid:104) H ut u, u (cid:105) − (cid:104) H ut v, v (cid:105)| + |(cid:104) ( H ut − H vt ) v, v (cid:105)| ) dt ≥ m α − β − M (cid:32) αβρ ( T ) + (cid:18) αρ ( T ) (cid:19) (cid:33) − β ω (cid:18) αρ ( T ) (cid:19) ≥ m α − β − α (cid:18) ρ ( T ) + 9 ρ ( T ) + ω (cid:18) αρ ( T ) (cid:19)(cid:19) , where we have used the fact that α > β in the last line. We can therefore write D T ( αu, f ) − D T ( βv, f ) ≥ m (cid:16) α (cid:16) − ˜ ω (cid:16) ρ ( T ) (cid:17)(cid:17) − β (cid:17) , (8.48)where we have set ˜ ω ( δ ) := 1 m (cid:16) δ + 9 δ + ω (3 diam(Ω) δ ) (cid:17) . .4. The case of strictly convex functions d T ( · , f ) prescribes a ˜ ω (cid:16) ρ ( T ) (cid:17) -near longest edge bisectionin the euclidean metric for any triangle T . Indeed if the smaller edge βv was selected, wewould necessarily have | βv | = β ≥ (cid:16) − ˜ ω (cid:16) ρ ( T ) (cid:17)(cid:17) α = (cid:16) − ˜ ω (cid:16) ρ ( T ) (cid:17)(cid:17) | αu | . Notice that ˜ ω ( δ ) → δ → f is strictly convex, there does not exists any triangle T ⊂ Ω such that e T ( f ) p =0. Let us assume for contradiction that the diameter of the triangles generated by thegreedy algorithm does not tend to zero. Then there exists a sequence ( T i ) i ≥ of trianglessuch that T i +1 is one of the children of T i , and h T i → d > i → ∞ , where h T denotesthe diameter of a triangle T . Since | T i | → 0, this also implies that ρ ( T i ) → + ∞ as i → ∞ .We can therefore choose i large enough such that h T i < d and C ˜ ω (cid:16) ρ ( T j ) (cid:17) ≤ for all j ≥ i , where C is the constant in Proposition 8.4.6. According to this Proposition, wehave σ ( T i +3 ) ≤ σ ( T i ) , where σ stands for σ q in the euclidean case q = x + y . On the other hand, we have forany triangle T , h T | T | ≤ σ ( T ) ≤ h T | T | , from which it follows that h T i +3 ≤ | T i +3 | σ ( T i +3 ) | T i | σ ( T i ) h T i ≤ h T i . Therefore, h T i +3 < d which is a contradiction. This concludes the proof of Proposition8.4.10. (cid:5) Proof of Theorem 8.4.1 Since f ∈ C , an immediate consequence of Proposition 8.4.10is that for all µ > 0, there exists N := N ( f, µ ) , such that for all T ∈ T N , there exists a quadratic function q T such that d q T ≤ d f ≤ (1 + µ ) d q T . Therefore our local results apply on all T ∈ T N . Specifically, we choose N := N ( f, c ) , with c the constant in Theorem 8.4.9. We then take η ≤ η := min T ∈T N (cid:26) e T ( f ) p , (cid:16) | T | σ q T ( T ) λ (cid:17) τ (cid:112) det q T (cid:27) . We use the notations f η = f N , T η = T N , N = N ( η ) = T η ) = T N ) , Chapter 8. Greedy bisection generates optimally adapted triangulations for the approximants and triangulation obtained by the greedy algorithm with stop-ping criterion given by the local error η . Note that T η is a refinement of T N , since η ≤ min T ∈T N e T ( f ) p , and therefore N ≥ N . We obviously have (cid:107) f − f N (cid:107) L p ≤ ηN p . Using Theorem 8.4.9, we also have N = (cid:88) T ∈T N N ( T, η ) ≤ C η − τ (cid:107) (cid:112) det( d f ) (cid:107) τL τ (Ω) , and therefore (cid:107) f − f N (cid:107) ≤ C τ (cid:107) (cid:112) det( d f ) (cid:107) L τ (Ω) N − , which is the claimed estimate. Since we have assumed η ≤ η , this estimate holds for N > N , where N is largest value of N such that e T ( f ) p ≥ η for at least one T ∈ T N . (cid:5) Remark 8.4.11. In Chapter 7 a modification of the algorithm is proposed so that itsconvergence in the L p norm is ensured for any function f ∈ L p (Ω) (or f ∈ C (Ω) when p = ∞ ). However this modification is not needed in the proof of Theorem 8.4.1, due tothe assumption that f is convex. hapter 9Variants of the greedy bisectionalgorithm Contents We study in this chapter several variants of the greedy algorithm that was discussedin Chapter 7 and Chapter 8.One of the key features of this algorithm is the decision function which governs thecreation of anisotropy by selecting among the available directions of refinement of a tri-angle. It was proved in Chapter 8 that for piecewise linear approximation, the L baseddecision function (9.1) leads to optimally adapted triangles. We consider in § L approximation error or the L ∞ interpolationerror respectively, both in the context of piecewise linear approximation. The study ofthese decision functions is motivated by the following reasons. The L based decisionfunction can be computed at a significantly smaller numerical cost than the L based38586 Chapter 9. Variants of the greedy bisection algorithm or L ∞ based decision functions, and is therefore the most suited method for numericalapplications. The L ∞ based decision function is computationally more costly, but it leadsto more general convergence results.We then focus our attention in § N − in the L norm)expected for such functions. This is inherently due to the fact that the geometry of thebisections used in the algorithm is too limited, regardless of the decision function which isbeing used. We thus consider some modifications of the greedy algorithm which partiallysolve this problem using alternative bisection choices. Instead of bisecting a triangle froma vertex to the midpoint of the opposite edge, we offer different possibilities which leadto better directional selectivity. In turn we obtain optimal convergence rates for simplecartoon functions of the form f = χ P where P is any half plane. The behavior of thegreedy algorithm on general cartoon functions remains an open question.Finally, we consider in § C function and which is in accor-dance with the convergence estimate established in Chapter 1 for optimized rectangularpartitions. Let us briefly recall the two steps of the greedy refinement algorithm studied in Chapter7 and Chapter 8 :a) A triangle T ∈ T which maximizes the approximation error is selected. T = argmax T (cid:48) ∈T e T (cid:48) ( f ) p . b) For each edge e ∈ { a, b, c } of the triangle T , the bisection of T from the midpoint of e to the opposite vertex defines two children T e and T e . The edge e which minimizes agiven decision function d T ( e, f ) is selected e := argmin e (cid:48) ∈{ a,b,c } d T ( e (cid:48) , f ) . and we define the new triangulation T (cid:48) by T (cid:48) := T − { T } + { T e , T e } . Theorem 8.4.1 states that if the function f to be approximated is C and strictlyconvex, and if the decision function is given by d T ( e, f ) := (cid:107) f − I T e f (cid:107) L ( T e ) + (cid:107) f − I T e f (cid:107) L ( T e ) , (9.1) .2. Alternative decision functions N →∞ N e T N ( f ) p ≤ C (cid:13)(cid:13)(cid:13)(cid:112) | det( d f ) | (cid:13)(cid:13)(cid:13) L τ (Ω) . (9.2)We now want to investigate two alternate choices for the decision function d T ( e, f ).The first of these choices is based on the L approximation error d T ( e, f ) := (cid:107) f − P T e f (cid:107) L ( T e ) + (cid:107) f − P T e f (cid:107) L ( T e ) , (9.3)where, for any triangle T , we denote by P T the L ( T ) orthogonal projection onto thespace IP of affine functions. The second choice is based on the L ∞ interpolation error, d T ( e, f ) := (cid:107) f − I T e f (cid:107) L ∞ ( T e ) + (cid:107) f − I T e f (cid:107) L ∞ ( T e ) . (9.4)Our motivation for studying the L based decision function is computational. Indeed,from the point of view of computer run time, the biggest cost of the greedy algorithmcomes from the evaluation of the decision function. As exposed below, a trick allows toevaluate the (discretized) L based decision function (9.3) much faster than the L basedor L ∞ based decision functions (9.1) and (9.4). The greedy algorithm has been imple-mented in C++ by Lihua Yang. For an image of realistic resolution, say 512 × L based decision function if it is efficiently evaluated.In contrast it may take up to a minute to generate the same number of triangles basedon a brute force evaluation of the L based decision function, or the L based or L ∞ based decision functions. As a result the L based decision function (9.3) is the preferredone in numerical experiments. We shall prove that the greedy algorithm based on the L decision function generates a sequence of triangulations satisfying the optimal estimate(9.2) when it is applied to any quadratic function f = q such that the quadratic form q is non-degenerate. In particular q may be either stricly convex, concave or of hyperbolictype. The latter case, which corresponds to a mixed signature (1 , 1) of the quadratic form q , is not treated in the study in the previous Chapter 8 of the L based decision function(9.1).The decision function based on the L ∞ error (9.4) is comparable to the L based deci-sion function in terms of computational cost, but it leads to the most complete theoreticalresults. Indeed we shall prove that the optimal convergence estimate (9.2) holds for anyquadratic function f = q such that the quadratic form q is non degenerate, as well as forfunctions which are C and stricly convex. The latter case was treated in Theorem 8.4.1of Chapter 8 for the L based decision function, but is not established for the L baseddecision function.Before turning to the detailed study of the different decision functions, we expose asannounced why the (discretized) L based decision function (9.3) is much less expensiveto compute than the L based or L ∞ based decision functions, in terms of computer runtime.88 Chapter 9. Variants of the greedy bisection algorithm Let T be a triangle, and let M T be the following 3 × , x, y ) of the subspace IP of L ( T ), M T := (cid:90) T x yx x xyy xy y dx dy = (cid:90) T (1 , x, y )(1 , x, y ) T dx dy. Let also V T ( f ) ∈ R be the vector of the first order moments of the function f on T , V T ( f ) := (cid:90) T (1 , x, y ) T f ( x, y ) dx dy. The L ( T ) orthogonal projection of f onto IP has the expression P T ( f ) = (1 , x, y ) M − T V T . Therefore, denoting by (cid:104)· , ·(cid:105) the L ( T ) scalar product, (cid:107) f − P T f (cid:107) L ( T ) = (cid:107) f (cid:107) L ( T ) − (cid:104) f, P T ( f ) (cid:105) + (cid:107) P T ( f ) (cid:107) L ( T ) = (cid:107) f (cid:107) L ( T ) − (cid:107) P T ( f ) (cid:107) L ( T ) = (cid:90) T f ( x, y ) dx dy − V T ( f ) T M − T V T ( f ) . Hence the L ( T ) approximation error of f on T has an expression in terms of the integralson T of the functions 1 , x, y, x , xy, y , and f, xf, yf, f . The Fubini formula transformsan integration on a bidimensional domain into two successive one dimensional integrations.Indeed for any triangle T and any g ∈ L ( T ), (cid:90) T g ( x, y ) dx dy = (cid:90) y ∗ ( T ) y ∗ ( T ) (cid:32)(cid:90) x ∗ ( T,y ) x ∗ ( T,y ) g ( x, y ) dx (cid:33) dy = (cid:90) y ∗ ( T ) y ∗ ( T ) (cid:16) G ( x ∗ ( T, y ) , y ) − G ( x ∗ ( T, y ) , y ) (cid:17) dy. (9.5)Where we have used the notations G ( x, y ) := (cid:90) x −∞ g ( x, y ) dy,y ∗ ( T ) := min { y ∈ R ; ∃ x ∈ R such that ( x, y ) ∈ T } ,x ∗ ( T, y ) := min { x ∈ R ; ( x, y ) ∈ T } , and where y ∗ ( T ) and x ∗ ( T, y ) are obtained by taking the maximum instead of the mini-mum in the definitions of y ∗ ( T ) and x ∗ ( T, y ) respectively. The important point in (9.5)is that the function G does not depend on the triangle T . If this function is known, then(9.5) transforms the bidimensional integration of g into a one dimensional integration.This strategy extends to the discrete setting with the Lebesgue measure on T replacedby the counting measure at pixels where the approximated function f is sampled and which .2. Alternative decision functions center contained in T . The integrals of the type (cid:82) T g ( x, y ) dxdy appearing inthe previous equations are therefore replaced by discrete sums (cid:88) ( i,j ) ∈ T g ( i, j ) , (9.6)where ( i, j ) stands for the center of the pixel, and without loss of generality runs over { , · · · , N − } for a N × N image. In particular the quantity (cid:107) f − P T f (cid:107) appearing inthe definition (9.3) of the L based decision function is replaced withinf π ∈ IP (cid:88) ( i,j ) ∈ T | f ( i, j ) − π ( i, j ) | . (9.7)Again, this quantity can be expressed in terms of the (discrete) integrals of the functions1 , x, y, · · · , xf, yf, f , which do not depend on T . The Fubini formula (9.5) is replacedwith (cid:88) ( i,j ) ∈ T g ( i, j ) = (cid:88) j ∗ ( T ) ≤ j ≤ j ∗ ( T ) G i ∗ ( T,j ) , j − G i ∗ ( T,j ) , j (9.8)where G i,j = (cid:88) ≤ i (cid:48) ≤ i g ( i (cid:48) , j ) ,j ∗ ( T ) = (cid:98) y ∗ ( T ) (cid:99) i ∗ ( T, j ) = (cid:98) x ∗ ( T, j ) (cid:99) , and similarly j ∗ ( T ) = (cid:98) y ∗ ( T ) (cid:99) and i ∗ ( T, j ) = (cid:98) x ∗ ( T, j ) (cid:99) . The points ( i ∗ ( T, j ) , j ), forall j ∗ ( T ) ≤ j ≤ j ∗ ( T ), are illustrated on the right of Figure 9.1 and surrounded bygrayed squares. On the same figure the points ( i ∗ ( T, j ) , j ), for all j ∗ ( T ) ≤ j ≤ j ∗ ( T ), aresurrounded by white squares.Consider a fixed function g , such as 1 , x, y, x , xy, y or f, xf, yf, f which are neededfor the evaluation of (9.7), for which the sum (9.6) needs to be evaluated for a largenumber of distinct triangles. The matrix ( G i,j ), 0 ≤ i ≤ N , 1 ≤ j ≤ N is evaluatedonce, and for any triangle T the sum (9.6) is computed using (9.5) which involves a muchsmaller collection of points, see Figure 9.1. This method greatly accelerates the numericalimplementation of the greedy algorithm, especially for the large and isotropic trianglesthat occur in its first steps. In this section, we study the algorithm when applied to a quadratic polynomial q suchthat det( q ) > 0. We shall assume without loss of generality that q is positive definite,since all our results extend in a trivial manner to the negative definite case.We first establish that the refinement procedure, using either the L based decisionfunction (9.3) or the L ∞ based (9.4), always selects for bisection the longest edge in thesense of the q -metric | u | q := (cid:112) q ( u ) for u ∈ R , as was already the case for the L baseddecision function (9.1). This is used to prove that the refinement procedure producestriangles which tend to adopt an optimal aspect ratio.90 Chapter 9. Variants of the greedy bisection algorithm Figure f on a triangle T in the L or L ∞ norm is proportional to the number of points with integer coordinateson T (left). Fewer points need to be considered to compute the L projection error (right). The L ∞ -based split Let us denote by α T ( f ) := (cid:107) f − I T f (cid:107) L ∞ ( T ) , the interpolation error in the sup norm. The decision function (9.4) can be re-expressedas d T ( e, f ) = α T e ( f ) + α T e ( f ) . (9.9) Theorem 9.2.1. If | a | q > max {| b | q , | c | q } , then d T ( a, q ) < min { d T ( b, q ) , d T ( c, q ) } . The-refore the refinement procedure based on (9.9) selects the longest edge in the sense of the q -metric. In order to prove this result, we need to study the interpolation error in detail. Proposition 9.2.2. Let T be a triangle with edges a, b, c such that | a | q ≥ | b | q ≥ | c | q , andlet w ∈ R and r > be the center and radius of the circumscribed circle for the q -metric,i.e. such that | v − w | q = r for all the vertices v of T . Then | a | q ≤ α T ( q ) ≤ r . Right equality holds if T is acute, i.e. (cid:104) Qb, c (cid:105) ≤ , where Q is the symmetric matrixassociated to the quadratic form q . Left equality holds if T is obtuse, i.e. (cid:104) Qb, c (cid:105) ≥ . Proof: At any point u ∈ IR , we have( q − I T q )( u ) = | u − w | q − r . Indeed, the difference ( q − I T q )( u ) − ( | u − w | q − r ) is an affine function of u , whichvanishes at the three vertices of T . Hence this difference is zero.The function | u − w | q − r is negative on T with maximal value 0 at the vertices. If T is acute, then its minimal value on T is − r and is attained at w ∈ T . If T is not acute,then the minimum is attained at m a , the midpoint of a , and if we choose a vertex v atone end of a , we obtain the value at the minimum by Pythagoras’ identity which gives( q − I T q )( m a ) = | m a − w | q − r = | m a − w | q − | v − w | q = −| v − m a | q = −| a | q / . .2. Alternative decision functions (cid:5) The dichotomy in the above result is illustrated in the case of the euclidean metric onfigure 4. Note that it would be sufficient to establish the above proof in this particular case,since we can perform an affine coordinate change φ = Q − such that q ◦ φ is the standardeuclidean form and that the L ∞ interpolation error is left invariant by this coordinatechange. vw wr rma a Figure 4 : maximum point for the L ∞ interpolation errorWe now prove the following result which clearly implies Theorem 9.2.1. Proposition 9.2.3. Let T be a triangle with edges | a | q ≥ | b | q ≥ | c | q . We then have : d T ( b, q ) − d T ( a, q ) ≥ 14 ( q ( a ) − q ( b )) d T ( c, q ) − d T ( a, q ) ≥ (cid:32) q ( a ) − (cid:18) | b | q + | c | q (cid:19) (cid:33) . b T T T a T a b b T c c1 T Figure 5 : Notations in the proof of Proposition (9.2.3) Proof: We introduce sub-triangles T ie , i = 1 , e = a, b, c , as defined in Figure 5,which correspond to the three refinement scenarios. With such definitions, the followinginequalities are easily derived from Proposition (9.2.2)4 α T a ( q ) = q ( b ) (since T a is obtuse)4 α T b ( q ) ≥ q ( a )4 α T c ( q ) ≥ q ( a )4 α T c ( q ) ≥ q ( b )92 Chapter 9. Variants of the greedy bisection algorithm On the other hand, we shall prove α T a ( q ) ≤ α T b ( q ) , (9.10)and 4 α T a ( q ) ≤ (cid:18) | b | q + | c | q (cid:19) . (9.11)The proof of (9.10) follows from elementary geometric observations. Let L be a line whichis parallel to c but does not contain it, and for x ∈ L denote by T x the triangle of vertices x and the end points of c . Denoting respectively by u ( x ) and v ( x ) the diameter of T x and of its circumscribed circle for the q -metric, we remark that these functions decreasemonotonously as x tends to a point x c which is the orthogonal projection (also for the q -metric) of the mid-point of c onto L . Since the function x (cid:55)→ α T x ( q ) is continuous in x andequal to u ( x ) or v ( x ) at all x , we conclude that this function also decreases monotonouslyas x tends to x c . Applying this observation to the line that contains m a and m b the mid-points of a and b , and remarking that m a is closer to x c than m b , we conclude that (9.10)holds.From (9.10) and the first set of inequalities, we obtain the first statement of the theoremsince d T ( b, q ) − d T ( a, q ) = α T b ( q ) + α T b ( q ) − α T a ( q ) − α T a ( q ) ≥ α T b ( q ) − α T a ( q ) ≥ ( q ( a ) − q ( b )) . The proof of (9.11) also follows from elementary geometric observations. In the casewhere T a is obtuse, one of its edges e is such that 4 α T a = q ( e ), and (9.11) follows since | e | q ≤ ( | b | q + | c | q ) for all e , using triangle inequality. Let R be an acute triangle of vertices u, v, w and let m = ( u + v ) / 2. Remarking than the center of the (euclidean) circumscribedcircle to R lies inside R , one easily checks by convexity that the (euclidean) diameter ofthis circle is smaller than | m − u | + | m − w | . In the case where T a is acute, the diameter ofits circumscribed circle is thus bounded by ( | b | q + | c | q ), as illustrated on Figure 6 when q is the euclidean metric.From (9.11) and the first set of inequalities, we obtain the second statement of thetheorem since d T ( c, q ) − d T ( a, q ) = α T c ( q ) + α T c ( q ) − α T a ( q ) − α T a ( q ) ≥ (cid:18) q ( b ) + q ( a ) − q ( b ) − (cid:16) | b | q + | c | q (cid:17) (cid:19) = (cid:18) q ( a ) − (cid:16) | b | q + | c | q (cid:17) (cid:19) . (cid:5) The L -based split We now denote by β T ( f ) := (cid:107) f − P T f (cid:107) L ( T ) , .2. Alternative decision functions c/2 rrob/2 Figure 6 : The case where T a is acute.the orthogonal projection error in the L norm. The decision function (9.3) now writes d T ( e, f ) = β T e ( f ) + β T e ( f ) . (9.12)We shall prove that the refinement procedure based on (9.12) behaves in a similar way as(9.9). Theorem 9.2.4. If d, e ∈ { a, b, c } are two edges such that | d | q < | e | q , then d T ( e, q ) Let T be a triangle with edges a, b, c and area | T | , and let q be aquadratic function. Then β T ( q ) = | T | (cid:0) c ( q ( a ) + q ( b ) + q ( c )) − c det( q ) | T | (cid:1) . (9.13) with constants c = and c = c = . Proof: We first prove (9.13) on the triangle R of vertices { (0 , , (0 , , (1 , } . It is easyto compute the integrals on R of monomials x k y l , k + l ≤ 4. Using these quantities, wecan derive the orthogonal projection of a quadratic function thanks to a formal computingprogram, which gives us q = ux + vy + 2 wxy ⇒ P R q = − u + v + w 10 + 2 x u + w ) + 2 y v + w ) . This yields the following expression for the L -squared error between q and its projection (cid:90) R ( q − P R q ) = (cid:90) R ( q − P R q ) = 1300 (cid:18) u + 2 uv v − uw − vw + 7 w (cid:19) , which is equivalent to (9.13).For an arbitrary triangle T , using an affine bijective transformation φ from R to T ,we have β T ( q ) = J φ β R (˜ q ) , Chapter 9. Variants of the greedy bisection algorithm where ˜ q = q ◦ φ and J φ is the constant jacobian of φ . Using the validity of (9.13) on R and the fact that | T | = J φ | R | , we thus obtain β T ( q ) = | T | (cid:16) c ( ˜q (˜ a ) + ˜q (˜ b ) + ˜q (˜ c )) − c det( ˜q ) | R | (cid:17) , where ˜q is the quadratic form associated to ˜ q and ˜ e denotes the edge segment of R mappedonto e by φ . Since ˜q (˜ e ) = q ( e ) and det( ˜q ) = J φ det( q ), we obtain (9.13) for T . (cid:5) We now prove the following result which clearly implies Theorem 9.2.4. Corollary 9.2.6. Let T be a triangle with edges a, b, c and area | T | , with | a | q ≥ | b | q and | a | q ≥ | c | q . Then d T ( b, q ) − d T ( a, q ) ≥ c | T | ( q ( a ) − q ( b ) ) , (9.14) d T ( c, q ) − d T ( a, q ) ≥ c | T | ( q ( a ) − q ( c ) ) . (9.15) Proof: The children triangles all have area | T | / 2, and take their edges among a, b, c , a/ , b/ , c/ a − b , b − c , c − a (recall that a + b + c = 0). We use the identity q ( u + v ) + q ( u − v ) = 2 q ( u ) + 2 q ( v ) , which is valid for all quadratic forms, and implies q (cid:18) b − c (cid:19) = q ( b ) + q ( c )2 − q ( a )4 . Using (9.13), this allows us to compute the local projection errors for the children of T .For example bisecting the edge a creates two children T (cid:48) and T (cid:48)(cid:48) with edges a , b, c − b and a , c, b − c , and therefore β T (cid:48) ( q ) = | T (cid:48) | (cid:32) c (cid:18) q (cid:16) a (cid:17) + q ( b ) + q (cid:18) c − b (cid:19)(cid:19) − c det( q ) | T (cid:48) | (cid:33) = | T | (cid:32) c (cid:18) q ( b ) + q ( c )2 (cid:19) − c det( q ) | T | (cid:33) and similarly β T (cid:48)(cid:48) ( q ) = | T | (cid:32) c (cid:18) q ( c ) + q ( b )2 (cid:19) − c det( q ) | T | (cid:33) . Adding up, we thus obtain d T ( a, q ) = | T | (cid:0) c (3 q ( b ) + q ( c )) + c ( q ( b ) + 3 q ( c )) − c det( q ) | T | (cid:1) . Subtracting this from the analogous expression for d T ( b, q ) we obtain d T ( b, q ) − d T ( a, q ) = c | T | (cid:0) (cid:0) q ( a ) − q ( b ) (cid:1) + 6 q ( c )( q ( a ) − q ( b )) (cid:1) (9.16)which implies (9.14). Exchanging b with c we obtain (9.15) which concludes the proof. (cid:5) .2. Alternative decision functions Convergence towards to optimal aspect ratio We have established in Theorems 9.2.1 and 9.2.4 that for any triangle T and anyquadratic function q ∈ IP such that the homogeneous component q ∈ IH is positivedefinite, the decision functions (9.3) and (9.4) based on the L or L ∞ error lead to thebisection of the same edge of T : the longest edge of the q -metric.Similarly it is established in Lemma 8.3.1 in the previous chapter that the decisionfunction (9.1) based on the L interpolation error leads to the bisection of the same edge.As a result, the decision functions can be interchanged without changing the result ofCorollary 8.3.10 which extends as follows. Theorem 9.2.7. Let Ω be a triangle, and let q be a quadratic function with positivedefinite associated quadratic form q . Let q N be the approximant of q on Ω obtained by thegreedy algorithm for the L p metric, using the L , L or L ∞ decision function (9.1), (9.3),(9.4). Then lim sup N →∞ N (cid:107) q − q N (cid:107) L p (Ω) ≤ C (cid:107) (cid:112) det( q ) (cid:107) L τ (Ω) , (9.17) where τ = p + 1 and where the constant C depends only on on the choice of the approxi-mation operator A T used in the definition of the approximant. The local perturbation analysis exposed in the previous chapter, § L based decision function extends without difficulties to the L and L ∞ based decisionfunctions. From this point, it is not difficult to extend Theorem 9.2.7 to functions f whichare close enough to a positive definite quadratic function q , in the sense that d q ≤ d f ≤ (1 + µ ) d q (9.18)for a constant µ > C and strictly convex function f defined on a polygonal domainΩ. In order to establish an asymptotical error estimate, we need to be ensured thatafter sufficiently many steps of the greedy algorithm, the target function f can be wellapproximated by a quadratic function q = q T on each triangle T , in the sense of (9.18),so that our local results will apply on such triangles. This property follows from the nextproposition, which is an extension of Proposition 8.4.10 in which the L based decisionfunction (9.1) is replaced with the L ∞ based decision function. We postpone its proof tothe appendix § Proposition 9.2.8. Let f be a C function such that d f ( x ) ≥ mI for some arbitrarybut fixed m > independent of x . Let T N be the triangulation generated by the greedyalgorithm applied to f using the L ∞ decision function given by (9.4). Then lim N → + ∞ max T ∈T N diam( T ) = 0 , i.e. the diameter of all triangles tends to . As a consequence, we may extend Theorem 8.4.1 to the L ∞ based decision function.96 Chapter 9. Variants of the greedy bisection algorithm Theorem 9.2.9. Let Ω be a polygonal domain and let f ∈ C (Ω) be such that d f ( x ) ≥ mI for all x ∈ Ω , for some arbitrary but fixed m > independent of x . Let f N be theapproximant obtained by the greedy algorithm for the L p metric, using the L decisionfunction (9.1) or the L ∞ decision function (9.4). Then lim sup N →∞ N (cid:107) f − f N (cid:107) L p ≤ C (cid:107) (cid:112) det( d f ) (cid:107) L τ , where τ = p + 1 and where C is a constant independent of p , f and m . In this section, we study the algorithm when applied to a quadratic polynomial q suchthat det( q ) < 0. We shall follow the same steps, and reach similar conclusions, as in thepositive definite case, using a measure of non-degeneracy which is equivalent to ρ q ( T ). Ifa triangle T has edges a, b, c such that | q ( a ) | ≥ | q ( b ) | ≥ | q ( c ) | , we will still refer to a asthe “longest” edge in the sense of q , although q does not define a proper metric anymore.Recall that ρ q ( T ) := | q ( a ) || T | (cid:112) | det q | . The following inequalities that will be repeatedly used in this section can be derivedwhen ρ q ( T ) is large enough. We postpone their proof to the appendix. Proposition 9.2.10. Let T be a triangle such that | q ( a ) | ≥ | q ( b ) | ≥ | q ( c ) | , and define d = b − c .If ρ q ( T ) ≥ , then q ( a ) q ( b ) ≥ , | q ( a ) | ≥ | q ( c ) | and | q ( a ) | ≥ | q ( d ) | . (9.19) If ρ q ( T ) ≥ , then | q ( a ) | ≤ | q ( b ) + q ( c ) | . (9.20) The L ∞ -based splitTheorem 9.2.11. The refinement procedure based on (9.9) selects the longest edge in thesense of q : if | q ( a ) | > max {| q ( b ) | , | q ( c ) |} and ρ q ( T ) ≥ , then d T ( a, q ) < min { d T ( b, q ) , d T ( c, q ) } . This theorem is very similar to the one for positive quadratic functions. In order toprove it, we first study the interpolation error which has a simple form in this context. Proposition 9.2.12. Let T be a triangle with edges a, b, c . Then α T ( q ) = 14 max {| q ( a ) | , | q ( b ) | , | q ( c ) |} . .2. Alternative decision functions Proof: Let x be the point of T at which the interpolation error is attained : x =argmax T | q − I T q | . If x is in the interior of T , then it must be a local extremum of q − I T q . However this function has only one critical point on R , which is not an extre-mum since q has mixed signature. Therefore x must lie on an edge. On each edge of T ,the function q − I T q is a one dimensional quadratic function vanishing at the endpoints.It follows that x must lie in the middle of an edge and the result follows. (cid:5) Proposition 9.2.13. Let T be a triangle with edges | q ( a ) | ≥ | q ( b ) | ≥ | q ( c ) | and suchthat ρ q ( T ) ≥ . Then d T ( b, q ) − d T ( a, q ) ≥ | q ( a ) | − | q ( b ) | ,d T ( c, q ) − d T ( a, q ) ≥ | q ( a ) | . Proof: The bisection through the edge a creates two sub-triangles T a , T a of edges respec-tively a , b, d and a , c, d . Using the last two inequalities in (9.19) we obtain that 4 α T a =max {| q ( a/ | , | q ( b ) |} and 4 α T a = | q ( a/ | . Therefore4 d T ( a, q ) = | q ( a ) | (cid:26) | q ( a ) | , | q ( b ) | (cid:27) . On the other hand, the choice of bisecting the edge b creates two subtriangles respectivelycontaining the edges a and b , and the choice of bisecting the edge c creates two subtrianglesrespectively containing the edges a and b . This provides us with the lower bounds4 d T ( b, q ) ≥ | q ( a ) | + | q ( b ) | , d T ( c, q ) ≥ | q ( a ) | + | q ( b ) | . The proposition follows easily, distinguishing between the two cases | q ( a ) | ≤ | q ( b ) | and | q ( a ) | ≥ | q ( b ) | . (cid:5) The L -based split The same conclusions can be reached for the refinement procedure based on (9.12). Theorem 9.2.14. The refinement procedure based on (9.12) selects the longest edgein the sense of q : if | q ( a ) | > max {| q ( b ) | , | q ( c ) |} and ρ q ( T ) ≥ , then d T ( a, q ) < min { d T ( b, q ) , d T ( c, q ) } . Proof: The expression found in (9.16) remains valid when det( q ) < 0. Substituting a by b or c and subtracting, we obtain d T ( b, q ) − d T ( a, q ) = 5 c | T | ( q ( a ) − q ( b )) (cid:18) s + q ( b )5 (cid:19) ,d T ( c, q ) − d T ( a, q ) = 5 c | T | ( q ( a ) − q ( c )) (cid:18) s + q ( c )5 (cid:19) , Chapter 9. Variants of the greedy bisection algorithm where s = q ( a ) + q ( b ) + q ( c ). Using (9.19), we see that the quantities s + q ( b )5 , s + q ( c )5 , q ( a ) − q ( b ) and q ( a ) − q ( c )all have the same sign as q ( a ) and are non-zero. It follows that d T ( a, q ) < min { d T ( b, q ) , d T ( c, q ) } which concludes the proof. (cid:5) Convergence toward the optimal aspect ratio. We have proved that the refinement procedure - either based on the L ∞ or L decisionfunction - systematically picks the longest edge in the sense of q if ρ q ( T ) ≥ 4. Similarlyto the positive definite case, we now study the iteration of several refinement steps andshow that the generated triangles tend to adopt an optimal “aspect ratio” in the senseof the measure of non-degeneracy ρ q ( T ) introduced in § T is a triangle with edges a, b, c , let us recall that ρ q ( T ) := max {| q ( a ) | , | q ( b ) | , | q ( c ) |}| T | (cid:112) | det q | . As in § ρ q ( T ). If T is a triangle with edges a, b, c ,we define σ q ( T ) := min {| q ( a ) + q ( b ) | , | q ( b ) + q ( c ) | , | q ( c ) + q ( a ) |} | T | (cid:112) | det q | . (9.21)Note that if q was a positive quadratic form, this definition is consistent with (8.22). Wedefine our measure of non-degeneracy κ q by κ q ( T ) := max { σ q ( T ) , / } . (9.22)We first show that the quantities κ q and ρ q are equivalent. Proposition 9.2.15. For any triangle T , one has σ q ( T ) ≤ ρ q ( T ) , (9.23) and κ q ( T ) ≤ ρ q ( T ) ≤ κ q ( T ) . (9.24) Proof: We denote by a, b, c the edges of T , and we assume that | q ( a ) | ≥ | q ( b ) | ≥ | q ( c ) | .The inequality (9.23) follows directly from the triangle inequality :2 | T | (cid:112) | det q | σ q ( T ) ≤ | q ( b ) + q ( c ) | ≤ | q ( a ) | ≤ | T | (cid:112) | det q | ρ q ( T ) . .2. Alternative decision functions ρ q ( T ) is always larger than 2 and therefore (9.23) implies the leftinequality in (9.24).It remains to prove the right inequality in (9.24). If ρ q ( T ) ≤ 8, it is immediate since κ q ( T ) ≥ and 323 52 ≥ 8. If ρ q ( T ) ≥ q ( a ) q ( b ) ≥ | q ( b ) + q ( c ) | ≤ | q ( a ) + q ( c ) | ≤ | q ( a ) + q ( b ) | . We obtain using (9.20) that ρ q ( T ) = | q ( a ) || T | (cid:112) | det q | ≤ | q ( b ) + q ( c ) || T | (cid:112) | det q | = 323 σ q ( T ) ≤ κ q ( T ) , which concludes the proof. (cid:5) Similar to ρ q , the quantity κ q is invariant by a linear coordinate changes φ , in thesense that κ q ◦ φ ( T ) = κ q ( φ ( T )) . Our next result shows that κ q ( T ) is always reduced by the refinement procedure. Proposition 9.2.16. If T is a triangle with children T and T obtained by the refinementprocedure for the quadratic function q , using the L based or L ∞ based decision function,then max { κ q ( T ) , κ q ( T ) } ≤ κ q ( T ) . Proof: Let us assume that a is the longest edge in the sense of q . In the case where ρ q ( T ) ≥ 4, we already noticed in the proof of Proposition 9.2.15 that σ q ( T ) = | q ( b ) + q ( c ) | | T | (cid:112) | det q | . Moreover, according Theorems 9.2.11 and 9.2.14 the edge a is selected by both decisionfunctions. It follows that the children T i have edges a/ , b, ( c − b ) / a/ , ( b − c ) / , c (recall that a + b + c = 0). We thus have2 | T | (cid:112) | det q | σ q ( T i ) ≤ (cid:12)(cid:12)(cid:12)(cid:12) q (cid:16) a (cid:17) + q (cid:18) b − c (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) q (cid:18) b + c (cid:19) + q (cid:18) b − c (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) = | q ( b ) + q ( c ) | 2= 2 | T | (cid:112) | det q | σ q ( T ) . We have proved that σ q ( T i ) ≤ σ q ( T ), and it readily follows that κ q ( T i ) ≤ κ q ( T ).In the case where ρ q ( T ) ≤ 4, we remark that T i contains at least one edge from T , say s ∈ { a, b, c } and one half-edge t ∈ { a , b , c } . This provides an upper bound for σ q : σ q ( T i ) ≤ | q ( s ) + q ( t ) | | T | (cid:112) | det q | ≤ | q ( a ) | + | q ( a ) | | T | (cid:112) | det q | = 58 ρ q ( T ) ≤ . (9.25)00 Chapter 9. Variants of the greedy bisection algorithm Therefore κ q ( T i ) = ≤ κ q ( T ). (cid:5) Our next objective is to show that as we iterate the refinement process, the value of κ q ( T ) becomes bounded independently of q for almost all generated triangles. If T is atriangle such that | q ( a ) | ≥ | q ( b ) | ≥ | q ( c ) | and if the edge a is cut (which is the case assoon as ρ q ( T ) ≥ 4, using the L or L ∞ based decision functions) we define ψ q ( T ) as thesubtriangle containing the edge c . We first prove a result which is analogous to Proposition8.3.7. Proposition 9.2.17. If T is a triangle such that κ q ( ψ q ( T )) > , then κ q ( ψ q ( T )) ≤ κ q ( T ) . Proof: Let S be a triangle such that κ q ( ψ q ( S )) > . According to (9.25), one must have ρ q ( S ) > 4. We assume that the edges of S satisfy | q ( a ) | ≥ | q ( b ) | ≥ | q ( c ) | . Since the threeedges of ψ q ( S ) are a , c and d = b − c , it follows from (9.19) that the longest edge of ψ q ( S )in the sense of q is a .Since κ q ( ψ q ( T )) ≥ κ q ( ψ q ( T )) ≥ κ q ( ψ q ( T )) > , we can apply this observation to the triangles T , ψ q ( T ) and ψ q ( T ). Therefore, denoting by a the longest edge of T in the sense of q , we find that a is the longest edge of ψ q ( T ). Since | ψ q ( T ) | = | T | / 8, we obtain that ρ q ( ψ q ( T )) = ρ q ( T ) / 8. Using the results of Proposition9.2.15, we thus have κ q ( ψ q ( T )) = σ q ( ψ q ( T )) ≤ ρ q ( ψ q ( T )) = 116 ρ q ( T ) ≤ κ q ( T ) , which concludes the proof. (cid:5) An immediate consequence of Propositions 9.2.16 and 9.2.17 is the following. Corollary 9.2.18. If ( T i ) i =1 are the 8 children obtained from 3 successive refinementprocedures from T for the function q , then– for all i, κ q ( T i ) ≤ κ q ( T ) ,– there exists i such that , κ q ( T i ) ≤ κ q ( T ) or κ q ( T i ) = . We eventually obtain the two following results which proof are exactly similar to the onesof Theorem 8.3.9 and Corollary 8.3.10. Theorem 9.2.19. Let T be a triangle, and let q a quadratic function of mixed type. Let k = ln(2 κ q ( T ) / − ln 2 . Then after n applications of the refinement procedure starting from T , atmost Cn k n/ of the n generated triangles are such that κ q ( S ) > where C is an absoluteconstant. Therefore the proportion of such triangles tends exponentially to as n → + ∞ . .3. Alternative bisection choices, and the approximation of cartoon functions Theorem 9.2.20. Let Ω be a triangle, and let q be a quadratic function such that det q < . Let q N be the approximant of q on Ω obtained by the greedy algorithm for the L p metric,using the L or L ∞ decision function (9.3), (9.4). Then lim sup N →∞ N (cid:107) q − q N (cid:107) L p (Ω) ≤ C (cid:107) (cid:112) det( q ) (cid:107) L τ (Ω) , where τ = p + 1 and where the constant C depends only on on the choice of the approxi-mation operator A T used in the definition of the approximant. A popular, although simplistic, model for images is the set of cartoon functions, seeDefinition 4.3.1. In summary a function f defined on a domain Ω is a cartoon functionif f is C except along a finite collection of C curves where f may have discontinuities.For any cartoon function f on the bi-dimensional domain Ω = [0 , , the approximations( f N ) N ≥ obtained by retaining the N largest coefficients in the wavelet expansion of f satisfy, as observed in (4.3), (cid:107) f − f N (cid:107) L (Ω) ≤ CN − . (9.26)One of the purposes of anisotropic mesh adaptation is to improve on this estimate, andwe therefore focus in this section on the L approximation error. For any triangle T andany function f ∈ L ( T ), we define e T ( f ) := inf π ∈ IP (cid:107) f − π (cid:107) L ( T ) = (cid:107) f − P T ( f ) (cid:107) L ( T ) , (9.27)where P T is the operator of L ( T ) orthogonal projection onto the space IP of affinefunctions. For any triangulation T of a polygonal domain Ω, and any f ∈ L (Ω), wedenote by e T ( f ) the error of approximation of f in the L (Ω) norm by discontinuouspiecewise affine functions on T . This quantity is defined by e T ( f ) := (cid:88) T ∈T e T ( f ) = (cid:107) f − P T ( f ) (cid:107) L (Ω) , (9.28)where P T is the L (Ω) orthogonal projection onto the space of discontinuous piecewiseaffine functions on the triangulation T . The heuristic analysis led in (4.15) suggests that,if f is a cartoon function on a polygonal domain Ω, then there exists a sequence ( T N ) N ≥ N of anisotropic triangulations of Ω, satisfying T N ) ≤ N , and such that e T N ( f ) ≤ CN − . (9.29)We make in this section an attempt, not totally conclusive, to answer the followingquestion : is it possible to produce a sequence ( T N ) N ≥ N of triangulations satisfying (9.29)using a hierarchical refinement algorithm ? The analysis presented in § Chapter 9. Variants of the greedy bisection algorithm Figure S T , (right) Six bisection choices S T .unfortunately the refinement algorithm studied in the previous chapters does not satisfy(9.29). This failure is inherently due to the limited geometric selectivity of the algorithmdue to the choice of bisections. For this reason, we study other types of bisections in § § T which maximizes the approximation error is selected within the par-tition T . T = argmax T (cid:48) ∈T e T (cid:48) ( f ) . 2. The refinement procedure considers a finite set S T of segments s along which theelement T may be split into two, thus creating two children elements T s and T s .The segment s ∈ S T which minimizes a given decision function d T ( s, f ) is selectedand we define T (cid:48) := T − { T } + { T s , T s } . Two possible choices for the set S T , denoted by S T and S T , are illustrated on Figure9.2. The set S T contains three segments, which join one of the three vertices of T to themidpoint of the opposite edge. It corresponds to the three bisection choices studied inthe previous chapters. The set S T contains a larger choice, namely six segments joining avertex v of T to the barycenter v + v of the other vertices v and v . Other bisectionchoices, that lead to non-triangular partitions, are displayed on Figure 9.9.We typically consider the L based decision function : if an element T is bisected alonga segment s , creating two smaller elements T s and T s , then d T ( s, f ) := e T s ( f ) + e T s ( f ) . (9.30)Starting from a triangulation T N of the polygonal domain Ω on which the function f needs to be approximated, with T N ) = N , the greedy algorithm creates step after stepa sequence T N +1 , T N +2 , · · · of triangulations of Ω satisfying T N ) = N . At the time ofwriting, unfortunately, the author does not know how to choose the set of segments S T and the decision function d T ( s, f ) such that given any cartoon function f , the sequence( T N ) N ≥ N of partitions produced by the greedy algorithm satisfies the desired estimate(9.29).As an intermediate objective, we choose to study the behavior of the refinement proce-dure when f is a particularly simple cartoon function, the characteristic function f = χ P .3. Alternative bisection choices, and the approximation of cartoon functions P is a half plane. Up to translation and rotation, we may always assume that P isof the form P := { ( x, y ) ∈ R ; y ≥ } . We denote by D the line D = ∂P = { ( x, 0) ; x ∈ R } , (9.31)and we observe that, denoting by ˚ T the interior of a triangle T , one has e T ( χ P ) = 0 if and only if ˚ T ∩ D = ∅ . (9.32)We shall first prove that, for certain configurations between the line D and the initialtriangulation T , the refinement procedure based on the three bisection choices S T pro-duces a sequence of triangulations ( T N ) N ≥ that does not satisfy the optimal convergenceestimate e T N ( χ P ) ≤ CN − . (9.33)We then show that if the six bisection choices S T illustrated on Figure (9.2) are allowed,and if an appropriate decision function is used, then (9.33) is satisfied. Eventually, weconsider another variant of the refinement procedure, which only involves three bisectionchoices at each step, and for which (9.33) is again satisfied. We show in this section that the greedy algorithm studied in the previous chaptersbeats the error estimate (9.26) associated to wavelet expansions when f = χ P . Howeverit fails to achieve the desired error estimate (9.33). More precisely, Proposition 9.3.4shows that, if the decision function d T ( s, f ) satisfies Assumptions 9.3.1, then the greedyrefinement procedure for the function f = χ P , starting from an arbitrary triangulation T N , T N ) = N , produces a sequence ( T N ) N ≥ N of triangulations, T N ) = N , suchthat e T N ( χ P ) ≤ CN − λ/ , (9.34)where λ/ . · · · . This estimate cannot be improved since we establish in Proposition9.3.2 that there exists a triangle T ref such that the sequence ( T N ) N ≥ produced by thegreedy refinement procedure for the function f = χ P starting from T = { T ref } satisfies e T N ( χ P ) ≥ cN − λ/ (9.35)where c > T , T (cid:48) are D -affine equivalent if there exists an affine changeof coordinates φ : R → R , that transforms T into T (cid:48) and leaves the line D defined at(9.31) invariant : φ ( T ) = T (cid:48) and φ ( D ) = D. If T and T (cid:48) are D -affine equivalent, then it follows from a change of variables that | T | − e T ( χ P ) = | T (cid:48) | − e T (cid:48) ( χ P ) . (9.36)The following assumptions on the decision function are illustrated on Figure (9.3).Their purpose is to create, through the refinement process, a thin layer of triangles alongthe line D .04 Chapter 9. Variants of the greedy bisection algorithm Figure D (Thick),forbidden bisection choices (dashed), authorized bisection choices (full). Assumptions 9.3.1. Let T be a triangle such that e T ( χ P ) (cid:54) = 0 . In the following wework under the assumption that segment s ∈ S T which minimizes the decision function d T ( s, χ P ) satisfies the following properties.a) If there is such a possibility in S T , the bisection of T along s creates a child T (cid:48) suchthat e T (cid:48) ( χ P ) = 0 .b) If a) is impossible then the decision function selects a segment s ∈ S T joining a vertexof the triangle T to the midpoint of an edge which is intersected by the line D . These two properties are illustrated in figure 9.3. Numerical experiments stronglysuggest that they are satisfied for the decision function based on the L error which isdefined by (9.30).We first prove the lower error bound (9.35), and for that purpose we denote by T ref the triangle of vertices (0 , − , (3 , − , (0 , .T ref is the large triangle enclosing the others on Figure 9.4. The refinement procedureapplied to the triangle T ref and the function f = χ P creates an infinite master tree P ref oftriangles. If P is a finite subtree of P ref , (in other words P contains T ref and the nodes of P have either two or no children), then the leaves of P define a triangulation T of T ref . Proposition 9.3.2. There exists a constant c > such that : for any triangulation T associated to a finite subtree of P ref , we have e T ( χ P ) ≥ c T ) − λ/ (9.37) where c > is a positive constant and λ = 1 . · · · is the solution of (cid:18) (cid:19) λ + (cid:18) (cid:19) λ = 1 . (9.38) In particular let ( T N ) N ≥ be the sequence of triangulations generated by the greedy al-gorithm applied to the function f = χ P and starting from T = { T ref } . If the decisionfunction satisfies Assumptions 9.3.1, then for all N ≥ one has e T N ( χ P ) ≥ cN − λ/ . .3. Alternative bisection choices, and the approximation of cartoon functions Figure T ref , some of its children and grand-children, witha decision function satisfying Assumptions 9.3.1. The colored triangles satisfy e T ( χ P ) = 0. aba 1 cba 1 2 Figure D -affine equivalence(right). Proof: Figures 9.4 and 9.5 reveal a surprising property of the master tree P ref . After afew steps, as illustrated on the right of Figure 9.4 we obtain two (white) triangles whichare D -affine equivalent to the initial triangle T ref , and three (colored) triangles on whichthe approximation error e T ( χ P ) is zero. It follows that, up to the relation of D -affineequivalence, the tree P ref is self-similar and is a repetition of the pattern presented onthe right of Figure 9.5.We define an auxiliary tree P (cid:48) ref as follows. The nodes of P (cid:48) ref are the triangles T ∈ P ref which are affine equivalent to T ref with respect to D . The children T , T of a triangle T in P (cid:48) ref are the grand-child and grand-grand-child of T in P ref which are affine equivalentto T ref . Note that | T | = 4 | T | = 8 | T | . (9.39)Let T be a triangulation of T ref associated to a finite subtree P of P ref . We denote by P (cid:48) the smallest subtree of P (cid:48) ref such that any leaf of P (cid:48) is contained in a leaf of P . We denoteby T (cid:48) the collection of leaves of P (cid:48) . For instance, if T is any of the three triangulationsillustrated on Figure 9.4, then T (cid:48) consists of the two white triangles in the right of Figure9.4. A triangle T ∈ T contains a single triangle T (cid:48) ∈ T (cid:48) if e T ( χ P ) (cid:54) = 0, and none otherwise.Therefore T (cid:48) ) ≤ T ) . (9.40)Furthermore, since all the elements of T (cid:48) are D -affine equivalent to T ref , we have e T ( χ P ) = (cid:88) T ∈T e T ( χ P ) ≥ (cid:88) T ∈T (cid:48) e T ( χ P ) = c (cid:88) T ∈T (cid:48) | T | (9.41)06 Chapter 9. Variants of the greedy bisection algorithm where c = e T ref ( χ P ) / | T ref | . For any triangle T ∈ P (cid:48) , we denote by E ( T ) the collectionof the elements of T (cid:48) that it contains, E ( T ) := { T ∈ T (cid:48) ; T ⊂ T } , and we define a ( T ) := (cid:88) T ∈E ( T ) | T | and n ( T ) := E ( T )) . We establish below that for any T ∈ P (cid:48) , the following inequality holds | T | ≤ a ( T ) n ( T ) λ . (9.42)Once this property is established, then choosing T = T ref and combining this inequalitywith (9.40) and (9.41) yields e T ( χ P ) ≥ c (cid:88) T ∈T (cid:48) | T | = c a ( T ref ) ≥ c | T ref | n ( T ref ) − λ = c T (cid:48) ) − λ ≥ c T ) − λ , where c = c | T ref | = e T ref ( χ P ) . This establishes the announced result (9.37) and concludesthe proof of this proposition. In order to prove (9.42) we use an induction argument onthe tree P (cid:48) , and for that purpose we distinguish two cases.i) If T is a leaf of P (cid:48) , then T ∈ T (cid:48) and therefore a ( T ) = | T | and n ( T ) = 1, hence(9.42) holds.ii) Otherwise T has two children T and T in P (cid:48) , and we may assume as an inductionhypothesis that | T | ≤ a ( T ) n ( T ) λ and | T | ≤ a ( T ) n ( T ) λ . (9.43)Applying the holder inequality u v + u v ≤ ( u p + u p ) p ( v q + v q ) q to the numbers u i = a ( T i ) p , v i = n ( T i ) q , p = 1 + λ, q = 1 + λλ = pλ , and elevating to the power p , we obtain (cid:16)(cid:0) a ( T ) n ( T ) λ (cid:1) p + (cid:0) a ( T ) n ( T ) λ (cid:1) p (cid:17) p ≤ ( a ( T ) + a ( T )) ( n ( T ) + n ( T )) λ . (9.44) .3. Alternative bisection choices, and the approximation of cartoon functions Figure | T | = (cid:32)(cid:18) | T | (cid:19) p + (cid:18) | T | (cid:19) p (cid:33) p = (cid:16) | T | p + | T | p (cid:17) p ≤ (cid:16)(cid:0) a ( T ) n ( T ) λ (cid:1) p + (cid:0) a ( T ) n ( T ) λ (cid:1) p (cid:17) p ≤ ( a ( T ) + a ( T )) ( n ( T ) + n ( T )) λ . = a ( T ) n ( T ) λ , where we use successively (9.38), (9.39), (9.43), (9.44) and the equalities a ( T ) = a ( T ) + a ( T ) and n ( T ) = n ( T ) + n ( T ). Again (9.42) holds, and by induction itholds for any element of P (cid:48) . (cid:5) We now turn to the proof of the upper estimate (9.34), and for that purpose we beginwith a geometrical lemma which uses the two assumptions (9.3.1) to analyse the first stepsof the refinement procedure. Lemma 9.3.3. Let T be a triangle such that e T ( χ P ) (cid:54) = 0 . If the decision function satisfies(9.3.1), then one of the following holdsi) A child T (cid:48) produced by the bisection of T satisfies e T (cid:48) ( χ P ) = 0 .ii) Applying iteratively the refinement procedure to T , some of its children, grand-childrenand grand-grand-children we obtain a triangulation T of T such that– T ) ≤ , and | T (cid:48) | ≥ | T | for all T (cid:48) ∈ T .– The approximation error e T (cid:48) ( χ P ) is non-zero for at most two triangles T (cid:48) ∈ T .Denoting these triangles by T , T , one of the following two possibilities holds | T | = 2 | T | = 16 | T | or | T | = 4 | T | = 8 | T | . Proof: Rather than reading a long discussion, the author invites the reader to check that,if i) is impossible, then one of the two possibilities illustrated on Figure 9.6 occurs andsatisfies ii). (cid:5) Chapter 9. Variants of the greedy bisection algorithm Proposition 9.3.4. Let T N be an arbitrary triangulation in R and let ( T N ) N ≥ N be thesequence of triangulations generated by the greedy algorithm applied to the function f = χ P and starting from the triangulation T . If the decision function satisfies Assumptions 9.3.1,then e T N ( χ P ) ≤ CN − λ/ , where λ = 1 . · · · is defined by (9.38). Proof: A proof of this proposition can be obtained by a straightforward adaptation ofLemma 9.3.7 and Proposition 9.3.8 in the next section, which is left to the reader. (cid:5) The previous section contains a negative result, Proposition 9.3.2, showing that thegreedy refinement procedure does not achieve the desired convergence rate (9.33) when itis based on the three bisection choices S T and a decision function satisfying the naturalassumptions 9.3.1.In this section the segment along which a triangle T is bisected is picked among thesix possibilities in S T , illustrated on Figure 9.2, instead of the three possibilities in S T .If the decision function satisfies Assumptions 9.3.5, then the desired convergence rate(9.33) is achieved, see Proposition 9.3.8. This result can be seen as a first step towardsthe construction of a hierarchical and anisotropic refinement algorithm well adapted tothe approximation of cartoon functions.The following assumptions on the decision function d T ( s, f ) are illustrated on Figure9.7. Their purpose is to create, through the refinement process, a thin layer of trianglesalong the line D . Note that for any triangle T any for any segment s ∈ S T , the twochildren of T have areas | T | and | T | . Assumptions 9.3.5. Let T be a triangle such that e T ( χ P ) (cid:54) = 0 . In the following wework under the assumption that segment s ∈ S T which minimizes the decision function d T ( s, χ P ) satisfies the following properties.a) If there is such a possibility in S T , the bisection of T along s creates a child T (cid:48) suchthat e T (cid:48) ( χ P ) = 0 and | T (cid:48) | = | T | .b) If a) is impossible, then whenever there is such a possibility in S T , the bisection of T along s creates a child T (cid:48) such that e T (cid:48) ( χ P ) = 0 and | T (cid:48) | = | T | .c) If a) and b) are impossible then the decision function selects a segment s = [ v , b ] ∈ S T joining a vertex v of the triangle T to the barycenter b = v + v of the other vertices.The choice of s must be such that the line D intersects the segment [ b, v ] . The next lemma uses these three assumptions to analyse the first steps of the refine-ment procedure on a triangle T . Lemma 9.3.6. Let T be a triangle such that e T ( χ P ) (cid:54) = 0 . If the decision function satisfies(9.3.5), then one of the following holdsi) A child T (cid:48) produced by the bisection of T satisfies e T (cid:48) ( χ P ) = 0 . .3. Alternative bisection choices, and the approximation of cartoon functions Figure D (Thick), forbidden bisection choices (dashed), authorized bisection choices (full). ii) Applying iteratively the refinement procedure to T , some of its children, grand-childrenand grand-grand-children we obtain a triangulation T of T such that– T ) ≤ , and | T (cid:48) | ≥ | T | for all T (cid:48) ∈ T .– The approximation error e T (cid:48) ( χ P ) is non-zero for at most two triangles T (cid:48) ∈ T .Denoting these by T , T , there exists i ∈ { , , } such that | T || T | ≤ × (cid:18) − i (cid:19) and | T || T | ≤ × (cid:18) i + 13 (cid:19) . (9.45) Proof: Rather than reading a long discussion, the author invites the reader to check that,if i) is impossible, then one of the three possibilities illustrated on Figure 9.8 occurs andsatisfies ii). (cid:5) Let T r be a triangle such that e T r ( χ P ) (cid:54) = 0. The decision function applied to T r and itsdescendants defines a master tree P r of triangles. If P is a finite subtree of P r , (in otherwords P contains T r , and the nodes of P have either two or no children), then the leaves of P define a triangulation T of T r . The next lemma describes some of these triangulations. Lemma 9.3.7. For any < ε ≤ , there exists a triangulation T ε of T r associated to afinite subtree of P r , which satisfies T ε ) ≤ Cε − µ and max {| T | ; T ∈ T ε and e T ( χ P ) (cid:54) = ∅} ≤ ε | T r | (9.46) where C = 30 and µ = 0 . · · · is the solution of (cid:18) (cid:19) µ + (cid:18) (cid:19) µ = 1 . (9.47) As a result, the master tree P r contains at most Cε − µ triangles T such that e T ( χ P ) ≥ ε | T r | . Proof: We consider a second tree P (cid:48) r of triangles, with root T r . A triangle T ∈ P (cid:48) r suchthat e T ( χ P ) = 0 has no children. A triangle T ∈ P (cid:48) r such that e T ( χ P ) (cid:54) = 0 has at most 7children, those described in Lemma 9.3.6. Two cases are possible :i) The triangle T has two children and one of them, denoted by T (cid:48) , satisfies e T (cid:48) ( χ P ) = 0.The child T (cid:48) of T is therefore a leaf of the tree P (cid:48) r .ii) The triangle T has at most 7 children, and at most two of them T , T are not leavesof the tree P (cid:48) r . Furthermore the areas of T , T satisfy (9.45) for some i ∈ { , , } .10 Chapter 9. Variants of the greedy bisection algorithm Figure P (cid:48) ε the subtree of P (cid:48) r created as follows : we start from the root T r , andwe include the children of a triangle T if and only if e T ( χ P ) (cid:54) = 0 and | T | > ε | T r | . Wedefine T ε as the collection of leaves of P (cid:48) ε and we remark that (9.46) is satisfied. Note thatthe parent T ∈ P (cid:48) ε of any triangle T (cid:48) ∈ P (cid:48) ε satisfies | T | > ε | T r | . Lemma 9.3.6 thereforeimplies that | T (cid:48) | ≥ ε | T r | for any T (cid:48) ∈ P (cid:48) ε . (9.48)We associate to each triangle T ∈ P (cid:48) ε the number n ( T ) := { T ∈ T ε ; T ⊂ T } , and we intend to show that for any triangle T in the tree P (cid:48) ε one has n ( T ) + 5 ≤ C (cid:18) | T | ε | T r | (cid:19) µ (9.49)Once this point is established, choosing T = T r concludes the proof of the first part of thisproposition. For that purpose, we proceed by induction on the tree P (cid:48) ε . We first remarkthat for any leaf T of P (cid:48) ε is follows from (9.48) that C (cid:18) | T | ε | T r | (cid:19) µ ≥ C (cid:18) (cid:19) µ ≥ n ( T ) + 5 . If T is not a leaf of P (cid:48) ε , then we may assume as an induction hypothesis that (9.49)holds for all the children of T . We now distinguish between the two types of nodes i) andii) of the tree P (cid:48) ε , and we obtain the following.Type i) The triangle T has two children, T , T , where T is a leaf of the tree P (cid:48) ε . Since T is not a leaf of P (cid:48) ε , it satisfies | T | ≥ ε | T r | . Furthermore | T | ≤ | T | , hence C (cid:18) | T | ε | T r | (cid:19) µ = C (cid:18) × | T | ε | T r | (cid:19) µ + C (cid:18) − (cid:18) (cid:19) µ (cid:19) (cid:18) | T | ε | T r | (cid:19) µ ≥ C (cid:18) | T | ε | T r | (cid:19) µ + C (cid:18) − (cid:18) (cid:19) µ (cid:19) ≥ n ( T ) + 5 + 1= n ( T ) + 5 . Therefore T satisfies (9.49). .3. Alternative bisection choices, and the approximation of cartoon functions T has at most seven children, and at most two of them, denotedby T , T , are not leaves. Hence n ( T ) ≤ n ( T ) + n ( T ) + 5. Using the estimate(9.45) on the areas of T and T , and the definition (9.47) of µ , we obtain (cid:18) | T || T | (cid:19) µ + (cid:18) | T || T | (cid:19) µ ≤ . Therefore C (cid:18) | T | ε | T r | (cid:19) µ ≥ C (cid:18) | T | ε | T r | (cid:19) µ + C (cid:18) | T | ε | T r | (cid:19) λ ≥ n ( T ) + 5 + n ( T ) + 5 ≥ n ( T ) + 5 , which concludes the proof of (9.49).We now turn to the second part of the proposition, and for that purpose we denote by P ε the subtree of P r associated to the triangulation T ε . Note that this tree is distinct fromthe subtree P (cid:48) ε of P (cid:48) r . In particular P ε is a binary tree while P (cid:48) ε is not (the tree P (cid:48) ε canbe obtained by grouping some of the elements of P ε in a single node with some of theirdescendants).Since P ε is a binary tree, the cardinality of the set P ε \ T ε of its inner nodes has asimple expression P ε \ T ε ) = T ε ) − . Furthermore, any triangle T ∈ P r which does not belong to P ε \ T ε is an element of T ε orone of its descendants, hence T satisfies e T ( χ P ) = 0 or e T ( χ P ) ≤ | T | < ε | T r | . Therefore { T ∈ P r ; e T ( χ P ) ≥ ε | T r |} ) ≤ P ε \ T ε ) = T ε ) − ≤ Cε − µ , which concludes the proof. (cid:5) Proposition 9.3.8. Let T N be an arbitrary triangulation in R and let ( T N ) N ≥ N be thesequence of triangulations generated by the greedy algorithm applied to the function f = χ P and starting from the triangulation T . If the decision function satisfies the assumptions9.3.5, then e T N ( χ P ) ≤ CN − ν , (9.50) where ν = µ − = 1 . · · · . Proof: Let T be a triangle, let f ∈ L ( T ) and let T , T be the children produced bysome bisection of T . Then e T ( f ) + e T ( f ) = inf π ,π ∈ IP (cid:107) f − π (cid:107) L ( T ) + (cid:107) f − π (cid:107) L ( T ) ≤ inf π ∈ IP (cid:107) f − π (cid:107) L ( T ) + (cid:107) f − π (cid:107) L ( T ) = e T ( f ) . Chapter 9. Variants of the greedy bisection algorithm Hence max { e T ( f ) , e T ( f ) } ≤ e T ( f ) . The decision function applied to the triangles in T and their descendants defines a collec-tion P of N master trees. It follows from the previous inequality that the approximationerror e T ( χ P ) is larger on a triangle T than on any of its descendants.At each step N ≥ N , the greedy algorithm selects for bisection a triangle in T N ∈ T N which maximizes the approximation error e T ( χ P ) among all T ∈ T N . As a result we havefor any N ≥ N , e T N ( χ P ) ≥ e T N ( χ P ) ≥ · · · ≥ e T N ( χ P ) ≥ max { e T ( χ P ) ; T ∈ P \ { T N , · · · , T N }} . We thus have for any N ≥ N , since T N ) = N , e T N ( χ P ) = (cid:88) T ∈T N e T ( χ P ) ≤ N e T N ( χ P ) . (9.51)According to Lemma 9.3.7, for any 0 < ε ≤ P of N trees contains atmost C ε − µ triangles T such that e T ( χ P ) ≥ ε , where C is independent of ε . Choosing ε = C µ ( N − N + 1) − µ we obtain e T N ( χ P ) ≤ C µ ( N − N + 1) − µ . Combining this with (9.51) yields (9.50) which concludes the proof. (cid:5) Remark 9.3.9. The convergence rate obtained in Proposition 9.3.8 actually exceeds thedesired error estimate (9.33). This point is due to the fact that f = χ P is a particularlysimple cartoon function. For a function cartoon f which is discontinuous along curvededges, instead of a straight line, or which is not piecewise affine, the convergence rate(9.33) is optimal. We suggest in this section another modification of the refinement procedure, in an at-tempt to improve the result of Proposition 9.3.8. This proposition is indeed not completelysatisfactory, in particular because of the following three reasons.– The decision function d T ( s, f ) needs to satisfy Assumptions 9.3.5. Unfortunatelynumerical results suggest that the natural decision function (9.30) does not satisfythese assumptions, and the author has not found a decision function with a simpleexpression which satisfies these assumptions.– The number of segments s ∈ S T along which a triangle T may be bisected hasincreased from 3 to 6. From the point of view of data compression, the hierarchicalbisection tree is therefore more expensive to encode. From the theoretical point ofview, the algorithm could be regarded as less elegant.– Since the cartoon function f = χ P is particularly simple, see Remark 9.3.9, a conver-gence rate faster that (9.50) could be expected. For example an exponential conver-gence rate e T N ( χ P ) ≤ Ce − βN where β > .3. Alternative bisection choices, and the approximation of cartoon functions a bc T a T b T c T c T ! T c T b T ! T b T a T ! T a Figure T ∈ C along the set of segments S ∗ T determined bythree points a, b, c ∈ ∂T which maximize (9.52).The modification of the algorithm described below solves these three points, at thefollowing cost. The modified greedy algorithm does not produce triangulations any more,but partitions into convex polygons of the domain Ω on which the function f to beapproximated is defined.We denote by C the collection of bounded closed and convex sets of nonempty interiorincluded in R . We denote by the capital letter T the elements of C , whether they aretriangles or not, and by T any partition of a domain Ω into elements of C . The definitions(9.27) and (9.28) of the approximation errors e T ( f ) and e T ( f ) are unchanged.Let T ∈ C be a bounded closed and convex set, and let a, b, c ∈ ∂T be three distinctpoints on the boundary of T . If none of the segments [ a, b ], [ b, c ] or [ c, a ] is included in ∂T , then the complementary in T of the triangle of vertices a, b, c has three connectedcomponents. We denote by T a the closure of the connected component which does notcontain a , and we define T b and T c likewise, see Figure 9.9. The bisection of T alongthe segment [ b, c ], [ c, a ] or [ a, b ] produces a pair of children { T a , T \ T a } , { T b , T \ T b } or { T c , T \ T c } respectively which also belong to C . We define τ ( T ; a, b, c ) := min {| T a | , | T b | , | T c |} and we set the convention τ ( T ; a, b, c ) = 0 for any a, b, c ∈ ∂C such that the complemen-tary in T of the triangle of vertices a, b, c has only two connected components. We alsodefine τ ( T ) := sup a,b,c ∈ ∂T τ ( T ; a, b, c ) . (9.52)The next proposition gives a lower bound for the ratio τ ( T ) / | T | . Proposition 9.3.10. The supremum (9.52) is attained and for any a, b, c satisfying τ ( T ; a, b, c ) = τ ( T ) , we have | T a | = | T b | = | T c | = τ ( T ) . Furthermore for any T ∈ C ,one has | T | ≤ τ ( T ) . (9.53) Proof: For any fixed T ∈ C , the boundary ∂T is compact and the map( a, b, c ) ∈ ( ∂T ) (cid:55)→ τ ( T ; a, b, c )14 Chapter 9. Variants of the greedy bisection algorithm Oab ca ! b ! c ! Α ΒΓ Oab ca ! b ! c ! Oab ca ! b ! c ! Figure T a , T b and T c , Right : three triangles contained in T a (cid:48) , T b (cid:48) and T c (cid:48) .is continuous. Hence the maximum (9.52) is attained. Furthermore let a, b, c ∈ ∂T andlet us assume that | T a | > | T b | , then moving the point c ∈ ∂T in the direction of T a continuously decreases | T a | and increases | T b | . As a result, if the triplet ( a, b, c ) ∈ ( ∂T ) maximizes (9.52), then one must have | T a | = | T b | = | T c | .For any affine change of coordinates Φ( z ) := φz + z , where φ ∈ GL and z ∈ R , oneeasily checks that τ ( T ) | T | = τ (Φ( T )) | Φ( T ) | . (9.54)Let T ∈ C and let ( a, b, c ) ∈ ( ∂T ) be a triplet maximizing (9.52). In order to establish(9.53) we may assume using (9.54), up to an affine change of coordinates, that the points a, b, c are the vertices of an equilateral triangle T eq centered at the origin O = (0 , ∈ R and of edge-length 2 = | a − b | = | b − c | = | c − a | . See Figure 9.10 for an illustration ofthis triangle and of the notations defined below.We denote by a (cid:48) the point on ∂T such that O ∈ [ a, a (cid:48) ], and we denote by α the distancefrom a (cid:48) to the segment [ b, c ]. Since the convex set T a contains the triangle of vertices a (cid:48) , b, c ,we have | T a | ≥ | b − c | α α. (9.55)We define b (cid:48) , c (cid:48) and β, γ similarly. The triplet ( a (cid:48) , b (cid:48) , c (cid:48) ) ∈ ( ∂T ) defines three convex sets T a (cid:48) , T b (cid:48) , T c (cid:48) . Recalling that τ ( T ) = τ ( T ; a, b, c ) = | T a | = | T b | = | T c |≥ τ ( T ; a (cid:48) , b (cid:48) , c (cid:48) ) = min {| T a (cid:48) | , | T b (cid:48) | , | T c (cid:48) |} , we obtain τ ( T ) ≥ max {| T a | , | T b | , | T c | , min {| T a (cid:48) | , | T b (cid:48) | , | T c (cid:48) |}} (9.56)Since | T a (cid:48) | contains the triangle of vertices a, b (cid:48) , c (cid:48) , an elementary computation shows that | T (cid:48) a | ≥ | det( b (cid:48) − a, c (cid:48) − a ) | |√ − βγ ) + β + γ | , hence τ ( T ) ≥ max (cid:40) α, β, γ, √ 34 + 14 min { β + γ − √ βγ, γ + α − √ γα, α + β − √ αβ } (cid:41) . .3. Alternative bisection choices, and the approximation of cartoon functions Figure S ∗ T (blue, dotted), the line D (black, thick). Right : After 8 steps of the greedy algorithm, with a decision functionsatisfying Assumptions 9.3.11.If the minimum appearing in the previous equation is negative, then α , β or γ is lar-ger than 1 / √ > √ / 4. It follows that τ ( T ) ≥ √ / 4, which concludes the proof since | T | = | T eq | + 3 τ ( T ) = √ τ ( T ). (cid:5) We denote by S ∗ T := { [ a, b ] , [ b, c ] , [ c, a ] } the collection of segments joining the threepoints a, b, c ∈ ( ∂T ) which maximize (9.52). Some convex polygons and the associatedoptimal triplets ( a, b, c ) ∈ ( ∂T ) are illustrated on Figure 9.9. If there are several triplets( a, b, c ) realizing this maximum, for instance if T is a disc or a regular polygon, then weconsider the minimal triplet with respect to the lexicographic order on ( R ) .From an algorithmic point of view, for any convex polygon T ∈ C the points ( a, b, c )which maximize (9.52) can be found by solving a finite number of third degree equations(one for each triplet of distinct faces of T ). Note that Proposition 9.3.10 is not optimal.We conjecture that the minimal value τ ( T ) / | T | is attained when the convex set T isthe unit disc, and is equal to ( π / − √ / / π (cid:39) . · · · , which is stricly larger than1 / (cid:39) . · · · . Assumptions 9.3.11. Let T ∈ C be such that e T ( χ P ) (cid:54) = 0 . In the following we workunder the assumption that the segment s ∈ S ∗ T which minimizes the decision function d T ( s, χ P ) satisfies the following property.a) The bisection of T along s creates a child T (cid:48) such that e T (cid:48) ( χ P ) = 0 . (This is alwayspossible). The following corollary shows that, under the previous assumption on the decisionfunction, the refinement procedure based on these bisection choices creates an exponen-tially thin layer around the discontinuity of the function f = χ P , as illustrated on Figure9.11. Corollary 9.3.12. Let Ω ⊂ R be a convex polygon, and let ( T N ) N ≥ be the sequence ofpartitions of Ω into convex polygons generated by the greedy algorithm applied to the func-tion f = χ P and starting from T = { Ω } . If the decision function satisfies Assumptions9.3.11 then e T N ( χ P ) ≤ | Ω | (1 − / N − . Chapter 9. Variants of the greedy bisection algorithm Figure f ( x, y ) = x + 100 y with the decision function (9.57) and 50 (i) and 200 (ii) elements.Likewise for the function f ( x, y ) = x + y and 30 elements (iii). Proof: It follows from the assumptions 9.3.11 that for all N ≥ T ∈ T N satisfies e T ( χ P ) (cid:54) = 0. We denote this element by T N , and we remark that T N +1 is a child of T N . It follows from Proposition 9.3.10 that | T N +1 | ≤ (1 − / | T N | , for any N ≥ 1, hence e T N ( χ P ) = e T N ( χ P ) ≤ | T N | ≤ | Ω | (1 − / N − , which concludes the proof. See Figure 9.11 for an illustration. (cid:5) The bisection of T ∈ C along a segment s ∈ S ∗ T creates two children T s and T s . In termsof the notations previously used in this section, the bisection along s = [ b, c ] ∈ S ∗ T createsthe two children T s = T a and T s = T \ T a . From this point, we assume that the decisionfunction d T ( s, f ) is defined by d T ( s, f ) := min (cid:40) e T s ( f ) | T s | , e T s ( f ) | T s | (cid:41) . (9.57)If the approximation error of f on T s or T s is zero, then d T ( s, f ) = 0. Therefore this deci-sion function satisfies Assumptions 9.3.11, and leads to an exponentially fast convergenceif f = χ P according to Corollary 9.3.12.Let us consider a quadratic function f ( z ) = z T Qz , where Q is a 2 × T of Ω should adopt, for an efficient approximation of f , an aspectratio dictated by the hessian matrix d f = Q of f (at least in an average sense). Thenumerical experiments presented in Figure 9.12 suggest that the greedy algorithm, basedon the bisection choices S ∗ T and the decision function (9.57), generates such partitions.Unfortunately, the author could not prove this property.Eventually, numerical experiments were also led for the characteristic function f = χ D of the unit disc D = { ( x, y ) ∈ R ; x + y ≤ } . They seem to indicate that the partitions( T N ) N ≥ of the triangle of vertices (0 , , (2 , , (0 , 2) generated by the greedy algorithmdo satisfy the desired error estimate e T N ( χ D ) ≤ CN − , .4. A greedy algorithm based on rectangles Figure f = χ D , with the decision function (9.57) and 200 elements (i). Detail (ii). Decay of theerror e T N ( χ D ), in logarithmic scale, for 1 ≤ N ≤ f of interest but using itsexplicit form in order to have an exact expression of the L approximation error. We study in this section another variation of the greedy algorithm which is simplerthan the previous ones. This algorithm produces partitions ( T N ) N ≥ of the unit squareΩ = [0 , into rectangles aligned with the horizontal or vertical axes, as studied inChapter 1 and illustrated on Figure 5 in the main introduction of the thesis, insteadof triangulations built of triangles of arbitrary direction. Here we work with piecewiseconstant approximants, and we measure the approximation error in the L ∞ norm. Theseimportant simplifications allow us to establish an error estimate which applies to anyfunction f ∈ C (Ω) and is asymptotically optimal in accordance with the results inChapter 1. We also establish a second error estimate which gives some valuable (yet nonoptimal) information in a non-asymptotic sense, i.e. for any value of N ≥ T ⊂ R isa set of the form T = I × J where I and J are compact intervals of R . For any rectangle T and any f in C ( T ), wedenote by e T ( f ) the difference between the supremum and the infimum of f on T . e T ( f ) := sup T f − inf T f. Note that e T ( f ) is also the double of the error of best approximation of f on T in the L ∞ norm, e T ( f ) = 2 inf c ∈ R (cid:107) f − c (cid:107) L ∞ ( T ) . Chapter 9. Variants of the greedy bisection algorithm If a function f ∈ C ( T ) attains its maximum on the rectangle T at the point z M =( x M , y M ), and its minimum at the point z m = ( x m , y m ) then observe that f ( z M ) − f ( z m ) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) x M x m ∂ x f ( x, y m ) dx (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) y M y m ∂ y f ( x M , y ) dy (cid:12)(cid:12)(cid:12)(cid:12) ≤ | x M − x m |(cid:107) ∂ x f (cid:107) L ∞ ( T ) + | y M − y m |(cid:107) ∂ y f (cid:107) L ∞ ( T ) . Therefore if T = I × J , we obtain e T ( f ) ≤ | I |(cid:107) ∂ x f (cid:107) L ∞ ( T ) + | J |(cid:107) ∂ y f (cid:107) L ∞ ( T ) . (9.58)For any partition T of a rectangle T into smaller rectangles, and for any f ∈ C ( T ), wedenote by e T ( f ) the maximum of e T ( f ) among all T ∈ T : e T ( f ) := max T ∈T e T ( f ) . We clearly also have e T ( f ) = 2 min g ∈ V T (cid:107) f − g (cid:107) L ∞ ( T ) , where V T is the collection of functions on T which are constant on the interior of each T ∈ T . Remark 9.4.1. Please note that the definition of e T ( f ) in this section differs a lot fromthe definition of e T ( f ) in the previous section. Indeed, e T ( f ) stands here for the double ofthe best approximation of f in L ∞ ( T ) norm by a constant c ∈ R , whereas e T ( f ) stands inthe previous section for the best approximation of f in L ( T ) norm by an affine function π ∈ IP . For any rectangle T we denote by ( T d , T u ) the down and up rectangles which areobtained by a horizontal split of T through its barycenter and by ( T l , T r ) the left andright rectangles which are obtained by a vertical split of T through its barycenter. Forany function f ∈ C ( T ), we define e T,h ( f ) := max { e T d ( f ) , e T u ( f ) } and e T,v ( f ) := max { e T l ( f ) , e T r ( f ) } . The greedy algorithm studied in this section takes as input a function f ∈ C (Ω),where Ω = [0 , is the unit square, and a parameter ρ ∈ [0 , T ofΩ into rectangles aligned with the coordinate axes, the greedy algorithm produces in onestep a refinement T (cid:48) of T , satisfying T (cid:48) ) = T ) + 1, proceeding as follows.1. Select a rectangle T ∈ T which maximizes e T ( f ).2. (Greedy split) If min { e T,h ( f ) , e T,v ( f ) } ≤ ρ e T ( f ) , (9.59)then bisect the rectangle T horizontally if e T,h ( f ) ≤ e T,v ( f ) and vertically otherwise,thus producing two children T , T of T . Define T (cid:48) := T − { T } + { T , T } . .4. A greedy algorithm based on rectangles { e T,h ( f ) , e T,v ( f ) } > ρ e T ( f ) , (9.60)then bisect the rectangle T horizontally if | I | ≥ | J | and vertically otherwise where I, J are the two compact intervals such that T = I × J . This bisection produces twochildren T , T of T and we define T (cid:48) := T − { T } + { T , T } . Starting from the partition T = { Ω } of the square Ω = [0 , , the greedy algorithmproduces step after step a sequence ( T N ) N ≥ of partitions of Ω into dyadic rectangles,satisfying T N ) = N . The parameter ρ ∈ [0 , 1] is chosen by the user. It should not bechosen too small in order to avoid that all splits are of safety type which would then leadto isotropic partitions. The choice ρ = 1 is not indicated either. Indeed let f ( x, y ) :=sin(4 πx ), then for any rectangle T of the form T = [0 , × J , where J is a compactinterval, one has e T ( f ) = e T,h ( f ) = e T,v ( f ) = 2 . If a rectangle T = [0 , × J is selected for bisection and if ρ = 1, then T is bisectedhorizontally and we are facing a similar situation for its children T u and T d . As a re-sult, the greedy algorithm produces a sequence ( T N ) N ≥ of partitions of Ω consisting ofrectangles of the form [0 , × J where J are dyadic intervals. This shows that the ap-proximations produced by the algorithm fail to converge towards f , and the global errorremains e T N ( f ) = 1 for all N ≥ 1. On the contrary, it is established in [35] that for anychoice of ρ ∈ [0 , 1[ and any f ∈ C (Ω), the partitions ( T N ) N ≥ produced by the greedyalgorithm satisfy e T N ( f ) → N → ∞ .We now prove that, using the specific value ρ = √ and measuring the error in the L ∞ norm, the refinement algorithm has optimal convergence properties (9.61) similar toTheorem 1.1.5. The authors suspect that this result can be extended to the L p norm, where1 ≤ p ≤ ∞ , and to an arbitrary parameter ρ ∈ ]1 / , Theorem 9.4.2. There exists a constant C > such that for any f ∈ C (Ω) , thepartitions ( T N ) N ≥ produced by the modified greedy refinement algorithm with parameter ρ := √ satisfy the asymptotic convergence estimate lim sup N → + ∞ N e T N ( f ) ≤ C (cid:13)(cid:13)(cid:13)(cid:13)(cid:113) | ∂ x f ∂ y f | (cid:13)(cid:13)(cid:13)(cid:13) L . (9.61) Furthermore for all N ≥ one has e T N ( f ) ≤ C (cid:107)∇ f (cid:107) L ∞ (Ω) N − . (9.62)The asymptotical error estimate (9.61) is the best than one should hope to obtain.Indeed it follows from Theorem 1.1.4 that there exists a constant c > f and such that lim inf N → + ∞ N e T (cid:48) N ( f ) ≥ c (cid:13)(cid:13)(cid:13)(cid:13)(cid:113) | ∂ x f ∂ y f | (cid:13)(cid:13)(cid:13)(cid:13) L , (9.63)20 Chapter 9. Variants of the greedy bisection algorithm for any admissible sequence ( T (cid:48) N ) N ≥ N of partitions of Ω into rectangles. (Admissibility isa minor technical condition, see Definition 1.1.3)Here and after, we use the (cid:96) ∞ norm on IR for measuring the gradient : for z = ( x, y ) ∈ Ω |∇ f ( z ) | := max {| ∂ x f ( z ) | , | ∂ y f ( z ) |} , and (cid:107)∇ f (cid:107) L ∞ ( T ) := sup z ∈ T |∇ f ( z ) | = max (cid:8) (cid:107) ∂ x f (cid:107) L ∞ ( T ) , (cid:107) ∂ y f (cid:107) L ∞ ( T ) (cid:9) . We consider a fixed function f ∈ C (Ω) and we denote by ( T N ) N ≥ the sequence of parti-tions of the square Ω = [0 , into dyadic rectangles generated by the greedy algorithm.Furthermore we define for each N ≥ ε ( N ) := e T N ( f ) . Finally we sometimes use the notation x ( z ) and y ( z ) to denote the coordinates of a point z ∈ IR . The proof of Theorem 9.4.2 requires a preliminary result. Lemma 9.4.3. Let T = I × J ∈ T M be a dyadic rectangle obtained at some stage M ≥ of the refinement algorithm, and let T = I × J ∈ T N be a dyadic rectangle obtained atsome later stage N > M and such that T ⊂ T . We then have | I | ≥ min (cid:26) | I | , ε ( N )4 (cid:107)∇ f (cid:107) L ∞ ( T ) (cid:27) and | J | ≥ min (cid:26) | J | , ε ( N )4 (cid:107)∇ f (cid:107) L ∞ ( T ) (cid:27) . (9.64) Proof: Since the coordinates x and y play symmetrical roles, it suffices to prove the firstinequality. We reason by contradiction. If the inequality does not hold, there exists arectangle T (cid:48) = I (cid:48) × J (cid:48) in the chain that led from T to T which is such that | I (cid:48) | < ε ( N )2 (cid:107)∇ f (cid:107) L ∞ ( T ) , (9.65)and such that T (cid:48) is split vertically by the algorithm. If this was a safety split, we wouldhave that | J (cid:48) | ≤ | I (cid:48) | and therefore using (9.58) e T (cid:48) ( f ) ≤ ( | I (cid:48) | + | J (cid:48) | ) (cid:107)∇ f (cid:107) L ∞ ( T ) ≤ | I (cid:48) |(cid:107)∇ f (cid:107) L ∞ ( T ) < ε ( N ) , which is a contradiction, since all ancestors of T should satisfy e T (cid:48) ( f ) ≥ ε ( N ). Hence thissplit was necessarily a greedy split.Let z m := Argmin z ∈ T (cid:48) f ( z ) and z M := Argmax z ∈ T (cid:48) f ( z ), and let T (cid:48)(cid:48) be the child of T (cid:48) (after the vertical split) containing z M . Then T (cid:48)(cid:48) also contains a point z (cid:48) m such that | x ( z (cid:48) m ) − x ( z m ) | ≤ | I (cid:48) | / y ( z (cid:48) m ) = y ( z m ). It follows that e T (cid:48) ,v ( f ) ≥ e T (cid:48)(cid:48) ( f ) ≥ f ( z M ) − f ( z (cid:48) m ) ≥ f ( z M ) − f ( z m ) − (cid:107) ∂ x f (cid:107) L ∞ ( T (cid:48) ) | I (cid:48) | ≥ e T (cid:48) ( f ) − ε ( N ) ≥ e T (cid:48) ( f ) > ρe T (cid:48) ( f ) . .4. A greedy algorithm based on rectangles (cid:5) Proof of Inequality (9.62) We first observe that for any N ≥ ε ( N ) ≤ ε (1) = sup Ω f − inf Ω f ≤ (cid:107)∇ f (cid:107) L ∞ (Ω) . Applying Lemma 9.4.3 to the square Ω = [0 , × [0 , 1] we obtainmin {| I | , | J |} ≥ ε ( N )4 (cid:107)∇ f (cid:107) L ∞ (Ω) for any rectangle T = I × J ∈ T N . Therefore1 = | Ω | = (cid:88) T = I × J ∈T N | I | × | J | ≥ N (cid:18) ε ( N )4 (cid:107)∇ f (cid:107) L ∞ (Ω) (cid:19) , hence N ε ( N ) ≤ (cid:107)∇ f (cid:107) L ∞ (Ω) , (9.66)which concludes the proof of (9.62). (cid:5) Proof of Inequality (9.61) We consider a small but fixed δ > 0, and we define h ( δ ) asthe maximal h > ∀ z, z (cid:48) ∈ Ω , | z − z (cid:48) | ≤ h ( δ ) ⇒ |∇ f ( z ) − ∇ f ( z (cid:48) ) | ≤ δ. (9.67)For any rectangle T = I × J ⊂ Ω, we thus have e T ( f ) ≥ ( (cid:107) ∂ x f (cid:107) L ∞ ( T ) − δ ) min { h ( δ ) , | I |} ,e T ( f ) ≥ ( (cid:107) ∂ y f (cid:107) L ∞ ( T ) − δ ) min { h ( δ ) , | J |} . (9.68)Let δ > M = M ( f, δ ) be the smallest value of N such that ε ( N ) < δh ( δ ).For all N ≥ M , and therefore ε ( N ) < δh ( δ ), we consider the partition T N which is arefinement of T M . For any rectangle T = I × J ∈ T M , we denote by T N ( T ) the set ofrectangles of T N that are contained T . We thus have T N := (cid:91) T ∈T M T N ( T ) , and T N ( T ) is a partition of T . We shall next bound by below the side length of T = I × J contained in T N ( T ), distinguishing different cases depending on the behaviour of f on T . Case 1. If T ∈ T M is such that (cid:107)∇ f (cid:107) L ∞ ( T ) ≤ δ , then a direct application of Lemma9.4.3 shows that for all T = I × J ∈ T N ( T ) we have | I | ≥ min (cid:26) | I | , ε ( N )40 δ (cid:27) and | J | ≥ min (cid:26) | J | , ε ( N )40 δ (cid:27) (9.69)22 Chapter 9. Variants of the greedy bisection algorithm Case 2. If T ∈ T M is such that (cid:107) ∂ x f (cid:107) L ∞ ( T ) ≥ δ and (cid:107) ∂ y f (cid:107) L ∞ ( T ) ≥ δ , we then claimthat for all T = I × J ∈ T N ( T ) we have | I | ≥ min (cid:26) | I | , ε ( N )20 (cid:107) ∂ x f (cid:107) L ∞ ( T ) (cid:27) and | J | ≥ min (cid:26) | J | , ε ( N )20 (cid:107) ∂ y f (cid:107) L ∞ ( T ) (cid:27) , (9.70)and that furthermore | T | (cid:107) ∂ x f (cid:107) L ∞ ( T ) (cid:107) ∂ y f (cid:107) L ∞ ( T ) ≤ (cid:18) (cid:19) (cid:90) T | ∂ x f ∂ y f | dxdy. (9.71)This last statement easily follows by the following observation : recalling (9.68) we obtain9 δh ( δ ) > ε ( M ) ≥ e T ( f ) ≥ ( (cid:107) ∂ y f (cid:107) L ∞ ( T ) − δ ) min { h ( δ ) , | J |} ≥ δ min { h ( δ ) , | J |} , therefore | J | ≤ h ( δ ). Similarly | I | ≤ h ( δ ) and therefore for all z ∈ T we obtain | ∂ x f ( z ) | ≥ (cid:107) ∂ x f (cid:107) L ∞ ( T ) − δ ≥ (cid:107) ∂ x f (cid:107) L ∞ ( T ) , | ∂ y f ( z ) | ≥ (cid:107) ∂ y f (cid:107) L ∞ ( T ) − δ ≥ (cid:107) ∂ y f (cid:107) L ∞ ( T ) . Integrating over T yields (9.71). Furthermore the previous equations show that ∂ x f and ∂ y f have a constant sign on T . Let us assume that both are positive, then for any rectangle T = [ a, b ] × [ c, d ] ⊂ T we obtain e T ( f ) = sup T f − inf T f ≥ f ( b, d ) − f ( a, c ) ≥ (cid:90) ba ∂ x f ( x, c ) dx + (cid:90) dc ∂ y f ( b, y ) dy ≥ (cid:0) ( b − a ) (cid:107) ∂ x f (cid:107) L ∞ ( T ) + ( d − c ) (cid:107) ∂ y f (cid:107) L ∞ ( T ) (cid:1) A similar reasoning can be applied if ∂ x f or ∂ y f is negative is negative on T . Recalling(9.58) we therefore obtain for any T ⊂ T ≤ e T ( f ) (cid:107) ∂ x f (cid:107) L ∞ ( T ) | I | + (cid:107) ∂ y f (cid:107) L ∞ ( T ) | J | ≤ . (9.72)Clearly the two inequalities in (9.70) are symmetrical, and it suffices to prove the firstone. Similar to the proof of Lemma 9.4.3, we reason by contradiction, assuming that arectangle T (cid:48) = I (cid:48) × J (cid:48) with | I (cid:48) |(cid:107) ∂ x f (cid:107) L ∞ ( T ) < ε ( N )10 (9.73)was split vertically by the algorithm in the chain leading from T to T . A simple compu-tation using (9.72) shows that e T (cid:48) ,h ( f ) e T (cid:48) ( f ) ≤ e T (cid:48) ,h ( f ) e T (cid:48) ,v ( f ) ≤ × σ σ/ σ := (cid:107) ∂ x f (cid:107) L ∞ ( T ) | I (cid:48) |(cid:107) ∂ y f (cid:107) L ∞ ( T ) | J (cid:48) | . .4. A greedy algorithm based on rectangles σ < . e T (cid:48) ,h ( f ) ≤ ρe T (cid:48) ( f ) and e T (cid:48) ,h ( f ) ≤ e T (cid:48) ,v ( f ) , hence the algorithm performs a horizontal greedy split on T (cid:48) , which contradicts our as-sumption. Therefore σ ≥ . 2, but this also leads to a contradiction since, using (9.73), ε ( N ) ≤ e T (cid:48) ( f ) ≤ (cid:107) ∂ x f (cid:107) L ∞ ( T ) | I (cid:48) | + (cid:107) ∂ y f (cid:107) L ∞ ( T ) | J (cid:48) | ≤ (1 + σ − ) ε ( N )10 < ε ( N ) . Case 3. If T ∈ T M is such that (cid:107) ∂ x f (cid:107) L ∞ ( T ) ≤ δ and (cid:107) ∂ y f (cid:107) L ∞ ( T ) ≥ δ , we thenclaim that for all T = I × J ∈ T N ( T ) we have, with C := 200, | I | ≥ min (cid:26) | I | , ε ( N ) C δ (cid:27) and | J | ≥ min (cid:26) | J | , ε ( N )4 (cid:107)∇ f (cid:107) L ∞ (cid:27) (9.74)with a symmetrical result if T is such that (cid:107) ∂ x f (cid:107) L ∞ ( T ) ≥ δ and (cid:107) ∂ y f (cid:107) L ∞ ( T ) ≤ δ .The right part of (9.74) is a direct consequence of Lemma 9.4.3, hence we focus on theleft part. Applying the second inequality of (9.68) to T = T , we obtain9 δh ( δ ) > e T ( f ) ≥ ( (cid:107) ∂ y f (cid:107) L ∞ ( T ) − δ ) min { h ( δ ) , | J |} ≥ δ min { h ( δ ) , | J |} , from which we infer that | J | ≤ h ( δ ). If z , z ∈ T and x ( z ) = x ( z ) we therefore have | ∂ y f ( z ) | ≥ | ∂ y f ( z ) | − δ . It follows that for any rectangle T = I × J ⊂ T we have( (cid:107) ∂ y f (cid:107) L ∞ ( T ) − δ ) | J | ≤ e T ( f ) ≤ (cid:107) ∂ y f (cid:107) L ∞ ( T ) | J | + 10 δ | I | . (9.75)We then again reason by contradiction, assuming that a rectangle T (cid:48) = I (cid:48) × J (cid:48) with | I (cid:48) | < ε ( N ) C δ (9.76)was split vertically by the algorithm in the chain leading from T to T . If (cid:107) ∂ y f (cid:107) L ∞ ( T (cid:48) ) ≤ δ , then (cid:107)∇ f (cid:107) L ∞ ( T (cid:48) ) ≤ δ and Lemma 9.4.3 shows that T (cid:48) should not have been splitvertically, which is a contradiction. Otherwise (cid:107) ∂ y f (cid:107) L ∞ ( T (cid:48) ) − δ ≥ (cid:107) ∂ y f (cid:107) L ∞ ( T (cid:48) ) , and weobtain using (9.75) and (9.76) that910 (cid:107) ∂ y f (cid:107) L ∞ ( T (cid:48) ) | J (cid:48) | ≤ e T (cid:48) ( f ) ≤ (cid:107) ∂ y f (cid:107) L ∞ ( T (cid:48) ) | J (cid:48) | + 10 δ ε ( N ) C δ . Recalling that e T (cid:48) ( f ) ≥ ε ( N ) we obtain (cid:18) − C (cid:19) e T (cid:48) ( f ) ≤ (cid:107) ∂ y f (cid:107) L ∞ ( T (cid:48) ) | J (cid:48) | ≤ e T (cid:48) ( f ) . (9.77)24 Chapter 9. Variants of the greedy bisection algorithm We now consider the children T (cid:48) v and T (cid:48) h of T (cid:48) of maximal error after a horizontal andvertical split respectively. We obtain e T (cid:48) ,h ( f ) = e T (cid:48) h ( f ) ≤ (cid:107) ∂ y f (cid:107) L ∞ ( T (cid:48) ) | J (cid:48) | δ | I (cid:48) |≤ e T (cid:48) ( f ) + 20 ε ( N ) C ≤ (cid:18) 59 + 20 C (cid:19) e T (cid:48) ( f ) = 5990 e T (cid:48) ( f ) , where we used (9.75) in the second line, and (9.76) and (9.77) in the third line. On theother hand, using (9.77) in the fourth line e T (cid:48) ,v ( f ) = e T (cid:48) h ( f ) ≥ ( (cid:107) ∂ y f (cid:107) L ∞ ( T (cid:48) ) − δ ) | J |≥ (cid:107) ∂ y f (cid:107) L ∞ ( T (cid:48) ) | J (cid:48) |≥ (cid:18) − C (cid:19) e T (cid:48) ( f ) = 81100 e T (cid:48) ( f ) . Therefore e T (cid:48) ,v ( f ) > e T (cid:48) ,h ( f ) which is a contradiction, since our decision rule would thenselect a horizontal split.We now recall that ε ( N ) → N → ∞ , see (9.66), and we choose N large enoughso that the minimum in (9.69), (9.70) and (9.74) is always equal to the second term. Forall T ∈ T N ( T ), we respectively find that ε ( N ) | T | ≤ C δ if (cid:107)∇ f (cid:107) L ∞ ( T ) ≤ δ | T | (cid:82) T | ∂ x f ∂ y f | if (cid:107) ∂ x f (cid:107) L ∞ ( T ) ≥ δ and (cid:107) ∂ y f (cid:107) L ∞ ( T ) ≥ δδ (cid:107)∇ f (cid:107) L ∞ if (cid:107) ∂ x f (cid:107) L ∞ ( T ) ≤ δ and (cid:107) ∂ y f (cid:107) L ∞ ( T ) ≥ δ (or reversed).with C := max { , (10 / , } = 40 . For z ∈ Ω, we set ψ N ( z ) := | T | where T ∈ T N is such that z ∈ T , and we obtain N = T N ) = (cid:90) Ω ψ N ( z ) dz ≤ C ε ( N ) − (cid:18)(cid:90) Ω | ∂ x f ∂ y f | dxdy + δ (cid:107)∇ f (cid:107) L ∞ + δ (cid:19) . Therefore lim sup N →∞ N ε ( N ) ≤ C (cid:18)(cid:90) Ω | ∂ x f ∂ y f | dxdy + δ (cid:107)∇ f (cid:107) L ∞ + δ (cid:19) . Taking the limit as δ → 0, we obtain the announced resultlim sup N →∞ N ε ( N ) ≤ C (cid:13)(cid:13)(cid:13)(cid:13)(cid:113) | ∂ x f ∂ y f | (cid:13)(cid:13)(cid:13)(cid:13) L , which concludes the proof. (cid:50) .5. Appendix Let q be a quadratic function of mixed type, and let T be a triangle with edges a, b, c such that | q ( a ) | ≥ | q ( b ) | ≥ | q ( c ) | . Up to a linear change of variables, we may assumethat the quadratic part of q is q ( x, y ) = x − y . Up to a translation, linear rescalingand permutation between the x and y coordinates, we may assume that the edge a iscentered at 0 and such that q ( a ) = 4. We write a = ( u, v ), and assume that u > q ( a ) = u − v , and there must be θ ∈ R such that( u, v ) = (cosh( θ ) , sinh( θ )).We next observe that the linear transformation φ of matrix (cid:18) cosh( θ ) − sinh( θ ) − sinh( θ ) cosh( θ ) (cid:19) leaves q invariant, in the sense that q ◦ φ = q , and satisfies φ ( a ) = (1 , a = (2 , T at the ends of the edge a are ( − , 0) and (1 , s, t ).There is no loss of generality, eventually, in assuming that s ≥ t > 0. Note that | T | = t and ρ q ( T ) = t .We now specialize to the case where ρ q ( T ) ≥ 4, which is equivalent to t ≤ 1. Recallthat the edges of T are such that | q ( a ) | ≥ | q ( b ) | ≥ | q ( c ) | . The following lines show that q (1 + s, t ) ≥ | q ( s − , t ) | , which implies that b = − (1 + s, t ) and c = ( s − , t ) : q (1 + s, t ) − q ( s − , t ) = 4 s ≥ , q (1 + s, t ) + q ( s − , t ) = 2(1 + s − t ) ≥ . In addition we see that q ( b ) ≥ q ( a ) q ( b ) ≥ 0. For thesecond and third inequality in (9.19) we remark that q ( c ) = ( s − − t and q ( d ) = s − t .Clearly − ≤ − t ≤ min { q ( c ) , q ( d ) } . If 0 ≤ s ≤ 1, we also clearly have max { q ( c ) , q ( d ) } ≤ . If s ≥ q ( c ) ≤ q ( d ) = s − t = ( s + 1) − t − (2 s + 1) = q ( b ) − (2 s + 1) ≤ q ( a ) − . We thus always have max {| q ( c ) | , | q ( d ) |} ≤ | q ( a ) | , (9.78)which concludes the proof of the inequalities (9.19).Last, we specialize to the case where ρ q ( T ) = t ≥ 8, equivalently t ≤ , to prove(9.20) : q ( b ) + q ( c ) = ( s + 1) − t + ( s − − t = 2 + 2 s − t ≥ 32 = 38 q ( a ) . Chapter 9. Variants of the greedy bisection algorithm For any triangle T we denote by T x the interval defined as the projection of T on the x axis, and by | T x | its length. We denote by I T and I T x the two dimensional and onedimensional local interpolations operators on T and T x respectively. It is clear that if g isa C convex function of one variable and if G ( x, y ) := g ( x ), then (cid:107) G − I T G (cid:107) L ∞ ( T ) = (cid:107) g − I T x g (cid:107) L ∞ ( T x ) . (9.79)The following lemma compares the interpolation error on an interval of R and on a sub-interval. Lemma 9.5.1. Let g ∈ C ( R , R ) be such that < m ≤ g (cid:48)(cid:48) ≤ M . Let x , x , x be realnumbers satisfying x < x + x ≤ x < x , and denote u := x − x ≤ v := x − x . Thendenoting by I u and I v the interpolation operators on the intervals [ x , x ] and [ x , x ] respectively, we have (cid:107) g − I v g (cid:107) L ∞ ([ x ,x ]) ≥ mv / and (cid:107) g − I u g (cid:107) L ∞ ([ x ,x ]) ≤ (cid:18) − α v − uv (cid:19) (cid:107) g − I v g (cid:107) L ∞ ([ x ,x ]) with α := (cid:112) m/M . Proof: Let us define h v = I v g − g . Since h (cid:48)(cid:48) v = − g (cid:48)(cid:48) and h v ( x ) = h v ( x ) = 0, h v can berepresented as the integral h v ( x ) = (cid:90) x x K v ( x, y ) g (cid:48)(cid:48) ( y ) dy. (9.80)where the Green kernel K v is given by K v ( x, y ) := 1 v (cid:26) ( x − x )( x − y ) if x ≤ y, ( x − x )( y − x ) if x ≥ y. Of course, we have a similar representation of h u = I u g − g with a kernel K u .The first part of the proposition immediately follows from h v (cid:18) x + x (cid:19) ≥ m (cid:90) x x K v (cid:18) x + x , y (cid:19) dy ≥ mv / . In order to prove the second part, we shall compare the Green Kernels K u and K v . Forthis purpose, we define µ ( x ) := ( x − x ) /u ( x − x ) /v . For all x, y ∈ [ x , x ], we thus have K u ( x, y ) K v ( x, y ) = min { µ ( x ) , µ ( y ) } ≤ µ ( x ) . .5. Appendix x u := argmax t ∈ [ x ,x ] ( I u g − g )( t ) and using (9.80), we obtain (cid:107) g − I u g (cid:107) L ∞ ([ x ,x ]) = h u ( x u )= (cid:82) x x K u ( x u , y ) g (cid:48)(cid:48) ( y ) dy ≤ µ ( x u ) (cid:82) x x K v ( x u , y ) g (cid:48)(cid:48) ( y ) dy ≤ µ ( x u ) (cid:82) x x K v ( x u , y ) g (cid:48)(cid:48) ( y ) dy ≤ µ ( x u ) (cid:107) g − I v g (cid:107) L ∞ ([ x ,x ]) . In order to conclude, we need to estimate µ ( x u ). One easily checks by differentiation that µ is decreasing and concave on the interval [ x , x ], and therefore for all x ∈ [ x , x ] µ ( x ) ≤ − ( x − x ) v − uv . Differentiating the integral representation of h u , we obtain uh (cid:48) u ( x u ) = − (cid:90) x u x ( y − x ) g (cid:48)(cid:48) ( y ) dy + (cid:90) x x u ( x − y ) g (cid:48)(cid:48) ( y ) dy. Since h (cid:48) ( x u ) = 0 and 0 < m ≤ g (cid:48)(cid:48) ≤ M , we obtain( x − x u ) m ≤ ( x u − x ) M. Since x ≥ x + x , this gives x u − x ≥ (cid:112) m/M v . Therefore, µ ( x u ) ≤ − (cid:112) m/M v − uv , which concludes the proof. (cid:5) The following corollary uses the above lemma to compare the values of the L ∞ -baseddecision functions for a convex function that depends only of one variable. For any vector v ∈ R we denote by v x the absolute value of its x coordinate. Corollary 9.5.2. Let T be a triangle with edges a, b, c , such that a x ≥ b x ≥ c x . Let G ( x, y ) = g ( x ) , where g ∈ C and < m ≤ g (cid:48)(cid:48) ≤ M . Then, with d T defined by (9.9), d T ( b, G ) − d T ( a, G ) ≥ Ca x ( a x − b x ) ,d T ( c, G ) − d T ( a, G ) ≥ Ca x / , with C = m / √ M . Proof: We recall the notation α T ( f ) := (cid:107) f − I T f (cid:107) L ∞ . It follows from (9.79) that α T ( G ) = (cid:107) g − I T x g (cid:107) L ∞ ( T x ) . We label the extremities of a by i ∈ { , } , and denote by T ie , i ∈ { , } , e ∈ { a, b, c } thechild of T resulting from the bisection through the edge e in such a way that T ie contains28 Chapter 9. Variants of the greedy bisection algorithm the extremity of a of label i . We denote by T ie,x the projection of T ie onto the x -axis. Then(up to exchanging the labels of the extremities of a ), | T a,x | = b x , | T a,x | = a x / , | T b,x | = a x , | T b,x | = c x + b x / , | T c,x | = b x + c x / , | T c,x | = a x . In particular, we have T a,x ⊂ T b,x and T a,x ⊂ T b,x and therefore α T a ( G ) = (cid:107) g − I T a,x g (cid:107) L ∞ ( T a,x ) ≤ (cid:107) g − I T b,x g (cid:107) L ∞ ( T b,x ) = α T b ( G ) , and similarly α T a ( G ) ≤ α T c ( G ). Moreover, we can apply the previous lemma with [ x , x ] = T a,x ⊂ T b,x = [ x , x ] or with [ x , x ] = T a,x ⊂ T c,x = [ x , x ] which respectively leads to α T b ( G ) − α T a ( G ) ≥ m / √ M a x ( a x − b x ) , and α T c ( G ) − α T a ( G ) ≥ m / √ M a x / . This allows us to conclude since d T ( e, G ) = α T e + α T e , for s ∈ { a, b, c } . (cid:5) Using the above result, we now prove that the decision function d T tends to prescribea longest edge bisection with respect to the euclidean metric when the triangle T becomestoo thin. Corollary 9.5.3. Let f be a convex function such that m Id ≤ d f ≤ M Id , and let T bea triangle with measure of non-degeneracy σ ( T ) for the euclidean metric and edges a, b, c ,such that | a | ≥ | b | ≥ | c | . Then if σ ( T ) > K the bisection prescribed by the decisionfunction d T ( · , f ) is a δ -near longest edge bisection with respect to the euclidean metric (inthe sense of definition 8.4.3), with K = 128( Mm ) / and δ := Kσ ( T ) . Proof: We denote by p ( X ) the affine orthogonal projection of a point X ∈ IR onto theline which includes the edge a , and we denote by O the midpoint of a . We then define˜ f ( X ) = f ( p ( X )) + df O ( X − p ( X )) . Then, using the notation X ( z ) = p ( X ) + z ( X − p ( X )), we have f ( X ) − ˜ f ( X ) = f ( X ) − f ( p ( X )) − df O ( X − p ( X ))= (cid:90) ( df X ( z ) − df O )( X − p ( X )) dz. But for X ∈ T , we have X ( z ) ∈ T for all z ∈ [0 , | X ( z ) − O | ≤ | a | and therefore (cid:107) df X ( z ) − df O (cid:107) ≤ M | a | . .5. Appendix | X − p ( X ) | ≤ | T || a | ≤ | a | σ ( T ) . Therefore, (cid:107) f − ˜ f (cid:107) L ∞ ( T ) ≤ M | a | σ ( T ) . We can apply the previous lemma to the function ˜ f ,since it is the sum of a function of one variable and of an affine function which has noeffect on the interpolation error. Assuming without loss of generality (up to a rotation)that a is parallel to the x axis, this gives us d T ( b, ˜ f ) − d T ( a, ˜ f ) ≥ C | a | ( | a | − b x ) ≥ C | a | ( | a | − | b | ) ,d T ( c, ˜ f ) − d T ( a, ˜ f ) ≥ C | a | / . with C = m / √ M . Since the Lebesgue constant of the interpolation operator on any triangleis 2, we have | d T ( e, f ) − d T ( e, ˜ f ) | ≤ || f − ˜ f || L ∞ ( T ) which implies the following inequalities d T ( b, f ) − d T ( a, f ) ≥ C | a | ( | a | − | b | ) − M | a | /σ ( T ) ,d T ( c, f ) − d T ( a, f ) ≥ | a | ( C/ − M/σ ( T )) . The second inequality shows that the edge c cannot be cut if C/ − M/σ ( T ) > σ ( T ) > K . The first inequality shows that b may be cut provided that C ( | a | − | b | ) − M | a | /σ ( T ) ≤ 0, i.e. | b | ≥ (1 − MCσ ( T ) ) | a | which shows the property of δ -nearlongest edge bisection with δ := Kσ ( T ) . (cid:5) The proof of Proposition 9.2.8 directly follows from this last result, proceeding exactlyas in the last paragraph of the proof of Proposition 8.4.10.30 Chapter 9. Variants of the greedy bisection algorithm ibliography [1] B. Alpert, A class of bases in L for the sparse representation of integral operators ,SIAM J. Math. Anal. 24, 246-262, 1993.[2] T. Apel, Anisotropic finite elements : Local estimates and applications , Series “Ad-vances in Numerical Mathematics”, Teubner, Stuttgart, 1999.[3] F. Arandiga, A. Cohen, R. Donat, N. Dyn and B. Matei, Approximation of piecewisesmooth images by edge-adapted techniques , Appl. Comp. Harm. Anal. 24, 225–250,2008.[4] V. Babenko, Y. Babenko, A. Ligun and A. Shumeiko, On Asymptotical Behaviorof the Optimal Linear Spline Interpolation Error of C Functions , East J. Approx.12(1), 71–101, 2006.[5] V. Babenko, Interpolation of continuous functions by piecewise linear ones , Math.Notes, 24, no.1, (1978) 43–53.[6] Yu. Babenko, V. Babenko, D. Skorokhodov, Exact asymptotics of the optimal L p, Ω -error of linear spline interpolation , East Journal on Approximations, V. 14, N. 3(2008), pp. 285–317.[7] Yu. Babenko, On the asymptotic behavior of the optimal error of spline interpolationof multivariate functions, PhD thesis, 2006.[8] Yu. Babenko, Exact asymptotics of the uniform error of interpolation by multilinearsplines, to appear in J. Approx. Theory.[9] Y. Babenko, T. Leskevich, J.M. Mirebeau, Sharp asymptotics of the L p approximationerror for interpolation on block partitions , preprint Laboratoire J.L. Lions, to appearin Numerische Mathematik, 2010.[10] I. Babuˇska, A. K. Aziz On the angle condition in the finite element method , SIAM J.Numer. Anal. 13, 1976[11] R. Baraniuk, H. Choi, J. Romberg and M. Wakin, Wavelet-domain approximation andcompression of piecewise smooth images , IEEE Transactions on Image Processing,15(5), 1071–1087, 2006.[12] P. Binev, W. Dahmen and R. DeVore, Adaptive Finite Element Methods with Conver-gence Rates , Numerische Mathematik 97, 219–268, 2004.[13] P. Binev, W. Dahmen, R. DeVore and P. Petrushev, Approximation Classes for Adap-tive Methods , Serdica Math. J. 28, 391–416, 2002.[14] M. Bocher, Introduction to Higher Algebra , Courier Dover Publications, 2004 ISBN0486495701, 9780486495705 43132 Bibliography [15] J-D. Boissonnat, C. Wormser and M. Yvinec. Locally uniform anisotropic meshing ,Proceedings of the twenty-fourth annual symposium on Computational geometry,2008[16] H. Borouchaki, P.J. Frey, P.L. George, P. Laug and E. Saltel, Mesh generation andmesh adaptivity : theory, techniques , in Encyclopedia of computational mechanics, E.Stein, R. de Borst and T.J.R. Hughes ed., John Wiley & Sons Ltd., 2004.[17] L. Breiman, J.H. Friedman, R.A. Olshen and C.J. Stone, Classification and regressiontrees , Wadsworth international, Belmont, CA, 1984.[18] K. B¨or¨oczky, M. Ludwig, Approximation of convex bodies and a momentum lemmafor power diagrams , Monatshefte f ¨u r Mathematik, V. 127, N. 2, (1999) 101–110.[19] S. Bougleux, G. Peyr´e and L. Cohen. Anisotropic Geodesics for Perceptual Grou-ping and Domain Meshing. Proc. tenth European Conference on Computer Vision(ECCV’08), Marseille, France, October 12-18, 2008.[20] Y. Bourgault, M. Picasso, F. Alauzet and A. Loseille, On the use of anisotropic aposteriori error estimators for the adaptative solution of 3-D inviscid compressibleflows, Int. J. Numer. Meth. Fluids., 2009, vol. 59, no. 1, pp. 47-74[21] E. Candes and D. L. Donoho, Curvelets and curvilinear integrals , J. Approx. Theory.113, 59–90, 2000.[22] F. Cao, Geometric curve evolution and image processing , Lecture Notes in Mathe-matics, Springer, 2003.[23] W. Cao. An interpolation error estimate on anisotropic meshes in R n and optimalmetrics for mesh refinement. SIAM J. Numer. Anal. 45 no. 6, 2368–2391, 2007.[24] W. Cao, Anisotropic measure of third order derivatives and the quadratic interpola-tion error on triangular elements , SIAM J.Sci.Comp. 29(2007), 756-781.[25] W. Cao. An interpolation error estimate in R based on the anisotropic measures ofhigher order derivatives . Math. Comp. 77, 265-286, 2008.[26] W. Cao, On the error of linear interpolation and the orientation, aspect ratio, andinternal angles of a triangle , SIAM J. Numer. Anal. 43(1), 19–40, 2005.[27] L. Chen, P. Sun and J. Xu, Optimal anisotropic meshes for minimizing interpolationerror in L p -norm , Math. of Comp. 76, 179–204, 2007.[28] L. Chen, On minimizing the linear interpolation error of convex quadratic functions ,East Journal of Approximation 14(3), 271–284, 2008.[29] L. Chen, Mesh smoothing schemes based on optimal Delaunay triangulations , in , 109–120, Williamsburg VA, Sandia National La-boratories, 2004.[30] A. Cohen, Numerical analysis of wavelet methods , Elsevier, 2003.[31] A. Cohen, W. Dahmen, I. Daubechies, and R. DeVore, Tree-structured approximationand optimal encoding , Appl. Comp. Harm. Anal. 11, pp. 192-226, 2001.[32] A. Cohen, W. Dahmen, I. Daubechies and R. DeVore, Harmonic analysis of the spaceBV , Rev. Mat. Iberoamericana 19, 235-262, 2003.33[33] A. Cohen, R. DeVore, P. Petrushev, and H. Xu, Nonlinear approximation and thespace BV( R ), Amer. J. Math. 121, 587-628, 1999.[34] A. Cohen, N. Dyn, F. Hecht and J.-M. Mirebeau, Adaptive multiresolution analysisbased on anisotropic triangulations , preprint, Laboratoire J.-L.Lions, to appear inMath. of Comp. 2010.[35] A. Cohen, J.-M. Mirebeau, Adaptive and anisotropic piecewise polynomial approxi-mation , chapter 4 of the book Multiscale, Nonlinear and Adaptive Approximation ,Springer, 2009[36] A. Cohen and J.M. Mirebeau, Greedy bisection generates optimally adapted triangu-lations , preprint, Laboratoire J.-L. Lions, to appear in Math. of Comp. 2010.[37] A. Cohen and J.M. Mirebeau, Anisotropic smoothness classes : from finite elementapproximation to image models , J. Math. Imaging Vis. 38, 52-69, 2010.[38] E. F. D’Azevedo, Are bilinear quadrilaterals better than linear triangles ? SIAM J.Sci. Comput. 22 (2000), no. 1, 198–217.[39] L. Demaret, N. Dyn, A. Iske, Image compression by linear splines over adaptivetriangulations , IEEE Transactions on Image Processing.[40] S. Dekel and D. Leviathan, Adaptive multivariate approximation using binary spacepartitions and geometric wavelets , SIAM Journal on Numerical Analysis 43, 707–732,2005.[41] L. Demaret, N. Dyn, M. Floater and A. Iske, Adaptive thinning for terrain modellingand image compression , in Advances in Multiresolution for Geometric Modelling,N.A. Dodgson, M.S. Floater, and M.A. Sabin (eds.), Springer-Verlag, Heidelberg,321-340, 2005.[42] R. DeVore, Nonlinear approximation , Acta Numerica 51-150, 1998[43] R. DeVore, G. Petrova and P. Wojtactzyck, Anisotropic Smoothness via Level Sets ,Comm. Pure and Applied Math. 61, 1264-1297, 2008.[44] R. DeVore and B. Lucier, High order regularity for conservation laws , Indiana Math.,413-430, 1990.[45] R. DeVore and G. Lorentz, Constructive approximation , Springer Grundlehren, Vol.303, 1993.[46] D. Donoho, Unconditional bases are optimal bases for data compression and statisticalestimation. Appl. Comp. Harm. Anal. 1, 100-115, 1993.[47] D. Donoho, I. Johnstone, G. Kerkyacharian and D. Picard, Wavelet shrinkage :Asymptotia. J. Roy. Stat. Soc. B 57, 301-369, 1995.[48] D. Donoho, Wedgelets : nearly minimax estimation of edges , Ann. Statist. 27(3),859–897, 1999.[49] D. Donoho, CART and best basis : a connexion , Ann. Statist. 25(5), 1870–1911, 1997.[50] W. D¨orfler, A convergent adaptive algorithm for Poisson’s equation , SIAM J. Numer.Anal. 33, 1106–1124, 1996.[51] T. J. Dijkema, R. Stevenson, A sparse Laplacian in tensor product wavelet coordi-nates , Numer. Math. 115(3), 433-449, 2010.34 Bibliography [52] J. Dixmier, Quelques aspects de la th´eorie des invariants , Gazette des Math´emati-ciens, vol. 43, pp. 39-64, January 1990.[53] L.C. Evans and R.F. Gariepy, Measure Theory and Fine Properties of Functions ,CRC Press, 1992.[54] E. Fatemi, S. Osher and L. Rudin, Non linear total variation based noise removalalgorithms , Physica D 60, 259-268, 1992[55] L. Fejes Toth, Lagerungen in der Ebene, auf der Kugel und im Raum , 2nd edn. Berlin :Springer, 1972.[56] P.J. Frey and P.L. George, Mesh generation. Application to finite elements , Secondedition. ISTE, London ; John Wiley & Sons, Inc., Hoboken, NJ, 2008.[57] E. Godlewski and P.A. Raviart, Numerical approximation of hyperbolic systems ofconservation laws , Applied Mathematical Sciences 118, Springer-Verlag, New-York,1996.[58] M. Gromov, Metric structures for Riemannian and non-Riemannian spaces , Birkhau-ser, 1999[59] P. Gruber, Error of asymptotic formulae for volume approximation of convex bodiesin E d , Monatsh. Math. 135, p279-304, 2002.[60] R. Hartshorne, Algebraic Geometry . New York : Springer-Verlag, 1999.[61] D. Hilbert, Theory of algebraic invariants , Translated by R. C. Laubenbacher, Cam-bridge University Press, 1993.[62] P. Jamet Estimations d’erreur pour des ´el´ements finis droits presque d´eg´en´er´es , Revuefran¸caise d’automatique, informatique, recherche op´erationelle. Analyse num´erique,tome 10, n o Orthogonal series , Amer. Math. Soc., Providence,1989.[64] B. Karaivanov and P. Petrushev, Nonlinear piecewise polynomial approximationbeyond Besov spaces , Appl. Comput. Harmon. Anal. 15, 177-223, 2003.[65] R. Kuate, Thesis directed by F. Hecht, Adaptation de maillage anisotrope : ´etude,construction d’estimateurs et raffinement hexa´edrique , LJLL, UPMC[66] F. Labelle and J. R. Shewchuk, Anisotropic Voronoi Diagrams and Guaranteed-Quality Anisotropic Mesh Generation , Proceedings of the Nineteenth Annual Sym-posium on Computational Geometry, 191-200, 2003.[67] S. Lang, Algebra , Graduate Texts in Mathematics, 211 (Corrected fourth printing,revised third ed.), New York : Springer-Verlag, 2004[68] R. LeVeque, Numerical methods for conservation laws , Birkhauser, 1992.[69] Ligun A.A., Shumeiko A.A., Asymptotic methods of curve recovery , Kiev. Inst. ofMath. NAS of Ukraine, 1997. (in Russian)[70] C. Louchet, Variational and Bayesian models for image denoising : from total varia-tion towards non-local means , Ph. D. thesis directed by L. Moisan35[71] C. Louchet and L. Moisan, Total Variation denoising using posterior expectation ,proceedings of the European Signal Processing Conference (Eusipco), 2008[72] J.M. Mirebeau, Optimal meshes for finite elements of arbitrary order , ConstructiveApproximation 32-2, pages 339-383, 2010.[73] J.M. Mirebeau Optimally adapted meshes for finite elements of arbitrary order and W ,p norms , preprint Laboratoire J.L. Lions, submitted to Numerische Mathematik,2010.[74] P. Morin, R. Nochetto and K. Siebert, Convergence of adaptive finite element me-thods , SIAM Review 44, 631–658, 2002.[75] E. Nadler, Piecewise linear best L approximation on triangles, in : Chui, C.K., Schu-maker, L.L. and Ward, J.D. (Eds.), Approximation Theory V, Academic Press, 499–502, 1986.[76] S.M. Nikol’skij, Approximation of functions of several variables and imbedding theo-rems , Springer, 1975.[77] Ricardo H.Nochetto, Kunibert G.Siebert, Andreas Veeser, Theory of adaptive finiteelement methods : An introduction , chapter 12 of the book Multiscale, Nonlinear andAdaptive Approximation , Springer, 2009[78] P. J. Olver, G. Sapiro and A. Tannenbaum, Invariant geometric evolutions of surfacesand volumetric smoothing , SIAM J. Appl. Math. 57,176-194, 1997.[79] P.J. Olver, G. Sapiro and A. Tannenbaum, Affine invariant detection ; edge maps,anisotropic diffusion and active contours , Acta Applicandae Mathematicae 59, 45-77,1999.[80] E. Le Pennec and S. Mallat, Bandelet image approximation and compression , SIAMJournal of Multiscale Modeling. and Simulation 4, 992-1039, 2005.[81] H. Pottmann, R. Krasauskas, B. Hamann, K. Joy, W. Seibold, On piecewise linearapproximation of quadratic functions , J. Geom. Graph. 4, no. 1, (2000) 31–53.[82] M.C. Rivara, New longest-edge algorithms for the refinement and/or improvement ofunstructured triangulations , Int. J. Num. Methods 40, 3313–3324, 1997.[83] G. Salmon, Higher plane curves , third edition, 1879 : [84] J. Schoen, Robust, Guaranteed-Quality Anisotropic Mesh Generation , Research Pro-ject, University of California at Berkeley, [85] J. R. ShewChuk, What is a good linear finite element : [86] J. R. ShewChuk General-Dimensional Constrained Delaunay and Constrained Regu-lar Triangulations, I : Combinatorial Properties , Discrete & Computational Geome-try 39(1-3) :580-637, March 2008[87] R. Stevenson, An optimal adaptive finite element method , SIAM J. Numer. Anal.,42(5), 2188–2217, 2005.36 Bibliography [88] V. Temlyakov, Approximation of periodic functions , Nova Science, New York, 1993.[89] A.N. Tikhonov, V.A. Arsenin, Solution of Ill-posed Problems , Winston & Sons, Wa-shington 1977.[90] R. Verfurth, A Review of A Posteriori Error Estimation and Adaptive Mesh-Refinement Techniques , Wiley-Teubner, 1996.[91] H. Whitney, Analytic extensions of differentiable functions defined in closed sets ,Transactions of the American Mathematical Society, 36, 63-89, 1934.[92] The 2-d anisotropic mesh generator BAMG : (included inthe FreeFem++ software)[93] FreeFem++ software, developped by Frederic Hecht, [94] A 3-d anisotropic mesh generator : [95] Personnal page with numerical examples and source codes :[95] Personnal page with numerical examples and source codes :